Coming Soon Project Aryabhatta: A new era of AI education.
24 Papers Live • Open Source

30 Foundational AI Papers in 30 Days

From RNNs to Transformers — complete implementations you can run, learn from, and build upon. Every paper, every line of code, explained.

"

If you really learn all of these, you'll know 90% of what matters today

— Ilya Sutskever
24 PAPERS LIVE
300+ ACTIVE USERS
15+ COUNTRIES
32K+ LINES OF CODE

24 Papers, Fully Implemented

Each paper comes with deep explanations, clean code, visualizations, and exercises. Click any card to explore.

Day 01

The Unreasonable Effectiveness of RNNs

Character-level language models that generate Shakespeare, code, and music

NumPy 5 exercises
Day 02

Understanding LSTM Networks

Gates, memory cells, and learning long-term dependencies

PyTorch 5 exercises
Day 03

Recurrent Neural Network Regularization

Dropout, layer norm, and preventing overfitting in sequence models

PyTorch 5 exercises
Day 04

Minimizing Description Length

Compression as the key to intelligence and model selection

Python 5 exercises
Day 05

The MDL Principle Tutorial

Two-part codes, prequential MDL, and normalized maximum likelihood

Python 5 exercises
Day 06

The First Law of Complexodynamics

Information equilibration and evolutionary dynamics in complex systems

Python 5 exercises
Day 07

The Coffee Automaton

Cellular automata, chaos theory, and emergent behavior

Python 5 exercises
Day 08

ImageNet Classification with CNNs

AlexNet — the paper that sparked the deep learning revolution

PyTorch 5 exercises
Day 09

Deep Residual Learning (ResNet)

Skip connections enabling 1000+ layer networks

PyTorch 5 exercises
Day 10

Identity Mappings in ResNets

Pre-activation and improved gradient flow

PyTorch 5 exercises
Day 11

Multi-Scale Context with Dilated Convolutions

Exponentially expanding receptive fields without resolution loss

PyTorch 5 exercises
Day 12

Dropout

A simple way to prevent neural networks from overfitting

PyTorch 5 exercises
Day 13

Attention Is All You Need

The Transformer architecture that revolutionized AI

PyTorch 5 exercises
Day 14

The Annotated Transformer

Line-by-line PyTorch implementation with explanations

PyTorch 5 exercises
Day 15

Bahdanau Attention (NMT)

The original attention mechanism before Transformers

PyTorch 5 exercises
Day 16

Order Matters (Seq2Seq for Sets)

Teaching networks when order doesn't matter in inputs

PyTorch 5 exercises
Day 17

Neural Turing Machines

Differentiable external memory with content-based addressing

PyTorch 5 exercises
Day 18

Pointer Networks

Attention as output — pointing at input elements for variable-size combinatorial problems

PyTorch 5 exercises
Day 19

Relational Reasoning

Learning relationships between objects (Sort-of-CLEVR, Relation Networks)

PyTorch 7 exercises
Day 20

Relational RNNs

Memory as a set of interacting slots — solving problems LSTMs can't

PyTorch 5 exercises
Day 21

Neural Message Passing

Unifying graph neural networks — messages, updates, and readouts for molecular prediction

PyTorch 5 exercises
Day 22

Deep Speech 2

End-to-end speech recognition — replacing the entire ASR pipeline with a single neural network

PyTorch 5 exercises
z
Day 23

Variational Lossy Autoencoder

Solving posterior collapse by limiting the decoder's receptive field

PyTorch 5 exercises
Day 24

GPipe: Efficient Training of Giant Neural Networks

Scaling models beyond memory limits with pipeline parallelism and micro-batching

PyTorch 5 exercises

Coming Next

Day 25 Scaling Laws for Neural Language Models
Day 26 MoE: Outrageously Large Neural Networks
+4 more

Your 30-Day Journey

From sequence models to modern architectures — a complete curriculum.

01

The Foundations

Complete

Days 1-7 · RNNs, LSTMs, regularization, compression, and complexity theory

RNNs LSTMs Dropout MDL Complexity Automata
02

The Deep Learning Explosion

Complete

Days 8-12 · CNNs, residual learning, and the vision revolution

AlexNet ResNet ResNet V2 Dilated Conv Dropout
03

The Transformer Era

Complete

Days 13-16 · Attention mechanisms and sequence-to-sequence learning

Attention Annotated Transformer Bahdanau Order Matters
04

Specialized Architectures

Complete

Days 17-22 · Memory networks, graphs, reasoning, and speech

Neural Turing Machines Pointer Networks Relational Reasoning Relational RNNs MPNNs Deep Speech 2
05

Generative Models & Scale

In Progress

Days 23-28 · GANs, VAEs, diffusion, and scaling laws

VLAE GPipe
06

Modern Extensions

Coming

Days 29-30 · RLHF, alignment, and the path to ChatGPT

Learn by Building

Every paper comes with complete, runnable implementations. No "left as an exercise" — we build everything from scratch so you truly understand how these systems work.

Production Code

Clean, documented, runs everywhere

5 Exercises Per Paper

With complete solutions

Interactive Notebooks

Run and experiment live

Deep Explanations

Theory meets practice

transformer.py
class MultiHeadAttention(nn.Module):
    """Multi-head attention from scratch."""
    
    def __init__(self, d_model, n_heads):
        super().__init__()
        self.d_k = d_model // n_heads
        self.W_q = nn.Linear(d_model, d_model)
        self.W_k = nn.Linear(d_model, d_model)
        self.W_v = nn.Linear(d_model, d_model)
        
    def forward(self, x):
        # Compute attention scores
        scores = Q @ K.transpose(-2, -1)
        scores = scores / math.sqrt(self.d_k)
        attn = F.softmax(scores, dim=-1)
        return attn @ V

Ready to Master the Foundations?

Join the journey. Learn AI the right way — by building from scratch.