Coming Soon Project Aryabhatta: A new era of AI education.

24 Papers Live • Open Source

30 Foundational AI Papers in 30 Days

From RNNs to Transformers — complete implementations you can run, learn from, and build upon. Every paper, every line of code, explained.

"

If you really learn all of these, you'll know 90% of what matters today

— Ilya Sutskever

Explore Papers View on GitHub

24 PAPERS LIVE

300+ ACTIVE USERS

15+ COUNTRIES

32K+ LINES OF CODE

The Collection

24 Papers, Fully Implemented

Each paper comes with deep explanations, clean code, visualizations, and exercises. Click any card to explore.

The Unreasonable Effectiveness of RNNs

Character-level language models that generate Shakespeare, code, and music

NumPy 5 exercises

Understanding LSTM Networks

Gates, memory cells, and learning long-term dependencies

PyTorch 5 exercises

Recurrent Neural Network Regularization

Dropout, layer norm, and preventing overfitting in sequence models

PyTorch 5 exercises

Minimizing Description Length

Compression as the key to intelligence and model selection

Python 5 exercises

The MDL Principle Tutorial

Two-part codes, prequential MDL, and normalized maximum likelihood

Python 5 exercises

The First Law of Complexodynamics

Information equilibration and evolutionary dynamics in complex systems

Python 5 exercises

The Coffee Automaton

Cellular automata, chaos theory, and emergent behavior

Python 5 exercises

ImageNet Classification with CNNs

AlexNet — the paper that sparked the deep learning revolution

PyTorch 5 exercises

Deep Residual Learning (ResNet)

Skip connections enabling 1000+ layer networks

PyTorch 5 exercises

Identity Mappings in ResNets

Pre-activation and improved gradient flow

PyTorch 5 exercises

Multi-Scale Context with Dilated Convolutions

Exponentially expanding receptive fields without resolution loss

PyTorch 5 exercises

Dropout

A simple way to prevent neural networks from overfitting

PyTorch 5 exercises

Attention Is All You Need

The Transformer architecture that revolutionized AI

PyTorch 5 exercises

The Annotated Transformer

Line-by-line PyTorch implementation with explanations

PyTorch 5 exercises

Bahdanau Attention (NMT)

The original attention mechanism before Transformers

PyTorch 5 exercises

Order Matters (Seq2Seq for Sets)

Teaching networks when order doesn't matter in inputs

PyTorch 5 exercises

Neural Turing Machines

Differentiable external memory with content-based addressing

PyTorch 5 exercises

Pointer Networks

Attention as output — pointing at input elements for variable-size combinatorial problems

PyTorch 5 exercises

Relational Reasoning

Learning relationships between objects (Sort-of-CLEVR, Relation Networks)

PyTorch 7 exercises

Relational RNNs

Memory as a set of interacting slots — solving problems LSTMs can't

PyTorch 5 exercises

Neural Message Passing

Unifying graph neural networks — messages, updates, and readouts for molecular prediction

PyTorch 5 exercises

Deep Speech 2

End-to-end speech recognition — replacing the entire ASR pipeline with a single neural network

PyTorch 5 exercises

Variational Lossy Autoencoder

Solving posterior collapse by limiting the decoder's receptive field

PyTorch 5 exercises

GPipe: Efficient Training of Giant Neural Networks

Scaling models beyond memory limits with pipeline parallelism and micro-batching

PyTorch 5 exercises

Coming Next

Day 25 Scaling Laws for Neural Language Models

Day 26 MoE: Outrageously Large Neural Networks

+4 more

The Roadmap

Your 30-Day Journey

From sequence models to modern architectures — a complete curriculum.

01

The Foundations

Complete

Days 1-7 · RNNs, LSTMs, regularization, compression, and complexity theory

RNNs LSTMs Dropout MDL Complexity Automata

02

The Deep Learning Explosion

Complete

Days 8-12 · CNNs, residual learning, and the vision revolution

AlexNet ResNet ResNet V2 Dilated Conv Dropout

03

The Transformer Era

Complete

Days 13-16 · Attention mechanisms and sequence-to-sequence learning

Attention Annotated Transformer Bahdanau Order Matters

04

Specialized Architectures

Complete

Days 17-22 · Memory networks, graphs, reasoning, and speech

Neural Turing Machines Pointer Networks Relational Reasoning Relational RNNs MPNNs Deep Speech 2

05

Generative Models & Scale

In Progress

Days 23-28 · GANs, VAEs, diffusion, and scaling laws

VLAE GPipe

06

Modern Extensions

Coming

Days 29-30 · RLHF, alignment, and the path to ChatGPT

Why 30u30

Learn by Building

Every paper comes with complete, runnable implementations. No "left as an exercise" — we build everything from scratch so you truly understand how these systems work.

Production Code

Clean, documented, runs everywhere

5 Exercises Per Paper

With complete solutions

Interactive Notebooks

Run and experiment live

Deep Explanations

Theory meets practice

transformer.py

class MultiHeadAttention(nn.Module):
    """Multi-head attention from scratch."""
    
    def __init__(self, d_model, n_heads):
        super().__init__()
        self.d_k = d_model // n_heads
        self.W_q = nn.Linear(d_model, d_model)
        self.W_k = nn.Linear(d_model, d_model)
        self.W_v = nn.Linear(d_model, d_model)
        
    def forward(self, x):
        # Compute attention scores
        scores = Q @ K.transpose(-2, -1)
        scores = scores / math.sqrt(self.d_k)
        attn = F.softmax(scores, dim=-1)
        return attn @ V

Ready to Master the Foundations?

Join the journey. Learn AI the right way — by building from scratch.

Star on GitHub Start Learning