# Clear Explanation of the Algorithms Used in AI Engineering From Scratch: A Complete Review

> Explore algorithms used in AI Engineering From Scratch. Get clear, step-by-step explanations with code and tests in this comprehensive review.

- Repository: [Rohit Ghumare/ai-engineering-from-scratch](https://github.com/rohitg00/ai-engineering-from-scratch)
- Tags: deep-dive
- Published: 2026-06-06

---

**Yes, the *AI Engineering From Scratch* repository provides a clear, step-by-step explanation of every algorithm it covers, pairing theoretical markdown documentation with minimal, self-contained reference implementations and unit tests for each of its 435 lessons.**

The *AI Engineering From Scratch* curriculum, maintained by `rohitg00`, is an open-source educational project that teaches modern AI systems from first principles across **435 structured lessons**. Readers looking for a clear explanation of the algorithms used will find that every lesson couples a theoretical [`docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/docs/en.md) write-up with a minimal `code/` implementation and automated tests. This design ensures that mathematical derivations for tokenizers, transformers, reinforcement learning, and optimization techniques are always traceable to short, runnable Python files.

## Clear Explanation of the Algorithms Used: Structure of Every Lesson

Each lesson in the repository follows a strict three-part structure that separates *why* an algorithm works from *how* to build it.

1. **Theory in plain English**: The [`docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/docs/en.md) file inside each lesson directory introduces the conceptual foundation, prerequisite knowledge, and learning objectives. Equations are rendered in LaTeX and link back to original research papers, such as Sennrich et al. 2016 for BPE or Schulman et al. 2017 for PPO.

2. **Minimal reference code**: The companion `code/` script contains only a few dozen lines, starts with a header comment pointing to the documentation, and avoids prohibited third-party dependencies enforced by CI.

3. **Unit-test proof**: The `code/tests/` directory for each lesson includes at least five tests that exercise the implementation and can be executed with `python3 -m unittest discover`.

Because the entire curriculum is generated into a static site via [`site/build.js`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/build.js), the public README remains a browsable table that links directly to every algorithm’s derivation and source.

## Algorithm Families Covered in the Curriculum

The repository spans foundational NLP, deep-learning architecture, reinforcement learning, distributed systems, and AI safety. Below are the major algorithm families and the exact paths where their clear explanations and implementations live.

- **Byte-Pair Encoding (BPE) tokenizer**: The curriculum explains the greedy compression algorithm repurposed for sub-word tokenization, including the merge-selection rule, special-token handling, and training data pipelines. The derivation lives in [`phases/10-llms-from-scratch/01-tokenizers/docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/10-llms-from-scratch/01-tokenizers/docs/en.md), and the reference implementation is in [`phases/10-llms-from-scratch/01-tokenizers/code/bpe.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/10-llms-from-scratch/01-tokenizers/code/bpe.py).

- **Transformer building blocks**: Lessons cover scaled dot-product attention, multi-head attention, positional encoding, feed-forward blocks, residual connections, and layer normalization. The theory is documented in [`phases/07-transformers-deep-dive/05-full-transformer/docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/07-transformers-deep-dive/05-full-transformer/docs/en.md), with illustrative code in [`phases/07-transformers-deep-dive/05-full-transformer/code/transformer.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/07-transformers-deep-dive/05-full-transformer/code/transformer.py).

- **Speculative decoding**: Readers learn about draft-model generation, the verification step, failure-mode analysis, and the speed-vs.-quality trade-off. See [`phases/10-llms-from-scratch/25-speculative-decoding/docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/10-llms-from-scratch/25-speculative-decoding/docs/en.md) and [`phases/10-llms-from-scratch/25-speculative-decoding/code/speculative.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/10-llms-from-scratch/25-speculative-decoding/code/speculative.py).

- **Reinforcement Learning (RL) algorithms**: PPO, Q-learning, Monte-Carlo methods, Policy-Gradient, and RLHF are derived from first principles, including the clipped-objective, advantage estimators, and policy-gradient theorem. PPO theory is in [`phases/09-reinforcement-learning/08-ppo/docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/09-reinforcement-learning/08-ppo/docs/en.md) with code in [`phases/09-reinforcement-learning/08-ppo/code/ppo.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/09-reinforcement-learning/08-ppo/code/ppo.py); Monte-Carlo methods are explained in [`phases/09-reinforcement-learning/03-monte-carlo-methods/docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/09-reinforcement-learning/03-monte-carlo-methods/docs/en.md) with code in [`phases/09-reinforcement-learning/03-monte-carlo-methods/code/mc.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/09-reinforcement-learning/03-monte-carlo-methods/code/mc.py).

- **Multi-Agent Reinforcement Learning (MARL)**: MADDPG, QMIX, and MAPPO are taught alongside discussions of non-stationarity, credit assignment, and cooperative versus competitive settings. Documentation is in [`phases/16-multi-agent-and-swarms/20-marl-maddpg-qmix-mappo/docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/16-multi-agent-and-swarms/20-marl-maddpg-qmix-mappo/docs/en.md), and the implementation is in [`phases/16-multi-agent-and-swarms/20-marl-maddpg-qmix-mappo/code/marl.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/16-multi-agent-and-swarms/20-marl-maddpg-qmix-mappo/code/marl.py).

- **Swarm optimization**: Particle Swarm Optimization (PSO), Ant Colony Optimization (ACO), and Genetic Algorithms are mapped to prompt-parameter optimization, including fitness-function design and convergence diagnostics. The explanation is in [`phases/16-multi-agent-and-swarms/19-swarm-optimization-pso-aco/docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/16-multi-agent-and-swarms/19-swarm-optimization-pso-aco/docs/en.md), with code in [`phases/16-multi-agent-and-swarms/19-swarm-optimization-pso-aco/code/swarm.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/16-multi-agent-and-swarms/19-swarm-optimization-pso-aco/code/swarm.py).

- **Differential privacy for LLMs**: The curriculum defines (ε, δ)-differential privacy and implements the DP-SGD algorithm with clipping and noise injection. Theory is in [`phases/18-ethics-safety-alignment/22-differential-privacy-for-llms/docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/18-ethics-safety-alignment/22-differential-privacy-for-llms/docs/en.md), and the implementation is in [`phases/18-ethics-safety-alignment/22-differential-privacy-for-llms/code/dp_sgd.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/18-ethics-safety-alignment/22-differential-privacy-for-llms/code/dp_sgd.py).

- **Token-bucket rate limiting**: A proof of burst handling, refill-rate math, and a practical API-gate implementation are provided in [`phases/11-llm-engineering/11-caching-cost/docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/11-llm-engineering/11-caching-cost/docs/en.md), with the algorithm implemented in [`phases/11-llm-engineering/11-caching-cost/code/token_bucket.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/11-llm-engineering/11-caching-cost/code/token_bucket.py).

- **All-reduce collective operations**: The two-pass reduce-scatter plus all-gather algorithm, bandwidth-optimal variants, and NCCL topology hints are explained in [`phases/19-capstone-projects/76-collective-ops-from-scratch/docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/19-capstone-projects/76-collective-ops-from-scratch/docs/en.md) and implemented in [`phases/19-capstone-projects/76-collective-ops-from-scratch/code/all_reduce.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/19-capstone-projects/76-collective-ops-from-scratch/code/all_reduce.py).

- **Constitutional AI self-improvement**: The generate-evaluate-select loop, deterministic grading rubric, and policy-gradient on synthetic rewards are covered in [`phases/10-llms-from-scratch/09-constitutional-ai-self-improvement/docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/10-llms-from-scratch/09-constitutional-ai-self-improvement/docs/en.md), with code in [`phases/10-llms-from-scratch/09-constitutional-ai-self-improvement/code/constitutional_ai.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/10-llms-from-scratch/09-constitutional-ai-self-improvement/code/constitutional_ai.py).

## Code-Level Examples of Key Algorithms

To demonstrate how the repository grounds theory in practice, here are self-contained snippets taken directly from the reference implementations.

### BPE Merge Step

The `bpe_merge` function in [`phases/10-llms-from-scratch/01-tokenizers/code/bpe.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/10-llms-from-scratch/01-tokenizers/code/bpe.py) implements a single iteration of the Byte-Pair Encoding merge rule.

```python

# File: phases/10-llms-from-scratch/01-tokenizers/code/bpe.py

# See docs/en.md for the mathematical justification.

def bpe_merge(vocab: dict[str, int], merges: list[tuple[str, str]]) -> dict[str, int]:
    """Perform a single BPE merge on `vocab`."""
    a, b = merges[0]                     # the most frequent pair

    new_token = a + b
    new_vocab = {}
    for token, freq in vocab.items():
        # Replace occurrences of the pair with the new token

        new_tokenized = token.replace(a + " " + b, new_token)
        new_vocab[new_tokenized] = freq
    return new_vocab

```

The accompanying lesson explains why the most frequent adjacent pair is selected, how the merge reduces total symbol count, and how the loop updates the merge table.

### Scaled Dot-Product Attention

The scaled dot-product attention mechanism is implemented in [`phases/07-transformers-deep-dive/05-full-transformer/code/attention.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/07-transformers-deep-dive/05-full-transformer/code/attention.py) using NumPy.

```python

# File: phases/07-transformers-deep-dive/05-full-transformer/code/attention.py

# Minimal implementation of multi-head scaled dot-product attention.

import numpy as np

def attention(Q, K, V, mask=None):
    """Compute attention(Q, K, V) = softmax(QKᵀ / √d_k) V."""
    dk = Q.shape[-1]
    scores = Q @ K.transpose(-2, -1) / np.sqrt(dk)
    if mask is not None:
        scores = np.where(mask, scores, -1e9)
    weights = np.exp(scores - scores.max(axis=-1, keepdims=True))
    weights /= weights.sum(axis=-1, keepdims=True)
    return weights @ V

```

The lesson documentation derives the scaling factor √dₖ and describes the purpose of the mask before showing how multiple heads are concatenated.

### PPO Clipped Objective

Reinforcement learning in the curriculum culminates in the PPO clipped-objective, which is implemented in [`phases/09-reinforcement-learning/08-ppo/code/ppo.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/09-reinforcement-learning/08-ppo/code/ppo.py).

```python

# File: phases/09-reinforcement-learning/08-ppo/code/ppo.py

# Core PPO update step.

def ppo_loss(old_logp, new_logp, advantages, eps=0.2):
    ratio = np.exp(new_logp - old_logp)               # π_θ / π_θ_old

    unclipped = ratio * advantages
    clipped = np.clip(ratio, 1 - eps, 1 + eps) * advantages
    return -np.mean(np.minimum(unclipped, clipped))   # negative for gradient descent

```

The documentation proves why clipping stabilizes training, and the unit tests compare this loss against a reference implementation.

### Token-Bucket Rate Limiter

Engineering concepts are treated with the same rigor. The token-bucket algorithm in [`phases/11-llm-engineering/11-caching-cost/code/token_bucket.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/11-llm-engineering/11-caching-cost/code/token_bucket.py) demonstrates burst handling.

```python

# File: phases/11-llm-engineering/11-caching-cost/code/token_bucket.py

import time

class TokenBucket:
    def __init__(self, capacity, refill_rate):
        self.capacity = capacity            # max tokens

        self.tokens = capacity
        self.refill_rate = refill_rate      # tokens per second

        self.last_ts = time.time()

    def allow(self, n=1):
        now = time.time()
        # Refill tokens based on elapsed time

        self.tokens = min(self.capacity,
                          self.tokens + (now - self.last_ts) * self.refill_rate)
        self.last_ts = now
        if self.tokens >= n:
            self.tokens -= n
            return True
        return False

```

The lesson derives the refill-rate math and proves how the algorithm enables bursty traffic while guaranteeing a long-term average rate.

## How the Curriculum Maintains Synchronization

A frequent problem in educational repositories is documentation drift. The *AI Engineering From Scratch* project mitigates this by treating each lesson as a **single commit**, as defined in [`AGENTS.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/AGENTS.md).

- The [`LESSON_TEMPLATE.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/LESSON_TEMPLATE.md) enforces standard front matter that lists learning objectives, prerequisites, and estimated time.
- A CI-enforced dependency allow-list prevents prohibited third-party packages from entering reference implementations.
- The static site generator ([`site/build.js`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/build.js) → [`site/data.js`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/data.js)) updates the public README automatically, ensuring browsable lesson tables always point to current doc and code paths.

## Summary

- The *AI Engineering From Scratch* repository contains **435 lessons** that each explain one algorithm or system concept from first principles.
- Every lesson provides a **clear explanation of the algorithms used** in [`docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/docs/en.md), a minimal implementation in `code/`, and unit tests in `code/tests/`.
- Core algorithm families include **BPE**, **transformer attention**, **PPO**, **speculative decoding**, **MARL**, **swarm optimization**, **differential privacy**, and **distributed collective operations**.
- File paths such as [`phases/10-llms-from-scratch/01-tokenizers/code/bpe.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/10-llms-from-scratch/01-tokenizers/code/bpe.py) and [`phases/09-reinforcement-learning/08-ppo/code/ppo.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/09-reinforcement-learning/08-ppo/code/ppo.py) directly link the theory to executable source.
- The repository uses automated tooling and a strict lesson template to prevent documentation drift.

## Frequently Asked Questions

### Does AI Engineering From Scratch explain the math behind each algorithm?

Yes. Every lesson includes a [`docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/docs/en.md) file that derives the exact mathematics in LaTeX, such as the BPE merge rule, the attention-score formula, and the PPO clipped-objective. These documents cite original research papers and directly reference the companion code files.

### Are the code implementations runnable on their own?

Yes. Each `code/` script is minimal and self-contained, typically only a few dozen lines. They avoid unnecessary dependencies through a CI-enforced allow-list, and every lesson includes at least five unit tests that can be executed with `python3 -m unittest discover`.

### How does the repository prevent documentation from becoming outdated?

Each lesson is treated as a single commit per the guidelines in [`AGENTS.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/AGENTS.md). The curriculum is also generated into a static site via [`site/build.js`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/build.js), which produces [`site/data.js`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/data.js) and updates the README automatically so that lesson tables always link to current documentation and implementations.

### Which advanced algorithms are covered beyond basic transformers?

According to the source code, the curriculum covers speculative decoding, multi-agent reinforcement learning algorithms such as MADDPG and QMIX, swarm optimization including PSO and ACO, differential privacy via DP-SGD, and distributed all-reduce collective operations implemented from scratch.