# The 20 Phases of the AI Engineering Curriculum: A Complete Developer Roadmap

> Explore the 20 phases of the AI engineering curriculum. This comprehensive developer roadmap guides you through fundamental tools to advanced projects with code and assessments.

- Repository: [Rohit Ghumare/ai-engineering-from-scratch](https://github.com/rohitg00/ai-engineering-from-scratch)
- Tags: architecture
- Published: 2026-06-10

---

**The AI engineering curriculum is organized into 20 sequential phases that build from fundamental tooling to advanced capstone projects, with each phase residing under the `phases/` directory and containing structured lessons with code implementations, documentation, and assessments.**

The open-source repository `rohitg00/ai-engineering-from-scratch` provides a comprehensive, implementation-first educational framework for mastering modern AI engineering. Spanning **20 phases of the AI engineering curriculum**, this resource progresses from environment setup and mathematical foundations to autonomous multi-agent systems and production infrastructure, following a pedagogical structure defined in the repository's main [`README.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/README.md) (lines 57-80).

## Curriculum Architecture and Lesson Structure

Each phase follows a uniform directory structure to ensure consistency across the learning path. Within any `phases/<NN>-<phase-name>/` directory, individual lessons contain four standardized components: a `code/` folder with implementation files, a [`docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/docs/en.md) file with instructional content, an `outputs/` directory for generated artifacts, and a [`quiz.json`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/quiz.json) file for knowledge validation. This scaffolding is enforced by the [`LESSON_TEMPLATE.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/LESSON_TEMPLATE.md) file in the repository root, which guarantees that learners encounter predictable navigation patterns whether they are exploring `phases/01-math-foundations` or `phases/14-agent-engineering`.

The repository uses automated tooling to maintain this structure. The [`scripts/audit_lessons.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/scripts/audit_lessons.py) script validates lesson completeness, while [`site/build.js`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/build.js) generates the public-facing website `aiengineeringfromscratch.com` from the markdown curriculum. Progress tracking is maintained in [`ROADMAP.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/ROADMAP.md), which documents completion status and upcoming work for each phase.

## The 20 Phases: From Zero to Production

The curriculum enforces a strict numerical ordering (00-19) where each phase depends on competencies developed in previous sections. According to the source code analysis of the repository, the phases are:

0. **Setup & Tooling** (`phases/00-setup-and-tooling`) – Development environment configuration, Git workflows, Docker containerization, Jupyter notebooks, and system profiling.

1. **Math Foundations** (`phases/01-math-foundations`) – Linear algebra, calculus, probability theory, optimization algorithms, and graph theory fundamentals.

2. **ML Fundamentals** (`phases/02-ml-fundamentals`) – Classical machine learning including regression, decision trees, SVMs, clustering algorithms, and scikit-learn pipelines.

3. **Deep Learning Core** (`phases/03-deep-learning-core`) – Perceptrons, multi-layer neural networks, backpropagation mechanics, optimizer implementations, and construction of a mini deep-learning framework.

4. **Vision** (`phases/04-computer-vision`) – Convolutional operations, CNN architectures, object detection, image segmentation, diffusion models, Vision Transformers (ViT), and 3D vision systems.

5. **NLP: Foundations to Advanced** (`phases/05-nlp-foundations-to-advanced`) – Text tokenization, word embeddings, sequence-to-sequence models, attention mechanisms, LLM-style text generation, and Retrieval-Augmented Generation (RAG).

6. **Speech & Audio** (`phases/06-speech-and-audio`) – Waveform processing, spectrogram analysis, Automatic Speech Recognition (ASR), OpenAI Whisper integration, Text-to-Speech (TTS), and voice cloning techniques.

7. **Transformers Deep Dive** (`phases/07-transformers-deep-dive`) – Self-attention mechanisms, multi-head attention, positional encodings, BERT/GPT architecture implementations, Mixture of Experts (MoE), and KV-cache optimization.

8. **Generative AI** (`phases/08-generative-ai`) – Variational Autoencoders (VAEs), Generative Adversarial Networks (GANs), diffusion models, latent diffusion, ControlNet, and video/audio generation systems.

9. **Reinforcement Learning** (`phases/09-reinforcement-learning`) – Markov Decision Processes (MDPs), dynamic programming, Q-learning, Deep Q-Networks (DQN), policy gradients, PPO, RLHF, and multi-agent environments.

10. **LLMs from Scratch** (`phases/10-llms-from-scratch`) – Byte-Pair Encoding (BPE) tokenizers, mini-GPT pre-training implementations, distributed training strategies, RLHF fine-tuning, and model quantization.

11. **LLM Engineering** (`phases/11-llm-engineering`) – Advanced prompt engineering, RAG pipeline construction, LoRA fine-tuning, function calling APIs, and safety guardrails.

12. **Multimodal AI** (`phases/12-multimodal-ai`) – Vision-language models (CLIP, BLIP-2), audio-language integration, video understanding, and omni-modal architectures.

13. **Tools & Protocols** (`phases/13-tools-and-protocols`) – Tool-use interfaces, Model Context Protocol (MCP) fundamentals, server/client architectures, security considerations, and request routing.

14. **Agent Engineering** (`phases/14-agent-engineering`) – Agent control loops, planning algorithms, memory system design, LangGraph implementations, AutoGen frameworks, and agent evaluation benchmarks.

15. **Autonomous Systems** (`phases/15-autonomous-systems`) – Architectures for self-contained agents and autonomous system design patterns.

16. **Multi-Agent & Swarms** (`phases/16-multi-agent-and-swarms`) – Inter-agent coordination protocols, hierarchical orchestration, and emergent swarm dynamics.

17. **Infrastructure & Production** (`phases/17-infrastructure-and-production`) – Model deployment strategies, observability tooling, logging infrastructure, autoscaling, and CI/CD pipelines for AI services.

18. **Ethics & Alignment** (`phases/18-ethics-and-alignment`) – AI safety protocols, bias mitigation techniques, interpretability methods (mechanistic interpretability), and Constitutional AI.

19. **Capstone Projects** (`phases/19-capstone-projects`) – End-to-end real-world projects requiring integration of the full technology stack, from data ingestion to deployed inference.

## Programmatically Exploring the Curriculum

You can interact with the curriculum structure programmatically using the repository's predictable file organization. The `phases/` directory contains numbered folders that allow for automated traversal and validation.

### List All Lessons Within a Phase

To enumerate all lessons in a specific phase (for example, Phase 1), use Python to traverse the directory structure:

```python
import os
import json
import pathlib

def list_lessons(phase_folder: str):
    base = pathlib.Path('phases') / phase_folder
    lessons = sorted(p.name for p in base.iterdir() if p.is_dir())
    return lessons

print(list_lessons('01-math-foundations'))   # → ['01-linear-algebra-intuition', ...]

```

### Inspect Lesson Documentation Metadata

Each lesson's documentation resides in [`docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/docs/en.md) with standardized front-matter. Extract the metadata header using standard Unix tools:

```bash

# Show the metadata header of lesson 01 in Phase 1

sed -n '1,15p' phases/01-math-foundations/01-linear-algebra-intuition/docs/en.md

```

### Execute Lesson Implementations

Individual lessons contain runnable code in their `code/` directories. For example, to run the perceptron implementation from Phase 3:

```bash

# Python example – run the perceptron implementation from Phase 3

python phases/03-deep-learning-core/01-the-perceptron/code/perceptron.py

```

### Query the Complete Phase Catalogue

To generate an index of all 20 phases programmatically, use Node.js to read the directory structure:

```js
const fs = require('fs');
const path = require('path');

const phases = fs.readdirSync('phases')
  .filter(name => fs.lstatSync(path.join('phases', name)).isDirectory());

console.log('All phases:', phases);

```

## Summary

- The **20 phases of the AI engineering curriculum** provide a sequential learning path from basic tooling (Phase 0) to complex capstone projects (Phase 19).
- Each phase resides in a numbered folder under `phases/` (e.g., `phases/10-llms-from-scratch`) and contains standardized lesson subdirectories with `code/`, `docs/`, `outputs/`, and [`quiz.json`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/quiz.json).
- The curriculum structure is defined in the repository's [`README.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/README.md) and enforced through templates ([`LESSON_TEMPLATE.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/LESSON_TEMPLATE.md)) and automation scripts ([`scripts/audit_lessons.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/scripts/audit_lessons.py)).
- Key documentation includes [`ROADMAP.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/ROADMAP.md) for tracking completion status and [`glossary/terms.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/glossary/terms.md) for definitional references.
- The repository supports programmatic exploration, allowing learners to list lessons, inspect documentation, and execute code implementations via Python, Bash, or Node.js.

## Frequently Asked Questions

### What is the recommended order for completing the 20 phases?

The curriculum is designed for **strict sequential progression** from Phase 0 through Phase 19. Each phase builds upon competencies developed in previous sections—for example, Phase 10 (LLMs from Scratch) requires understanding of transformers from Phase 7 and deep learning fundamentals from Phase 3. The [`README.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/README.md) explicitly presents this as a directed graph in a Mermaid diagram illustrating dependencies.

### How are lessons structured within each phase?

Every lesson follows a **four-component template**: implementation code in a `code/` directory, instructional content in [`docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/docs/en.md), output artifacts in `outputs/`, and assessment questions in [`quiz.json`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/quiz.json). This structure is standardized across all phases, from `phases/00-setup-and-tooling` to `phases/19-capstone-projects`, ensuring learners know exactly where to find practical implementations versus theoretical explanations.

### What tools are available for tracking curriculum completion?

The repository includes **automated validation scripts** located in the `scripts/` directory, particularly [`audit_lessons.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/audit_lessons.py), which verifies that each phase adheres to the required structure. Additionally, [`ROADMAP.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/ROADMAP.md) provides a human-readable tracking document that denotes which phases are complete, work-in-progress, or planned, while [`site/build.js`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/build.js) generates the public website that renders the curriculum state.

### Does the curriculum cover production deployment and MLOps?

Yes, **Phase 17 (Infrastructure & Production)** specifically addresses deployment strategies, observability, logging, scaling, and CI/CD for AI services. Earlier phases (particularly Phase 13: Tools & Protocols) introduce the Model Context Protocol (MCP) and tool interfaces that are essential for production agent systems, while Phase 18 covers the ethical and alignment considerations necessary for safe deployment.