# How the rohitg00/ai-engineering-from-scratch Repository Is Organized by Lessons

> Explore the rohitg00/ai-engineering-from-scratch repository learn how its lessons are organized. Discover standardized documentation code quizzes and artifacts within a clear structure.

- Repository: [Rohit Ghumare/ai-engineering-from-scratch](https://github.com/rohitg00/ai-engineering-from-scratch)
- Tags: getting-started
- Published: 2026-06-10

---

**The rohitg00/ai-engineering-from-scratch repository follows a strict hierarchical structure where 20 curriculum phases contain self-contained lessons, each standardized with documentation, code implementations, quizzes, and output artifacts in predictable folder locations.**

The rohitg00/ai-engineering-from-scratch repository is a curriculum-style codebase designed for systematic AI engineering education. Every lesson follows a standardized folder structure enforced by the **Lesson Contract** documented in [`AGENTS.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/AGENTS.md), making the content discoverable for both human learners and automated tooling. This organizational pattern ensures that all 20 phases of the curriculum maintain consistency from "Math Foundations" through "Agent Engineering."

## Hierarchical Phase-Lesson Structure

The repository organizes content under a `phases/` directory using a strict naming convention that enables programmatic discovery and navigation.

### Top-Level Organization

Each phase resides in `phases/<phase-number>-<phase-slug>/`, while individual lessons follow the pattern `<lesson-number>-<lesson-slug>/` nested within their respective phases. For example, the first lesson of the first phase lives at `phases/01-math-foundations/01-linear-algebra-intuition/`.

The [`README.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/README.md) (lines 51-66) serves as the central navigation hub, listing all phases in collapsible markdown tables. Each table entry links directly to the lesson directory and specifies the implementation languages, such as Python, TypeScript, Rust, or Julia.

### The 20-Phase Curriculum

According to the source documentation, the curriculum spans **20 phases** numbered 0 through 19. The [`README.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/README.md) presents these phases with markdown tables structured as:

```markdown
| # | Lesson | Type | Lang |

|---|--------|------|------|
| 01 | [Linear Algebra Intuition](phases/01-math-foundations/01-linear-algebra-intuition/) | Learn | Python, Julia |
| 02 | [Vectors, Matrices & Operations](phases/01-math-foundations/02-vectors-matrices-operations/) | Build | Python, Julia |

```

This table format allows the site generator ([`site/build.js`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/build.js)) to parse links and produce [`site/data.js`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/data.js) for the live web interface.

## The Four Core Components of Every Lesson

Every lesson directory contains four mandatory components that satisfy the Lesson Contract defined in [`AGENTS.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/AGENTS.md) (lines 63-84).

### Documentation ([`docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/docs/en.md))

The [`docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/docs/en.md) file contains the human-readable narrative, learning objectives, and prerequisites. This file includes front-matter metadata, notably the `**Languages:**` declaration, which must match the implementation files present in the `code/` directory.

### Code Implementation (`code/`)

The `code/` directory houses `main.<ext>`—the minimal reference implementation using the extension appropriate for the lesson's declared languages (`.py`, `.ts`, `.rs`, or `.jl`). The directory also includes a `tests/` subdirectory containing unit tests that verify the implementation.

### Assessment ([`quiz.json`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/quiz.json))

Each lesson includes a [`quiz.json`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/quiz.json) file containing exactly **six questions**: one pre-assessment, three checkpoint questions, and two post-assessment questions. This schema follows the strict structure:

```json
{
  "questions": [
    {"stage": "pre", ...},
    {"stage": "check", ...},
    {"stage": "check", ...},
    {"stage": "check", ...},
    {"stage": "post", ...},
    {"stage": "post", ...}
  ]
}

```

### Generated Artifacts (`outputs/`)

The `outputs/` directory stores reusable AI artifacts produced by completing the lesson. These may include **prompts**, **skills**, **agents**, or **MCP servers** that can be installed or deployed directly into production environments.

## Programmatic Lesson Discovery

Because the repository follows strict naming conventions, you can discover and validate lessons programmatically. The following Python script mirrors the directory conventions described in [`AGENTS.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/AGENTS.md) and extracts lesson metadata:

```python
import pathlib
import re

repo_root = pathlib.Path(__file__).parent.parent
lesson_pattern = re.compile(r"^\d{2}-(.+)$")

def discover_lessons():
    lessons = []
    for phase_dir in (repo_root / "phases").iterdir():
        if not phase_dir.is_dir():
            continue
        for lesson_dir in phase_dir.iterdir():
            if not lesson_dir.is_dir():
                continue
            doc_path = lesson_dir / "docs" / "en.md"
            title = None
            with doc_path.open() as f:
                for line in f:
                    if line.startswith("# "):

                        title = line[2:].strip()
                        break
            lessons.append({
                "phase": phase_dir.name,
                "lesson": lesson_dir.name,
                "title": title or "Untitled",
            })
    return lessons

if __name__ == "__main__":
    for info in discover_lessons():
        print(f"{info['phase']} → {info['lesson']}: {info['title']}")

```

To validate that a lesson's quiz follows the required schema, use this validation function:

```python
import json
from pathlib import Path

def load_quiz(lesson_path: Path) -> dict:
    with (lesson_path / "quiz.json").open() as f:
        return json.load(f)

def validate_quiz(quiz: dict) -> bool:
    if len(quiz.get("questions", [])) != 6:
        return False
    stages = [q["stage"] for q in quiz["questions"]]
    return stages == ["pre", "check", "check", "check", "post", "post"]

# Example usage

lesson_dir = Path("phases/01-math-foundations/01-linear-algebra-intuition")
quiz = load_quiz(lesson_dir)
print("Quiz valid?", validate_quiz(quiz))

```

The CI pipeline defined in [`.github/workflows/curriculum.yml`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/.github/workflows/curriculum.yml) automatically runs similar audits on every merge, ensuring the README counts stay synchronized and the site data rebuilds.

## Summary

- **Hierarchical Structure**: Lessons live at `phases/<phase>-<slug>/<lesson>-<slug>/` following strict naming conventions.
- **Mandatory Components**: Every lesson must contain [`docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/docs/en.md), `code/main.<ext>` with tests, [`quiz.json`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/quiz.json), and an `outputs/` directory.
- **Lesson Contract**: [`AGENTS.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/AGENTS.md) enforces that declared languages match implementation files and that quizzes contain exactly six questions in the prescribed order.
- **Automated Discovery**: The consistent structure enables parsing by [`site/build.js`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/build.js) and validation through CI workflows.

## Frequently Asked Questions

### What is the naming convention for lesson directories?

Lesson directories use the format `<NN>-<lesson-slug>` where `NN` is a two-digit number. This convention appears in both the folder structure and the markdown tables of [`README.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/README.md), allowing the site generator to parse paths programmatically.

### How does the repository validate that lessons follow the required structure?

The **Lesson Contract** documented in [`AGENTS.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/AGENTS.md) (lines 63-84) defines hard rules that are enforced by the CI workflow in [`.github/workflows/curriculum.yml`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/.github/workflows/curriculum.yml). This pipeline audits lessons, verifies that front-matter languages match code implementations, and checks that each [`quiz.json`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/quiz.json) contains exactly six questions in the correct sequence.

### Where are the unit tests located within each lesson?

Unit tests reside in the `code/tests/` subdirectory within each lesson folder. According to the repository conventions, every lesson must include tests that verify the `main.<ext>` implementation, with the file extension matching the languages declared in the [`docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/docs/en.md) front-matter.

### How is the curriculum website generated from the repository structure?

The [`site/build.js`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/build.js) script parses the markdown links in [`README.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/README.md), [`ROADMAP.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/ROADMAP.md), and [`GLOSSARY.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/GLOSSARY.md) to produce [`site/data.js`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/data.js). This data file powers the live website, and the process is automated via the [`curriculum.yml`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/curriculum.yml) GitHub Actions workflow, which rebuilds the site on every merge to the main branch.