# How Phases and Prerequisites in AI Engineering from Scratch Form a Validated Learning DAG

> Discover how phases and prerequisites in AI Engineering from Scratch create a learning DAG. Understand lesson dependencies for expert AI development. Learn more.

- Repository: [Rohit Ghumare/ai-engineering-from-scratch](https://github.com/rohitg00/ai-engineering-from-scratch)
- Tags: internals
- Published: 2026-06-07

---

**Phases and prerequisites in AI Engineering from Scratch form a directed acyclic graph (DAG) encoded in lesson front matter, where each [`docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/docs/en.md) file explicitly lists earlier phases or lessons that must be completed before advancement.**

The `rohitg00/ai-engineering-from-scratch` curriculum, according to its source code, is organized into numbered phases that group related lessons. Each lesson stores its documentation and metadata under `phases/<phase-slug>/<lesson-slug>/docs/en.md`, where a `**Prerequisites:**` declaration defines exactly which earlier concepts a learner must master. This metadata-driven design ensures that every dependency is explicit, machine-readable, and validated automatically.

## How Lesson Front Matter Declares Phase Dependencies

Inside every lesson directory, the [`docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/docs/en.md) file contains a front matter block that specifies the lesson's **Prerequisites**. These are not loose recommendations; they are strict, declarative links expressed as plain text such as `Phase 1 (Linear Algebra Intuition)` or `Phase 19 Track A lessons 25-29`.

The repository stores lessons at predictable paths like:

```text
phases/<phase-slug>/<lesson-slug>/docs/en.md

```

For example, the *Perceptron* lesson in Phase 3 declares a prerequisite from Phase 1. As written in [`phases/03-deep-learning-core/01-the-perceptron/docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/03-deep-learning-core/01-the-perceptron/docs/en.md), its front matter states:

```text
**Prerequisites:** Phase 1 (Linear Algebra Intuition)

```

Similarly, advanced capstone lessons reference multiple prerequisite groups. In [`phases/19-capstone-projects/87-end-to-end-safety-gate/docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/19-capstone-projects/87-end-to-end-safety-gate/docs/en.md), the front matter lists both cross-phase safety work and intra-phase track lessons:

```text
**Prerequisites:** Phase 18 safety lessons, Phase 19 Track A lessons 25-29

```

## Cross-Phase vs. Intra-Phase Ordering

The prerequisite system creates two distinct types of dependencies that together define a complete learning path.

**Cross-phase ordering** allows a lesson to depend on concepts from any earlier phase. A Phase 3 lesson can directly require a Phase 1 intuition lesson, ensuring foundational math is in place before deep learning mechanics are introduced.

**Intra-phase ordering** keeps later lessons within the same phase dependent on earlier ones. In Phase 19, downstream capstone lessons repeatedly reference `Phase 19 Track A lessons 20-29` as prerequisites. This forces mastery of tokenizers, transformer blocks, and other foundational components before attempting integrated end-to-end projects.

**Multiple prerequisites** are supported per lesson. Some entries list several distinct groups—such as safety curricula and track-specific lessons—to capture richer, many-to-one dependency requirements.

## Automated Validation of the Prerequisite DAG

Because the curriculum is intended to be linearizable, the repository provides automation that treats prerequisites as a directed acyclic graph rather than plain text. The [`AGENTS.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/AGENTS.md) file documents the high-level curriculum architecture, describing phases, lessons, and the **Prerequisites** convention used throughout the repo.

The [`scripts/scaffold-lesson.sh`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/scripts/scaffold-lesson.sh) tooling validates that any new lesson's `**Prerequisites:**` list references only earlier lessons or phases. This prevents contributors from accidentally linking forward in time. Meanwhile, [`scripts/audit_lessons.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/scripts/audit_lessons.py) runs in CI to detect cycles or missing links, enforcing the DAG structure across the entire repository. If an audit finds a cycle or an unreachable prerequisite, the build fails.

## Querying the Phase-to-Prerequisite Graph Programmatically

You can extract the full dependency graph directly from the lesson files. Below is a Python script that scans all `phases/**/docs/en.md` paths, parses the `**Prerequisites:**` line with a regular expression, and builds a serializable mapping of lesson identifiers to their prerequisite strings.

```python
import pathlib
import re
import json

ROOT = pathlib.Path("/cache/repos/github.com/rohitg00/ai-engineering-from-scratch/main")
DOCS = list(ROOT.glob("phases/**/docs/en.md"))

prereq_pat = re.compile(r"\*\*Prerequisites:\*\*\s*(.+)")

graph = {}  # lesson → list of prerequisite identifiers

for doc in DOCS:
    lesson_id = "/".join(doc.parts[-4:-1])   # e.g. phases/03-deep-learning-core/01-the-perceptron

    with doc.open() as f:
        for line in f:
            m = prereq_pat.search(line)
            if m:
                prereqs = [p.strip() for p in m.group(1).split(",")]
                graph[lesson_id] = prereqs
                break

# pretty-print a few entries

for k, v in list(graph.items())[:5]:
    print(f"{k} → {v}")

# optionally, dump the whole graph as JSON for downstream tooling

with open("prereq_graph.json", "w") as out:
    json.dump(graph, out, indent=2)

```

Running this against the repository produces entries such as:

```text
phases/03-deep-learning-core/01-the-perceptron → ['Phase 1 (Linear Algebra Intuition)']
phases/19-capstone-projects/87-end-to-end-safety-gate → ['Phase 18 safety lessons', 'Phase 19 Track A lessons 25-29']

```

This output can be fed into graph visualization tools or used to generate a topological ordering of lessons for custom learning dashboards.

## Summary

- **Phases and prerequisites in AI Engineering from Scratch** create an explicit DAG encoded in lesson front matter.
- Each lesson under `phases/<phase-slug>/<lesson-slug>/docs/en.md` declares `**Prerequisites:**` as plain text links to earlier content.
- **Cross-phase** and **intra-phase** dependencies are both supported, allowing flexible but strict ordering.
- The [`scripts/scaffold-lesson.sh`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/scripts/scaffold-lesson.sh) generator and [`scripts/audit_lessons.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/scripts/audit_lessons.py) CI pipeline validate references and prevent cycles.
- A simple Python scanner can parse every [`docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/docs/en.md) file to build a machine-readable prerequisite graph.

## Frequently Asked Questions

### What is the relationship between phases and prerequisites in AI Engineering from Scratch?

Phases are the high-level organizational buckets, while prerequisites are the declarative metadata inside each lesson that links it to specific earlier phases or lessons. Together they form a directed acyclic graph that defines a strict, linearizable learning path.

### Where are prerequisites declared in the repository?

Prerequisites are declared in the front matter of each lesson's documentation file, located at `phases/<phase-slug>/<lesson-slug>/docs/en.md`. The conventional format is `**Prerequisites:** [list of prior lessons or phases]`.

### How does the repository prevent circular dependencies between lessons?

The [`scripts/audit_lessons.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/scripts/audit_lessons.py) CI script audits every lesson's prerequisite list to detect cycles or missing links. Additionally, [`scripts/scaffold-lesson.sh`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/scripts/scaffold-lesson.sh) validates that new lessons only reference earlier content, ensuring the curriculum remains a true DAG.

### Can a lesson depend on multiple prerequisites from different phases?

Yes. Lessons can list several prerequisite groups separated by commas. For example, the end-to-end safety gate lesson in Phase 19 requires both `Phase 18 safety lessons` and `Phase 19 Track A lessons 25-29`.