How Phases and Prerequisites in AI Engineering from Scratch Form a Validated Learning DAG
Phases and prerequisites in AI Engineering from Scratch form a directed acyclic graph (DAG) encoded in lesson front matter, where each docs/en.md file explicitly lists earlier phases or lessons that must be completed before advancement.
The rohitg00/ai-engineering-from-scratch curriculum, according to its source code, is organized into numbered phases that group related lessons. Each lesson stores its documentation and metadata under phases/<phase-slug>/<lesson-slug>/docs/en.md, where a **Prerequisites:** declaration defines exactly which earlier concepts a learner must master. This metadata-driven design ensures that every dependency is explicit, machine-readable, and validated automatically.
How Lesson Front Matter Declares Phase Dependencies
Inside every lesson directory, the docs/en.md file contains a front matter block that specifies the lesson's Prerequisites. These are not loose recommendations; they are strict, declarative links expressed as plain text such as Phase 1 (Linear Algebra Intuition) or Phase 19 Track A lessons 25-29.
The repository stores lessons at predictable paths like:
phases/<phase-slug>/<lesson-slug>/docs/en.md
For example, the Perceptron lesson in Phase 3 declares a prerequisite from Phase 1. As written in phases/03-deep-learning-core/01-the-perceptron/docs/en.md, its front matter states:
**Prerequisites:** Phase 1 (Linear Algebra Intuition)
Similarly, advanced capstone lessons reference multiple prerequisite groups. In phases/19-capstone-projects/87-end-to-end-safety-gate/docs/en.md, the front matter lists both cross-phase safety work and intra-phase track lessons:
**Prerequisites:** Phase 18 safety lessons, Phase 19 Track A lessons 25-29
Cross-Phase vs. Intra-Phase Ordering
The prerequisite system creates two distinct types of dependencies that together define a complete learning path.
Cross-phase ordering allows a lesson to depend on concepts from any earlier phase. A Phase 3 lesson can directly require a Phase 1 intuition lesson, ensuring foundational math is in place before deep learning mechanics are introduced.
Intra-phase ordering keeps later lessons within the same phase dependent on earlier ones. In Phase 19, downstream capstone lessons repeatedly reference Phase 19 Track A lessons 20-29 as prerequisites. This forces mastery of tokenizers, transformer blocks, and other foundational components before attempting integrated end-to-end projects.
Multiple prerequisites are supported per lesson. Some entries list several distinct groups—such as safety curricula and track-specific lessons—to capture richer, many-to-one dependency requirements.
Automated Validation of the Prerequisite DAG
Because the curriculum is intended to be linearizable, the repository provides automation that treats prerequisites as a directed acyclic graph rather than plain text. The AGENTS.md file documents the high-level curriculum architecture, describing phases, lessons, and the Prerequisites convention used throughout the repo.
The scripts/scaffold-lesson.sh tooling validates that any new lesson's **Prerequisites:** list references only earlier lessons or phases. This prevents contributors from accidentally linking forward in time. Meanwhile, scripts/audit_lessons.py runs in CI to detect cycles or missing links, enforcing the DAG structure across the entire repository. If an audit finds a cycle or an unreachable prerequisite, the build fails.
Querying the Phase-to-Prerequisite Graph Programmatically
You can extract the full dependency graph directly from the lesson files. Below is a Python script that scans all phases/**/docs/en.md paths, parses the **Prerequisites:** line with a regular expression, and builds a serializable mapping of lesson identifiers to their prerequisite strings.
import pathlib
import re
import json
ROOT = pathlib.Path("/cache/repos/github.com/rohitg00/ai-engineering-from-scratch/main")
DOCS = list(ROOT.glob("phases/**/docs/en.md"))
prereq_pat = re.compile(r"\*\*Prerequisites:\*\*\s*(.+)")
graph = {} # lesson → list of prerequisite identifiers
for doc in DOCS:
lesson_id = "/".join(doc.parts[-4:-1]) # e.g. phases/03-deep-learning-core/01-the-perceptron
with doc.open() as f:
for line in f:
m = prereq_pat.search(line)
if m:
prereqs = [p.strip() for p in m.group(1).split(",")]
graph[lesson_id] = prereqs
break
# pretty-print a few entries
for k, v in list(graph.items())[:5]:
print(f"{k} → {v}")
# optionally, dump the whole graph as JSON for downstream tooling
with open("prereq_graph.json", "w") as out:
json.dump(graph, out, indent=2)
Running this against the repository produces entries such as:
phases/03-deep-learning-core/01-the-perceptron → ['Phase 1 (Linear Algebra Intuition)']
phases/19-capstone-projects/87-end-to-end-safety-gate → ['Phase 18 safety lessons', 'Phase 19 Track A lessons 25-29']
This output can be fed into graph visualization tools or used to generate a topological ordering of lessons for custom learning dashboards.
Summary
- Phases and prerequisites in AI Engineering from Scratch create an explicit DAG encoded in lesson front matter.
- Each lesson under
phases/<phase-slug>/<lesson-slug>/docs/en.mddeclares**Prerequisites:**as plain text links to earlier content. - Cross-phase and intra-phase dependencies are both supported, allowing flexible but strict ordering.
- The
scripts/scaffold-lesson.shgenerator andscripts/audit_lessons.pyCI pipeline validate references and prevent cycles. - A simple Python scanner can parse every
docs/en.mdfile to build a machine-readable prerequisite graph.
Frequently Asked Questions
What is the relationship between phases and prerequisites in AI Engineering from Scratch?
Phases are the high-level organizational buckets, while prerequisites are the declarative metadata inside each lesson that links it to specific earlier phases or lessons. Together they form a directed acyclic graph that defines a strict, linearizable learning path.
Where are prerequisites declared in the repository?
Prerequisites are declared in the front matter of each lesson's documentation file, located at phases/<phase-slug>/<lesson-slug>/docs/en.md. The conventional format is **Prerequisites:** [list of prior lessons or phases].
How does the repository prevent circular dependencies between lessons?
The scripts/audit_lessons.py CI script audits every lesson's prerequisite list to detect cycles or missing links. Additionally, scripts/scaffold-lesson.sh validates that new lessons only reference earlier content, ensuring the curriculum remains a true DAG.
Can a lesson depend on multiple prerequisites from different phases?
Yes. Lessons can list several prerequisite groups separated by commas. For example, the end-to-end safety gate lesson in Phase 19 requires both Phase 18 safety lessons and Phase 19 Track A lessons 25-29.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →