# Distinctions Between Learn, Build, and Reference Lesson Types in AI Engineering From Scratch

> Understand the differences between Learn, Build, and Reference lesson types in AI engineering from scratch. Discover how each type contributes to foundational AI knowledge and practical application.

- Repository: [Rohit Ghumare/ai-engineering-from-scratch](https://github.com/rohitg00/ai-engineering-from-scratch)
- Tags: deep-dive
- Published: 2026-06-05

---

**In the `rohitg00/ai-engineering-from-scratch` curriculum, Learn lessons deliver theory without executable code, Build lessons require a `code/` directory with unit tests and an `outputs/` artifact, and Reference lessons provide reusable tables or figures that other lessons can cite.**

The repository organizes its self-contained lessons into three mutually exclusive categories defined by required Markdown front-matter. Grasping the distinctions between Learn, Build, and Reference lesson types tells a learner whether to expect a conceptual reading, a hands-on coding exercise, or a canonical data table before opening any file.

## How Lesson Types Are Enforced in the Curriculum

Every lesson begins with a Markdown front-matter block governed by the **Lesson contract** in [`AGENTS.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/AGENTS.md) (lines 72–76). One required field is:

```markdown
**Type:** <Learn | Build | Reference>

```

The CI job [`audit_lessons.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/audit_lessons.py) validates this field across the repository. Any structural mismatch—such as a **Learn** lesson that accidentally contains a `code/` folder—is flagged as a contract violation and must be corrected.

## Learn Lessons: Theory Without Code

**Learn** lessons introduce background knowledge, ethical frameworks, and mathematical foundations. As implemented in `rohitg00/ai-engineering-from-scratch`, these lessons rely on long-form narrative text and conceptual assessment rather than implementation.

Key characteristics include:

- **No `code/` directory** or runnable implementation.
- A [`quiz.json`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/quiz.json) file that checks understanding of the concepts presented.
- Reference tables or figures may appear for illustration, but **no executable code** is included.

For example, [`phases/18-ethics-safety-alignment/30-dual-use-risk-cyber-bio-chem-nuclear/docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/18-ethics-safety-alignment/30-dual-use-risk-cyber-bio-chem-nuclear/docs/en.md) sets `**Type:** Learn` and provides a pure conceptual overview of dual-use threats:

```markdown

# Dual‑Use Risk: Cyber‑Bio‑Chem‑Nuclear  

> Understanding the spectrum of dual‑use threats.  

**Type:** Learn  
**Languages:** None  
**Prerequisites:** None  
**Time:** ~15 min  

## Learning Objectives

- Define dual‑use risk categories.  
- Identify real‑world examples.  
- Explain mitigation strategies.  

```

## Build Lessons: Implementation With Tests and Outputs

**Build** lessons are where learners apply theory by writing code. Each Build lesson must contain a `code/` folder with a `main.<lang>` implementation and a `tests/` suite containing at least five unit tests. The generated artifact—whether a skill, prompt, agent, or MCP server—is stored under `outputs/`.

Key characteristics include:

- A **`code/` folder** with the primary implementation.
- A **`tests/` directory** with ≥ 5 unit tests.
- An **`outputs/` directory** for the reusable artifact produced by the lesson.
- Explanatory [`docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/docs/en.md) front-matter still accompanies the code, but the emphasis is on hands-on construction.

For example, [`phases/19-capstone-projects/87-end-to-end-safety-gate/docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/19-capstone-projects/87-end-to-end-safety-gate/docs/en.md) declares `**Type:** Build` and expects the learner to implement a safety gate. The companion `code/` directory holds the actual implementation and its test suite:

```markdown

# End‑to‑End Safety Gate  

> Assemble a safety‑gate that blocks harmful completions.  

**Type:** Build  
**Languages:** Python  
**Prerequisites:** 86‑constitutional‑rules‑engine  
**Time:** ~45 min  

## Learning Objectives

- Implement a rule‑engine API.  
- Write unit tests for safety checks.  
- Deploy the gate as a reusable skill.  

```

## Reference Lessons: Reusable Tables and Figures

**Reference** lessons supply canonical material that multiple other lessons can consult. These typically contain tables, charts, or frozen data sets—such as model-size tables or hyper-parameter defaults—and do not require a `code/` directory.

Key characteristics include:

- **Tables, charts, or canonical data** that can be cited by Learn or Build lessons.
- Figures may be stored in `site/assets/figures/`.
- Consumed by the site generator ([`site/build.js`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/build.js)) to populate lookup tables and [`site/data.js`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/data.js).

For example, [`phases/10-llms-from-scratch/08-dpo/docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/10-llms-from-scratch/08-dpo/docs/en.md) sets `**Type:** Reference` and provides a frozen checkpoint table for DPO training:

```markdown

# DPO Reference Model Table  

> Canonical frozen SFT checkpoints used for DPO.  

**Type:** Reference  
**Languages:** None  
**Prerequisites:** None  
**Time:** ~5 min  

| Model | Size | Reference Checkpoint |
|-------|------|----------------------|
| Llama‑3.2‑1B‑Instruct‑spec | 1 B | `meta‑llama/Llama‑3.2‑1B‑Instruct‑spec` |
| Qwen‑3‑0.6B‑spec | 0.6 B | `Qwen/Qwen3‑0.6B‑spec` |

```

## How the Three Types Fit Into the Curriculum Pipeline

The repository uses these categories to create a structured learning flow:

1. **Learning pipeline** – Students start with **Learn** lessons to acquire theoretical foundations in ethics, mathematics, or model bias.
2. **Construction pipeline** – Once the concept is established, a **Build** lesson guides the learner to implement the algorithm from scratch, such as writing a transformer block or a training loop.
3. **Reference assets** – **Reference** lessons act as canonical sources of data that other lessons cite. The site build script [`site/build.js`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/build.js) reads links of the form `![FIG_NNN](../../site/assets/figures/…)` and incorporates them into [`site/data.js`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/data.js).

Because the **Lesson contract** in [`AGENTS.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/AGENTS.md) (lines 72–76) strictly defines these boundaries, the automated [`audit_lessons.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/audit_lessons.py) job can enforce them and keep the entire curriculum machine-parsable.

## Summary

- **Learn** lessons teach concepts through narrative text and quizzes, and they must not contain a `code/` directory.
- **Build** lessons require hands-on coding, a `code/` folder with `main.<lang>`, a `tests/` suite of at least five unit tests, and an `outputs/` artifact.
- **Reference** lessons provide reusable tables or figures for citation across the curriculum, often feeding the site generator at [`site/build.js`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/build.js).
- The **Lesson contract** in [`AGENTS.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/AGENTS.md) and the [`audit_lessons.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/audit_lessons.py) CI job enforce these distinctions across all phases.

## Frequently Asked Questions

### Can a Learn lesson contain executable code?

No. According to the Lesson contract in [`AGENTS.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/AGENTS.md), a **Learn** lesson must not include a `code/` directory or runnable implementation. If [`audit_lessons.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/audit_lessons.py) detects executable code inside a Learn lesson, it flags the violation because the pedagogical goal is strictly conceptual.

### What are the testing requirements for a Build lesson?

A **Build** lesson must include a `code/` folder containing a `main.<lang>` implementation and a `tests/` directory with at least five unit tests. The lesson should also produce a reusable artifact stored under `outputs/`.

### How are Reference lessons consumed by the site generator?

The site build script [`site/build.js`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/build.js) reads figures and tables from Reference lessons—often linked via patterns like `![FIG_NNN](../../site/assets/figures/…)`—and incorporates them into [`site/data.js`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/data.js). This allows other lessons to cite canonical data without duplicating it.

### Where are the three lesson types formally defined?

The three types are formally defined in [`AGENTS.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/AGENTS.md) at lines 72–76 as part of the Lesson contract. The curriculum requires every lesson’s front-matter to declare exactly one of the three values: `Learn`, `Build`, or `Reference`.