How Phase Directories Are Structured in AI Engineering From Scratch: Naming Patterns Explained

Phase directories use a strict zero-padded two-digit numbering system combined with kebab-case descriptors to organize lessons chronologically and enable automated tooling to locate content without hardcoded paths.

The rohitg00/ai-engineering-from-scratch curriculum organizes educational content into sequential phases using a rigid directory structure. Each phase directory follows a predictable naming pattern that allows scripts and site generators to locate lessons automatically. Understanding these conventions is essential for contributing lessons or building custom tooling around the repository.

Phase Directory Hierarchy

The repository uses a three-tier hierarchy under the top-level phases/ folder to separate broad topics from specific lessons and their assets.

Top-Level Phase Folders

Each phase resides in a folder following the pattern phases/<NN>-<phase-name>/, where <NN> represents a zero-padded two-digit integer (e.g., 04) and <phase-name> uses kebab-case (lower-case words separated by hyphens). For example, phases/04-computer-vision/ contains all lessons related to computer vision.

Lesson Subdirectories

Inside each phase folder, individual lessons follow the pattern phases/<NN>-<phase-name>/<NN>-<lesson-name>/. The <NN> prefix matches the parent phase number, creating an explicit relationship that allows scripts to sort lessons chronologically. For instance, phases/04-computer-vision/04-image-classification/ indicates the fourth lesson within the fourth phase.

Standard Lesson Sub-folders

Every lesson directory contains three standardized sub-folders:

  • code/ – Source code and executable scripts
  • docs/ – Documentation files including en.md for the primary lesson content
  • outputs/ – Generated artifacts and results

Naming Conventions and Patterns

The curriculum enforces strict naming conventions to ensure consistency across the repository.

Zero-padded numbering (<NN>): All directories use two-digit integers (e.g., 01, 12) to maintain alphanumeric sorting order. This prevents 10 from appearing before 2 in file listings.

Kebab-case descriptors: Both phase names and lesson names use lower-case letters with hyphens separating words (e.g., transfer-learning, natural-language-processing). This format ensures URL compatibility and cross-platform path handling.

Repeated phase prefix: Lessons reuse the same <NN> prefix as their parent phase. While this might seem redundant, it explicitly encodes the chronological position and enables glob patterns like 04-* to match both the phase and its lessons.

Programmatically Accessing Phase Content

You can leverage these naming patterns to navigate the curriculum programmatically.

Listing Lessons with Python

Use the zero-padded prefix to locate phase and lesson directories:

from pathlib import Path

def list_lessons(phase_number: int) -> list[Path]:
    """Return absolute paths of all lesson folders for the given phase."""
    phase_prefix = f"{phase_number:02d}-"
    phase_dir = Path("phases").glob(f"{phase_prefix}*")
    # There should be exactly one matching phase folder

    phase_path = next(phase_dir, None)
    if not phase_path:
        raise FileNotFoundError(f"No phase with prefix {phase_prefix}")

    lessons = sorted(p for p in phase_path.iterdir()
                    if p.is_dir() and p.name.startswith(phase_prefix))
    return lessons

print([p.name for p in list_lessons(4)])

# → ['04-image-classification', '05-transfer-learning', ...]

Visualizing Structure with Bash

The tree command quickly displays the hierarchy for any phase:


# Show the full directory tree for phase 04

tree -L 2 phases/04-computer-vision

Output (truncated):


phases/04-computer-vision
├── 04-image-classification
│   ├── code
│   ├── docs
│   └── outputs
├── 05-transfer-learning
│   ├── code
│   ├── docs
│   └── outputs
…

Reading Lesson Content with TypeScript

Access documentation files using predictable path construction:

import { readFile } from "fs/promises";
import { join } from "path";

async function readLessonDoc(phase: number, lessonSlug: string) {
  const base = `phases/${phase.toString().padStart(2, "0")}-${lessonSlug}`;
  const docPath = join(base, "docs", "en.md");
  const content = await readFile(docPath, "utf-8");
  console.log(content.split("\n")[0]); // first line of the doc
}

readLessonDoc(4, "image-classification");

Configuration and Documentation

The phase directory structure is formally documented in two key files within the repository.

According to lines 89‑97 of README.md, the directory tree follows the strict phases/<NN>-<phase-name>/<NN>-<lesson-name>/ pattern described above. Additionally, lines 13‑20 of AGENTS.md restate the layout under the "Repo layout" section, confirming these conventions for contributors and automated agents. These documentation files serve as the authoritative source for the naming patterns implemented throughout the curriculum.

Summary

  • Phase directories follow the pattern phases/<NN>-<phase-name>/ using zero-padded two-digit numbers and kebab-case names.
  • Lesson directories repeat the phase number prefix: phases/<NN>-<phase-name>/<NN>-<lesson-name>/.
  • Standard sub-folders (code/, docs/, outputs/) exist inside every lesson directory.
  • Automated tooling relies on these predictable patterns to locate and sort content chronologically.
  • Documentation in README.md (lines 89‑97) and AGENTS.md (lines 13‑20) defines the official structure.

Frequently Asked Questions

What is the exact format for phase directory names?

Phase directories use the format phases/<NN>-<phase-name>/, where <NN> is a zero-padded two-digit integer (e.g., 04) and <phase-name> is a kebab-case descriptor (e.g., computer-vision). This creates paths like phases/04-computer-vision/.

Why do lessons use the same number prefix as their parent phase?

Lessons repeat the phase number prefix (e.g., 04-image-classification inside 04-computer-vision) to explicitly encode the chronological relationship and enable alphanumeric sorting. This pattern allows scripts to identify which lessons belong to which phase without parsing configuration files.

Where is the phase directory structure documented?

The structure is documented in lines 89‑97 of README.md and lines 13‑20 of AGENTS.md. Both files describe the phases/<NN>-<phase-name>/<NN>-<lesson-name>/ hierarchy and the standard sub-folders (code/, docs/, outputs/).

How can I list all lessons in a specific phase programmatically?

Use glob patterns with the zero-padded phase prefix. In Python, search phases/ for directories starting with the phase number (e.g., 04-*), then iterate the matching phase directory for sub-folders starting with the same prefix. This approach works in any language that supports glob matching or directory listing.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →