How Phase Directories Are Structured in AI Engineering From Scratch: Naming Patterns Explained
Phase directories use a strict zero-padded two-digit numbering system combined with kebab-case descriptors to organize lessons chronologically and enable automated tooling to locate content without hardcoded paths.
The rohitg00/ai-engineering-from-scratch curriculum organizes educational content into sequential phases using a rigid directory structure. Each phase directory follows a predictable naming pattern that allows scripts and site generators to locate lessons automatically. Understanding these conventions is essential for contributing lessons or building custom tooling around the repository.
Phase Directory Hierarchy
The repository uses a three-tier hierarchy under the top-level phases/ folder to separate broad topics from specific lessons and their assets.
Top-Level Phase Folders
Each phase resides in a folder following the pattern phases/<NN>-<phase-name>/, where <NN> represents a zero-padded two-digit integer (e.g., 04) and <phase-name> uses kebab-case (lower-case words separated by hyphens). For example, phases/04-computer-vision/ contains all lessons related to computer vision.
Lesson Subdirectories
Inside each phase folder, individual lessons follow the pattern phases/<NN>-<phase-name>/<NN>-<lesson-name>/. The <NN> prefix matches the parent phase number, creating an explicit relationship that allows scripts to sort lessons chronologically. For instance, phases/04-computer-vision/04-image-classification/ indicates the fourth lesson within the fourth phase.
Standard Lesson Sub-folders
Every lesson directory contains three standardized sub-folders:
code/– Source code and executable scriptsdocs/– Documentation files includingen.mdfor the primary lesson contentoutputs/– Generated artifacts and results
Naming Conventions and Patterns
The curriculum enforces strict naming conventions to ensure consistency across the repository.
Zero-padded numbering (<NN>): All directories use two-digit integers (e.g., 01, 12) to maintain alphanumeric sorting order. This prevents 10 from appearing before 2 in file listings.
Kebab-case descriptors: Both phase names and lesson names use lower-case letters with hyphens separating words (e.g., transfer-learning, natural-language-processing). This format ensures URL compatibility and cross-platform path handling.
Repeated phase prefix: Lessons reuse the same <NN> prefix as their parent phase. While this might seem redundant, it explicitly encodes the chronological position and enables glob patterns like 04-* to match both the phase and its lessons.
Programmatically Accessing Phase Content
You can leverage these naming patterns to navigate the curriculum programmatically.
Listing Lessons with Python
Use the zero-padded prefix to locate phase and lesson directories:
from pathlib import Path
def list_lessons(phase_number: int) -> list[Path]:
"""Return absolute paths of all lesson folders for the given phase."""
phase_prefix = f"{phase_number:02d}-"
phase_dir = Path("phases").glob(f"{phase_prefix}*")
# There should be exactly one matching phase folder
phase_path = next(phase_dir, None)
if not phase_path:
raise FileNotFoundError(f"No phase with prefix {phase_prefix}")
lessons = sorted(p for p in phase_path.iterdir()
if p.is_dir() and p.name.startswith(phase_prefix))
return lessons
print([p.name for p in list_lessons(4)])
# → ['04-image-classification', '05-transfer-learning', ...]
Visualizing Structure with Bash
The tree command quickly displays the hierarchy for any phase:
# Show the full directory tree for phase 04
tree -L 2 phases/04-computer-vision
Output (truncated):
phases/04-computer-vision
├── 04-image-classification
│ ├── code
│ ├── docs
│ └── outputs
├── 05-transfer-learning
│ ├── code
│ ├── docs
│ └── outputs
…
Reading Lesson Content with TypeScript
Access documentation files using predictable path construction:
import { readFile } from "fs/promises";
import { join } from "path";
async function readLessonDoc(phase: number, lessonSlug: string) {
const base = `phases/${phase.toString().padStart(2, "0")}-${lessonSlug}`;
const docPath = join(base, "docs", "en.md");
const content = await readFile(docPath, "utf-8");
console.log(content.split("\n")[0]); // first line of the doc
}
readLessonDoc(4, "image-classification");
Configuration and Documentation
The phase directory structure is formally documented in two key files within the repository.
According to lines 89‑97 of README.md, the directory tree follows the strict phases/<NN>-<phase-name>/<NN>-<lesson-name>/ pattern described above. Additionally, lines 13‑20 of AGENTS.md restate the layout under the "Repo layout" section, confirming these conventions for contributors and automated agents. These documentation files serve as the authoritative source for the naming patterns implemented throughout the curriculum.
Summary
- Phase directories follow the pattern
phases/<NN>-<phase-name>/using zero-padded two-digit numbers and kebab-case names. - Lesson directories repeat the phase number prefix:
phases/<NN>-<phase-name>/<NN>-<lesson-name>/. - Standard sub-folders (
code/,docs/,outputs/) exist inside every lesson directory. - Automated tooling relies on these predictable patterns to locate and sort content chronologically.
- Documentation in
README.md(lines 89‑97) andAGENTS.md(lines 13‑20) defines the official structure.
Frequently Asked Questions
What is the exact format for phase directory names?
Phase directories use the format phases/<NN>-<phase-name>/, where <NN> is a zero-padded two-digit integer (e.g., 04) and <phase-name> is a kebab-case descriptor (e.g., computer-vision). This creates paths like phases/04-computer-vision/.
Why do lessons use the same number prefix as their parent phase?
Lessons repeat the phase number prefix (e.g., 04-image-classification inside 04-computer-vision) to explicitly encode the chronological relationship and enable alphanumeric sorting. This pattern allows scripts to identify which lessons belong to which phase without parsing configuration files.
Where is the phase directory structure documented?
The structure is documented in lines 89‑97 of README.md and lines 13‑20 of AGENTS.md. Both files describe the phases/<NN>-<phase-name>/<NN>-<lesson-name>/ hierarchy and the standard sub-folders (code/, docs/, outputs/).
How can I list all lessons in a specific phase programmatically?
Use glob patterns with the zero-padded phase prefix. In Python, search phases/ for directories starting with the phase number (e.g., 04-*), then iterate the matching phase directory for sub-folders starting with the same prefix. This approach works in any language that supports glob matching or directory listing.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →