How to Contribute to the AI Engineering from Scratch Project: A Complete Guide
Fork the repository, create a feature branch, add your lesson following the standardized directory structure, ensure tests pass, and submit a pull request with a clear description.
The AI Engineering from Scratch curriculum is an open-source educational resource hosted at rohitg00/ai-engineering-from-scratch that teaches artificial intelligence by building every component from the ground up. Whether you want to add a new lesson, fix bugs, improve documentation, or provide translations, the contribution workflow is designed to be straightforward and transparent. This guide covers the exact steps needed to submit high-quality contributions that align with the project's 20-phase, 500-lesson architecture.
Understanding the Repository Structure
The curriculum organizes content into 20 phases containing approximately 500 lessons. Each lesson lives in its own directory under phases/<phase-slug>/<lesson-slug>/ and follows a strict file structure to ensure the automated site generator can process it correctly.
A standard lesson directory contains:
<lesson-slug>/
├── code/ # Runnable implementations (Python, TypeScript, Rust, Julia)
├── docs/
│ └── en.md # Required lesson narrative with front-matter
├── quiz.json # Six-question assessment (pre, three check, two post)
└── outputs/ # Optional prompts, skills, agents, or MCP servers
When you contribute to the ai-engineering-from-scratch project, you must maintain this structure for the site/build.js script to correctly generate the static curriculum website.
Contribution Workflow Step-by-Step
Follow these six steps to ensure your contribution meets the repository's standards:
-
Fork the repository at
https://github.com/rohitg00/ai-engineering-from-scratch/fork. -
Create a feature branch with a descriptive name:
git checkout -b add-lesson-phase-3-gradient-descent -
Add or modify content following the lesson structure described in CONTRIBUTING.md.
-
Run local tests using the unit test discovery command in the lesson's code folder:
cd phases/<phase-slug>/<lesson-slug>/code python -m unittest discover -
Update top-level index files including
README.md,ROADMAP.md, and theglossary/directory so the website generator can include your changes. -
Open a pull request with a clear description; the CI pipeline automatically lints, audits, and rebuilds the site data.
The repository enforces a "one lesson per PR" hard rule to maintain clean git history and simplify code reviews.
Creating a New Lesson
When adding a new lesson to the ai-engineering-from-scratch curriculum, you must provide four essential components:
Lesson Documentation
Create docs/en.md using the front-matter template that specifies the title, hook, type, languages, prerequisites, and learning objectives. This markdown file drives the narrative instruction for the lesson.
Runnable Code Implementations
Provide at least one working implementation in the code/ directory. For Python lessons, include a requirements.txt file; for TypeScript, include a package.json.
Here is an example implementation for a gradient descent lesson:
# phases/03-deep-learning-core/15-gradient-descent/code/main.py
"""
Lesson 15 – Gradient Descent
Build: implement gradient descent from scratch.
Use: compare against PyTorch's optimizer.
"""
import numpy as np
def gradient_descent(f, grad_f, x0, lr=0.1, steps=100):
x = x0
for _ in range(steps):
x = x - lr * grad_f(x)
return x
# Example: minimize f(x) = (x-3)^2
f = lambda x: (x - 3) ** 2
grad_f = lambda x: 2 * (x - 3)
optimum = gradient_descent(f, grad_f, x0=0.0)
print(f"Optimum: {optimum:.2f}")
Assessment Quiz
Include a quiz.json file containing exactly six questions: one pre-assessment, three checkpoint questions, and two post-assessment questions.
Curriculum Index Entry
Add a row to the lesson table in README.md following this pattern:
| 15 | [Gradient Descent](phases/03-deep-learning-core/15-gradient-descent/) | Build | Python |
Running Tests and Validation
Before submitting your pull request, validate your changes using the project's automated tooling.
Local Testing
Execute unit tests within the lesson's code directory:
python -m unittest discover
CI Pipeline Checks
The continuous integration workflow in scripts/audit_lessons.py validates that:
- Lesson tables and phase headers retain the required markdown patterns defined in
CONTRIBUTING.md(lines 13-16) - All code executes with declared dependencies
- The site data (
site/data.js) rebuilds cleanly after your changes
The pipeline automatically runs these checks when you open a pull request, blocking merges that fail validation.
Translating Existing Content
If you prefer to translate rather than create new lessons, add a new markdown file under the lesson's docs/ directory while keeping the original en.md intact.
For example, to add a Chinese translation:
# phases/03-deep-learning-core/15-gradient-descent/docs/zh.md
# 梯度下降
> 用一句话概括核心概念。
## 问题
...
Translation contributions follow the same branch-and-PR workflow as new lessons, but typically require fewer test modifications since the underlying code implementations remain unchanged.
Summary
- Fork and branch before making any changes to the
rohitg00/ai-engineering-from-scratchrepository. - Follow the directory structure:
phases/<phase-slug>/<lesson-slug>/withcode/,docs/en.md, andquiz.json. - Commit one lesson per PR to maintain clean git history.
- Update
README.mdtables to ensure the site generator picks up your content. - Run
python -m unittest discoverlocally before pushing to catch execution errors early. - Respect the CI pipeline enforced by
scripts/audit_lessons.pywhich validates markdown patterns and dependency resolution.
Frequently Asked Questions
What is the "one lesson per PR" rule?
The one lesson per PR rule is a hard requirement in the ai-engineering-from-scratch contribution guidelines. It ensures that code reviews remain focused and manageable, allows specific lessons to be reverted without affecting others, and maintains a clean git history. If you are contributing multiple lessons, you must open separate pull requests for each one.
How do I know if my lesson passes the automated audits?
The scripts/audit_lessons.py file contains the validation logic used by the CI pipeline. It checks for correct file naming conventions, validates that quiz.json exists and contains the required six questions, verifies that lesson tables in README.md match specific markdown patterns (as defined in lines 13-16 of CONTRIBUTING.md), and ensures all declared dependencies in requirements.txt or package.json resolve correctly. You can run this script locally or wait for the automatic CI check when you open your pull request.
Can I contribute lessons in programming languages other than Python?
Yes. The curriculum supports Python, TypeScript, Rust, and Julia. When adding a lesson in an alternative language, include the appropriate dependency file (package.json for TypeScript, Cargo.toml for Rust, etc.) in the code/ directory and specify the language in the README.md table entry. The site/build.js generator will automatically categorize your lesson by the languages you declare.
Where do I find the templates for lesson documentation?
The front-matter template for docs/en.md and detailed file structure rules are documented in CONTRIBUTING.md at the root of the repository. This file specifies the exact metadata fields required (title, hook, type, languages, prerequisites, learning objectives) and provides examples of properly formatted lesson tables and quiz structures.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →