How to Contribute to the AI Engineering from Scratch Project: A Complete Guide

Fork the repository, create a feature branch, add your lesson following the standardized directory structure, ensure tests pass, and submit a pull request with a clear description.

The AI Engineering from Scratch curriculum is an open-source educational resource hosted at rohitg00/ai-engineering-from-scratch that teaches artificial intelligence by building every component from the ground up. Whether you want to add a new lesson, fix bugs, improve documentation, or provide translations, the contribution workflow is designed to be straightforward and transparent. This guide covers the exact steps needed to submit high-quality contributions that align with the project's 20-phase, 500-lesson architecture.

Understanding the Repository Structure

The curriculum organizes content into 20 phases containing approximately 500 lessons. Each lesson lives in its own directory under phases/<phase-slug>/<lesson-slug>/ and follows a strict file structure to ensure the automated site generator can process it correctly.

A standard lesson directory contains:

<lesson-slug>/
├── code/           # Runnable implementations (Python, TypeScript, Rust, Julia)

├── docs/
│   └── en.md       # Required lesson narrative with front-matter

├── quiz.json       # Six-question assessment (pre, three check, two post)

└── outputs/        # Optional prompts, skills, agents, or MCP servers

When you contribute to the ai-engineering-from-scratch project, you must maintain this structure for the site/build.js script to correctly generate the static curriculum website.

Contribution Workflow Step-by-Step

Follow these six steps to ensure your contribution meets the repository's standards:

  1. Fork the repository at https://github.com/rohitg00/ai-engineering-from-scratch/fork.

  2. Create a feature branch with a descriptive name:

    git checkout -b add-lesson-phase-3-gradient-descent
  3. Add or modify content following the lesson structure described in CONTRIBUTING.md.

  4. Run local tests using the unit test discovery command in the lesson's code folder:

    cd phases/<phase-slug>/<lesson-slug>/code
    python -m unittest discover
  5. Update top-level index files including README.md, ROADMAP.md, and the glossary/ directory so the website generator can include your changes.

  6. Open a pull request with a clear description; the CI pipeline automatically lints, audits, and rebuilds the site data.

The repository enforces a "one lesson per PR" hard rule to maintain clean git history and simplify code reviews.

Creating a New Lesson

When adding a new lesson to the ai-engineering-from-scratch curriculum, you must provide four essential components:

Lesson Documentation

Create docs/en.md using the front-matter template that specifies the title, hook, type, languages, prerequisites, and learning objectives. This markdown file drives the narrative instruction for the lesson.

Runnable Code Implementations

Provide at least one working implementation in the code/ directory. For Python lessons, include a requirements.txt file; for TypeScript, include a package.json.

Here is an example implementation for a gradient descent lesson:


# phases/03-deep-learning-core/15-gradient-descent/code/main.py

"""
Lesson 15 – Gradient Descent
Build: implement gradient descent from scratch.
Use: compare against PyTorch's optimizer.
"""

import numpy as np

def gradient_descent(f, grad_f, x0, lr=0.1, steps=100):
    x = x0
    for _ in range(steps):
        x = x - lr * grad_f(x)
    return x

# Example: minimize f(x) = (x-3)^2

f = lambda x: (x - 3) ** 2
grad_f = lambda x: 2 * (x - 3)
optimum = gradient_descent(f, grad_f, x0=0.0)
print(f"Optimum: {optimum:.2f}")

Assessment Quiz

Include a quiz.json file containing exactly six questions: one pre-assessment, three checkpoint questions, and two post-assessment questions.

Curriculum Index Entry

Add a row to the lesson table in README.md following this pattern:

| 15 | [Gradient Descent](phases/03-deep-learning-core/15-gradient-descent/) | Build | Python |

Running Tests and Validation

Before submitting your pull request, validate your changes using the project's automated tooling.

Local Testing

Execute unit tests within the lesson's code directory:

python -m unittest discover

CI Pipeline Checks

The continuous integration workflow in scripts/audit_lessons.py validates that:

  • Lesson tables and phase headers retain the required markdown patterns defined in CONTRIBUTING.md (lines 13-16)
  • All code executes with declared dependencies
  • The site data (site/data.js) rebuilds cleanly after your changes

The pipeline automatically runs these checks when you open a pull request, blocking merges that fail validation.

Translating Existing Content

If you prefer to translate rather than create new lessons, add a new markdown file under the lesson's docs/ directory while keeping the original en.md intact.

For example, to add a Chinese translation:


# phases/03-deep-learning-core/15-gradient-descent/docs/zh.md

# 梯度下降

> 用一句话概括核心概念。

## 问题

...

Translation contributions follow the same branch-and-PR workflow as new lessons, but typically require fewer test modifications since the underlying code implementations remain unchanged.

Summary

  • Fork and branch before making any changes to the rohitg00/ai-engineering-from-scratch repository.
  • Follow the directory structure: phases/<phase-slug>/<lesson-slug>/ with code/, docs/en.md, and quiz.json.
  • Commit one lesson per PR to maintain clean git history.
  • Update README.md tables to ensure the site generator picks up your content.
  • Run python -m unittest discover locally before pushing to catch execution errors early.
  • Respect the CI pipeline enforced by scripts/audit_lessons.py which validates markdown patterns and dependency resolution.

Frequently Asked Questions

What is the "one lesson per PR" rule?

The one lesson per PR rule is a hard requirement in the ai-engineering-from-scratch contribution guidelines. It ensures that code reviews remain focused and manageable, allows specific lessons to be reverted without affecting others, and maintains a clean git history. If you are contributing multiple lessons, you must open separate pull requests for each one.

How do I know if my lesson passes the automated audits?

The scripts/audit_lessons.py file contains the validation logic used by the CI pipeline. It checks for correct file naming conventions, validates that quiz.json exists and contains the required six questions, verifies that lesson tables in README.md match specific markdown patterns (as defined in lines 13-16 of CONTRIBUTING.md), and ensures all declared dependencies in requirements.txt or package.json resolve correctly. You can run this script locally or wait for the automatic CI check when you open your pull request.

Can I contribute lessons in programming languages other than Python?

Yes. The curriculum supports Python, TypeScript, Rust, and Julia. When adding a lesson in an alternative language, include the appropriate dependency file (package.json for TypeScript, Cargo.toml for Rust, etc.) in the code/ directory and specify the language in the README.md table entry. The site/build.js generator will automatically categorize your lesson by the languages you declare.

Where do I find the templates for lesson documentation?

The front-matter template for docs/en.md and detailed file structure rules are documented in CONTRIBUTING.md at the root of the repository. This file specifies the exact metadata fields required (title, hook, type, languages, prerequisites, learning objectives) and provides examples of properly formatted lesson tables and quiz structures.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →