How to Set Up the rohitg00/ai-engineering-from-scratch Project: Complete Installation Guide

Clone the repository, create a Python virtual environment, install dependencies from requirements.txt, and run phases/01-math-foundations/01-linear-algebra-intuition/code/vectors.py to verify the curriculum environment is ready.

The rohitg00/ai-engineering-from-scratch repository is a comprehensive 503-lesson curriculum organized into 20 phases that teaches AI fundamentals by building every algorithm from raw mathematics before introducing frameworks. To set up the rohitg00/ai-engineering-from-scratch project correctly, you must configure an isolated Python environment and validate the installation against the phase-based lesson structure that implements the Build-It / Use-It pattern.

Step 1: Clone the Repository and Explore the Structure

Start by cloning the codebase from GitHub. The repository contains no large binary assets; all lesson content and artifacts are generated on-the-fly.

git clone https://github.com/rohitg00/ai-engineering-from-scratch.git
cd ai-engineering-from-scratch

The root directory contains the README.md entry point, requirements.txt for Python dependencies, and the phases/ directory where all curriculum phases live. Each lesson follows the path phases/<NN>-<phase-name>/<NN>-<lesson>/ and contains three sub-folders: code/ for runnable implementations, docs/en.md for narrative documentation, and outputs/ for generated artifacts.

Step 2: Configure the Python Environment

The curriculum is designed to be stdlib-first, but later phases require optional heavy libraries like PyTorch. Create a virtual environment to isolate these dependencies and prevent conflicts with your global Python installation.

python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

The requirements.txt file contains a minimal, curated list of allowed packages including numpy, torch, and h5py. This ensures all 503 lessons have access to necessary libraries while maintaining a clean development environment.

Step 3: Verify Your Installation

Confirm your environment works by running a foundational lesson from Phase 01. Execute the vector implementation to validate the math foundations layer:

python phases/01-math-foundations/01-linear-algebra-intuition/code/vectors.py

Alternatively, run the lesson's unit tests to ensure full compatibility with the CI pipeline defined in .github/workflows/curriculum.yml:

python -m unittest discover -s phases/01-math-foundations/01-linear-algebra-intuition/code/tests -v

Successful execution (silent exit with status 0, or passing tests) confirms your setup is complete.

Step 4: Install Generated Artifacts (Optional)

Each lesson writes reusable artifacts—skills, prompts, agents, or MCP servers—to its local outputs/ folder. To consolidate these into a single directory for use with Claude, Cursor, or Codex, run the installation script:

python scripts/install_skills.py ./my-llm-skills

This creates ./my-llm-skills/manifest.json and organizes each artifact with a SKILL.md front-matter file compatible with major LLM systems.

Advanced Installation Options

The scripts/install_skills.py utility supports several filtering and layout options:

  • --type {skill,prompt,agent,all}: Choose which artifact class to copy (e.g., --type agent)
  • --phase N: Restrict to specific phase number (e.g., --phase 14)
  • --tag TAG: Filter by custom tags in artifact front-matter (e.g., --tag vision)
  • --layout {flat,by-phase,skills}: Choose directory structure (e.g., --layout by-phase)
  • --dry-run: Preview actions without writing files
  • --force: Overwrite existing files

Example command for advanced filtering:

python scripts/install_skills.py ./my-llm-skills \
  --type all \
  --phase 14 \
  --layout by-phase \
  --dry-run

When not in dry-run mode, the script writes a manifest.json consumable by downstream tooling and agent runtimes.

Understanding the Build-It/Use-It Pattern

Every lesson in the repository implements a dual-structure: first you build the algorithm from scratch using only standard library or raw math, then you run the same logic through a production library to compare implementations. For example, in phases/07-transformers-deep-dive/02-self-attention-from-scratch/, you would run:

python code/main.py       # Build-it implementation

python code/main.py --use # Use-it implementation with production library

This pattern ensures you understand both low-level mathematics and high-level framework behavior. Contributors should consult AGENTS.md for the strict "one-commit-per-lesson" workflow rules before modifying any lesson code.

Summary

To successfully set up the rohitg00/ai-engineering-from-scratch project:

Frequently Asked Questions

Do I need to install all Python dependencies at once?

No. While requirements.txt contains all dependencies, the curriculum is designed to be stdlib-first. You can install packages incrementally as you progress through phases, though installing everything upfront ensures you won't encounter missing module errors in later transformer or agent-engineering lessons that rely on torch or h5py.

What Python version is required for this project?

The repository requires Python 3. The requirements.txt specifies compatible versions of numpy, torch, and h5py that work with modern Python 3.x releases. Always use the provided virtual environment to avoid version conflicts with system packages and to maintain the isolated environment expected by the curriculum scripts.

How do I know if the installation succeeded?

Run the unit test discovery command: python -m unittest discover -s phases/01-math-foundations/01-linear-algebra-intuition/code/tests -v. Passing tests indicate your environment matches the CI pipeline configuration. Additionally, running any lesson's code/main.py should execute without import errors or crashes, and the scripts/audit_lessons.py utility can lint lesson structure to verify installation completeness.

Where are the generated skills stored after running the install script?

By default, scripts/install_skills.py writes to your specified output directory (e.g., ./my-llm-skills), creating a manifest.json index and subdirectories containing SKILL.md files. These artifacts are compatible with Claude, Cursor, Codex, and other LLM systems that consume markdown-based skill definitions, and can be organized using the --layout option to create flat, by-phase, or skills-based directory structures.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →