Core AI Concepts Covered in AI Engineering from Scratch: Complete 20‑Phase Curriculum
AI Engineering from Scratch is a comprehensive open‑source curriculum spanning mathematical foundations, deep learning, generative AI, and autonomous agent systems across 16 structured phases.
The rohitg00/ai-engineering-from-scratch repository organizes modern AI education into progressive modules that move from vector mathematics to production‑grade LLM deployment. Each phase combines theoretical explanations with executable code implementations, covering every major pillar of contemporary artificial intelligence.
Mathematical Foundations and Classical ML
Linear Algebra and Calculus (Phase 1)
The curriculum begins with the mathematical primitives that power all modern AI. Phase 1 covers vectors, matrices, eigen‑decomposition, gradients, and optimization theory.
In phases/01-math-foundations/01-linear-algebra-intuition/docs/en.md, the repository implements core operations like dot products that later enable attention mechanisms and similarity search:
class Vector:
def __init__(self, components):
self.c = list(components)
def dot(self, other):
return sum(a*b for a, b in zip(self.c, other.c))
a = Vector([1, 2, 3])
b = Vector([4, 5, 6])
print("a·b =", a.dot(b)) # → 32
Machine Learning Fundamentals (Phase 2)
Phase 2 transitions from mathematics to classical algorithms. The material covers linear regression, logistic regression, decision trees, SVMs, K‑means clustering, and end‑to‑end ML pipelines.
The lesson at phases/02-ml-fundamentals/02-linear-regression/docs/en.md implements these algorithms from scratch without frameworks, establishing how loss functions and gradient descent optimize model parameters.
Deep Learning and Neural Networks
Deep Learning Core (Phase 3)
Neural network fundamentals appear in Phase 3, covering perceptrons, backpropagation, activation functions, loss landscapes, and optimizers like SGD and Adam.
The file phases/03-deep-learning-core/03-backpropagation/docs/en.md contains the complete backpropagation algorithm—the mechanism that enables gradient flow through arbitrary computational graphs. This phase bridges the gap between shallow classical ML and deep representation learning.
Computer Vision (Phase 4)
Phase 4 applies neural networks to image data through convolutional neural networks (CNNs), Vision Transformers (ViT), YOLO object detection, diffusion models for image generation, and 3‑D computer vision.
The Vision Transformers lesson at phases/04-computer-vision/14-vision-transformers/docs/en.md demonstrates how transformer architectures originally designed for NLP transfer effectively to image classification tasks.
Speech and Audio (Phase 6)
Audio AI receives dedicated coverage in Phase 6, including spectrograms, automatic speech recognition (ASR), Whisper architecture and fine‑tuning, text‑to‑speech (TTS), and neural audio codecs.
The Whisper implementation at phases/06-speech-and-audio/05-whisper-architecture-finetuning/docs/en.md teaches sequence‑to‑sequence modeling for audio transcription.
Natural Language Processing and Transformers
NLP Foundations (Phase 5)
Text processing begins with tokenization, word embeddings (Word2Vec, GloVe), and recurrent architectures (RNN/CNN for text). The curriculum then introduces the attention mechanism that revolutionized the field.
The breakthrough moment appears in phases/05-nlp-foundations-to-advanced/10-attention-mechanism/docs/en.md, which explains how attention weights allow models to focus on relevant context regardless of sequence distance.
Transformers Deep Dive (Phase 7)
Phase 7 provides the complete transformer architecture: self‑attention, multi‑head attention, positional encoding, BERT and GPT variants, Mixture of Experts (MoE), and KV‑cache optimization for inference.
The file phases/07-transformers-deep-dive/02-self-attention-from-scratch/docs/en.md implements scaled dot‑product attention in pure Python:
import torch, math
def scaled_dot_attn(Q, K, V):
d_k = Q.shape[-1]
scores = torch.matmul(Q, K.transpose(-2, -1)) / math.sqrt(d_k)
attn = torch.softmax(scores, dim=-1)
return torch.matmul(attn, V)
batch, seq, dim = 2, 5, 64
Q = torch.randn(batch, seq, dim)
K = torch.randn(batch, seq, dim)
V = torch.randn(batch, seq, dim)
out = scaled_dot_attn(Q, K, V)
print("Attention output shape:", out.shape) # → (2, 5, 64)
Generative AI and Large Language Models
Generative Models (Phase 8)
Phase 8 covers diffusion models (DDPM, Stable Diffusion), GANs, latent diffusion, ControlNet, and flow‑matching for image, video, and audio generation.
The DDPM implementation at phases/08-generative-ai/06-diffusion-ddpm-from-scratch/docs/en.md walks through the forward and reverse diffusion processes, teaching how neural networks learn to denoise random noise into coherent outputs.
LLMs from Scratch (Phase 10)
This phase teaches how to build large language models end‑to‑end: tokenizers (BPE, WordPiece, SentencePiece), pre‑training at scale, distributed training with FSDP/DeepSpeed, instruction tuning, RLHF, and quantization (GPTQ, AWQ, GGUF).
The repository includes a miniature GPT implementation at phases/10-llms-from-scratch/04-pre-training-mini-gpt/docs/en.md that links the mathematical foundations to practical transformer training.
LLM Engineering (Phase 11)
Production deployment receives equal attention. Phase 11 covers prompt engineering, RAG (Retrieval‑Augmented Generation), LoRA fine‑tuning, function calling, guardrails, and KV‑cache optimization.
The LoRA (Low‑Rank Adaptation) implementation at phases/11-llm-engineering/01-prompt-engineering/docs/en.md demonstrates efficient parameter‑efficient fine‑tuning:
import torch
W = torch.randn(4096, 4096) # Original weight matrix
A = torch.randn(4096, 16) # Low‑rank factor
B = torch.randn(16, 4096) # Low‑rank factor
alpha = 1.0
W_lora = W + alpha * torch.matmul(A, B)
print("Original size:", W.numel(), "LoRA size:", (A.numel()+B.numel()))
# → Original size: 16,777,216 LoRA size: 131,072
Reinforcement Learning and Advanced Systems
Reinforcement Learning (Phase 9)
Phase 9 introduces decision‑making agents through MDPs (Markov Decision Processes), Q‑learning, DQN, PPO (Proximal Policy Optimization), RLHF (Reinforcement Learning from Human Feedback), and multi‑agent systems.
The PPO implementation at phases/09-reinforcement-learning/08-ppo/docs/en.md teaches policy gradient methods that later enable training reward models for LLM alignment.
Multimodal AI and Agent Systems
Multimodal AI (Phase 12)
Phase 12 fuses vision, language, and audio through CLIP, BLIP‑2, LLaVA, and vision‑language models. The curriculum extends to multimodal RAG and multimodal agent architectures.
The CLIP contrastive pre‑training lesson at phases/12-multimodal-ai/02-clip-contrastive-pretraining/docs/en.md demonstrates how dual encoders learn joint embeddings across modalities.
Tools and Protocols (Phase 13)
Modern AI interoperability appears in Phase 13 through the Model‑Context Protocol (MCP). This covers server/client architecture, async task handling, security considerations, and routing.
phases/13-tools-and-protocols/06-mcp-fundamentals/docs/en.md establishes the protocol fundamentals that enable LLMs to securely invoke external tools and APIs.
Agent Engineering (Phase 14)
Autonomous agents receive comprehensive treatment in Phase 14: agent loops (ReAct), memory systems, planning algorithms (HTN, tree‑of‑thoughts), LangGraph, crew‑AI, and agent evaluation frameworks.
The file phases/14-agent-engineering/01-the-agent-loop/docs/en.md implements the ReAct‑style loop that powers autonomous decision‑making, combining reasoning traces with tool execution.
Capstone Projects (Phase 19)
The curriculum culminates in end‑to‑end system building. This phase covers safety gates, constitutional rules engines, and reusable agent skills.
The end‑to‑end safety gate at phases/19-capstone-projects/87-end-to-end-safety-gate/docs/en.md demonstrates how to integrate guardrails, observability, and fail‑safes into production AI systems.
Summary
- AI Engineering from Scratch spans 16 phases from linear algebra to autonomous agents, with each concept implemented in runnable code.
- The curriculum progresses through mathematical foundations, classical ML, neural networks, transformers, generative AI, and production LLM engineering.
- Key implementation files include
phases/07-transformers-deep-dive/02-self-attention-from-scratch/docs/en.mdfor transformer internals andphases/14-agent-engineering/01-the-agent-loop/docs/en.mdfor autonomous agent architecture. - Production considerations like LoRA fine‑tuning, quantization, and MCP tool integration receive equal weight alongside theoretical foundations.
Frequently Asked Questions
Does the curriculum require prior AI knowledge?
No. Phase 0 covers development environment setup (Git, Docker, Jupyter), and Phase 1 begins with linear algebra fundamentals. The repository assumes only basic Python programming competency and builds all AI concepts from first principles with mathematical derivations and code implementations.
How does this differ from other AI courses?
Unlike framework‑centric tutorials, this repository implements algorithms from scratch (e.g., backpropagation, self‑attention, DDPM diffusion) before introducing libraries. The curriculum also uniquely covers modern production techniques like RLHF, MCP protocols, and agent orchestration alongside theoretical foundations, bridging the gap between research and engineering.
Are the code examples production‑ready?
The implementations prioritize educational clarity over optimization, but Phase 11 and Phase 14 specifically address production concerns: distributed training with FSDP/DeepSpeed, LoRA for efficient fine‑tuning, KV‑cache management, quantization strategies, and safety guardrails. The capstone projects demonstrate full‑stack deployment patterns.
What hardware requirements are needed for the exercises?
Early phases (0‑7) run comfortably on CPU. Later phases involving LLM pre‑training (Phase 10) and diffusion models (Phase 8) reference cloud GPU usage (A100/H100) and distributed training configurations, though the repository provides lightweight variants that demonstrate concepts on consumer hardware where possible.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →