# Core AI Concepts Covered in AI Engineering from Scratch: Complete 20‑Phase Curriculum

> Explore core AI concepts in AI Engineering from Scratch. This 16-phase open-source curriculum covers math, deep learning, generative AI, and autonomous agents.

- Repository: [Rohit Ghumare/ai-engineering-from-scratch](https://github.com/rohitg00/ai-engineering-from-scratch)
- Tags: getting-started
- Published: 2026-06-06

---

**AI Engineering from Scratch** is a comprehensive open‑source curriculum spanning mathematical foundations, deep learning, generative AI, and autonomous agent systems across 16 structured phases.

The rohitg00/ai-engineering-from-scratch repository organizes modern AI education into progressive modules that move from vector mathematics to production‑grade LLM deployment. Each phase combines theoretical explanations with executable code implementations, covering every major pillar of contemporary artificial intelligence.

## Mathematical Foundations and Classical ML

### Linear Algebra and Calculus (Phase 1)

The curriculum begins with the mathematical primitives that power all modern AI. Phase 1 covers vectors, matrices, eigen‑decomposition, gradients, and optimization theory.

In [`phases/01-math-foundations/01-linear-algebra-intuition/docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/01-math-foundations/01-linear-algebra-intuition/docs/en.md), the repository implements core operations like dot products that later enable attention mechanisms and similarity search:

```python
class Vector:
    def __init__(self, components): 
        self.c = list(components)
    def dot(self, other): 
        return sum(a*b for a, b in zip(self.c, other.c))

a = Vector([1, 2, 3])
b = Vector([4, 5, 6])
print("a·b =", a.dot(b))  # → 32

```

### Machine Learning Fundamentals (Phase 2)

Phase 2 transitions from mathematics to classical algorithms. The material covers **linear regression**, **logistic regression**, **decision trees**, **SVMs**, **K‑means clustering**, and end‑to‑end ML pipelines.

The lesson at [`phases/02-ml-fundamentals/02-linear-regression/docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/02-ml-fundamentals/02-linear-regression/docs/en.md) implements these algorithms from scratch without frameworks, establishing how loss functions and gradient descent optimize model parameters.

## Deep Learning and Neural Networks

### Deep Learning Core (Phase 3)

Neural network fundamentals appear in Phase 3, covering perceptrons, **backpropagation**, activation functions, loss landscapes, and optimizers like SGD and Adam.

The file [`phases/03-deep-learning-core/03-backpropagation/docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/03-deep-learning-core/03-backpropagation/docs/en.md) contains the complete backpropagation algorithm—the mechanism that enables gradient flow through arbitrary computational graphs. This phase bridges the gap between shallow classical ML and deep representation learning.

### Computer Vision (Phase 4)

Phase 4 applies neural networks to image data through **convolutional neural networks (CNNs)**, **Vision Transformers (ViT)**, **YOLO** object detection, diffusion models for image generation, and 3‑D computer vision.

The Vision Transformers lesson at [`phases/04-computer-vision/14-vision-transformers/docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/04-computer-vision/14-vision-transformers/docs/en.md) demonstrates how transformer architectures originally designed for NLP transfer effectively to image classification tasks.

### Speech and Audio (Phase 6)

Audio AI receives dedicated coverage in Phase 6, including **spectrograms**, automatic speech recognition (**ASR**), **Whisper** architecture and fine‑tuning, text‑to‑speech (**TTS**), and neural audio codecs.

The Whisper implementation at [`phases/06-speech-and-audio/05-whisper-architecture-finetuning/docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/06-speech-and-audio/05-whisper-architecture-finetuning/docs/en.md) teaches sequence‑to‑sequence modeling for audio transcription.

## Natural Language Processing and Transformers

### NLP Foundations (Phase 5)

Text processing begins with **tokenization**, **word embeddings** (Word2Vec, GloVe), and recurrent architectures (RNN/CNN for text). The curriculum then introduces the attention mechanism that revolutionized the field.

The breakthrough moment appears in [`phases/05-nlp-foundations-to-advanced/10-attention-mechanism/docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/05-nlp-foundations-to-advanced/10-attention-mechanism/docs/en.md), which explains how attention weights allow models to focus on relevant context regardless of sequence distance.

### Transformers Deep Dive (Phase 7)

Phase 7 provides the complete transformer architecture: **self‑attention**, **multi‑head attention**, **positional encoding**, **BERT** and **GPT** variants, **Mixture of Experts (MoE)**, and **KV‑cache** optimization for inference.

The file [`phases/07-transformers-deep-dive/02-self-attention-from-scratch/docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/07-transformers-deep-dive/02-self-attention-from-scratch/docs/en.md) implements scaled dot‑product attention in pure Python:

```python
import torch, math

def scaled_dot_attn(Q, K, V):
    d_k = Q.shape[-1]
    scores = torch.matmul(Q, K.transpose(-2, -1)) / math.sqrt(d_k)
    attn = torch.softmax(scores, dim=-1)
    return torch.matmul(attn, V)

batch, seq, dim = 2, 5, 64
Q = torch.randn(batch, seq, dim)
K = torch.randn(batch, seq, dim)
V = torch.randn(batch, seq, dim)
out = scaled_dot_attn(Q, K, V)
print("Attention output shape:", out.shape)  # → (2, 5, 64)

```

## Generative AI and Large Language Models

### Generative Models (Phase 8)

Phase 8 covers **diffusion models** (DDPM, Stable Diffusion), **GANs**, **latent diffusion**, **ControlNet**, and **flow‑matching** for image, video, and audio generation.

The DDPM implementation at [`phases/08-generative-ai/06-diffusion-ddpm-from-scratch/docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/08-generative-ai/06-diffusion-ddpm-from-scratch/docs/en.md) walks through the forward and reverse diffusion processes, teaching how neural networks learn to denoise random noise into coherent outputs.

### LLMs from Scratch (Phase 10)

This phase teaches how to build **large language models** end‑to‑end: **tokenizers** (BPE, WordPiece, SentencePiece), **pre‑training** at scale, distributed training with **FSDP/DeepSpeed**, **instruction tuning**, **RLHF**, and **quantization** (GPTQ, AWQ, GGUF).

The repository includes a miniature GPT implementation at [`phases/10-llms-from-scratch/04-pre-training-mini-gpt/docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/10-llms-from-scratch/04-pre-training-mini-gpt/docs/en.md) that links the mathematical foundations to practical transformer training.

### LLM Engineering (Phase 11)

Production deployment receives equal attention. Phase 11 covers **prompt engineering**, **RAG** (Retrieval‑Augmented Generation), **LoRA** fine‑tuning, **function calling**, **guardrails**, and **KV‑cache** optimization.

The LoRA (Low‑Rank Adaptation) implementation at [`phases/11-llm-engineering/01-prompt-engineering/docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/11-llm-engineering/01-prompt-engineering/docs/en.md) demonstrates efficient parameter‑efficient fine‑tuning:

```python
import torch

W = torch.randn(4096, 4096)  # Original weight matrix

A = torch.randn(4096, 16)  # Low‑rank factor

B = torch.randn(16, 4096)  # Low‑rank factor

alpha = 1.0
W_lora = W + alpha * torch.matmul(A, B)
print("Original size:", W.numel(), "LoRA size:", (A.numel()+B.numel()))

# → Original size: 16,777,216  LoRA size: 131,072

```

## Reinforcement Learning and Advanced Systems

### Reinforcement Learning (Phase 9)

Phase 9 introduces decision‑making agents through **MDPs** (Markov Decision Processes), **Q‑learning**, **DQN**, **PPO** (Proximal Policy Optimization), **RLHF** (Reinforcement Learning from Human Feedback), and multi‑agent systems.

The PPO implementation at [`phases/09-reinforcement-learning/08-ppo/docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/09-reinforcement-learning/08-ppo/docs/en.md) teaches policy gradient methods that later enable training reward models for LLM alignment.

## Multimodal AI and Agent Systems

### Multimodal AI (Phase 12)

Phase 12 fuses vision, language, and audio through **CLIP**, **BLIP‑2**, **LLaVA**, and vision‑language models. The curriculum extends to **multimodal RAG** and multimodal agent architectures.

The CLIP contrastive pre‑training lesson at [`phases/12-multimodal-ai/02-clip-contrastive-pretraining/docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/12-multimodal-ai/02-clip-contrastive-pretraining/docs/en.md) demonstrates how dual encoders learn joint embeddings across modalities.

### Tools and Protocols (Phase 13)

Modern AI interoperability appears in Phase 13 through the **Model‑Context Protocol (MCP)**. This covers server/client architecture, async task handling, security considerations, and routing.

[`phases/13-tools-and-protocols/06-mcp-fundamentals/docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/13-tools-and-protocols/06-mcp-fundamentals/docs/en.md) establishes the protocol fundamentals that enable LLMs to securely invoke external tools and APIs.

### Agent Engineering (Phase 14)

Autonomous agents receive comprehensive treatment in Phase 14: **agent loops** (ReAct), **memory systems**, **planning algorithms** (HTN, tree‑of‑thoughts), **LangGraph**, **crew‑AI**, and agent evaluation frameworks.

The file [`phases/14-agent-engineering/01-the-agent-loop/docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/14-agent-engineering/01-the-agent-loop/docs/en.md) implements the ReAct‑style loop that powers autonomous decision‑making, combining reasoning traces with tool execution.

### Capstone Projects (Phase 19)

The curriculum culminates in end‑to‑end system building. This phase covers safety gates, constitutional rules engines, and reusable agent skills.

The end‑to‑end safety gate at [`phases/19-capstone-projects/87-end-to-end-safety-gate/docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/19-capstone-projects/87-end-to-end-safety-gate/docs/en.md) demonstrates how to integrate guardrails, observability, and fail‑safes into production AI systems.

## Summary

- **AI Engineering from Scratch** spans 16 phases from linear algebra to autonomous agents, with each concept implemented in runnable code.
- The curriculum progresses through mathematical foundations, classical ML, neural networks, transformers, generative AI, and production LLM engineering.
- Key implementation files include [`phases/07-transformers-deep-dive/02-self-attention-from-scratch/docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/07-transformers-deep-dive/02-self-attention-from-scratch/docs/en.md) for transformer internals and [`phases/14-agent-engineering/01-the-agent-loop/docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/14-agent-engineering/01-the-agent-loop/docs/en.md) for autonomous agent architecture.
- Production considerations like LoRA fine‑tuning, quantization, and MCP tool integration receive equal weight alongside theoretical foundations.

## Frequently Asked Questions

### Does the curriculum require prior AI knowledge?

No. Phase 0 covers development environment setup (Git, Docker, Jupyter), and Phase 1 begins with linear algebra fundamentals. The repository assumes only basic Python programming competency and builds all AI concepts from first principles with mathematical derivations and code implementations.

### How does this differ from other AI courses?

Unlike framework‑centric tutorials, this repository implements algorithms from scratch (e.g., backpropagation, self‑attention, DDPM diffusion) before introducing libraries. The curriculum also uniquely covers modern production techniques like **RLHF**, **MCP protocols**, and **agent orchestration** alongside theoretical foundations, bridging the gap between research and engineering.

### Are the code examples production‑ready?

The implementations prioritize educational clarity over optimization, but Phase 11 and Phase 14 specifically address production concerns: distributed training with **FSDP/DeepSpeed**, **LoRA** for efficient fine‑tuning, **KV‑cache** management, quantization strategies, and safety guardrails. The capstone projects demonstrate full‑stack deployment patterns.

### What hardware requirements are needed for the exercises?

Early phases (0‑7) run comfortably on CPU. Later phases involving LLM pre‑training (Phase 10) and diffusion models (Phase 8) reference cloud GPU usage (A100/H100) and distributed training configurations, though the repository provides lightweight variants that demonstrate concepts on consumer hardware where possible.