# AI Engineering Best Practices Demonstrated in the ai-engineering-from-scratch Repository

> Explore AI engineering best practices in the ai-engineering-from-scratch repository. Discover modular design, dual implementation, zero-dependency policies, and automated artifact generation for production-grade AI.

- Repository: [Rohit Ghumare/ai-engineering-from-scratch](https://github.com/rohitg00/ai-engineering-from-scratch)
- Tags: best-practices
- Published: 2026-06-06

---

**This repository encodes ten production-grade best practices for AI engineering, including modular lesson architecture, dual implementation strategies, zero-dependency policies, and automated artifact generation.**

The [`ai-engineering-from-scratch`](https://github.com/rohitg00/ai-engineering-from-scratch) repository by Rohit Ghumare serves as both a comprehensive curriculum and a reference implementation of professional AI engineering standards. Unlike typical educational resources that prioritize theory over practice, this codebase demonstrates how to build, test, document, and ship AI systems using reproducible, auditable workflows. Below is a detailed breakdown of the architectural decisions, conventions, and automation strategies that make this repository a blueprint for scalable AI development.

## Modular Lesson Architecture with Strict Directory Conventions

Every concept in the curriculum lives in a self-contained folder following a rigid three-part structure: `code/`, `docs/`, and `outputs/`. As documented in the [README.md at lines 84-92](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/README.md#L84-L92), this layout guarantees that every lesson is reproducible, discoverable, and portable.

The `code/` directory contains the algorithmic implementations, `docs/` stores the explanatory material including front-matter metadata, and `outputs/` houses reusable artifacts such as prompts, skills, and MCP servers. This separation of concerns allows automated tooling to parse the curriculum programmatically while keeping human-readable documentation adjacent to executable code.

## The "Build-It / Use-It" Dual Implementation Strategy

A core pedagogical principle in this repository requires every algorithm to be implemented twice: first from raw mathematical foundations, then using production-grade libraries. This pattern, illustrated in the [six-phase pipeline diagram](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/README.md#L95-L106), forces deep understanding of underlying mechanics before developers rely on black-box frameworks.

For example, a lesson on backpropagation would include a pure NumPy implementation in [`phases/03-deep-learning-core/03-backpropagation/code/backpropagation.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/03-deep-learning-core/03-backpropagation/code/backpropagation.py) alongside a PyTorch equivalent. This approach ensures that engineers understand gradient flow at the tensor level while still being proficient with optimized industrial tools.

## Zero-Dependency First Principles and Explicit Allowlists

The repository enforces a strict **zero-dependency** policy wherever possible, permitting only standard library modules or explicitly allowed packages documented in [[`AGENTS.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/AGENTS.md) at lines 67-73](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/AGENTS.md#L67-L73). This constraint keeps examples lightweight, portable across environments, and easy to audit for security vulnerabilities.

When external dependencies are necessary, they must be declared in the lesson metadata and justified pedagogically. This practice mirrors production environments where dependency minimization reduces supply chain attack surfaces and simplifies deployment to containerized or edge environments.

## Rigorous Per-Lesson Testing Contracts

Functional correctness is guaranteed through a mandatory testing structure where each `code/` folder ships with a `tests/` suite that must exit with code 0. The [testing contract in AGENTS.md](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/AGENTS.md#L33-L42) specifies that every lesson must include unit tests validating both the manual implementation and the library-based version.

To run validation locally, navigate to any lesson directory and execute the appropriate test runner:

```bash
cd phases/03-deep-learning-core/03-backpropagation
python -m unittest discover -v

```

This requirement serves as living documentation while preventing regressions as the curriculum evolves or dependencies update.

## Automated CI Validation and Curriculum Integrity

The repository maintains quality through GitHub Actions defined in [[`.github/workflows/curriculum.yml`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/.github/workflows/curriculum.yml)](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/.github/workflows/curriculum.yml), which triggers on every pull request. This pipeline executes [`scripts/audit_lessons.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/scripts/audit_lessons.py) to validate lesson structure, metadata compliance, and test coverage, while [`scripts/check_readme_counts.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/scripts/check_readme_counts.py) automatically synchronizes the README badge with the actual lesson count.

This automation prevents "drift" between documentation and code, enforces contribution standards, and ensures the public-facing site remains synchronized with the repository state. The CI pipeline also rebuilds the static site using [`site/build.js`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/build.js), which parses the declarative metadata from each lesson's [`docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/docs/en.md) front-matter.

## Conventional Commits and Atomic Lesson Boundaries

Version control hygiene is enforced through strict commit conventions documented in [[`AGENTS.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/AGENTS.md) at lines 12-16](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/AGENTS.md#L12-L16). The repository requires **one commit per lesson** with a concise, structured subject line following the pattern:

```bash
git add phases/15-agent-engineering/99-new-lesson/
git commit -m "feat(phase-15/99): add new-lesson"
git push origin my-branch

```

This atomic commit strategy makes history readable, supports automated changelog generation, and enables `git bisect` debugging when lessons introduce breaking changes.

## Declarative Metadata for Tooling Integration

Each lesson includes structured front-matter in [`docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/docs/en.md) that declares the lesson type, programming language, prerequisites, and learning objectives. According to the [lesson front-matter specification in AGENTS.md](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/AGENTS.md#L22-L30), this metadata enables the [`site/build.js`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/build.js) generator to construct a navigable curriculum automatically.

This declarative approach allows the repository to function as a queryable knowledge graph, where learners can filter lessons by prerequisite knowledge or technology stack without manually browsing directories.

## Reusable Artifacts and Skill Installation

A distinguishing feature of this repository is its treatment of learning outcomes as production assets. Each lesson produces artifacts—prompts, skills, agents, or MCP servers—stored in the `outputs/` directory. These can be installed into development environments using the repository-wide installation script:

```bash
python3 scripts/install_skills.py

```

This script, located at [[`scripts/install_skills.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/scripts/install_skills.py)](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/scripts/install_skills.py), parses all `outputs/*.md` files and copies them into user-accessible directories for tools like Claude, Cursor, or other AI coding assistants. This bridges the gap between educational content and practical utility, turning theoretical knowledge into immediately deployable components.

## Transparent Versioning and Roadmap Communication

Curriculum development is tracked transparently through [[`ROADMAP.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/ROADMAP.md)](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/ROADMAP.md), which documents phase status and lesson completion. The README features a live badge showing the current lesson count, maintained automatically by the CI pipeline.

This visibility provides a single source of truth for contributors and learners regarding curriculum maturity, planned content, and phase completion status. It demonstrates best practices for open-source project management, where roadmap alignment precedes code implementation.

## Community Infrastructure and Legal Protection

The repository includes comprehensive community governance files: an MIT license, [[`CONTRIBUTING.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/CONTRIBUTING.md)](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/CONTRIBUTING.md), and [`CODE_OF_CONDUCT.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/CODE_OF_CONDUCT.md). These documents establish contribution guidelines, define the dependency allowlist policy, and provide legal protection for both maintainers and contributors.

This infrastructure encourages community participation while maintaining quality standards, mirroring the governance models of major open-source AI projects like Hugging Face Transformers or PyTorch.

## Practical Workflow Examples

To execute a single lesson implementation directly:

```bash
git clone https://github.com/rohitg00/ai-engineering-from-scratch.git
cd ai-engineering-from-scratch
python phases/01-math-foundations/01-linear-algebra-intuition/code/vectors.py

```

This pattern, documented in the [README quick-start](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/README.md#L117-L124), demonstrates the repository's emphasis on immediate executability without complex environment setup.

## Summary

The `ai-engineering-from-scratch` repository demonstrates that educational resources can simultaneously serve as production-grade reference implementations. Key takeaways include:

- **Modular architecture** with strict `code/docs/outputs` separation ensures reproducibility
- **Dual implementation** (raw math + production libraries) builds deep understanding
- **Zero-dependency** policies and explicit allowlists minimize attack surfaces
- **Mandatory testing** per lesson enforces functional correctness
- **Automated CI** validation prevents documentation drift and enforces standards
- **Atomic commits** with conventional formatting maintain readable history
- **Declarative metadata** enables automated curriculum generation
- **Reusable artifacts** bridge learning and production via [`scripts/install_skills.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/scripts/install_skills.py)
- **Transparent roadmapping** aligns community expectations with development progress
- **Complete governance** infrastructure supports sustainable open-source growth

These conventions scale from single lessons to 500-plus lesson curricula while maintaining auditability and portability across environments.

## Frequently Asked Questions

### What makes the ai-engineering-from-scratch repository different from other AI courses?

Unlike traditional courses that provide only high-level explanations, this repository requires you to implement algorithms from mathematical foundations before using production libraries. According to the source code structure in `phases/*/*/code/`, every concept includes both a "raw" implementation and a library-based version, ensuring you understand the underlying mechanics rather than just API calls.

### How does the repository ensure code quality across hundreds of lessons?

Quality is enforced through automated validation in [`.github/workflows/curriculum.yml`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/.github/workflows/curriculum.yml) and the [`scripts/audit_lessons.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/scripts/audit_lessons.py) tool. Every lesson must include a `tests/` directory that passes with exit code 0, and the CI pipeline verifies metadata compliance, structure integrity, and lesson count accuracy on every pull request, as specified in [`AGENTS.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/AGENTS.md).

### Can I use the lesson outputs in my own projects?

Yes. The repository treats learning outcomes as reusable assets. Each lesson produces artifacts in its `outputs/` directory, and you can install these skills, prompts, or MCP servers into your development environment using `python3 scripts/install_skills.py`. This script parses the markdown outputs and makes them available to AI coding assistants like Claude or Cursor.

### What are the contribution requirements for adding new lessons?

Contributions must follow the conventions in [`AGENTS.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/AGENTS.md): one atomic commit per lesson with a conventional commit message (e.g., `feat(phase-15/99): add new-lesson`), inclusion of a `tests/` suite that passes validation, and adherence to the zero-dependency policy where possible. The lesson must also include declarative front-matter in [`docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/docs/en.md) describing prerequisites and learning objectives.