# What the curriculum.yml GitHub Workflow Does in AI Engineering from Scratch: 4 Automated Jobs Explained

> Explore the curriculum.yml GitHub workflow in AI Engineering from Scratch. Learn how it automatically audits, syncs READMEs, rebuilds data, and prevents drift on every push or pull request.

- Repository: [Rohit Ghumare/ai-engineering-from-scratch](https://github.com/rohitg00/ai-engineering-from-scratch)
- Tags: internals
- Published: 2026-06-07

---

**The [`curriculum.yml`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/curriculum.yml) GitHub workflow in the *AI Engineering from Scratch* repository automatically audits lesson structure, synchronizes README statistics, rebuilds site data, and warns contributors about README drift on every push and pull request to `main`.**

The [`curriculum.yml`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/curriculum.yml) GitHub workflow is the automation backbone that keeps the curriculum coherent and publish-ready without manual intervention. Defined in [`.github/workflows/curriculum.yml`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/.github/workflows/curriculum.yml), it runs four specialized jobs that enforce invariants and self-heal documentation. This article breaks down exactly what each job does, when it triggers, and how you can run the same checks locally.

## Workflow Triggers and Scope

The workflow fires on every **push** to `main` and every **pull request** targeting `main`, but only when curriculum-related files change. This includes lessons, scripts, documentation, and the site generator. By filtering path changes, the workflow avoids unnecessary runs on unrelated updates.

## The Four Jobs in curriculum.yml

The workflow’s responsibilities are split into four jobs that run in a specific order on `main` or provide feedback during code review.

### audit: Validating the Curriculum Contract

The **`audit`** job runs on both pushes and pull requests. It executes [`scripts/audit_lessons.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/scripts/audit_lessons.py) to verify that every lesson follows the strict curriculum contract.

This check validates metadata, tests, and file layout. If any rule is broken, the script exits with a non-zero status and blocks the workflow.

```bash
python3 scripts/audit_lessons.py

```

Run this locally to lint all lesson directories before committing.

### readme-counts-sync: Auto-Fixing README Statistics

The **`readme-counts-sync`** job runs only on pushes to `main`. It re-generates the lesson catalog and runs `scripts/check_readme_counts.py --fix` to rewrite the lesson-count tables in [`README.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/README.md).

After fixing the tables, the job commits the changes back to the branch. This prevents manual drift in the top-level README statistics. The push logic includes a safe retry loop with rebase handling, and it deliberately skips commits when the most recent message already indicates a bot-generated change.

To replicate this fix locally:

```bash
python3 scripts/build_catalog.py
python3 scripts/check_readme_counts.py --fix
git diff README.md
git add README.md
git commit -m "chore(readme): sync counts"
git push

```

### site-rebuild: Keeping Site Data in Sync

The **`site-rebuild`** job runs on pushes to `main` after `readme-counts-sync` completes. It executes `node site/build.js` to rebuild [`site/data.js`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/data.js) from the current catalog.

The rebuilt data file is then committed back to the repository. This ensures the public website always reflects the latest curriculum state.

Local usage mirrors the CI step exactly:

```bash
node site/build.js
git diff site/data.js
git add site/data.js
git commit -m "chore(site): rebuild data.js"
git push

```

### readme-counts-drift: Advisory Checks for Pull Requests

The **`readme-counts-drift`** job runs only on pull requests. It builds the catalog and checks README counts, but instead of committing fixes, it emits a `::warning::` annotation if the README is out of sync.

This gives contributors early feedback that `main` will self-heal the counts on merge, avoiding unnecessary manual edits in the PR.

Simulate the advisory check locally with:

```bash
python3 scripts/build_catalog.py
if ! python3 scripts/check_readme_counts.py; then
  echo "README counts out of sync – main branch will self-heal on merge"
fi

```

## Key Files Behind the Automation

Several source files work together to power the [`curriculum.yml`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/curriculum.yml) GitHub workflow:

- **[`.github/workflows/curriculum.yml`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/.github/workflows/curriculum.yml)** — Defines the CI orchestration, job dependencies, and trigger conditions.
- **[`scripts/audit_lessons.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/scripts/audit_lessons.py)** — Validates lesson structure and metadata against the curriculum contract.
- **[`scripts/build_catalog.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/scripts/build_catalog.py)** — Generates the temporary lesson catalog consumed by count checks and site rebuilds.
- **[`scripts/check_readme_counts.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/scripts/check_readme_counts.py)** — Checks and optionally fixes the lesson-count tables in [`README.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/README.md).
- **[`site/build.js`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/build.js)** — Re-creates [`site/data.js`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/data.js) for the public website.
- **[`README.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/README.md)** — The human-readable overview that the workflow auto-maintains.

## Summary

- The **[`curriculum.yml`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/curriculum.yml) GitHub workflow** triggers on pushes and PRs to `main` when curriculum files change.
- The **`audit`** job enforces lesson metadata and layout rules via [`scripts/audit_lessons.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/scripts/audit_lessons.py).
- The **`readme-counts-sync`** job auto-fixes and commits README count tables after every merge.
- The **`site-rebuild`** job regenerates [`site/data.js`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/data.js) via `node site/build.js` and commits the result.
- The **`readme-counts-drift`** job warns PR authors if README counts are out of sync without blocking merges.
- All scripts can be run locally to verify curriculum integrity before pushing.

## Frequently Asked Questions

### What triggers the curriculum.yml GitHub workflow in AI Engineering from Scratch?

The workflow triggers on every push to `main` and every pull request targeting `main`, but only when files affecting the curriculum are modified. This includes lesson content, scripts, documentation, and the site generator.

### Why does the README update automatically instead of requiring manual edits?

The `readme-counts-sync` job runs `scripts/check_readme_counts.py --fix` on every push to `main`. This prevents human error and drift by automatically regenerating the lesson-count tables and committing the corrected [`README.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/README.md) back to the repository.

### Can I run the same checks locally that the GitHub workflow runs?

Yes. You can run `python3 scripts/audit_lessons.py` to validate lessons, `python3 scripts/build_catalog.py` followed by `python3 scripts/check_readme_counts.py --fix` to sync README counts, and `node site/build.js` to rebuild site data. These commands mirror the CI jobs in [`.github/workflows/curriculum.yml`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/.github/workflows/curriculum.yml).

### What happens if the audit job fails during a pull request?

If [`scripts/audit_lessons.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/scripts/audit_lessons.py) finds a curriculum contract violation, it exits with a non-zero status and fails the `audit` job. This blocks the workflow run and signals the contributor to fix the lesson structure, metadata, or file layout before merging.