What the curriculum.yml GitHub Workflow Does in AI Engineering from Scratch: 4 Automated Jobs Explained

The curriculum.yml GitHub workflow in the AI Engineering from Scratch repository automatically audits lesson structure, synchronizes README statistics, rebuilds site data, and warns contributors about README drift on every push and pull request to main.

The curriculum.yml GitHub workflow is the automation backbone that keeps the curriculum coherent and publish-ready without manual intervention. Defined in .github/workflows/curriculum.yml, it runs four specialized jobs that enforce invariants and self-heal documentation. This article breaks down exactly what each job does, when it triggers, and how you can run the same checks locally.

Workflow Triggers and Scope

The workflow fires on every push to main and every pull request targeting main, but only when curriculum-related files change. This includes lessons, scripts, documentation, and the site generator. By filtering path changes, the workflow avoids unnecessary runs on unrelated updates.

The Four Jobs in curriculum.yml

The workflow’s responsibilities are split into four jobs that run in a specific order on main or provide feedback during code review.

audit: Validating the Curriculum Contract

The audit job runs on both pushes and pull requests. It executes scripts/audit_lessons.py to verify that every lesson follows the strict curriculum contract.

This check validates metadata, tests, and file layout. If any rule is broken, the script exits with a non-zero status and blocks the workflow.

python3 scripts/audit_lessons.py

Run this locally to lint all lesson directories before committing.

readme-counts-sync: Auto-Fixing README Statistics

The readme-counts-sync job runs only on pushes to main. It re-generates the lesson catalog and runs scripts/check_readme_counts.py --fix to rewrite the lesson-count tables in README.md.

After fixing the tables, the job commits the changes back to the branch. This prevents manual drift in the top-level README statistics. The push logic includes a safe retry loop with rebase handling, and it deliberately skips commits when the most recent message already indicates a bot-generated change.

To replicate this fix locally:

python3 scripts/build_catalog.py
python3 scripts/check_readme_counts.py --fix
git diff README.md
git add README.md
git commit -m "chore(readme): sync counts"
git push

site-rebuild: Keeping Site Data in Sync

The site-rebuild job runs on pushes to main after readme-counts-sync completes. It executes node site/build.js to rebuild site/data.js from the current catalog.

The rebuilt data file is then committed back to the repository. This ensures the public website always reflects the latest curriculum state.

Local usage mirrors the CI step exactly:

node site/build.js
git diff site/data.js
git add site/data.js
git commit -m "chore(site): rebuild data.js"
git push

readme-counts-drift: Advisory Checks for Pull Requests

The readme-counts-drift job runs only on pull requests. It builds the catalog and checks README counts, but instead of committing fixes, it emits a ::warning:: annotation if the README is out of sync.

This gives contributors early feedback that main will self-heal the counts on merge, avoiding unnecessary manual edits in the PR.

Simulate the advisory check locally with:

python3 scripts/build_catalog.py
if ! python3 scripts/check_readme_counts.py; then
  echo "README counts out of sync – main branch will self-heal on merge"
fi

Key Files Behind the Automation

Several source files work together to power the curriculum.yml GitHub workflow:

Summary

  • The curriculum.yml GitHub workflow triggers on pushes and PRs to main when curriculum files change.
  • The audit job enforces lesson metadata and layout rules via scripts/audit_lessons.py.
  • The readme-counts-sync job auto-fixes and commits README count tables after every merge.
  • The site-rebuild job regenerates site/data.js via node site/build.js and commits the result.
  • The readme-counts-drift job warns PR authors if README counts are out of sync without blocking merges.
  • All scripts can be run locally to verify curriculum integrity before pushing.

Frequently Asked Questions

What triggers the curriculum.yml GitHub workflow in AI Engineering from Scratch?

The workflow triggers on every push to main and every pull request targeting main, but only when files affecting the curriculum are modified. This includes lesson content, scripts, documentation, and the site generator.

Why does the README update automatically instead of requiring manual edits?

The readme-counts-sync job runs scripts/check_readme_counts.py --fix on every push to main. This prevents human error and drift by automatically regenerating the lesson-count tables and committing the corrected README.md back to the repository.

Can I run the same checks locally that the GitHub workflow runs?

Yes. You can run python3 scripts/audit_lessons.py to validate lessons, python3 scripts/build_catalog.py followed by python3 scripts/check_readme_counts.py --fix to sync README counts, and node site/build.js to rebuild site data. These commands mirror the CI jobs in .github/workflows/curriculum.yml.

What happens if the audit job fails during a pull request?

If scripts/audit_lessons.py finds a curriculum contract violation, it exits with a non-zero status and fails the audit job. This blocks the workflow run and signals the contributor to fix the lesson structure, metadata, or file layout before merging.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →