CI/CD Workflows Defined in curriculum.yml: Automated Testing for AI Engineering Curriculum
The curriculum.yml workflow defines four automated jobs—audit, readme-counts-sync, site-rebuild, and readme-counts-drift—that validate lesson structure, synchronize documentation, and rebuild the public site on every push and pull request.
The repository rohitg00/ai-engineering-from-scratch maintains a structured curriculum for learning AI engineering through GitHub Actions automation. The CI/CD workflows defined in curriculum.yml enforce invariant rules across lesson files, automatically repair documentation drift, and ensure the public website reflects the latest content. These workflows run on Ubuntu runners using Python 3.12 (and Node.js for site builds) to provide continuous integration for the educational codebase.
The Four CI/CD Jobs in curriculum.yml
The workflow file .github/workflows/curriculum.yml defines four distinct jobs that trigger under specific conditions to maintain curriculum integrity.
audit: Structural Validation on Push and Pull Request
The audit job runs on every push and pull request that modifies lesson files, scripts, or the workflow itself. It executes scripts/audit_lessons.py to perform comprehensive invariant checking across the curriculum.
This validation script scans every lesson directory under phases/, confirming that each lesson contains required components including docs/en.md, code/ directories, quiz.json files, and corresponding unit tests. It also validates naming conventions and ensures no disallowed dependencies are introduced into the codebase.
- name: run scripts/audit_lessons.py
run: python3 scripts/audit_lessons.py
readme-counts-sync: Automatic README Repair on Main
The readme-counts-sync job triggers exclusively on pushes to the main branch. It regenerates the lesson catalog using scripts/build_catalog.py, then executes scripts/check_readme_counts.py --fix to verify that the lesson count tables in README.md match the actual lesson directories.
If the script detects discrepancies between the documented counts and the actual curriculum structure, it automatically commits the corrected README.md to the repository, ensuring documentation stays synchronized with the codebase.
- name: build ephemeral catalog
run: python3 scripts/build_catalog.py
- name: sync README counts
run: python3 scripts/check_readme_counts.py --fix
site-rebuild: Static Site Regeneration
The site-rebuild job executes after readme-counts-sync completes on pushes to main. It runs node site/build.js to regenerate site/data.js, which powers the static website that lists all available lessons.
If the generated file contains changes, the job commits the new version, ensuring the public site always reflects the current curriculum structure and lesson availability.
- name: rebuild site/data.js
run: node site/build.js
readme-counts-drift: Advisory Checks on Pull Requests
The readme-counts-drift job runs on pull requests targeting main to provide early warning of documentation inconsistencies. It builds a temporary catalog and runs scripts/check_readme_counts.py without the --fix flag.
If the README counts are out of sync, the job emits a GitHub warning annotation that alerts reviewers to the drift, informing them that the main branch will automatically heal the documentation upon merge.
- name: check README counts
run: |
if ! python3 scripts/check_readme_counts.py; then
echo "::warning::README.md counts drift detected. Main will self-heal on merge."
fi
How the Testing Scripts Validate Curriculum Integrity
The CI/CD workflows rely on several Python scripts and a Node.js build tool to enforce curriculum standards.
Invariant Checking with audit_lessons.py
The scripts/audit_lessons.py script performs structural and rule-based validation of every lesson in the repository. It confirms that lesson metadata conforms to the expected schema, verifies that code examples have corresponding test coverage, and checks for compliance with curriculum design rules.
README Synchronization with check_readme_counts.py
The scripts/check_readme_counts.py script compares the lesson count tables in README.md against the catalog generated by scripts/build_catalog.py. When invoked with the --fix flag (as in the readme-counts-sync job), it automatically updates the README to match the actual lesson counts. Without the flag (as in the readme-counts-drift job), it reports discrepancies without modifying files.
Site Data Generation with build.js
The site/build.js script reads the JSON catalog produced by the build process and writes site/data.js, which serves as the data source for the front-end lesson index. This ensures the public website always displays the current curriculum organization and lesson metadata.
Summary
The CI/CD workflows defined in curriculum.yml provide comprehensive automation for the ai-engineering-from-scratch repository:
- Four automated jobs handle validation, documentation sync, site rebuilding, and drift detection
- Invariant checking via
scripts/audit_lessons.pyenforces lesson structure and code compliance on every push and pull request - Automatic repair via
scripts/check_readme_counts.py --fixkeeps README.md synchronized with actual lesson counts on the main branch - Static site regeneration via
site/build.jsensures the public curriculum site reflects the latest content - Advisory warnings on pull requests alert reviewers to documentation drift before merging
Frequently Asked Questions
What triggers the CI/CD workflows in curriculum.yml?
The workflows trigger based on Git events and file changes. The audit job runs on every push and pull request that modifies lesson files, scripts, or the workflow itself. The readme-counts-sync and site-rebuild jobs execute only on pushes to the main branch, while readme-counts-drift runs specifically on pull requests targeting main.
How does the audit job validate lesson structure?
The audit job runs scripts/audit_lessons.py to scan every lesson directory under phases/. It validates that each lesson contains required files including docs/en.md, code/ directories, quiz.json, and unit tests. The script also enforces naming conventions and prevents disallowed dependencies from entering the curriculum.
What is the difference between readme-counts-sync and readme-counts-drift?
The readme-counts-sync job runs on the main branch and automatically fixes documentation by running scripts/check_readme_counts.py --fix, committing any changes to README.md. The readme-counts-drift job runs on pull requests without the --fix flag, emitting a GitHub warning if counts are out of sync without modifying files, alerting reviewers that the main branch will self-heal upon merge.
Which programming languages do the workflow scripts use?
The workflow primarily uses Python 3.12 for validation and synchronization tasks, specifically in scripts/audit_lessons.py, scripts/build_catalog.py, and scripts/check_readme_counts.py. The site-rebuild job uses Node.js to execute site/build.js for generating the static site data file.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →