# CI/CD Workflows Defined in curriculum.yml: Automated Testing for AI Engineering Curriculum

> Explore CI/CD workflows in curriculum.yml: audit, readme-counts-sync, site-rebuild, and readme-counts-drift automate testing for AI Engineering curriculum, ensuring lesson structure and documentation accuracy. Learn more!

- Repository: [Rohit Ghumare/ai-engineering-from-scratch](https://github.com/rohitg00/ai-engineering-from-scratch)
- Tags: how-to-guide
- Published: 2026-06-09

---

**The [`curriculum.yml`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/curriculum.yml) workflow defines four automated jobs—`audit`, `readme-counts-sync`, `site-rebuild`, and `readme-counts-drift`—that validate lesson structure, synchronize documentation, and rebuild the public site on every push and pull request.**

The repository `rohitg00/ai-engineering-from-scratch` maintains a structured curriculum for learning AI engineering through GitHub Actions automation. The **CI/CD workflows defined in [`curriculum.yml`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/curriculum.yml)** enforce invariant rules across lesson files, automatically repair documentation drift, and ensure the public website reflects the latest content. These workflows run on Ubuntu runners using Python 3.12 (and Node.js for site builds) to provide continuous integration for the educational codebase.

## The Four CI/CD Jobs in curriculum.yml

The workflow file [`.github/workflows/curriculum.yml`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/.github/workflows/curriculum.yml) defines four distinct jobs that trigger under specific conditions to maintain curriculum integrity.

### audit: Structural Validation on Push and Pull Request

The **`audit`** job runs on every push and pull request that modifies lesson files, scripts, or the workflow itself. It executes [`scripts/audit_lessons.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/scripts/audit_lessons.py) to perform comprehensive invariant checking across the curriculum.

This validation script scans every lesson directory under `phases/`, confirming that each lesson contains required components including [`docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/docs/en.md), `code/` directories, [`quiz.json`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/quiz.json) files, and corresponding unit tests. It also validates naming conventions and ensures no disallowed dependencies are introduced into the codebase.

```yaml
- name: run scripts/audit_lessons.py
  run: python3 scripts/audit_lessons.py

```

### readme-counts-sync: Automatic README Repair on Main

The **`readme-counts-sync`** job triggers exclusively on pushes to the `main` branch. It regenerates the lesson catalog using [`scripts/build_catalog.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/scripts/build_catalog.py), then executes `scripts/check_readme_counts.py --fix` to verify that the lesson count tables in [`README.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/README.md) match the actual lesson directories.

If the script detects discrepancies between the documented counts and the actual curriculum structure, it automatically commits the corrected [`README.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/README.md) to the repository, ensuring documentation stays synchronized with the codebase.

```yaml
- name: build ephemeral catalog
  run: python3 scripts/build_catalog.py
- name: sync README counts
  run: python3 scripts/check_readme_counts.py --fix

```

### site-rebuild: Static Site Regeneration

The **`site-rebuild`** job executes after `readme-counts-sync` completes on pushes to `main`. It runs `node site/build.js` to regenerate [`site/data.js`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/data.js), which powers the static website that lists all available lessons.

If the generated file contains changes, the job commits the new version, ensuring the public site always reflects the current curriculum structure and lesson availability.

```yaml
- name: rebuild site/data.js
  run: node site/build.js

```

### readme-counts-drift: Advisory Checks on Pull Requests

The **`readme-counts-drift`** job runs on pull requests targeting `main` to provide early warning of documentation inconsistencies. It builds a temporary catalog and runs [`scripts/check_readme_counts.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/scripts/check_readme_counts.py) without the `--fix` flag.

If the README counts are out of sync, the job emits a GitHub warning annotation that alerts reviewers to the drift, informing them that the `main` branch will automatically heal the documentation upon merge.

```yaml
- name: check README counts
  run: |
    if ! python3 scripts/check_readme_counts.py; then
      echo "::warning::README.md counts drift detected. Main will self-heal on merge."
    fi

```

## How the Testing Scripts Validate Curriculum Integrity

The CI/CD workflows rely on several Python scripts and a Node.js build tool to enforce curriculum standards.

### Invariant Checking with audit_lessons.py

The [`scripts/audit_lessons.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/scripts/audit_lessons.py) script performs structural and rule-based validation of every lesson in the repository. It confirms that lesson metadata conforms to the expected schema, verifies that code examples have corresponding test coverage, and checks for compliance with curriculum design rules.

### README Synchronization with check_readme_counts.py

The [`scripts/check_readme_counts.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/scripts/check_readme_counts.py) script compares the lesson count tables in [`README.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/README.md) against the catalog generated by [`scripts/build_catalog.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/scripts/build_catalog.py). When invoked with the `--fix` flag (as in the `readme-counts-sync` job), it automatically updates the README to match the actual lesson counts. Without the flag (as in the `readme-counts-drift` job), it reports discrepancies without modifying files.

### Site Data Generation with build.js

The [`site/build.js`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/build.js) script reads the JSON catalog produced by the build process and writes [`site/data.js`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/data.js), which serves as the data source for the front-end lesson index. This ensures the public website always displays the current curriculum organization and lesson metadata.

## Summary

The **CI/CD workflows defined in [`curriculum.yml`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/curriculum.yml)** provide comprehensive automation for the `ai-engineering-from-scratch` repository:

- **Four automated jobs** handle validation, documentation sync, site rebuilding, and drift detection
- **Invariant checking** via [`scripts/audit_lessons.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/scripts/audit_lessons.py) enforces lesson structure and code compliance on every push and pull request
- **Automatic repair** via `scripts/check_readme_counts.py --fix` keeps README.md synchronized with actual lesson counts on the main branch
- **Static site regeneration** via [`site/build.js`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/build.js) ensures the public curriculum site reflects the latest content
- **Advisory warnings** on pull requests alert reviewers to documentation drift before merging

## Frequently Asked Questions

### What triggers the CI/CD workflows in curriculum.yml?

The workflows trigger based on Git events and file changes. The `audit` job runs on every push and pull request that modifies lesson files, scripts, or the workflow itself. The `readme-counts-sync` and `site-rebuild` jobs execute only on pushes to the `main` branch, while `readme-counts-drift` runs specifically on pull requests targeting `main`.

### How does the audit job validate lesson structure?

The `audit` job runs [`scripts/audit_lessons.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/scripts/audit_lessons.py) to scan every lesson directory under `phases/`. It validates that each lesson contains required files including [`docs/en.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/docs/en.md), `code/` directories, [`quiz.json`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/quiz.json), and unit tests. The script also enforces naming conventions and prevents disallowed dependencies from entering the curriculum.

### What is the difference between readme-counts-sync and readme-counts-drift?

The `readme-counts-sync` job runs on the `main` branch and automatically fixes documentation by running `scripts/check_readme_counts.py --fix`, committing any changes to README.md. The `readme-counts-drift` job runs on pull requests without the `--fix` flag, emitting a GitHub warning if counts are out of sync without modifying files, alerting reviewers that the main branch will self-heal upon merge.

### Which programming languages do the workflow scripts use?

The workflow primarily uses Python 3.12 for validation and synchronization tasks, specifically in [`scripts/audit_lessons.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/scripts/audit_lessons.py), [`scripts/build_catalog.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/scripts/build_catalog.py), and [`scripts/check_readme_counts.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/scripts/check_readme_counts.py). The `site-rebuild` job uses Node.js to execute [`site/build.js`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/build.js) for generating the static site data file.