# CI Workflow Gates in curriculum.yml: Four Automated Validation Gates Explained

> Discover the four CI workflow gates in curriculum.yml: audit, readme-counts-sync, site-rebuild, and readme-counts-drift. Learn how they automate validation for your AI engineering curriculum.

- Repository: [Rohit Ghumare/ai-engineering-from-scratch](https://github.com/rohitg00/ai-engineering-from-scratch)
- Tags: how-to-guide
- Published: 2026-06-08

---

**The [`.github/workflows/curriculum.yml`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/.github/workflows/curriculum.yml) file in the rohitg00/ai-engineering-from-scratch repository defines four distinct CI gates—`audit`, `readme-counts-sync`, `site-rebuild`, and `readme-counts-drift`—that validate lesson invariants, synchronize documentation counts, and regenerate site assets on every push and pull request.**

This automated pipeline ensures the AI Engineering curriculum maintains structural integrity without manual intervention. Each gate targets a specific validation layer, from enforcing front-matter schemas to keeping the generated website data in lockstep with lesson content.

## Overview of the CI Pipeline

The workflow triggers on two events: pushes to the `main` branch and pull requests affecting curriculum files. According to the source code in [`.github/workflows/curriculum.yml`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/.github/workflows/curriculum.yml), the pipeline separates concerns into **validation**, **repair**, and **regeneration** phases. The `audit` gate runs universally, while `readme-counts-sync` and `site-rebuild` execute only after changes land in `main`. A fourth gate, `readme-counts-drift`, provides non-blocking feedback during pull request reviews.

## The Four CI Workflow Gates

### 1. audit — Lesson Invariant Validation

The **`audit`** gate acts as the primary quality checkpoint. It executes [`scripts/audit_lessons.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/scripts/audit_lessons.py) to enforce contract rules governing lesson front-matter, file layout, and test counts.

This job triggers on every push and pull request. It checks out the repository, configures Python 3.12, and runs the invariant checker. If any lesson violates the defined schema—such as missing required metadata or incorrect directory structure—the workflow fails immediately, blocking the merge.

```yaml
audit:
  runs-on: ubuntu-latest
  steps:
    - uses: actions/checkout@v4
    - uses: actions/setup-python@v5
      with:
        python-version: "3.12"
    - name: Run invariant checks
      run: python3 scripts/audit_lessons.py

```

### 2. readme-counts-sync — Automated README Correction

The **`readme-counts-sync`** gate runs exclusively on pushes to `main`. It ensures the statistical counts in **[`README.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/README.md)**—including phase totals, lesson counts, and quiz question tallies—match the actual curriculum structure.

The process first builds a temporary catalog using [`scripts/build_catalog.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/scripts/build_catalog.py), then executes `scripts/check_readme_counts.py --fix` to auto-correct discrepancies. When changes are detected, the workflow commits and pushes the updated [`README.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/README.md) directly to the repository, eliminating manual tally maintenance.

```yaml
readme-counts-sync:
  if: github.event_name == 'push'
  runs-on: ubuntu-latest
  steps:
    - uses: actions/checkout@v4
    - uses: actions/setup-python@v5
      with:
        python-version: "3.12"
    - run: python3 scripts/build_catalog.py
    - run: python3 scripts/check_readme_counts.py --fix
    - name: Commit changes
      run: |
        git add README.md
        git commit -m "chore(readme): sync counts"
        git push

```

### 3. site-rebuild — Documentation Site Regeneration

The **`site-rebuild`** gate depends on the successful completion of `readme-counts-sync`. It regenerates **[`site/data.js`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/data.js)**, the data file powering the static documentation website.

Using `node site/build.js`, this gate transforms the curriculum structure into a JavaScript data export. Like the README sync, it commits and pushes changes only when the generated file differs from the repository version. This dependency chain ensures the website reflects the corrected README state before rebuilding.

```yaml
site-rebuild:
  needs: readme-counts-sync
  if: github.event_name == 'push'
  runs-on: ubuntu-latest
  steps:
    - uses: actions/checkout@v4
    - run: node site/build.js
    - name: Commit changes
      run: |
        git add site/data.js
        git commit -m "chore(site): rebuild data.js"
        git push

```

### 4. readme-counts-drift — Pull Request Advisory Check

The **`readme-counts-drift`** gate provides early warning during code review without blocking merges. It operates only on pull requests, running the same catalog build and count check as `readme-counts-sync` but **without** the `--fix` flag.

When drift is detected between the PR's [`README.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/README.md) and the actual lesson catalog, the workflow emits a GitHub Actions warning annotation. The actual repair occurs after merge via `readme-counts-sync`, maintaining a clean separation between advisory feedback and automated repair.

```yaml
readme-counts-drift:
  if: github.event_name == 'pull_request'
  runs-on: ubuntu-latest
  steps:
    - uses: actions/checkout@v4
    - uses: actions/setup-python@v5
      with:
        python-version: "3.12"
    - run: python3 scripts/build_catalog.py
    - run: |
        if ! python3 scripts/check_readme_counts.py; then
          echo "::warning::README.md counts drift detected. Main will self-heal on merge."
        fi

```

## Key Implementation Files

The gates rely on specific scripts located in the repository root and `scripts/` directory:

- **[`.github/workflows/curriculum.yml`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/.github/workflows/curriculum.yml)** — Orchestrates the four gates and manages execution order via `needs` dependencies.
- **[`scripts/audit_lessons.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/scripts/audit_lessons.py)** — Implements the invariant validation logic for lesson structure and metadata.
- **[`scripts/build_catalog.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/scripts/build_catalog.py)** — Generates a temporary JSON catalog of all lessons, phases, and quizzes used by count synchronization logic.
- **[`scripts/check_readme_counts.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/scripts/check_readme_counts.py)** — Compares live README statistics against the generated catalog; accepts `--fix` to overwrite mismatches.
- **[`site/build.js`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/build.js)** — Node.js script that compiles [`site/data.js`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/data.js) from the curriculum catalog for the documentation site.

## Summary

- The **`audit`** gate enforces structural invariants on every push and pull request using [`scripts/audit_lessons.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/scripts/audit_lessons.py).
- The **`readme-counts-sync`** gate auto-corrects README statistics after merges to `main` via `check_readme_counts.py --fix`.
- The **`site-rebuild`** gate regenerates [`site/data.js`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/data.js) after README synchronization completes, keeping the documentation site current.
- The **`readme-counts-drift`** gate provides non-blocking warnings on pull requests when README counts diverge from the curriculum catalog.
- All gates execute on Ubuntu runners with Python 3.12, except `site-rebuild` which requires Node.js for [`site/build.js`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/build.js).

## Frequently Asked Questions

### What triggers the CI workflow gates in curriculum.yml?

The workflow triggers on two GitHub events: **pushes to the `main` branch** and **pull requests** targeting curriculum files. The `audit` and `readme-counts-drift` gates run during pull request reviews, while `readme-counts-sync` and `site-rebuild` execute only after code merges into `main`.

### Why does the site-rebuild gate depend on readme-counts-sync?

The `site-rebuild` gate uses `needs: readme-counts-sync` to ensure [`README.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/README.md) counts are accurate before regenerating [`site/data.js`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/data.js). Since the site data often references the same statistical totals displayed in the README, this sequential execution prevents generating documentation from stale or incorrect count metadata.

### What happens if the audit gate fails?

When [`scripts/audit_lessons.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/scripts/audit_lessons.py) detects a lesson contract violation—such as malformed front-matter or missing test files—the **audit job fails and blocks the merge**. Contributors must fix the underlying curriculum structure before the pull request can proceed, ensuring only valid lessons enter the `main` branch.

### Can contributors override the automated README fixes?

No manual override is necessary because `readme-counts-sync` runs only on `main` branch pushes after merge. Contributors see advisory warnings via `readme-counts-drift` during review, but the actual synchronization happens automatically post-merge. If a contributor prefers manual control, they must update the counts in their PR to match the [`build_catalog.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/build_catalog.py) output exactly.