CI Workflow Gates in curriculum.yml: Four Automated Validation Gates Explained

The .github/workflows/curriculum.yml file in the rohitg00/ai-engineering-from-scratch repository defines four distinct CI gates—audit, readme-counts-sync, site-rebuild, and readme-counts-drift—that validate lesson invariants, synchronize documentation counts, and regenerate site assets on every push and pull request.

This automated pipeline ensures the AI Engineering curriculum maintains structural integrity without manual intervention. Each gate targets a specific validation layer, from enforcing front-matter schemas to keeping the generated website data in lockstep with lesson content.

Overview of the CI Pipeline

The workflow triggers on two events: pushes to the main branch and pull requests affecting curriculum files. According to the source code in .github/workflows/curriculum.yml, the pipeline separates concerns into validation, repair, and regeneration phases. The audit gate runs universally, while readme-counts-sync and site-rebuild execute only after changes land in main. A fourth gate, readme-counts-drift, provides non-blocking feedback during pull request reviews.

The Four CI Workflow Gates

1. audit — Lesson Invariant Validation

The audit gate acts as the primary quality checkpoint. It executes scripts/audit_lessons.py to enforce contract rules governing lesson front-matter, file layout, and test counts.

This job triggers on every push and pull request. It checks out the repository, configures Python 3.12, and runs the invariant checker. If any lesson violates the defined schema—such as missing required metadata or incorrect directory structure—the workflow fails immediately, blocking the merge.

audit:
  runs-on: ubuntu-latest
  steps:
    - uses: actions/checkout@v4
    - uses: actions/setup-python@v5
      with:
        python-version: "3.12"
    - name: Run invariant checks
      run: python3 scripts/audit_lessons.py

2. readme-counts-sync — Automated README Correction

The readme-counts-sync gate runs exclusively on pushes to main. It ensures the statistical counts in README.md—including phase totals, lesson counts, and quiz question tallies—match the actual curriculum structure.

The process first builds a temporary catalog using scripts/build_catalog.py, then executes scripts/check_readme_counts.py --fix to auto-correct discrepancies. When changes are detected, the workflow commits and pushes the updated README.md directly to the repository, eliminating manual tally maintenance.

readme-counts-sync:
  if: github.event_name == 'push'
  runs-on: ubuntu-latest
  steps:
    - uses: actions/checkout@v4
    - uses: actions/setup-python@v5
      with:
        python-version: "3.12"
    - run: python3 scripts/build_catalog.py
    - run: python3 scripts/check_readme_counts.py --fix
    - name: Commit changes
      run: |
        git add README.md
        git commit -m "chore(readme): sync counts"
        git push

3. site-rebuild — Documentation Site Regeneration

The site-rebuild gate depends on the successful completion of readme-counts-sync. It regenerates site/data.js, the data file powering the static documentation website.

Using node site/build.js, this gate transforms the curriculum structure into a JavaScript data export. Like the README sync, it commits and pushes changes only when the generated file differs from the repository version. This dependency chain ensures the website reflects the corrected README state before rebuilding.

site-rebuild:
  needs: readme-counts-sync
  if: github.event_name == 'push'
  runs-on: ubuntu-latest
  steps:
    - uses: actions/checkout@v4
    - run: node site/build.js
    - name: Commit changes
      run: |
        git add site/data.js
        git commit -m "chore(site): rebuild data.js"
        git push

4. readme-counts-drift — Pull Request Advisory Check

The readme-counts-drift gate provides early warning during code review without blocking merges. It operates only on pull requests, running the same catalog build and count check as readme-counts-sync but without the --fix flag.

When drift is detected between the PR's README.md and the actual lesson catalog, the workflow emits a GitHub Actions warning annotation. The actual repair occurs after merge via readme-counts-sync, maintaining a clean separation between advisory feedback and automated repair.

readme-counts-drift:
  if: github.event_name == 'pull_request'
  runs-on: ubuntu-latest
  steps:
    - uses: actions/checkout@v4
    - uses: actions/setup-python@v5
      with:
        python-version: "3.12"
    - run: python3 scripts/build_catalog.py
    - run: |
        if ! python3 scripts/check_readme_counts.py; then
          echo "::warning::README.md counts drift detected. Main will self-heal on merge."
        fi

Key Implementation Files

The gates rely on specific scripts located in the repository root and scripts/ directory:

Summary

  • The audit gate enforces structural invariants on every push and pull request using scripts/audit_lessons.py.
  • The readme-counts-sync gate auto-corrects README statistics after merges to main via check_readme_counts.py --fix.
  • The site-rebuild gate regenerates site/data.js after README synchronization completes, keeping the documentation site current.
  • The readme-counts-drift gate provides non-blocking warnings on pull requests when README counts diverge from the curriculum catalog.
  • All gates execute on Ubuntu runners with Python 3.12, except site-rebuild which requires Node.js for site/build.js.

Frequently Asked Questions

What triggers the CI workflow gates in curriculum.yml?

The workflow triggers on two GitHub events: pushes to the main branch and pull requests targeting curriculum files. The audit and readme-counts-drift gates run during pull request reviews, while readme-counts-sync and site-rebuild execute only after code merges into main.

Why does the site-rebuild gate depend on readme-counts-sync?

The site-rebuild gate uses needs: readme-counts-sync to ensure README.md counts are accurate before regenerating site/data.js. Since the site data often references the same statistical totals displayed in the README, this sequential execution prevents generating documentation from stale or incorrect count metadata.

What happens if the audit gate fails?

When scripts/audit_lessons.py detects a lesson contract violation—such as malformed front-matter or missing test files—the audit job fails and blocks the merge. Contributors must fix the underlying curriculum structure before the pull request can proceed, ensuring only valid lessons enter the main branch.

Can contributors override the automated README fixes?

No manual override is necessary because readme-counts-sync runs only on main branch pushes after merge. Contributors see advisory warnings via readme-counts-drift during review, but the actual synchronization happens automatically post-merge. If a contributor prefers manual control, they must update the counts in their PR to match the build_catalog.py output exactly.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →