CI/CD Pipelines in AI Engineering From Scratch: GitHub Actions Workflow Explained

Yes, the AI Engineering From Scratch repository implements a complete CI/CD pipeline using GitHub Actions that automatically validates curriculum integrity, synchronizes documentation counts, and rebuilds the project website on every push to main.

The rohitg00/ai-engineering-from-scratch project maintains educational content quality through automated continuous integration and delivery (CI/CD) processes. According to the source code, the workflow defined in .github/workflows/curriculum.yml orchestrates four distinct jobs that run on Ubuntu runners, ensuring that every code change meets curriculum invariants before deployment.

CI/CD Pipeline Architecture

The repository uses a single workflow file located at .github/workflows/curriculum.yml to manage all automation. This pipeline triggers on every push to the main branch and on every pull request, providing immediate feedback to contributors.

The infrastructure relies on:

  • Ubuntu-latest runners for all jobs
  • Python 3.12 for curriculum validation scripts
  • Node.js for static site generation
  • GITHUB_TOKEN for authenticated commits back to the repository

The Four Automated Jobs

The CI/CD pipeline splits responsibilities across four specialized jobs, each targeting specific maintenance tasks.

Audit Job: Enforcing Curriculum Invariants

The audit job runs on every push and pull request to validate the curriculum structure. It executes scripts/audit_lessons.py to verify that all lessons meet predefined structural requirements.

jobs:
  audit:
    name: invariant checks
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          persist-credentials: false
      - uses: actions/setup-python@v5
        with:
          python-version: "3.12"
      - name: run scripts/audit_lessons.py
        run: python3 scripts/audit_lessons.py

This job acts as a gatekeeper, preventing malformed curriculum data from reaching the main branch.

README Counts Sync: Automated Documentation Updates

The readme-counts-sync job activates only on pushes to main. It runs scripts/check_readme_counts.py --fix to regenerate the lesson catalog via scripts/build_catalog.py and automatically commits any changes to README.md.

The workflow includes safeguards against infinite loops and handles push conflicts through a retry-and-rebase strategy:

- name: sync README counts
  run: python3 scripts/check_readme_counts.py --fix
- name: commit + push if README changed
  env:
    BOT_COMMIT_PREFIX: "chore(readme): sync counts"
  run: |
    if git diff --quiet README.md; then exit 0; fi
    git config user.name "github-actions[bot]"
    git config user.email "41898282+github-actions[bot]@users.noreply.github.com"
    git add README.md
    git commit -m "$BOT_COMMIT_PREFIX"
    # push with retry logic …

Site Rebuild: Continuous Deployment of Static Assets

The site-rebuild job handles continuous deployment for the curriculum website. Triggered exclusively on main branch pushes, it executes node site/build.js to regenerate site/data.js and automatically commits the updated file.

This ensures the rendered curriculum website at /site always reflects the latest lesson structure without manual intervention.

README Counts Drift Detection

The readme-counts-drift job runs specifically on pull requests to check for documentation count discrepancies. Unlike the sync job that fixes issues, this job emits warnings when counts are out-of-sync, allowing reviewers to catch documentation errors before merging.

Key Implementation Details

The CI/CD configuration demonstrates several production-quality practices:

  • Conflict Resolution: The workflow implements retry logic with rebasing to handle concurrent modifications
  • Loop Prevention: Checks ensure that bot commits do not re-trigger the workflow infinitely
  • Minimal Permissions: The audit job sets persist-credentials: false during checkout for security
  • Targeted Triggers: Jobs use conditional logic to run only on specific events (push to main vs. pull_request)

Summary

  • GitHub Actions powers the entire CI/CD infrastructure for rohitg00/ai-engineering-from-scratch
  • The .github/workflows/curriculum.yml file defines four automated jobs: audit, readme-counts-sync, site-rebuild, and readme-counts-drift
  • Python 3.12 validates curriculum structure through scripts/audit_lessons.py
  • Node.js rebuilds the static site via site/build.js on every merge to main
  • The pipeline automatically commits documentation fixes using the default GITHUB_TOKEN with safeguards against infinite loops

Frequently Asked Questions

What triggers the CI/CD pipelines in this repository?

The pipelines trigger on every push to the main branch and on every pull request. Specific jobs like readme-counts-sync and site-rebuild only execute on main branch pushes, while the audit job runs on both push and pull request events.

How does the workflow prevent infinite commit loops?

The implementation checks for actual file changes before committing, using git diff --quiet to exit early if no modifications exist. Additionally, the workflow design ensures that commits made by the github-actions[bot] account do not re-trigger the same automation jobs.

What is the purpose of the scripts/audit_lessons.py file?

This Python script enforces curriculum invariants by validating the structure and integrity of lesson files. It runs as part of the audit job in the CI pipeline, acting as a quality gate that prevents malformed or incomplete curriculum data from being merged into the main branch.

Does the CI/CD pipeline deploy to an external hosting service?

The pipeline handles continuous deployment to the repository itself rather than external hosts. The site-rebuild job regenerates site/data.js and commits

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →