# CI/CD Pipelines in AI Engineering From Scratch: GitHub Actions Workflow Explained

> Explore CI/CD pipelines in AI engineering with GitHub Actions. This repo automates curriculum validation, doc syncing, and website rebuilds for seamless project updates.

- Repository: [Rohit Ghumare/ai-engineering-from-scratch](https://github.com/rohitg00/ai-engineering-from-scratch)
- Tags: tutorial
- Published: 2026-06-06

---

**Yes, the AI Engineering From Scratch repository implements a complete CI/CD pipeline using GitHub Actions that automatically validates curriculum integrity, synchronizes documentation counts, and rebuilds the project website on every push to main.**

The rohitg00/ai-engineering-from-scratch project maintains educational content quality through automated continuous integration and delivery (CI/CD) processes. According to the source code, the workflow defined in [`.github/workflows/curriculum.yml`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/.github/workflows/curriculum.yml) orchestrates four distinct jobs that run on Ubuntu runners, ensuring that every code change meets curriculum invariants before deployment.

## CI/CD Pipeline Architecture

The repository uses a single workflow file located at [`.github/workflows/curriculum.yml`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/.github/workflows/curriculum.yml) to manage all automation. This pipeline triggers on every **push** to the `main` branch and on every **pull request**, providing immediate feedback to contributors.

The infrastructure relies on:
- **Ubuntu-latest** runners for all jobs
- **Python 3.12** for curriculum validation scripts
- **Node.js** for static site generation
- **GITHUB_TOKEN** for authenticated commits back to the repository

## The Four Automated Jobs

The CI/CD pipeline splits responsibilities across four specialized jobs, each targeting specific maintenance tasks.

### Audit Job: Enforcing Curriculum Invariants

The **audit** job runs on every push and pull request to validate the curriculum structure. It executes [`scripts/audit_lessons.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/scripts/audit_lessons.py) to verify that all lessons meet predefined structural requirements.

```yaml
jobs:
  audit:
    name: invariant checks
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          persist-credentials: false
      - uses: actions/setup-python@v5
        with:
          python-version: "3.12"
      - name: run scripts/audit_lessons.py
        run: python3 scripts/audit_lessons.py

```

This job acts as a gatekeeper, preventing malformed curriculum data from reaching the main branch.

### README Counts Sync: Automated Documentation Updates

The **readme-counts-sync** job activates only on pushes to `main`. It runs `scripts/check_readme_counts.py --fix` to regenerate the lesson catalog via [`scripts/build_catalog.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/scripts/build_catalog.py) and automatically commits any changes to [`README.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/README.md).

The workflow includes safeguards against infinite loops and handles push conflicts through a retry-and-rebase strategy:

```yaml
- name: sync README counts
  run: python3 scripts/check_readme_counts.py --fix
- name: commit + push if README changed
  env:
    BOT_COMMIT_PREFIX: "chore(readme): sync counts"
  run: |
    if git diff --quiet README.md; then exit 0; fi
    git config user.name "github-actions[bot]"
    git config user.email "41898282+github-actions[bot]@users.noreply.github.com"
    git add README.md
    git commit -m "$BOT_COMMIT_PREFIX"
    # push with retry logic …

```

### Site Rebuild: Continuous Deployment of Static Assets

The **site-rebuild** job handles continuous deployment for the curriculum website. Triggered exclusively on `main` branch pushes, it executes `node site/build.js` to regenerate [`site/data.js`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/data.js) and automatically commits the updated file.

This ensures the rendered curriculum website at `/site` always reflects the latest lesson structure without manual intervention.

### README Counts Drift Detection

The **readme-counts-drift** job runs specifically on pull requests to check for documentation count discrepancies. Unlike the sync job that fixes issues, this job emits warnings when counts are out-of-sync, allowing reviewers to catch documentation errors before merging.

## Key Implementation Details

The CI/CD configuration demonstrates several production-quality practices:

- **Conflict Resolution**: The workflow implements retry logic with rebasing to handle concurrent modifications
- **Loop Prevention**: Checks ensure that bot commits do not re-trigger the workflow infinitely
- **Minimal Permissions**: The `audit` job sets `persist-credentials: false` during checkout for security
- **Targeted Triggers**: Jobs use conditional logic to run only on specific events (`push` to main vs. `pull_request`)

## Summary

- **GitHub Actions** powers the entire CI/CD infrastructure for rohitg00/ai-engineering-from-scratch
- The [`.github/workflows/curriculum.yml`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/.github/workflows/curriculum.yml) file defines four automated jobs: **audit**, **readme-counts-sync**, **site-rebuild**, and **readme-counts-drift**
- **Python 3.12** validates curriculum structure through [`scripts/audit_lessons.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/scripts/audit_lessons.py)
- **Node.js** rebuilds the static site via [`site/build.js`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/build.js) on every merge to main
- The pipeline automatically commits documentation fixes using the default `GITHUB_TOKEN` with safeguards against infinite loops

## Frequently Asked Questions

### What triggers the CI/CD pipelines in this repository?

The pipelines trigger on every push to the `main` branch and on every pull request. Specific jobs like `readme-counts-sync` and `site-rebuild` only execute on `main` branch pushes, while the `audit` job runs on both push and pull request events.

### How does the workflow prevent infinite commit loops?

The implementation checks for actual file changes before committing, using `git diff --quiet` to exit early if no modifications exist. Additionally, the workflow design ensures that commits made by the `github-actions[bot]` account do not re-trigger the same automation jobs.

### What is the purpose of the [`scripts/audit_lessons.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/scripts/audit_lessons.py) file?

This Python script enforces curriculum invariants by validating the structure and integrity of lesson files. It runs as part of the **audit** job in the CI pipeline, acting as a quality gate that prevents malformed or incomplete curriculum data from being merged into the `main` branch.

### Does the CI/CD pipeline deploy to an external hosting service?

The pipeline handles continuous deployment to the repository itself rather than external hosts. The **site-rebuild** job regenerates [`site/data.js`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/data.js) and commits