How to Contribute Glossary Terms to the AI Engineering From-Scratch Repository
To contribute a glossary term, fork the repository, append a structured definition to glossary/terms.md, run node site/build.js to validate the index, and submit a pull request following the conventional commit format feat(glossary): add <term>.
The rohitg00/ai-engineering-from-scratch repository maintains a canonical glossary of AI Engineering terminology that powers the curriculum's cross-referencing system. Contributing a new term follows a lightweight, automated workflow designed to keep the 277-term lexicon consistent and searchable. Whether you're clarifying jargon like "Prompt Engineering" or "RAG," the process centers on editing a single markdown file and passing automated CI validation.
Step-by-Step Contribution Workflow
Fork and Branch the Repository
Start by forking the repository and creating a feature branch using the naming convention add-glossary-<term>. Clone your fork locally and switch to the new branch to isolate your changes.
git clone https://github.com/rohitg00/ai-engineering-from-scratch.git
cd ai-engineering-from-scratch
git checkout -b add-glossary-<term>
Append the Term to glossary/terms.md
Open glossary/terms.md and add your term using the established three-part structure. Each term requires an H3 heading followed by specific bullet points explaining common usage, technical definition, and etymology.
### Prompt-Engineering
- **What people say:** "Crafting the best prompt for the model"
- **What it actually means:** The systematic design of input text (system prompt, few-shot examples, chain-of-thought instructions, etc.) to steer an LLM toward a desired behaviour while minimizing hallucination or bias.
- **Why it's called that:** The term borrows from software engineering—just as engineers write code, we "engineer" prompts to obtain reliable outputs.
Place the block at the end of the file or in alphabetical order to maintain consistency.
Update Cross-References in Lesson Documentation
According to the automation contract in AGENTS.md, you should reference the new term in any relevant lesson files (typically under docs/en.md) using wiki-style links like [[TermName]]. This ensures the curriculum remains internally consistent and leverages the glossary for definitions.
Validate with the Site Builder
Before committing, run the static site generator to verify your term is indexed correctly. The site/build.js script parses glossary/terms.md and regenerates site/data.js, which serves as the JSON data source for the web UI.
node site/build.js
Check that site/data.js updates without errors. Never manually edit site/data.js, as CI regenerates it automatically after each merge.
Commit and Submit Your Pull Request
Follow the one-lesson-per-commit policy (a glossary update counts as one logical change). Use the conventional commit message format feat(glossary): add <term>. Push your branch and open a PR that includes a concise description of the term and any lesson files touched.
CI Automation and Quality Gates
Once you open the pull request, three automated checks must pass before merge:
- audit: Validates lesson structure and markdown syntax
- readme-counts-sync: Ensures documentation statistics are current
- site-rebuild: Regenerates
site/data.jsto confirm the glossary index is valid
These checks are defined in .github/workflows/curriculum.yml.
Key Files in the Contribution Pipeline
glossary/terms.md: The central canonical store for all 277+ term definitions.site/build.js: The Node.js parser that transforms glossary markdown into structured JSON.AGENTS.md: The automation contract specifying that cross-lesson terms must be added to the glossary surface.CONTRIBUTING.md: The comprehensive guide to fork, branch, and PR workflows.site/data.js: The auto-generated JSON output consumed by the site frontend.
Summary
-
Fork the repository and create a branch named
add-glossary-<term> -
Append new terms to
glossary/terms.mdusing the "### Term" format with three descriptive bullets -
Run
node site/build.jslocally to validate the index generation -
Commit using
feat(glossary): add <term>and open a PR -
Ensure CI passes
audit,readme-counts-sync, andsite-rebuildchecks
Frequently Asked Questions
What file should I edit to add a new glossary term?
You must edit glossary/terms.md directly. This file serves as the single source of truth for the curriculum's canonical glossary. Do not edit site/data.js, as it is auto-generated by the CI pipeline.
How do I format a new glossary entry to match the existing style?
Use an H3 heading for the term name, followed by three bullet points: "What people say," "What it actually means," and "Why it's called that." This structure ensures consistency across all 277 terms and enables proper parsing by site/build.js.
What CI checks validate my glossary contribution?
Your pull request must pass the audit, readme-counts-sync, and site-rebuild jobs defined in .github/workflows/curriculum.yml. The site-rebuild check specifically verifies that node site/build.js executes successfully and that the generated site/data.js contains your new term.
Should I manually update site/data.js when adding a term?
No. The site/data.js file is automatically regenerated by the site-rebuild CI job after your PR is merged. Manually editing it will cause conflicts and fail the build validation.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →