# How `glossary/terms.md` Powers Consistent Terminology Across Lessons in AI Engineering from Scratch > Learn how glossary terms.md ensures consistent terminology in AI Engineering from Scratch. Discover how site build.js creates a GLOSSARY constant for unified lesson definitions. - Repository: [Rohit Ghumare/ai-engineering-from-scratch](https://github.com/rohitg00/ai-engineering-from-scratch) - Tags: internals - Published: 2026-06-07 --- **[`glossary/terms.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/glossary/terms.md) serves as the single source of truth for all canonical definitions in the `rohitg00/ai-engineering-from-scratch` curriculum, parsed at build time by [`site/build.js`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/build.js) into a `GLOSSARY` constant that every lesson page consumes for unified terminology.** In the [`rohitg00/ai-engineering-from-scratch`](https://github.com/rohitg00/ai-engineering-from-scratch) repository, [`glossary/terms.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/glossary/terms.md) is the centralized dictionary that synchronizes technical language across 435 lessons. Rather than rewriting explanations in every module, authors reference terms from this file, ensuring concepts like **Agent** and **Attention** carry identical, version-controlled definitions wherever they appear. This architecture is maintained through automated build pipelines and strict contribution rules that prevent drift. ## The Canonical Source in [`glossary/terms.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/glossary/terms.md) The file at [`glossary/terms.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/glossary/terms.md) contains a markdown list of every canonical term used throughout the curriculum. Each entry follows a strict three-section template that separates colloquial usage from precise meaning: ```markdown ### - **What people say:** "" - **What it actually means:** - **Why it's called that:** ``` Early entries in the file define foundational concepts such as **Agent**, **Attention**, and **Adam** using this exact structure, making [`glossary/terms.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/glossary/terms.md) the authoritative starting point for how terminology is introduced to learners. ## Build-Time Extraction in [`site/build.js`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/build.js) When the site is generated, the build script at [`site/build.js`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/build.js) reads [`glossary/terms.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/glossary/terms.md) and executes a function named `parseGlossary`. This parser walks the file line-by-line and converts each term into a structured JavaScript object with three fields: ```js { term: "Agent", says: "An autonomous AI that thinks and acts on its own", means: "A while loop where an LLM decides what tool to call next, executes it, sees the result, and repeats" } ``` The resulting array is stored as the `GLOSSARY` constant inside the auto-generated file [`site/data.js`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/data.js). Because this happens at build time, the front-end receives a static, queryable data structure rather than parsing raw markdown in the browser. ## Cross-Lesson Consistency and Policy Enforcement Lesson authors do not rewrite definitions. Instead, they reference a canonical term by name, and the build process guarantees that [`glossary/terms.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/glossary/terms.md) remains the only source of truth. This design delivers two critical outcomes: - **Uniform wording** — Every lesson that mentions a cached term shares the exact same explanation. - **Instant propagation** — Updating a definition in [`glossary/terms.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/glossary/terms.md) and rebuilding automatically refreshes tooltips, links, and glossary page entries across all 435 lessons. The repository’s contribution guidelines in [`AGENTS.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/AGENTS.md) explicitly codify this workflow: > *“When introducing a term used by more than one lesson, add it to [`glossary/terms.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/glossary/terms.md).”* This policy prevents duplicated or diverging explanations and keeps the curriculum coherent as it scales. ## Runtime Discovery and [`llms.txt`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/llms.txt) Generation Beyond powering the static site, the `GLOSSARY` constant in [`site/data.js`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/data.js) is consumed by the `writeLlms` script to generate [`llms.txt`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/llms.txt). That script embeds a count of glossary terms, creating a machine-readable summary that external agents can scrape for meta-learning. Thus, [`glossary/terms.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/glossary/terms.md) supports both human readers and automated tooling through a single parsed output. ## Working with the Glossary: Code Examples ### Adding a New Term To introduce a concept such as **KV Cache**, an author appends the standard template directly to [`glossary/terms.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/glossary/terms.md): ```markdown ### KV Cache - **What people say:** "Makes inference faster" - **What it actually means:** "During autoregressive generation, caching the key and value matrices from previous tokens so you don't recompute them at each step." - **Why it's called that:** "The cache stores the K (key) and V (value) tensors for reuse." ``` Committing this change and running `npm run build` (or waiting for CI) triggers `parseGlossary` in [`site/build.js`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/build.js) and regenerates [`site/data.js`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/data.js) with the new entry. ### Accessing Parsed Data in a Client Script Any front-end module can import the compiled definitions from the generated data file: ```js // Assume site/data.js has been loaded import { GLOSSARY } from './data.js'; // Find the definition for "Agent" const agentEntry = GLOSSARY.find(t => t.term === 'Agent'); console.log(agentEntry.means); // → "A while loop where an LLM decides what tool to call next, executes it, sees the result, and repeats" ``` ### Rendering a Tooltip Inside a Lesson A lesson page can annotate a term with a `data-term` attribute and hydrate a tooltip from the canonical store: ```html Agent ``` When the page loads, the script queries `GLOSSARY` and renders a double-line tooltip that remains synchronized with the master definition in [`glossary/terms.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/glossary/terms.md). ## Key Files in the Glossary Pipeline Several files cooperate to turn the markdown master list into a curriculum-wide terminology layer: - **[`glossary/terms.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/glossary/terms.md)** — The master list of all curriculum terms in strict markdown format. - **[`site/build.js`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/build.js)** — Executes `parseGlossary` to read [`terms.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/terms.md) and write the compiled dataset. - **[`site/data.js`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/data.js)** *(generated)* — Exports the `GLOSSARY` array consumed by the front-end and by `writeLlms`. - **[`AGENTS.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/AGENTS.md)** — Contribution guidelines that mandate adding shared terms to [`glossary/terms.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/glossary/terms.md). - **[`site/glossary.html`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/glossary.html)** — Renders the searchable glossary page using the imported `GLOSSARY` constant. ## Summary - [`glossary/terms.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/glossary/terms.md) stores every canonical definition for the AI Engineering from Scratch curriculum using a rigid three-part markdown template. - At build time, [`site/build.js`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/build.js) runs `parseGlossary` to convert the markdown into the `GLOSSARY` JavaScript array inside [`site/data.js`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/data.js). - Lesson pages consume this constant for tooltips, links, and the searchable [`glossary.html`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/glossary.html) page, ensuring identical wording across all 435 lessons. - The [`AGENTS.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/AGENTS.md) contribution policy forces authors to centralize new terms, preventing explanation drift. - The same `GLOSSARY` structure feeds the `writeLlms` script for machine-readable [`llms.txt`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/llms.txt) generation. ## Frequently Asked Questions ### What is the purpose of [`glossary/terms.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/glossary/terms.md) in AI Engineering from Scratch? [`glossary/terms.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/glossary/terms.md) is the single source of truth for technical definitions used across the entire curriculum. It stores each term in a standardized three-section format so that concepts like **Agent** or **KV Cache** are explained identically in every lesson. ### How does updating [`glossary/terms.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/glossary/terms.md) affect existing lessons? Because [`site/build.js`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/build.js) reparses the file into the `GLOSSARY` constant during each build, any edit to a definition automatically propagates to all lesson tooltips, glossary links, and the searchable glossary page. Authors never need to manually update individual lesson files. ### What is the required format for new glossary entries? Every term must follow the template established in [`glossary/terms.md`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/glossary/terms.md): an H3 heading for the term name, followed by three bullet lines labeled **What people say**, **What it actually means**, and **Why it's called that**. This strict structure allows `parseGlossary` to split the entry into machine-readable fields. ### Where is the parsed glossary data consumed besides the website? In addition to rendering the glossary page and lesson tooltips, the generated `GLOSSARY` array in [`site/data.js`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/site/data.js) is used by the `writeLlms` script to populate [`llms.txt`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/llms.txt) with term counts and metadata, enabling external agents to discover curriculum concepts programmatically.