architecture

How Reusable AI Artifacts Are Structured in ai-engineering-from-scratch

June 10, 2026 rohitg00/ai-engineering-from-scratch ↗

The ai-engineering-from-scratch repository stores every prompt, skill, agent, and MCP-server as a static markdown file under the top-level outputs/ directory, using a centralized outputs/index.json catalog for machine-readable discovery.

All reusable AI artifacts in this curriculum are treated as version-controlled plaintext outputs. The repository at rohitg00/ai-engineering-from-scratch organizes these resources into a strict directory hierarchy that separates concerns while maintaining discoverability through a generated index. This architecture ensures that AI components remain diffable, importable, and reproducible across different lessons and projects.

Centralized Storage Architecture

The Outputs Directory Convention

The repository maintains a top-level outputs/ folder that acts as the single source of truth for all static artifacts. According to the source code structure, this directory contains specific subfolders for each artifact type:

outputs/prompts/ — Stores prompt templates as .md files containing exact model instructions
outputs/agents/ — Contains agent definitions describing roles, tools, and workflows
outputs/mcp-servers/ — Holds micro-control-plane server specifications
<lesson-path>/outputs/ — Lesson-specific directories housing skills and generated traces

Each folder includes .gitkeep files to preserve directory structure in version control when empty.

Catalog-Driven Discovery

The outputs/index.json file serves as the machine-readable registry for all artifacts. As implemented in the repository, this catalog is generated by the CI step site/build.js and contains structured entries mapping artifact types, slugs, file paths, and descriptions. This design allows the site generator to render the "Artifacts" view and enables programmatic lookup without scanning the filesystem directly.

Types of Reusable AI Artifacts

Prompts

Prompts are plain-text markdown files stored in outputs/prompts/. These files contain the exact wording sent to language models, often including placeholder syntax for dynamic content injection. The format remains human-editable while being directly loadable by downstream Python scripts.

Skills

Skills represent reusable capabilities defined at the lesson level. Each skill resides in <lesson-dir>/outputs/skill-<slug>.md and combines a human-readable description with an optional JSON payload. For instance, the end-to-end safety gate lesson generates phases/19-capstone-projects/87-end-to-end-safety-gate/outputs/skill-end-to-end-safety-gate.md alongside structured trace data in gate_trace.json, allowing consumers to import both the prompt logic and execution metadata.

Agents and MCP Servers

Higher-level constructs reside in outputs/agents/ and outputs/mcp-servers/. Agent definitions describe autonomous entities with specified roles, available toolsets, and orchestration workflows. MCP-server files specify micro-control-plane configurations used for registry operations and checkpoint demonstrations. Both follow the same plaintext markdown convention for consistency.

Loading Artifacts in Python

Reading Prompts from Disk

To load a prompt template directly from the outputs/prompts/ directory:

import pathlib

REPO_ROOT = pathlib.Path(__file__).parents[2]

def load_prompt(name: str) -> str:
    """Read a prompt markdown file from `outputs/prompts/`."""
    prompt_path = REPO_ROOT / "outputs" / "prompts" / f"{name}.md"
    return prompt_path.read_text(encoding="utf-8")

system_prompt = load_prompt("assistant-intro")

Importing Skills with Metadata

Skills often include accompanying JSON traces. Load both components using the skill slug:

import json
import pathlib

def load_skill(skill_slug: str):
    lesson_root = pathlib.Path(__file__).parents[2] / "phases" / "19-capstone-projects" \
        / "87-end-to-end-safety-gate" / "outputs"
    
    md_path = lesson_root / f"skill-{skill_slug}.md"
    json_path = lesson_root / "gate_trace.json"
    
    description = md_path.read_text(encoding="utf-8")
    trace = json.loads(json_path.read_text(encoding="utf-8"))
    
    return description, trace

desc, trace = load_skill("end-to-end-safety-gate")

Querying the Artifact Catalog

Enumerate available resources by parsing the centralized index:

import json
import pathlib

catalog_path = pathlib.Path(__file__).parents[2] / "outputs" / "index.json"
catalog = json.loads(catalog_path.read_text(encoding="utf-8"))

prompts = [a for a in catalog if a["type"] == "prompt"]
for p in prompts:
    print(f"- {p['slug']} at {p['path']}")

Design Principles

The artifact structure follows four core constraints defined in AGENTS.md:

Plain-text storage — Markdown files ensure version control compatibility and human readability
Lesson-level ownership — Each lesson writes exclusively to its own outputs/ subdirectory, preventing cross-contamination
Separation of concerns — Distinct namespaces (prompts/, agents/, mcp-servers/) prevent naming collisions
Static generation — Artifacts are committed as static files rather than generated at runtime, guaranteeing reproducibility

Summary

Reusable AI artifacts in this repository are stored as static markdown files under the top-level outputs/ directory
The outputs/index.json catalog provides machine-readable metadata for all prompts, skills, agents, and servers
Skills combine markdown descriptions with JSON payloads in lesson-specific output folders like phases/19-capstone-projects/87-end-to-end-safety-gate/outputs/
Plain-text storage ensures artifacts remain diffable, version-controlled, and accessible via standard file I/O
The architecture separates artifact types into dedicated subdirectories while maintaining a unified discovery interface through the CI-generated catalog

Frequently Asked Questions

What file format does the repository use for prompts and agents?

All reusable AI artifacts use markdown (.md) files. This format stores exact prompt text, agent role descriptions, and server configurations as human-readable plaintext that remains diffable in version control and directly importable by Python code without parsing engines.

How does the repository handle artifact discovery programmatically?

Consumers query the outputs/index.json file, which serves as a centralized catalog. Generated by the CI pipeline via site/build.js, this JSON file maps artifact slugs to their filesystem paths and metadata types, eliminating the need for filesystem crawling or complex import logic.

What distinguishes a skill from a standard prompt in this structure?

While prompts in outputs/prompts/ contain static text templates, skills reside in lesson-specific outputs/ directories as skill-<slug>.md files and often include accompanying JSON payloads such as gate_trace.json. Skills represent executable capabilities produced by specific curriculum lessons rather than generic reusable templates.

Where can I find the specification for how artifacts should be organized?

The repository's organizational policy is documented in AGENTS.md at the repository root. This file defines the directory conventions, naming schemes for skill files, and the CI process that aggregates individual lesson outputs into the global outputs/index.json catalog.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:

curl -s "https://instagit.com/install.md"

Add to your MCP client configuration:

{
  "mcpServers": {
    "instagit": {
      "command": "npx",
      "args": ["-y", "instagit@latest"]
    }
  }
}

Ask your agent:

"Use Instagit MCP to understand how rohitg00/ai-engineering-from-scratch works."

Works with

Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →