# How install_skills.py Manages Artifact Versioning and Tagging

> Discover how install_skills.py manages artifact versioning and tagging by leveraging YAML front-matter and generated manifests for efficient version tracking and filtering.

- Repository: [Rohit Ghumare/ai-engineering-from-scratch](https://github.com/rohitg00/ai-engineering-from-scratch)
- Tags: how-to-guide
- Published: 2026-06-09

---

**The [`install_skills.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/install_skills.py) script does not implement its own version-control system; instead, it propagates `version` and `tags` metadata declared in YAML front-matter of artifact files, enabling tag-based filtering and version tracking through generated manifests.**

The [`install_skills.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/install_skills.py) script in the `rohitg00/ai-engineering-from-scratch` repository handles artifact versioning and tagging by extracting metadata from YAML front-matter rather than implementing complex version-control logic. By scanning markdown files in `phases/**/outputs/`, the script captures version strings and tag lists declared by authors, enabling precise filtering and cataloging of skills, prompts, and agents.

## How Metadata Discovery Drives Artifact Versioning

The implementation spans three critical components:

- **[`scripts/install_skills.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/scripts/install_skills.py)**: The main driver that discovers artifacts, extracts `version` and `tags`, filters by criteria, copies files, and writes the manifest.
- **[`scripts/_lib.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/scripts/_lib.py)**: Contains the `parse_frontmatter` helper that extracts YAML metadata from markdown files.
- **`phases/**/outputs/*-*.md`**: The source markdown artifacts that declare `version:` and `tags:` in their front-matter.

### Parsing YAML Front-Matter from Artifact Files

While scanning `phases/**/outputs/*-*.md`, the script reads each file and extracts the front-matter via `parse_frontmatter`. The `version` field is captured as a plain string using `str(meta.get("version", ""))` on lines 124-126. For the `tags` field, the script ensures list semantics with `list(tags_raw) if isinstance(tags_raw, list) else []`, defaulting to an empty list when the metadata is absent or malformed.

### The Artifact Dataclass Structure

Each discovered artifact is represented by an `Artifact` dataclass defined in [`scripts/install_skills.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/scripts/install_skills.py) (lines 47-56). This data model stores the extracted `version` and `tags` alongside the artifact's type, name, phase, lesson, description, and source path. By encapsulating metadata in a structured object, the script maintains type safety throughout the installation pipeline.

## Tag-Based Filtering and Selection

### Implementing the --tag Filter

The `filter_artifacts` function (lines 149-152) implements the CLI's `--tag TAG` option by checking if the supplied `tag_filter` exists within each artifact's `tags` list. This allows users to narrow installations to specific categories such as "foundation" without manually curating file lists.

### Version Propagation During Copy Operations

When artifacts pass the filtering stage, the script copies them to the target directory while preserving their metadata. The `version` field remains immutable throughout this process, ensuring that installed artifacts carry their declared version identifiers regardless of the installation timestamp or target environment.

## Manifest Generation and Version Tracking

After copying selected artifacts, the `write_manifest` function (lines 58-68) generates a [`manifest.json`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/manifest.json) file in the target directory. Each entry includes the artifact's `version` and `tags` via `Artifact.to_dict`, creating a permanent record of exactly which versions were installed. This enables downstream tools to audit deployments and manage dependencies accurately.

## Practical Usage Examples

The following commands demonstrate how to leverage artifact versioning and tagging in practice:

```bash

# Install only skill artifacts tagged as "foundation"

python3 scripts/install_skills.py ./my_skills \
    --type skill \
    --tag foundation \
    --layout flat

```

This command filters the discovered artifacts to those containing the "foundation" tag before copying them to `./my_skills`.

```bash

# Dry-run to preview which artifacts would be copied, preserving their versions

python3 scripts/install_skills.py ./tmp \
    --type all \
    --dry-run

```

Use `--dry-run` to validate your tag and type filters without modifying the filesystem.

```bash

# Force-overwrite existing files and generate a versioned manifest

python3 scripts/install_skills.py ./dest \
    --layout by-phase \
    --force

```

The `--force` flag ensures that existing files are replaced, while the generated manifest captures the precise version metadata for each installed artifact.

## Summary

- **Metadata-driven approach**: The script reads `version` and `tags` from YAML front-matter in `phases/**/outputs/*-*.md` files rather than implementing internal version control.
- **Structured data model**: The `Artifact` dataclass encapsulates version and tag information alongside file metadata.
- **Flexible filtering**: The `--tag` CLI option leverages `filter_artifacts` to select only artifacts matching specific tags.
- **Audit trail**: The `write_manifest` function generates [`manifest.json`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/manifest.json) containing exact version strings and tag lists for every installed artifact.

## Frequently Asked Questions

### Does install_skills.py enforce semantic versioning?

No, the script treats the `version` field as an opaque string through `str(meta.get("version", ""))`. It does not parse or validate semantic versioning schemes; it simply records whatever version identifier the author declared in the YAML front-matter.

### What happens if an artifact file lacks a tags field?

If the `tags` field is missing or not a list, the script defaults to an empty list using `list(tags_raw) if isinstance(tags_raw, list) else []`. This ensures that filtering operations still function correctly, though the artifact will not match any specific `--tag` queries.

### How does the manifest.json support downstream tooling?

The `write_manifest` function creates a JSON file where each entry includes the artifact's `version` and `tags` via `Artifact.to_dict`. This allows CI/CD pipelines or other automation tools to verify which specific versions were installed and track dependencies across different environments.

### Can I filter by multiple tags simultaneously?

The current implementation in `filter_artifacts` (lines 149-152) supports filtering by a single tag via the `--tag` parameter. To filter by multiple tags, you would need to run the script separately for each tag or modify the filtering logic to accept multiple tag arguments.