# How verify-pipeline.mjs Performs Health Checks on the Career-Ops Job Tracker

> Learn how verify-pipeline.mjs performs health checks on the Career-Ops job tracker. This script validates job data integrity, ensuring accurate reporting and application status.

- Repository: [Santiago Fernández de Valderrama/career-ops](https://github.com/santifer/career-ops)
- Tags: how-to-guide
- Published: 2026-06-07

---

**`verify-pipeline.mjs` parses the [`applications.md`](https://github.com/santifer/career-ops/blob/main/applications.md) job tracker and executes seven deterministic validations—from canonical status verification to report-link existence—to enforce data integrity and exit with code `1` when errors are detected.**

The `santifer/career-ops` repository relies on `verify-pipeline.mjs` to safeguard the integrity of its job-tracking pipeline. This Node.js script reads the tracker file and related auxiliary resources, then applies a strict set of health checks to catch duplicates, malformed rows, and broken references before they propagate downstream.

## File Discovery and Initial Parsing

Before any validation runs, `verify-pipeline.mjs` resolves the path to the tracker file (lines 22‑28). It first checks for [`data/applications.md`](https://github.com/santifer/career-ops/blob/main/data/applications.md), falls back to a root-level [`applications.md`](https://github.com/santifer/career-ops/blob/main/applications.md), and respects the `CAREER_OPS_TRACKER` environment variable if set.

Once located, the script reads the tracker as a UTF‑8 string, splits it into lines, and parses every line beginning with `|` into a structured record object (lines 71‑83). Each record exposes the following fields: `num`, `date`, `company`, `role`, `score`, `status`, `pdf`, `report`, and `notes`. Header and divider lines are automatically skipped during this phase.

## The Seven Core Health Checks

After parsing, the script runs its checks in sequence. Errors are logged with a red ❌ prefix, warnings with ⚠️, and passing checks with ✅. The counters for `errors` and `warnings` are incremented accordingly.

### Canonical Status Validation

The first check ensures every entry’s `status` column contains an allowed canonical value or a known alias defined in [`templates/states.yml`](https://github.com/santifer/career-ops/blob/main/templates/states.yml) (lines 87‑112). It also forbids markdown bold syntax and embedded dates inside the status field, keeping the column clean for downstream processing.

### Duplicate Detection

To prevent redundant entries, the script builds a lookup map keyed by `company` plus `role` (lines 113‑129). Both values are normalized to lowercase and stripped to alphanumeric characters before hashing. Any key with more than one matching row is flagged as a possible duplicate.

### Report-Link Sanity Checks

For each row, the script extracts the markdown hyperlink from the `report` column and verifies that the referenced file exists (lines 130‑146). The lookup checks the path relative to the tracker directory and also from the repository root, accommodating legacy link structures.

### Score Format Enforcement

The `score` column must match the pattern `X/5` (for example, `4.2/5`) or use the special tokens `N/A` and `DUP` (lines 148‑156). Any other format is rejected immediately to maintain consistent scoring semantics across the tracker.

### Row-Structure Integrity

This structural guard ensures every data row begins with `|` and contains at least nine pipe-delimited columns (lines 158‑168). Rows that fail this test are reported as malformed, protecting the parser from shifted or truncated data.

### Pending TSV Detection

The script inspects the `batch/tracker-additions/` directory for any `.tsv` files that have not yet been merged into the main tracker (lines 172‑180). If unmerged files exist, a warning prompts the user to run the merge step so that staged data does not diverge from the canonical file.

### Bold-in-Score Warnings

Finally, the validator scans the `score` column for stray markdown bold markers (`**`) (lines 183‑190). Because bold formatting violates the tracker’s data contract, any instance is surfaced as a warning to preserve plain-text consistency.

## Summary Reporting and Exit Codes

At the end of the run, the script aggregates results and prints a concise summary such as `📊 Pipeline Health: 0 errors, 2 warnings` (lines 194‑204). A colour-coded status line follows: green when clean, yellow for warnings only, and red when errors exist. The process exits with `process.exit(errors > 0 ? 1 : 0)`, allowing CI pipelines to fail automatically when integrity is compromised.

## Running the Script Locally

Execute the validator from the repository root with the following command:

```bash
node verify-pipeline.mjs

```

Typical output for a healthy tracker looks like this:

```text
📊 Checking 23 entries in applications.md

✅ All statuses are canonical
✅ No exact duplicates found
✅ All report links valid
✅ All scores valid
✅ All rows properly formatted
✅ No pending TSVs
✅ No bold in scores

--------------------------------------------------
📊 Pipeline Health: 0 errors, 0 warnings
🟢 Pipeline is clean!

```

When issues exist, the script prints the offending line number and a short description:

```text
❌ #5: Non-canonical status "**Applied**"
⚠️ Possible duplicates: #12, #18 (Acme Corp — Senior Engineer)
❌ #7: Report not found: reports/007-acme-2023-05-01.md

```

## CI Integration for Career-Ops

You can integrate `verify-pipeline.mjs` into package scripts or GitHub Actions to block corrupted data before it reaches the main branch.

Add a convenience script to [`package.json`](https://github.com/santifer/career-ops/blob/main/package.json):

```json
{
  "scripts": {
    "verify": "node verify-pipeline.mjs"
  }
}

```

Then reference it in a GitHub Action workflow:

```yaml
- name: Verify tracker health
  run: npm run verify

```

Because the script exits with a non-zero status on error, the workflow step fails automatically and prevents the merge of invalid tracker data.

## Summary

- **`verify-pipeline.mjs`** resolves the tracker file via [`data/applications.md`](https://github.com/santifer/career-ops/blob/main/data/applications.md), a root-level fallback, or the `CAREER_OPS_TRACKER` environment variable.
- The script parses every row into a structured record with fields such as `company`, `role`, `status`, `score`, and `report`.
- **Seven core checks** safeguard the pipeline: canonical status validation, duplicate detection, report-link existence, score format enforcement, row-structure integrity, pending TSV detection, and bold-in-score warnings.
- Errors and warnings are tallied throughout execution and summarized at the end (lines 194‑204).
- The process exits with code `1` if any errors are found, making it ideal for CI gates and pre-commit hooks.

## Frequently Asked Questions

### What file does verify-pipeline.mjs check?

The script targets the job tracker, which defaults to [`data/applications.md`](https://github.com/santifer/career-ops/blob/main/data/applications.md) inside the `santifer/career-ops` repository. If that path is missing, it falls back to a root-level [`applications.md`](https://github.com/santifer/career-ops/blob/main/applications.md), or it respects an override via the `CAREER_OPS_TRACKER` environment variable.

### How does the script detect duplicate job applications?

It concatenates the `company` and `role` values for each row, normalizes them to lowercase, and strips non-alphanumeric characters to create a lookup key (lines 113‑129). If two or more rows share the same key, the script flags them as possible duplicates.

### What happens when a report link points to a missing file?

During the report-link sanity check (lines 130‑146), the script extracts the markdown hyperlink from the `report` column and verifies the file exists relative to the tracker directory or the repository root. If the file is not found, the script logs an error and ultimately exits with code `1`.

### Can I use verify-pipeline.mjs inside a GitHub Action?

Yes. The script is designed for automation because it returns a non-zero exit code whenever errors are detected. You can invoke it with `npm run verify` or `node verify-pipeline.mjs` as a step in any CI workflow, and the job will fail immediately if the tracker violates any validation rule.