how-to-guide

How to Normalize Canonical Statuses Across Tracker Entries in Career-Ops

June 9, 2026 santifer/career-ops ↗

The normalize-statuses.mjs script enforces data integrity by scanning applications.md and remapping any non-canonical status values to one of eight defined states listed in templates/states.yml.

In the santifer/career-ops repository, tracker entries for job applications rely on a strict taxonomy of canonical statuses to ensure consistent reporting. The system uses a Node.js normalization script to automatically detect and correct malformed or language-specific status values, converting them to a standardized English set defined in a YAML configuration file. This process prevents downstream analytics errors and maintains a single source of truth across localized entries.

The Canonical Status Schema

According to the santifer/career-ops source code, the authoritative list of valid states resides in templates/states.yml. This file defines eight canonical labels, each mapping to specific language aliases to support internationalization.

The canonical states are:

Evaluated – Maps from evaluada.
Applied – Maps from aplicado, enviada, aplicada, or sent.
Responded – Maps from respondido.
Interview – Maps from entrevista.
Offer – Maps from oferta.
Rejected – Maps from rechazado or rechazada.
Discarded – Maps from descartado, descartada, cerrada, or cancelada.
SKIP – Maps from no_aplicar, no aplicar, skip, or monitor.

Each entry in the tracker file applications.md (or data/applications.md) must conform to one of these eight values to be considered valid by downstream scripts like merge-tracker.mjs.

How Statuses Are Normalized Across Tracker Entries

The normalize-statuses.mjs script performs an automated sweep of the tracker file to enforce the canonical schema. The process executes in eight distinct steps:

Load the source file – The script selects data/applications.md if present, otherwise falling back to the legacy applications.md path (lines 19‑22).
Parse the status column – For each markdown table row, the script isolates the status field (parts[6]).
Sanitize formatting – The script removes markdown bold syntax (**) using s.replace(/\*\*/g, '') to ensure clean string matching.
Map to canonical values – The normalizeStatus(raw) function (lines 28‑86) handles the core logic:
- Detects specific markers like DUPLICADO, CERRADA, or RECHAZADO and maps them to Discarded or Rejected.
- Strips date suffixes (e.g., aplicado 2023) to return Applied.
- Translates Spanish aliases (evaluada, entrevista) to English equivalents (Evaluated, Interview).
- Validates against the canonical list (Evaluated, Applied, Responded, Interview, Offer, Rejected, Discarded, SKIP) using case‑insensitive matching.
Preserve auxiliary data – If a status like DUPLICADO contains extra text, that content is moved to the notes column (lines 26‑34).
Rewrite the row – The script reconstructs the table line with the new canonical status and cleans the score column of bold formatting (lines 36‑40).
Log changes – Every substitution is reported to the console in the format #${num}: "old" → "new".
Atomic write with backup – Unless the --dry‑run flag is set, the original file is backed up as applications.md.bak before the normalized content is written (lines 58‑62).

Practical Usage and Code Examples

You can execute the normalizer directly from the command line. Use the --dry‑run flag to preview changes without modifying data.

Preview changes safely:

node normalize-statuses.mjs --dry-run

Apply normalization and create a backup:

node normalize-statuses.mjs

Input Transformation Example

Consider a tracker row containing a non‑canonical Spanish status with formatting artifacts:

| 12 | 2024-04-01 | Acme Corp | Senior Engineer | 4.5/5 | **Aplicado 2024** | ✅ | [12](reports/012-acme-2024-04-01.md) | |

After processing, the script produces a clean, canonical entry:

| 12 | 2024-04-01 | Acme Corp | Senior Engineer | 4.5/5 | Applied | ✅ | [12](reports/012-acme-2024-04-01.md) | |

In this example, the date suffix is stripped, the Spanish alias Aplicado is normalized to Applied, and the markdown bold markers are removed from the status column.

Safety Features and Data Integrity

The normalization workflow includes safeguards to prevent data loss. Before writing changes, the script creates a timestamped backup file named applications.md.bak, allowing for manual rollback if necessary. The --dry‑run mode provides a complete diff of proposed changes without altering the source file, enabling validation before execution.

Maintaining canonical statuses is critical for downstream consumption. Scripts such as merge-tracker.mjs assume the status column conforms to the eight authorized values; mismatched strings would break grouping logic and corrupt analytics dashboards. By centralizing the canonical definition in templates/states.yml and enforcing it via normalize-statuses.mjs, the repository ensures that internationalized entries (Spanish variants) integrate seamlessly into an English‑standardized reporting pipeline.

Summary

Canonical statuses are defined in templates/states.yml and include eight states: Evaluated, Applied, Responded, Interview, Offer, Rejected, Discarded, and SKIP.
The normalize-statuses.mjs script scans applications.md (or data/applications.md) and remaps any alias or malformed status to its canonical English equivalent.
The normalizeStatus() function handles Spanish translations, date suffixes, and specific markers like DUPLICADO or CERRADA.
Safety mechanisms include a --dry‑run preview mode and an automatic .bak file creation before overwriting.
Downstream tools like merge-tracker.mjs rely on this normalization to ensure data integrity and accurate reporting.

Frequently Asked Questions

Where are the canonical status definitions stored?

The canonical status definitions are stored in templates/states.yml in the root of the santifer/career-ops repository. This YAML file lists the eight valid states—such as Applied, Interview, and Rejected—along with their accepted aliases in other languages.

How does the normalizer handle Spanish status values?

The normalizeStatus() function in normalize-statuses.mjs includes a mapping layer that converts Spanish terms like evaluada, aplicado, and rechazado to their English canonical equivalents (Evaluated, Applied, Rejected). It performs case‑insensitive matching and strips formatting characters to ensure robust detection.

Is it possible to run the normalization without changing my data?

Yes. You can run the script with the --dry‑run flag to preview all proposed changes in the console. This mode parses the file and logs every substitution (e.g., #12: "Aplicado" → "Applied") but does not write to applications.md or create a backup file.

What does the script do if a status includes a date or extra notes?

The parser automatically strips date suffixes (e.g., aplicado 2023 becomes Applied). If specific markers like DUPLICADO contain additional text, that content is moved to the notes column before the status is reset to its canonical value, ensuring no auxiliary information is lost during normalization.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:

curl -s "https://instagit.com/install.md"

Add to your MCP client configuration:

{
  "mcpServers": {
    "instagit": {
      "command": "npx",
      "args": ["-y", "instagit@latest"]
    }
  }
}

Ask your agent:

"Use Instagit MCP to understand how santifer/career-ops works."

Works with

Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →