# How to Normalize Canonical Statuses Across Tracker Entries in Career-Ops

> Learn how the normalize statuses script in santifer/career-ops ensures data integrity by standardizing tracker entry statuses to eight defined states. Maintain consistent career operations tracking.

- Repository: [Santiago Fernández de Valderrama/career-ops](https://github.com/santifer/career-ops)
- Tags: how-to-guide
- Published: 2026-06-09

---

**The `normalize-statuses.mjs` script enforces data integrity by scanning [`applications.md`](https://github.com/santifer/career-ops/blob/main/applications.md) and remapping any non-canonical status values to one of eight defined states listed in [`templates/states.yml`](https://github.com/santifer/career-ops/blob/main/templates/states.yml).**

In the `santifer/career-ops` repository, tracker entries for job applications rely on a strict taxonomy of **canonical statuses** to ensure consistent reporting. The system uses a Node.js normalization script to automatically detect and correct malformed or language-specific status values, converting them to a standardized English set defined in a YAML configuration file. This process prevents downstream analytics errors and maintains a single source of truth across localized entries.

## The Canonical Status Schema

According to the `santifer/career-ops` source code, the authoritative list of valid states resides in **[`templates/states.yml`](https://github.com/santifer/career-ops/blob/main/templates/states.yml)**. This file defines eight canonical labels, each mapping to specific language aliases to support internationalization.

The canonical states are:

- **Evaluated** – Maps from `evaluada`.
- **Applied** – Maps from `aplicado`, `enviada`, `aplicada`, or `sent`.
- **Responded** – Maps from `respondido`.
- **Interview** – Maps from `entrevista`.
- **Offer** – Maps from `oferta`.
- **Rejected** – Maps from `rechazado` or `rechazada`.
- **Discarded** – Maps from `descartado`, `descartada`, `cerrada`, or `cancelada`.
- **SKIP** – Maps from `no_aplicar`, `no aplicar`, `skip`, or `monitor`.

Each entry in the tracker file **[`applications.md`](https://github.com/santifer/career-ops/blob/main/applications.md)** (or [`data/applications.md`](https://github.com/santifer/career-ops/blob/main/data/applications.md)) must conform to one of these eight values to be considered valid by downstream scripts like `merge-tracker.mjs`.

## How Statuses Are Normalized Across Tracker Entries

The **`normalize-statuses.mjs`** script performs an automated sweep of the tracker file to enforce the canonical schema. The process executes in eight distinct steps:

1. **Load the source file** – The script selects [`data/applications.md`](https://github.com/santifer/career-ops/blob/main/data/applications.md) if present, otherwise falling back to the legacy [`applications.md`](https://github.com/santifer/career-ops/blob/main/applications.md) path (lines 19‑22).
2. **Parse the status column** – For each markdown table row, the script isolates the status field (`parts[6]`).
3. **Sanitize formatting** – The script removes markdown bold syntax (`**`) using `s.replace(/\*\*/g, '')` to ensure clean string matching.
4. **Map to canonical values** – The `normalizeStatus(raw)` function (lines 28‑86) handles the core logic:
   - Detects specific markers like **DUPLICADO**, **CERRADA**, or **RECHAZADO** and maps them to `Discarded` or `Rejected`.
   - Strips date suffixes (e.g., `aplicado 2023`) to return `Applied`.
   - Translates Spanish aliases (`evaluada`, `entrevista`) to English equivalents (`Evaluated`, `Interview`).
   - Validates against the canonical list (`Evaluated`, `Applied`, `Responded`, `Interview`, `Offer`, `Rejected`, `Discarded`, `SKIP`) using case‑insensitive matching.
5. **Preserve auxiliary data** – If a status like `DUPLICADO` contains extra text, that content is moved to the *notes* column (lines 26‑34).
6. **Rewrite the row** – The script reconstructs the table line with the new canonical status and cleans the *score* column of bold formatting (lines 36‑40).
7. **Log changes** – Every substitution is reported to the console in the format `#${num}: "old" → "new"`.
8. **Atomic write with backup** – Unless the `--dry‑run` flag is set, the original file is backed up as `applications.md.bak` before the normalized content is written (lines 58‑62).

## Practical Usage and Code Examples

You can execute the normalizer directly from the command line. Use the `--dry‑run` flag to preview changes without modifying data.

Preview changes safely:

```bash
node normalize-statuses.mjs --dry-run

```

Apply normalization and create a backup:

```bash
node normalize-statuses.mjs

```

### Input Transformation Example

Consider a tracker row containing a non‑canonical Spanish status with formatting artifacts:

```markdown
| 12 | 2024-04-01 | Acme Corp | Senior Engineer | 4.5/5 | **Aplicado 2024** | ✅ | [12](reports/012-acme-2024-04-01.md) | |

```

After processing, the script produces a clean, canonical entry:

```markdown
| 12 | 2024-04-01 | Acme Corp | Senior Engineer | 4.5/5 | Applied | ✅ | [12](reports/012-acme-2024-04-01.md) | |

```

In this example, the date suffix is stripped, the Spanish alias `Aplicado` is normalized to `Applied`, and the markdown bold markers are removed from the status column.

## Safety Features and Data Integrity

The normalization workflow includes safeguards to prevent data loss. Before writing changes, the script creates a timestamped backup file named `applications.md.bak`, allowing for manual rollback if necessary. The `--dry‑run` mode provides a complete diff of proposed changes without altering the source file, enabling validation before execution.

Maintaining **canonical statuses** is critical for downstream consumption. Scripts such as `merge-tracker.mjs` assume the status column conforms to the eight authorized values; mismatched strings would break grouping logic and corrupt analytics dashboards. By centralizing the canonical definition in [`templates/states.yml`](https://github.com/santifer/career-ops/blob/main/templates/states.yml) and enforcing it via `normalize-statuses.mjs`, the repository ensures that internationalized entries (Spanish variants) integrate seamlessly into an English‑standardized reporting pipeline.

## Summary

- **Canonical statuses** are defined in [`templates/states.yml`](https://github.com/santifer/career-ops/blob/main/templates/states.yml) and include eight states: `Evaluated`, `Applied`, `Responded`, `Interview`, `Offer`, `Rejected`, `Discarded`, and `SKIP`.
- The **`normalize-statuses.mjs`** script scans [`applications.md`](https://github.com/santifer/career-ops/blob/main/applications.md) (or [`data/applications.md`](https://github.com/santifer/career-ops/blob/main/data/applications.md)) and remaps any alias or malformed status to its canonical English equivalent.
- The `normalizeStatus()` function handles Spanish translations, date suffixes, and specific markers like `DUPLICADO` or `CERRADA`.
- Safety mechanisms include a `--dry‑run` preview mode and an automatic `.bak` file creation before overwriting.
- Downstream tools like `merge-tracker.mjs` rely on this normalization to ensure data integrity and accurate reporting.

## Frequently Asked Questions

### Where are the canonical status definitions stored?

The canonical status definitions are stored in **[`templates/states.yml`](https://github.com/santifer/career-ops/blob/main/templates/states.yml)** in the root of the `santifer/career-ops` repository. This YAML file lists the eight valid states—such as `Applied`, `Interview`, and `Rejected`—along with their accepted aliases in other languages.

### How does the normalizer handle Spanish status values?

The `normalizeStatus()` function in `normalize-statuses.mjs` includes a mapping layer that converts Spanish terms like `evaluada`, `aplicado`, and `rechazado` to their English canonical equivalents (`Evaluated`, `Applied`, `Rejected`). It performs case‑insensitive matching and strips formatting characters to ensure robust detection.

### Is it possible to run the normalization without changing my data?

Yes. You can run the script with the **`--dry‑run`** flag to preview all proposed changes in the console. This mode parses the file and logs every substitution (e.g., `#12: "Aplicado" → "Applied"`) but does not write to [`applications.md`](https://github.com/santifer/career-ops/blob/main/applications.md) or create a backup file.

### What does the script do if a status includes a date or extra notes?

The parser automatically strips date suffixes (e.g., `aplicado 2023` becomes `Applied`). If specific markers like `DUPLICADO` contain additional text, that content is moved to the *notes* column before the status is reset to its canonical value, ensuring no auxiliary information is lost during normalization.