How merge-tracker.mjs Handles Column Reordering When Merging Tracker Entries
The merge-tracker.mjs script detects swapped status/score columns using regex heuristics in the parseTsvContent function, then normalizes entries to a canonical order before appending them to applications.md.
The merge-tracker.mjs script in the santifer/career-ops repository merges tab-separated value (TSV) tracker rows into the main applications.md table. Working alongside tracker-links.mjs and verify-pipeline.mjs, this utility handles column reordering when TSV producers place the status and score fields in different orders, ensuring consistent data formatting regardless of input variations.
Detecting Column Order in parseTsvContent
The detection logic resides in the parseTsvContent function (lines 26‑48 of merge-tracker.mjs).
Splitting and Inspecting TSV Lines
When processing a TSV file, the script first splits the content on tab characters:
const parts = content.split('\t');
It then examines columns 4 and 5 (zero‑based indices 4 and 5), which contain either a status string (e.g., "Applied") or a score string (e.g., "4.2/5").
Heuristic Pattern Matching
The script applies two distinct heuristics to identify which column contains which data:
- Score detection: A column matches the score pattern if it conforms to
^\d+\.?\d*\/5$or equals the special tokensN/AorDUP. - Status detection: A column matches the status pattern if it contains canonical status words or aliases recognized by the
validateStatusfunction (e.g., "evaluated", "aplicado", "skip").
Determining Standard vs Swapped Order
Based on these checks, the script determines the column orientation using the following logic:
- Standard order (status → score): If column 4 matches a status pattern and column 5 does not look like a score, the file follows the normal layout.
- Swapped order (score → status): If column 4 matches the score pattern and column 5 matches a status pattern, the columns are reversed.
- Fallback: If neither condition is met unambiguously, the script defaults to treating column 4 as status and column 5 as score.
This decision sets the local variables statusCol and scoreCol accordingly.
Normalizing Entries to Canonical Format
Once the column order is identified, the script constructs a uniform data structure regardless of the original layout.
Building the Addition Object
Lines 50‑60 create a normalized addition object:
const addition = {
status: validateStatus(statusCol),
score: scoreCol,
// ... other fields
};
The validateStatus function ensures the status field always contains the canonical form, while the score field preserves the raw string (e.g., "4.5/5" or "N/A").
Writing Consistent Rows to applications.md
When appending to applications.md, the script enforces a canonical column order. Lines 99‑100 construct the output line placing the score before the status:
const line = `| ${addition.id} | ${addition.date} | ${addition.company} | ${addition.role} | ${addition.score} | ${addition.status} | ...`;
Notice that score precedes status in the final output, ensuring the tracker table maintains a consistent schema even when processing TSV files with reversed columns.
Implementation Examples
Consider these two TSV inputs demonstrating the reordering logic:
Standard TSV (status before score):
42 2024-05-01 Acme Corp Senior Engineer Applied 4.5/5 ✅ [42](reports/42-acme-2024-05-01.md) First interview
The script recognizes "Applied" as a status and "4.5/5" as a score, mapping them directly to statusCol and scoreCol.
Swapped TSV (score before status):
43 2024-05-02 Beta Ltd Backend Engineer 4.2/5 Rejected ❌ [43](reports/43-beta-2024-05-02.md) No response
Here, column 4 ("4.2/5") matches the score regex while column 5 ("Rejected") matches a status alias. The heuristic flips the assignment, producing statusCol = "Rejected" and scoreCol = "4.2/5".
Resulting canonical entries in applications.md:
| 42 | 2024-05-01 | Acme Corp | Senior Engineer | 4.5/5 | Applied | ✅ | [42](reports/42-acme-2024-05-01.md) | First interview |
| 43 | 2024-05-02 | Beta Ltd | Backend Engineer | 4.2/5 | Rejected | ❌ | [43](reports/43-beta-2024-05-02.md) | No response |
Both entries now follow the identical column layout: score appears in the fifth column, status in the sixth.
Summary
- The
parseTsvContentfunction (lines 26‑48) inmerge-tracker.mjsdetects column order by applying regex heuristics to columns 4 and 5. - Score patterns match
^\d+\.?\d*\/5$or the literalsN/A/DUP, while status patterns match canonical words recognized byvalidateStatus. - When column 4 contains a score and column 5 contains a status, the script swaps the assignment to maintain logical consistency.
- The script writes all entries to
applications.mdusing a canonical order (score followed by status) regardless of the input TSV structure.
Frequently Asked Questions
What happens if a TSV file contains ambiguous column values?
If neither column 4 nor column 5 clearly matches the expected patterns, the script falls back to treating column 4 as the status and column 5 as the score. While this ensures the merge operation continues, it assumes the standard layout when heuristics fail.
Can I customize the status aliases that trigger column detection?
Yes. The validateStatus function recognizes specific canonical status words and their aliases. To add new aliases, modify the validation logic in merge-tracker.mjs that compares column values against known status strings.
Why does the script specifically check columns 4 and 5?
The santifer/career-ops tracker schema defines ID, date, company, and role as the first four columns (indices 0‑3). The variable fields—status and score—always occupy positions 4 and 5, though their order may vary depending on the TSV producer. This fixed positioning allows the heuristic to focus only on the potentially swapped fields.
Does the reordering affect how scores are validated?
No. The script preserves the raw score string (e.g., "4.5/5" or "N/A") during normalization. Only the column positioning changes; the score validation and formatting remain consistent with the original input value.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →