# ATS PDF Generation Pipeline: From HTML Template to Space Grotesk/DM Sans Output

> Discover the ATS PDF generation pipeline in the career-ops repo. Learn how Node.js transforms HTML templates into ATS compatible PDFs with embedded Space Grotesk and DM Sans fonts using Playwright.

- Repository: [Santiago Fernández de Valderrama/career-ops](https://github.com/santifer/career-ops)
- Tags: how-to-guide
- Published: 2026-06-07

---

**The career-ops repository converts styled HTML CV templates into ATS-compatible PDFs using a Node.js pipeline that rewrites font paths, sanitizes Unicode text, and renders the final document via Playwright with embedded Space Grotesk and DM Sans fonts.**

The santifer/career-ops project provides a complete **ATS PDF generation pipeline** that transforms data-driven HTML templates into printer-ready résumés. Unlike simple HTML-to-PDF converters, this implementation handles font embedding, path resolution, and text normalization specifically designed to pass modern applicant tracking systems while maintaining precise typographic control.

## Step 1 — HTML CV Template with @font-face Rules

The process begins in [`templates/cv-template.html`](https://github.com/santifer/career-ops/blob/main/templates/cv-template.html), which defines the visual structure and embeds typography via CSS `@font-face` declarations.

- **Space Grotesk** serves as the display font for headings.
- **DM Sans** provides the body text rendering.

The template references local font files using relative paths (`./fonts/...`) and contains placeholder tokens such as `{{NAME}}` and `{{EXPERIENCE}}` that are substituted with user data prior to rendering. The font assets reside in the repository’s `fonts/` directory, including specific files like `space-grotesk-latin.woff2` and `dm-sans-latin.woff2`.

## Step 2 — Font Resolution and Path Rewriting

Before Playwright processes the HTML, `generate-pdf.mjs` (lines 26–34) rewrites relative font URLs to absolute `file://` paths. This transformation ensures the headless browser can locate self-hosted fonts regardless of the temporary working directory.

The script uses a regex replacement to transform `url('./fonts/...')` references into absolute file system URLs:

```javascript
const fontsDir = resolve(__dirname, 'fonts');
html = html.replace(/url\(['"]?\.\/fonts\//g, `url('file://${fontsDir}/`);

```

This step eliminates 404 errors for font resources when the HTML is loaded from memory or temporary locations.

## Step 3 — ATS Text Normalization

The `normalizeTextForATS` helper function (lines 34–89 in `generate-pdf.mjs`) sanitizes the HTML content to prevent ATS parsing failures. The function strips problematic Unicode characters including:

- Smart quotes (curly apostrophes)
- Em and en dashes
- Zero-width spaces and joiners
- Other non-ASCII glyphs that confuse legacy parsers

All replacements are logged to stderr for debugging purposes, ensuring you can verify exactly which characters were converted to their ASCII-safe equivalents.

## Step 4 — Playwright Rendering

The final stage leverages Playwright to converts the processed HTML into a binary PDF buffer.

### Browser Launch and Content Loading

The script launches a headless Chromium instance and loads the normalized HTML using `page.setContent` with a `baseURL` parameter set to `file://${dirname(htmlPath)}/`. This base URL ensures any relative assets (such as profile images) resolve correctly against the template directory.

### Font Loading Verification

Before generating the PDF, the script explicitly waits for web fonts to fully load:

```javascript
await page.evaluate(() => document.fonts.ready);

```

This guarantees that **Space Grotesk** and **DM Sans** are completely rendered, preventing fallback font substitution or layout shifts in the final output.

### PDF Generation

The `page.pdf` method produces the output with ATS-optimized settings:

- **Format**: Configurable as `a4` or `letter`
- **Print background**: Enabled (`printBackground: true`) to preserve CSS background colors
- **Margins**: Set to 0.6 inches on all sides for safe printing boundaries

```javascript
const pdf = await page.pdf({ 
  format: 'a4', 
  printBackground: true,
  margin: { top: '0.6in', right: '0.6in', bottom: '0.6in', left: '0.6in' }
});

```

The resulting buffer is written to the target file system, producing a PDF that embeds both font families as subsetted subsets while containing only the clean, normalized text extracted by ATS parsers.

## Command-Line Usage

Execute the pipeline from the terminal using Node.js:

```bash

# Generate A4 PDF (default)

node generate-pdf.mjs templates/cv-template.html output/cv.pdf

# Generate US Letter format

node generate-pdf.mjs templates/cv-template.html output/cv-letter.pdf --format=letter

```

Each run outputs a confirmation report displaying the file path, page count, and file size.

## Summary

- The **ATS PDF generation pipeline** in santifer/career-ops runs as a four-stage process: template preparation, font path resolution, Unicode normalization, and Playwright rendering.
- **Space Grotesk** and **DM Sans** fonts are self-hosted and referenced via absolute `file://` URLs to ensure reliable embedding.
- The `normalizeTextForATS` function removes smart quotes, zero-width characters, and other Unicode hazards that break legacy ATS parsers.
- Playwright waits for `document.fonts.ready` before PDF generation to guarantee typographic accuracy.
- Output supports both **A4** and **Letter** formats with configurable margins and preserved background styling.

## Frequently Asked Questions

### Why does the pipeline rewrite font paths to absolute file:// URLs?

Playwright loads HTML content in a headless Chromium instance that resolves relative URLs against the current working directory, which may differ from the template location. By rewriting `url('./fonts/...')` to `file://${absolutePath}` in `generate-pdf.mjs` (lines 26–34), the script ensures the browser always finds the Space Grotesk and DM Sans font files regardless of where the Node.js process executes.

### What Unicode characters does normalizeTextForATS remove?

The `normalizeTextForATS` function targets smart quotes (curly apostrophes), em-dashes, en-dashes, zero-width spaces, zero-width joiners, and other non-ASCII glyphs that cause extraction failures in rigid ATS parsers. It replaces these with standard ASCII equivalents (straight quotes, hyphens, and spaces) while logging each transformation for audit purposes.

### How does the pipeline ensure fonts are fully loaded before PDF generation?

After calling `page.setContent`, the script executes `await page.evaluate(() => document.fonts.ready)`, which returns a Promise that resolves only when all `@font-face` resources—including Space Grotesk and DM Sans variants—have finished loading and are active in the document. This prevents the PDF from rendering with fallback system fonts.

### Can I customize the page format and margins?

Yes. The `generate-pdf.mjs` script accepts a `--format` flag supporting `a4` or `letter` values. Internally, these map to Playwright’s PDF options. Margins are hardcoded to 0.6 inches in the current implementation, though you can modify the `margin` object passed to `page.pdf` in the source code to adjust top, right, bottom, and left spacing.