ATS PDF Generation Pipeline: From HTML Template to Space Grotesk/DM Sans Output
The career-ops repository converts styled HTML CV templates into ATS-compatible PDFs using a Node.js pipeline that rewrites font paths, sanitizes Unicode text, and renders the final document via Playwright with embedded Space Grotesk and DM Sans fonts.
The santifer/career-ops project provides a complete ATS PDF generation pipeline that transforms data-driven HTML templates into printer-ready résumés. Unlike simple HTML-to-PDF converters, this implementation handles font embedding, path resolution, and text normalization specifically designed to pass modern applicant tracking systems while maintaining precise typographic control.
Step 1 — HTML CV Template with @font-face Rules
The process begins in templates/cv-template.html, which defines the visual structure and embeds typography via CSS @font-face declarations.
- Space Grotesk serves as the display font for headings.
- DM Sans provides the body text rendering.
The template references local font files using relative paths (./fonts/...) and contains placeholder tokens such as {{NAME}} and {{EXPERIENCE}} that are substituted with user data prior to rendering. The font assets reside in the repository’s fonts/ directory, including specific files like space-grotesk-latin.woff2 and dm-sans-latin.woff2.
Step 2 — Font Resolution and Path Rewriting
Before Playwright processes the HTML, generate-pdf.mjs (lines 26–34) rewrites relative font URLs to absolute file:// paths. This transformation ensures the headless browser can locate self-hosted fonts regardless of the temporary working directory.
The script uses a regex replacement to transform url('./fonts/...') references into absolute file system URLs:
const fontsDir = resolve(__dirname, 'fonts');
html = html.replace(/url\(['"]?\.\/fonts\//g, `url('file://${fontsDir}/`);
This step eliminates 404 errors for font resources when the HTML is loaded from memory or temporary locations.
Step 3 — ATS Text Normalization
The normalizeTextForATS helper function (lines 34–89 in generate-pdf.mjs) sanitizes the HTML content to prevent ATS parsing failures. The function strips problematic Unicode characters including:
- Smart quotes (curly apostrophes)
- Em and en dashes
- Zero-width spaces and joiners
- Other non-ASCII glyphs that confuse legacy parsers
All replacements are logged to stderr for debugging purposes, ensuring you can verify exactly which characters were converted to their ASCII-safe equivalents.
Step 4 — Playwright Rendering
The final stage leverages Playwright to converts the processed HTML into a binary PDF buffer.
Browser Launch and Content Loading
The script launches a headless Chromium instance and loads the normalized HTML using page.setContent with a baseURL parameter set to file://${dirname(htmlPath)}/. This base URL ensures any relative assets (such as profile images) resolve correctly against the template directory.
Font Loading Verification
Before generating the PDF, the script explicitly waits for web fonts to fully load:
await page.evaluate(() => document.fonts.ready);
This guarantees that Space Grotesk and DM Sans are completely rendered, preventing fallback font substitution or layout shifts in the final output.
PDF Generation
The page.pdf method produces the output with ATS-optimized settings:
- Format: Configurable as
a4orletter - Print background: Enabled (
printBackground: true) to preserve CSS background colors - Margins: Set to 0.6 inches on all sides for safe printing boundaries
const pdf = await page.pdf({
format: 'a4',
printBackground: true,
margin: { top: '0.6in', right: '0.6in', bottom: '0.6in', left: '0.6in' }
});
The resulting buffer is written to the target file system, producing a PDF that embeds both font families as subsetted subsets while containing only the clean, normalized text extracted by ATS parsers.
Command-Line Usage
Execute the pipeline from the terminal using Node.js:
# Generate A4 PDF (default)
node generate-pdf.mjs templates/cv-template.html output/cv.pdf
# Generate US Letter format
node generate-pdf.mjs templates/cv-template.html output/cv-letter.pdf --format=letter
Each run outputs a confirmation report displaying the file path, page count, and file size.
Summary
- The ATS PDF generation pipeline in santifer/career-ops runs as a four-stage process: template preparation, font path resolution, Unicode normalization, and Playwright rendering.
- Space Grotesk and DM Sans fonts are self-hosted and referenced via absolute
file://URLs to ensure reliable embedding. - The
normalizeTextForATSfunction removes smart quotes, zero-width characters, and other Unicode hazards that break legacy ATS parsers. - Playwright waits for
document.fonts.readybefore PDF generation to guarantee typographic accuracy. - Output supports both A4 and Letter formats with configurable margins and preserved background styling.
Frequently Asked Questions
Why does the pipeline rewrite font paths to absolute file:// URLs?
Playwright loads HTML content in a headless Chromium instance that resolves relative URLs against the current working directory, which may differ from the template location. By rewriting url('./fonts/...') to file://${absolutePath} in generate-pdf.mjs (lines 26–34), the script ensures the browser always finds the Space Grotesk and DM Sans font files regardless of where the Node.js process executes.
What Unicode characters does normalizeTextForATS remove?
The normalizeTextForATS function targets smart quotes (curly apostrophes), em-dashes, en-dashes, zero-width spaces, zero-width joiners, and other non-ASCII glyphs that cause extraction failures in rigid ATS parsers. It replaces these with standard ASCII equivalents (straight quotes, hyphens, and spaces) while logging each transformation for audit purposes.
How does the pipeline ensure fonts are fully loaded before PDF generation?
After calling page.setContent, the script executes await page.evaluate(() => document.fonts.ready), which returns a Promise that resolves only when all @font-face resources—including Space Grotesk and DM Sans variants—have finished loading and are active in the document. This prevents the PDF from rendering with fallback system fonts.
Can I customize the page format and margins?
Yes. The generate-pdf.mjs script accepts a --format flag supporting a4 or letter values. Internally, these map to Playwright’s PDF options. Margins are hardcoded to 0.6 inches in the current implementation, though you can modify the margin object passed to page.pdf in the source code to adjust top, right, bottom, and left spacing.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →