How the PDF Generation Pipeline Creates ATS-Optimized CVs Using Playwright and HTML Templates

The Career-Ops pipeline converts Mustache-templated HTML into ATS-friendly PDFs by normalizing Unicode characters, rendering through headless Chromium, and exporting with specific margin settings to guarantee both visual fidelity and parser compatibility.

The santifer/career-ops repository implements a deterministic workflow that bridges modern web design with applicant tracking system (ATS) requirements. By combining HTML templating with a controlled browser environment, the pipeline generates ATS-optimized CVs using Playwright and HTML templates that maintain consistent typography while ensuring the underlying text remains machine-readable.

HTML Template Structure and Placeholder System

The foundation of every generated CV rests in templates/cv-template.html. This base layout uses Mustache-style placeholders such as {{NAME}}, {{EXPERIENCE}}, {{PHONE}}, and {{EMAIL}} that external scripts populate before PDF conversion.

The template includes CSS page-break controls to ensure professional formatting:

<div class="header avoid-break">
  <h1>{{NAME}}</h1>
  <div class="header-gradient"></div>
  <div class="contact-row">
    <span>{{PHONE}}</span> | <span>{{EMAIL}}</span> | 
    <a href="{{LINKEDIN_URL}}">{{LINKEDIN_DISPLAY}}</a> | 
    <span>{{LOCATION}}</span>
  </div>
</div>
...
<div class="section">
  <div class="section-title">{{SECTION_EXPERIENCE}}</div>
  {{EXPERIENCE}}
</div>

The avoid-break class prevents awkward page splits between critical sections, ensuring the printed output respects document flow.

Pre-Processing for ATS Compliance

Before rendering, generate-pdf.mjs performs two critical normalization steps on the raw HTML content.

Resolving Font Assets with Absolute file:// URLs

Lines 31-34 of generate-pdf.mjs rewrite relative font references to absolute file:// paths. This conversion ensures the headless Chromium instance can load local font assets from the templates/fonts/ directory without requiring a local HTTP server.

Unicode Normalization via normalizeTextForATS

The normalizeTextForATS function (lines 34-45) sanitizes content by replacing Unicode characters that break ATS parsers. The routine targets smart quotes, em-dashes, zero-width spaces, non-breaking spaces, bullet glyphs, and currency symbols, converting them to plain ASCII equivalents that downstream parsers reliably interpret.

Browser Rendering with Playwright

The pipeline leverages Playwright's headless browser to achieve pixel-perfect rendering while maintaining cross-platform consistency.

Launching the Headless Instance

Line 50 initializes the rendering engine:

const browser = await chromium.launch({ headless: true });
const page = await browser.newPage();

This creates an isolated Chromium environment free from local system font caches or browser extensions that might alter layout.

Content Injection and Font Loading

The pipeline feeds prepared HTML to page.setContent() with an explicit baseURL parameter pointing to the source file directory. This ensures relative resources (images, stylesheets) resolve correctly. Lines 60-61 explicitly await document.fonts.ready to guarantee all web fonts are fully loaded before capture, eliminating flash-of-unstyled-text artifacts in the final output.

PDF Generation and Output

Once rendering completes, the pipeline exports the document using Playwright's native PDF capabilities.

Configuring PDF Export Parameters

Lines 63-72 configure the page.pdf() call with ATS-friendly settings:

const pdfBuffer = await page.pdf({
  format: 'a4', // or 'letter' via CLI flag
  printBackground: true,
  margin: {
    top: '0.6in',
    bottom: '0.6in',
    left: '0.6in',
    right: '0.6in'
  }
});

The printBackground: true option preserves CSS color definitions and gradient headers, while generous margins ensure content remains clear when printed or viewed digitally.

Writing Output and Page Count Validation

The resulting binary is written to disk via writeFile (lines 77-78). A heuristic in lines 80-83 reads the PDF buffer to calculate and log the page count, providing immediate feedback on whether the CV fits target length constraints.

Running the Pipeline

Execute the generator from the command line:

node career-ops/generate-pdf.mjs output/cv.html output/cv.pdf --format=a4

The script accepts the input HTML path, output PDF path, and optional format specification (defaults to A4).

Summary

  • HTML templates in templates/cv-template.html provide the structural foundation with Mustache placeholders for dynamic content.
  • ATS normalization via normalizeTextForATS (lines 34-45) strips problematic Unicode characters before rendering.
  • Playwright's headless Chromium ensures browser-accurate rendering with explicit font loading waits at document.fonts.ready (lines 60-61).
  • PDF export preserves visual styling through printBackground: true while maintaining plain-text parsability for ATS systems.
  • The pipeline supports both A4 and Letter formats via CLI flags passed to page.pdf().

Frequently Asked Questions

Why does the pipeline convert font URLs to file:// paths?

Relative font paths fail to resolve in headless Chromium when loading local HTML files directly. Converting to absolute file:// URLs in lines 31-34 ensures font assets load correctly without requiring a local HTTP server, guaranteeing consistent typography across different deployment environments.

What specific Unicode characters does normalizeTextForATS target?

The function replaces smart quotes (curly quotes), em-dashes, en-dashes, zero-width spaces, non-breaking spaces, various bullet glyphs, and currency symbols. According to the implementation in lines 34-45, these characters are converted to ASCII equivalents like straight quotes and hyphens that ATS parsers interpret reliably.

How does the pipeline prevent fonts from rendering incorrectly in the exported PDF?

After injecting HTML via page.setContent(), the script explicitly awaits document.fonts.ready (lines 60-61) before capturing the PDF. This ensures all @font-face resources referenced in templates/cv-template.html are fully decoded and applied, preventing fallback font substitution in the final output.

Can I generate multiple page sizes without modifying the source code?

Yes. The CLI accepts a --format parameter (e.g., --format=letter) that passes directly to Playwright's page.pdf() method at lines 63-72. Valid options include standard paper sizes like a4 and letter, allowing dynamic adaptation to regional application requirements without code changes.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →