How the PDF Generation Pipeline Creates ATS-Optimized CVs Using Playwright and HTML Templates
The Career-Ops pipeline converts Mustache-templated HTML into ATS-friendly PDFs by normalizing Unicode characters, rendering through headless Chromium, and exporting with specific margin settings to guarantee both visual fidelity and parser compatibility.
The santifer/career-ops repository implements a deterministic workflow that bridges modern web design with applicant tracking system (ATS) requirements. By combining HTML templating with a controlled browser environment, the pipeline generates ATS-optimized CVs using Playwright and HTML templates that maintain consistent typography while ensuring the underlying text remains machine-readable.
HTML Template Structure and Placeholder System
The foundation of every generated CV rests in templates/cv-template.html. This base layout uses Mustache-style placeholders such as {{NAME}}, {{EXPERIENCE}}, {{PHONE}}, and {{EMAIL}} that external scripts populate before PDF conversion.
The template includes CSS page-break controls to ensure professional formatting:
<div class="header avoid-break">
<h1>{{NAME}}</h1>
<div class="header-gradient"></div>
<div class="contact-row">
<span>{{PHONE}}</span> | <span>{{EMAIL}}</span> |
<a href="{{LINKEDIN_URL}}">{{LINKEDIN_DISPLAY}}</a> |
<span>{{LOCATION}}</span>
</div>
</div>
...
<div class="section">
<div class="section-title">{{SECTION_EXPERIENCE}}</div>
{{EXPERIENCE}}
</div>
The avoid-break class prevents awkward page splits between critical sections, ensuring the printed output respects document flow.
Pre-Processing for ATS Compliance
Before rendering, generate-pdf.mjs performs two critical normalization steps on the raw HTML content.
Resolving Font Assets with Absolute file:// URLs
Lines 31-34 of generate-pdf.mjs rewrite relative font references to absolute file:// paths. This conversion ensures the headless Chromium instance can load local font assets from the templates/fonts/ directory without requiring a local HTTP server.
Unicode Normalization via normalizeTextForATS
The normalizeTextForATS function (lines 34-45) sanitizes content by replacing Unicode characters that break ATS parsers. The routine targets smart quotes, em-dashes, zero-width spaces, non-breaking spaces, bullet glyphs, and currency symbols, converting them to plain ASCII equivalents that downstream parsers reliably interpret.
Browser Rendering with Playwright
The pipeline leverages Playwright's headless browser to achieve pixel-perfect rendering while maintaining cross-platform consistency.
Launching the Headless Instance
Line 50 initializes the rendering engine:
const browser = await chromium.launch({ headless: true });
const page = await browser.newPage();
This creates an isolated Chromium environment free from local system font caches or browser extensions that might alter layout.
Content Injection and Font Loading
The pipeline feeds prepared HTML to page.setContent() with an explicit baseURL parameter pointing to the source file directory. This ensures relative resources (images, stylesheets) resolve correctly. Lines 60-61 explicitly await document.fonts.ready to guarantee all web fonts are fully loaded before capture, eliminating flash-of-unstyled-text artifacts in the final output.
PDF Generation and Output
Once rendering completes, the pipeline exports the document using Playwright's native PDF capabilities.
Configuring PDF Export Parameters
Lines 63-72 configure the page.pdf() call with ATS-friendly settings:
const pdfBuffer = await page.pdf({
format: 'a4', // or 'letter' via CLI flag
printBackground: true,
margin: {
top: '0.6in',
bottom: '0.6in',
left: '0.6in',
right: '0.6in'
}
});
The printBackground: true option preserves CSS color definitions and gradient headers, while generous margins ensure content remains clear when printed or viewed digitally.
Writing Output and Page Count Validation
The resulting binary is written to disk via writeFile (lines 77-78). A heuristic in lines 80-83 reads the PDF buffer to calculate and log the page count, providing immediate feedback on whether the CV fits target length constraints.
Running the Pipeline
Execute the generator from the command line:
node career-ops/generate-pdf.mjs output/cv.html output/cv.pdf --format=a4
The script accepts the input HTML path, output PDF path, and optional format specification (defaults to A4).
Summary
- HTML templates in
templates/cv-template.htmlprovide the structural foundation with Mustache placeholders for dynamic content. - ATS normalization via
normalizeTextForATS(lines 34-45) strips problematic Unicode characters before rendering. - Playwright's headless Chromium ensures browser-accurate rendering with explicit font loading waits at
document.fonts.ready(lines 60-61). - PDF export preserves visual styling through
printBackground: truewhile maintaining plain-text parsability for ATS systems. - The pipeline supports both A4 and Letter formats via CLI flags passed to
page.pdf().
Frequently Asked Questions
Why does the pipeline convert font URLs to file:// paths?
Relative font paths fail to resolve in headless Chromium when loading local HTML files directly. Converting to absolute file:// URLs in lines 31-34 ensures font assets load correctly without requiring a local HTTP server, guaranteeing consistent typography across different deployment environments.
What specific Unicode characters does normalizeTextForATS target?
The function replaces smart quotes (curly quotes), em-dashes, en-dashes, zero-width spaces, non-breaking spaces, various bullet glyphs, and currency symbols. According to the implementation in lines 34-45, these characters are converted to ASCII equivalents like straight quotes and hyphens that ATS parsers interpret reliably.
How does the pipeline prevent fonts from rendering incorrectly in the exported PDF?
After injecting HTML via page.setContent(), the script explicitly awaits document.fonts.ready (lines 60-61) before capturing the PDF. This ensures all @font-face resources referenced in templates/cv-template.html are fully decoded and applied, preventing fallback font substitution in the final output.
Can I generate multiple page sizes without modifying the source code?
Yes. The CLI accepts a --format parameter (e.g., --format=letter) that passes directly to Playwright's page.pdf() method at lines 63-72. Valid options include standard paper sizes like a4 and letter, allowing dynamic adaptation to regional application requirements without code changes.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →