How Greenhouse, Lever, and Ashby Provider Scripts Implement the fetch() Interface in Career Ops

The greenhouse.mjs, lever.mjs, and ashby.mjs modules each implement a standardized asynchronous fetch(entry, ctx) method that contacts their respective ATS APIs, normalizes the responses, and returns arrays of Job objects, utilizing shared HTTP utilities from providers/_http.mjs for consistent timeout and redirect management.

The santifer/career-ops repository employs a modular provider architecture to unify job board scraping across different Applicant Tracking Systems. Each provider adheres to a strict Provider contract defined in providers/_types.js, exporting a default object with a standardized fetch() interface that enables scan.mjs to retrieve normalized job data regardless of the underlying platform.

The Provider Contract Structure

According to the type definitions in providers/_types.js, every provider module must export an object containing three key properties:

  • id: A unique string identifier (e.g., "greenhouse", "lever", or "ashby")
  • detect(entry): An optional function that accepts a PortalEntry and returns a hint object when the entry matches the provider's URL patterns
  • async fetch(entry, ctx): The core method that accepts a PortalEntry and an HTTP context, then returns a Promise resolving to an array of Job objects with the shape {title, url, company, location}

Shared HTTP Context and Dependencies

All provider implementations rely on the ctx object created by makeHttpCtx() in providers/_http.mjs. This context provides:

  • fetchJson(url, options): Wraps fetch() with AbortController-based timeouts, default User-Agent headers, and JSON parsing
  • fetchText(url, options): Similar wrapper for text responses
  • Automatic timeout handling: Prevents hung connections through configurable AbortController signals

The fetch() implementations follow a consistent five-step pattern: derive the API endpoint using resolveApiUrl, optionally validate the URL (Greenhouse only), perform the HTTP request via ctx, transform the raw payload into normalized Job objects, and return the array or an empty array on error.

Greenhouse Implementation (greenhouse.mjs)

In providers/greenhouse.mjs, the Greenhouse provider implements strict security measures before fetching data. The fetch() method first validates the URL against an allow-list of trusted hostnames to prevent SSRF attacks. It then invokes ctx.fetchJson with redirect: 'error' explicitly set to block malicious redirects, extracts the jobs array from the JSON response, and maps each entry to the normalized Job format.

// Conceptual implementation pattern from greenhouse.mjs
const response = await ctx.fetchJson(apiUrl, { redirect: 'error' });
const jobs = response.jobs.map(job => ({
  title: job.title,
  url: job.absolute_url,
  company: entry.name,
  location: job.location?.name || 'Remote'
}));

Lever Implementation (lever.mjs)

The Lever provider in providers/lever.mjs takes a streamlined approach suitable for Lever's stable API. It constructs the endpoint URL from the careers_url or api property of the PortalEntry, calls ctx.fetchJson without restrictive redirect policies, expects a direct array response (or an object containing postings), and maps fields like text and hostedUrl to the standardized Job schema.

// Example mapping from lever.mjs
const postings = await ctx.fetchJson(apiUrl);
return postings.map(posting => ({
  title: posting.text,
  url: posting.hostedUrl,
  company: entry.name,
  location: posting.categories?.location || 'Remote'
}));

Ashby Implementation (ashby.mjs)

The Ashby provider in providers/ashby.mjs handles a slower, rate-limited public API through aggressive resilience patterns. The fetch() implementation configures a 30-second timeout (longer than the default) and wraps the ctx.fetchJson call in a retry loop with exponential back-off and jitter. This compensates for Ashby's strict rate limits while still delivering normalized Job arrays identical in shape to those from Greenhouse and Lever.

// Ashby's resilience pattern (simplified)
const MAX_RETRIES = 3;
let attempt = 0;
while (attempt < MAX_RETRIES) {
  try {
    const data = await ctx.fetchJson(url, { timeout: 30000 });
    return data.jobs.map(/* normalization logic */);
  } catch (err) {
    attempt++;
    await delay(Math.pow(2, attempt) * 1000 + Math.random() * 1000);
  }
}

Orchestration in scan.mjs

The scan.mjs entry point dynamically loads all *.mjs files from the providers/ directory (excluding files prefixed with _), iterates through the loaded modules to find a provider whose detect() method matches the PortalEntry, and then invokes that provider's fetch(entry, ctx) method to retrieve the final job list.

import { makeHttpCtx } from './providers/_http.mjs';

async function scanEntry(entry, providers) {
  const ctx = makeHttpCtx();
  
  // Auto-detect appropriate provider
  const provider = providers.find(p => p.detect?.(entry));
  if (!provider) throw new Error('No provider matched entry');
  
  // Execute standardized fetch
  const jobs = await provider.fetch(entry, ctx);
  return jobs;
}

Direct Provider Usage Examples

You can invoke any provider directly without the scanner orchestration:

import greenhouseProvider from './providers/greenhouse.mjs';
import leverProvider from './providers/lever.mjs';
import ashbyProvider from './providers/ashby.mjs';
import { makeHttpCtx } from './providers/_http.mjs';

const entry = {
  name: 'Example Corp',
  careers_url: 'https://jobs.lever.co/example'
};

const ctx = makeHttpCtx();

// Lever fetch
const leverJobs = await leverProvider.fetch(entry, ctx);
console.log('Lever:', leverJobs);

// Ashby fetch with built-in retry logic
const ashbyJobs = await ashbyProvider.fetch(entry, ctx);

// Greenhouse fetch with SSRF protection
const greenhouseJobs = await greenhouseProvider.fetch(entry, ctx);

Summary

  • All three providers implement the fetch(entry, ctx) interface defined in providers/_types.js, accepting a PortalEntry and returning normalized Job arrays
  • Greenhouse prioritizes security with URL allow-lists and redirect: 'error' policies to prevent SSRF attacks
  • Lever uses straightforward HTTP requests with direct array mapping for reliable, fast APIs
  • Ashby implements 30-second timeouts and exponential back-off retry loops to handle rate-limited public endpoints
  • Each provider depends on makeHttpCtx() from providers/_http.mjs for consistent fetchJson(), timeout handling, and User-Agent configuration
  • scan.mjs orchestrates provider selection through the detect() method and executes fetch() to unify job data retrieval

Frequently Asked Questions

What is the exact signature of the fetch() method in these provider scripts?

The method signature is async fetch(entry, ctx), where entry is a PortalEntry object containing properties like careers_url or api, and ctx is the HTTP context created by makeHttpCtx() in providers/_http.mjs. It returns a Promise that resolves to an array of Job objects shaped as {title, url, company, location}, or an empty array if the request fails.

How does error handling differ between the Greenhouse and Ashby implementations?

Greenhouse validates URLs against an allow-list and uses redirect: 'error' to fail fast on suspicious redirects, returning an empty array on validation or network errors. Ashby, conversely, implements a retry loop with exponential back-off and jitter specifically within its fetch() method to handle transient failures and rate limits from the slower Ashby API before eventually returning the job data or an empty array.

Can I implement a custom provider using the same fetch() interface?

Yes, any new provider module must export a default object adhering to the contract in providers/_types.js, including an id string, an optional detect(entry) function for auto-discovery, and the async fetch(entry, ctx) method. You should utilize ctx.fetchJson() or ctx.fetchText() from the shared HTTP utilities to ensure consistent timeout handling and security headers across the codebase.

Why does the Ashby provider require a longer timeout than Lever or Greenhouse?

Ashby's public API is inherently slower and heavily rate-limited compared to Greenhouse and Lever, which offer more responsive endpoints. The Ashby implementation configures a 30-second timeout (as opposed to the default used by other providers) to accommodate these latency characteristics, combined with a retry mechanism to ensure reliable data retrieval without overwhelming the remote server.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →