How to Configure Custom Search Queries in `portals.yml` for the Career-Ops Scanner

You configure custom search queries in portals.yml by editing the search_queries block to add, modify, or disable YAML entries with name, query, and enabled fields, then re-run node scan.mjs to apply the changes.

When you configure custom search queries in portals.yml, you control exactly which roles, sites, and locations the santifer/career-ops scanner monitors. The repository uses this file as the central driver for its job-scanning pipeline, making it the primary interface for tuning your search strategy.

Where portals.yml Fits in the Career-Ops Pipeline

Before you edit, it helps to understand how the configuration flows through the system. The scanner script scan.mjs reads portals.yml from the project root, iterates over each entry under search_queries, and executes a WebSearch for every item with enabled: true. It then extracts the title, URL, and company from the results and writes them to data/pipeline.md for later evaluation in the modes/oferta.md flow.

The Template File vs. Your Active Config

The repository provides a reference template at templates/portals.example.yml. The first time you run the system, this template is copied to portals.yml in the project root. After that copy exists, the scanner reads only portals.yml, so all edits must be made to the active file, not the template.

Steps to Configure Custom Search Queries in portals.yml

Locate the search_queries: block in portals.yml (it starts around line 46 in the template). Each query is a YAML list item with three fields:

  • name — A descriptive label for the query.
  • query — The actual WebSearch string, usually containing site: and keyword filters.
  • enabled — A boolean toggle (true or false) that controls whether the scanner runs this query.

Modify an Existing Query

To refine an existing search, update the query string or toggle the enabled flag. For example, if you want to target AI product-manager roles on Ashby boards, your entry would look like this:

- name: Ashby — AI PM
  query: 'site:jobs.ashbyhq.com "AI Product Manager" OR "Senior Product Manager AI" remote'
  enabled: true

Adjust the keywords, add OR clauses, or insert location filters to match your target titles.

Add a New Query Block

Expanding coverage to additional job boards requires appending a new YAML block that follows the same three-field structure. For instance, to add an Indeed search for remote Data Engineer positions:

- name: Indeed — Data Engineer
  query: 'site:indeed.com "Data Engineer" "remote" "Python"'
  enabled: true

Give the block a unique name, wrap the query in single quotes to preserve special characters, and set enabled: true so scan.mjs picks it up on the next run.

Disable a Query Temporarily

Instead of deleting a query you no longer need, set enabled: false. This keeps the definition intact for later use while telling the scanner to skip it:

- name: Workable — Old Role
  query: 'site:apply.workable.com "Legacy Role"'
  enabled: false

Practical portals.yml Query Examples

These examples demonstrate common customizations you can paste directly into your search_queries block.

Narrow Searches by Location

Adding a location restriction while still allowing remote postings helps focus results on a specific market:

- name: Greenhouse — AI Engineer
  query: 'site:boards.greenhouse.io "AI Engineer" "San Francisco" remote'
  enabled: true

The added "San Francisco" term biases results toward that city without excluding remote roles.

Target Additional Job Boards

You can point the scanner at LinkedIn or any other public index by changing the site: operator:

- name: LinkedIn — Machine Learning Engineer
  query: 'site:linkedin.com/jobs "Machine Learning Engineer" "remote" "Python"'
  enabled: true

According to the santifer/career-ops source code, these strings are passed directly into the WebSearch loop inside scan.mjs, so any operator supported by the underlying search engine—such as inurl: or intitle:—will work.

How scan.mjs Reads Your Configuration

Inside scan.mjs, the scanner loads portals.yml and loops through the search_queries array. For each entry where enabled equals true, it performs the WebSearch, scrapes the resulting page titles and URLs, and feeds the structured data into the downstream pipeline. Because there is no caching layer for the YAML itself, saving the file and re-running node scan.mjs is sufficient to apply changes immediately.

Summary

  • portals.yml is the active configuration file that scan.mjs reads at runtime; the reference template lives at templates/portals.example.yml.
  • The search_queries block controls which WebSearches the scanner executes.
  • Each query requires name, query, and enabled fields.
  • Set enabled: false to temporarily disable a query without deleting it.
  • After editing, run node scan.mjs to verify that your custom queries return results.

Frequently Asked Questions

Where does portals.yml come from?

The file is copied automatically from templates/portals.example.yml the first time you run the scanner. Once the copy exists in the project root, edits must be made to portals.yml because scan.mjs does not read the template file.

What fields are required for each search query?

Every entry in the search_queries block must include name (a descriptive label), query (the search string), and enabled (a boolean toggle). The scanner ignores any entry with enabled: false.

How do I verify my custom queries are working?

Save portals.yml and run node scan.mjs. Watch the console output and grep for your query name to confirm it is being processed. If a query yields too many or too few hits, refine the keywords or add site-specific filters such as inurl:.

Can I use advanced search operators in the query string?

Yes. Because scan.mjs passes the query value directly to the WebSearch implementation, you can use any supported operator—including site:, OR, inurl:, and intitle:—to improve precision.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →