How to Configure Custom Search Queries in `portals.yml` for the Career-Ops Scanner
You configure custom search queries in portals.yml by editing the search_queries block to add, modify, or disable YAML entries with name, query, and enabled fields, then re-run node scan.mjs to apply the changes.
When you configure custom search queries in portals.yml, you control exactly which roles, sites, and locations the santifer/career-ops scanner monitors. The repository uses this file as the central driver for its job-scanning pipeline, making it the primary interface for tuning your search strategy.
Where portals.yml Fits in the Career-Ops Pipeline
Before you edit, it helps to understand how the configuration flows through the system. The scanner script scan.mjs reads portals.yml from the project root, iterates over each entry under search_queries, and executes a WebSearch for every item with enabled: true. It then extracts the title, URL, and company from the results and writes them to data/pipeline.md for later evaluation in the modes/oferta.md flow.
The Template File vs. Your Active Config
The repository provides a reference template at templates/portals.example.yml. The first time you run the system, this template is copied to portals.yml in the project root. After that copy exists, the scanner reads only portals.yml, so all edits must be made to the active file, not the template.
Steps to Configure Custom Search Queries in portals.yml
Locate the search_queries: block in portals.yml (it starts around line 46 in the template). Each query is a YAML list item with three fields:
name— A descriptive label for the query.query— The actual WebSearch string, usually containingsite:and keyword filters.enabled— A boolean toggle (trueorfalse) that controls whether the scanner runs this query.
Modify an Existing Query
To refine an existing search, update the query string or toggle the enabled flag. For example, if you want to target AI product-manager roles on Ashby boards, your entry would look like this:
- name: Ashby — AI PM
query: 'site:jobs.ashbyhq.com "AI Product Manager" OR "Senior Product Manager AI" remote'
enabled: true
Adjust the keywords, add OR clauses, or insert location filters to match your target titles.
Add a New Query Block
Expanding coverage to additional job boards requires appending a new YAML block that follows the same three-field structure. For instance, to add an Indeed search for remote Data Engineer positions:
- name: Indeed — Data Engineer
query: 'site:indeed.com "Data Engineer" "remote" "Python"'
enabled: true
Give the block a unique name, wrap the query in single quotes to preserve special characters, and set enabled: true so scan.mjs picks it up on the next run.
Disable a Query Temporarily
Instead of deleting a query you no longer need, set enabled: false. This keeps the definition intact for later use while telling the scanner to skip it:
- name: Workable — Old Role
query: 'site:apply.workable.com "Legacy Role"'
enabled: false
Practical portals.yml Query Examples
These examples demonstrate common customizations you can paste directly into your search_queries block.
Narrow Searches by Location
Adding a location restriction while still allowing remote postings helps focus results on a specific market:
- name: Greenhouse — AI Engineer
query: 'site:boards.greenhouse.io "AI Engineer" "San Francisco" remote'
enabled: true
The added "San Francisco" term biases results toward that city without excluding remote roles.
Target Additional Job Boards
You can point the scanner at LinkedIn or any other public index by changing the site: operator:
- name: LinkedIn — Machine Learning Engineer
query: 'site:linkedin.com/jobs "Machine Learning Engineer" "remote" "Python"'
enabled: true
According to the santifer/career-ops source code, these strings are passed directly into the WebSearch loop inside scan.mjs, so any operator supported by the underlying search engine—such as inurl: or intitle:—will work.
How scan.mjs Reads Your Configuration
Inside scan.mjs, the scanner loads portals.yml and loops through the search_queries array. For each entry where enabled equals true, it performs the WebSearch, scrapes the resulting page titles and URLs, and feeds the structured data into the downstream pipeline. Because there is no caching layer for the YAML itself, saving the file and re-running node scan.mjs is sufficient to apply changes immediately.
Summary
portals.ymlis the active configuration file thatscan.mjsreads at runtime; the reference template lives attemplates/portals.example.yml.- The
search_queriesblock controls which WebSearches the scanner executes. - Each query requires
name,query, andenabledfields. - Set
enabled: falseto temporarily disable a query without deleting it. - After editing, run
node scan.mjsto verify that your custom queries return results.
Frequently Asked Questions
Where does portals.yml come from?
The file is copied automatically from templates/portals.example.yml the first time you run the scanner. Once the copy exists in the project root, edits must be made to portals.yml because scan.mjs does not read the template file.
What fields are required for each search query?
Every entry in the search_queries block must include name (a descriptive label), query (the search string), and enabled (a boolean toggle). The scanner ignores any entry with enabled: false.
How do I verify my custom queries are working?
Save portals.yml and run node scan.mjs. Watch the console output and grep for your query name to confirm it is being processed. If a query yields too many or too few hits, refine the keywords or add site-specific filters such as inurl:.
Can I use advanced search operators in the query string?
Yes. Because scan.mjs passes the query value directly to the WebSearch implementation, you can use any supported operator—including site:, OR, inurl:, and intitle:—to improve precision.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →