How Dexter's `read_filings` Tool Works: LLM-Powered SEC Filing Analysis
Dexter's read_filings tool is a two-step, LLM-orchestrated workflow that converts natural-language questions about SEC filings into targeted extraction of specific document sections without downloading entire files.
The read_filings tool in the virattt/dexter repository enables AI agents to answer complex financial questions by intelligently querying 10-K, 10-Q, and 8-K documents. Instead of retrieving massive PDFs, the tool uses structured planning and parallel API calls to fetch only the relevant sections needed to answer a user's query.
The Two-Step Architecture
The read_filings implementation in src/tools/finance/read-filings.ts follows a distinct planning-and-execution pattern that separates search strategy from data retrieval.
Step 1: Planning the Search
When a user submits a query, the tool first invokes an LLM to create a structured search plan. The buildPlanPrompt() function generates a system prompt that instructs the model to return a JSON object matching the FilingPlanSchema.
The plan must resolve:
- ticker: Converting company names (e.g., "Microsoft") to stock symbols (e.g.,
MSFT) - filing_types: Selecting from
10-K,10-Q, or8-Kbased on query intent - limit: Capping results (defaulting to 10 filings)
This validation step ensures downstream tools receive well-formed parameters before any external API calls are made.
Step 2: Fetching Metadata
With a validated plan, the tool executes two parallel operations defined in src/tools/finance/filings.ts:
get_filings: Retrieves recent filing metadata (accession numbers, URLs, filing dates) for the specified ticker and form typesgetFilingItemTypes: Fetches the canonical list of item names (e.g., "Item-1A", "Part-1,Item-2") from the Financial Datasets API
If no filings match the criteria, the tool returns a formatted error result immediately, avoiding unnecessary processing.
Intelligent Item Selection and Retrieval
The second LLM invocation determines exactly which document sections contain the answer.
Selecting Specific Items
The buildStep2Prompt() function creates a context-rich prompt containing:
- The original user query
- The list of available filings from Step 2
- The complete catalog of item names for 10-K and 10-Q forms
The LLM acts as a router, selecting up to three specific filings and identifying the exact items (e.g., "Item 7. Management’s Discussion and Analysis") needed to answer the query. The model responds with a tool-call message invoking one of three retrieval functions stored in STEP2_TOOL_MAP:
get_10K_filing_itemsget_10Q_filing_itemsget_8K_filing_items
Executing Item Calls
Dexter executes the selected item-tool calls in parallel (capped at three concurrent requests). Each call:
- Sends the ticker, accession number, and
itemsarray to the Financial Datasets API endpoint/filings/items/ - Returns a JSON payload containing the extracted text sections
- Wraps the result using
formatToolResultwith associated source URLs
The tool aggregates results by accession number, collecting any failures under an _errors field while merging all source URLs into the final response.
Output Formatting and Integration
The final step standardizes the output for Dexter's agent loop. The formatToolResult function (invoked at line 284 in read-filings.ts) structures the response as {data, sourceUrls}, allowing the agent to incorporate extracted sections into its scratchpad for downstream reasoning and answer generation.
Implementation Examples
Direct Invocation from a Custom Script
import { createReadFilings } from '@/tools/finance/read-filings';
// Choose the model you want Dexter to use (e.g., "gpt-5.2")
const readFilingsTool = createReadFilings('gpt-5.2');
// Example query
const input = { query: 'What were Apple’s risk factors in its 2023 10‑K?' };
readFilingsTool
.invoke(input)
.then(result => {
console.log('Fetched sections:', result.data);
console.log('Source URLs:', result.sourceUrls);
})
.catch(err => console.error('Tool error:', err));
Using the Tool from the CLI (Dexter’s Built-in REPL)
> /tool read_filings
Enter your query: Summarize Microsoft’s MD&A section from the most recent 10‑Q.
Dexter will:
- Resolve "Microsoft" →
MSFT - Plan a 10‑Q search
- Retrieve the latest 10‑Q metadata for
MSFT - Ask the LLM which MD&A items to fetch (e.g.,
Part‑1,Item‑2) - Call
get_10Q_filing_itemsand return the extracted text for downstream reasoning
Key Source Files
| File | Role | Link |
|---|---|---|
src/tools/finance/read-filings.ts |
Main implementation of the read_filings DynamicStructuredTool; orchestrates planning, metadata fetch, item selection, and result aggregation. |
read‑filings.ts |
src/tools/finance/filings.ts |
Provides low‑level tools (get_filings, get_10K_filing_items, get_10Q_filing_items, get_8K_filing_items) and the canonical filing‑item type catalog. |
filings.ts |
src/model/llm.ts |
Helper callLlm that abstracts model invocation with system prompts and output schema validation (used by read_filings). |
llm.ts |
src/agent/prompts.ts |
Supplies getCurrentDate() for time‑aware prompts used in both planning and item‑selection stages. |
prompts.ts |
These files together enable Dexter to turn a free‑form question into a precise, low‑latency SEC filing extraction workflow.
Summary
- Two-stage LLM orchestration:
read_filingsfirst plans the search strategy, then selects specific document items to retrieve. - Parallel metadata fetching: The tool simultaneously queries filing metadata and item type catalogs from the Financial Datasets API.
- Intelligent item routing: A second LLM call determines exactly which sections (e.g., "Item 1A", "Part 1, Item 2") contain the answer.
- Structured output: Results are standardized via
formatToolResultinto{data, sourceUrls}for seamless agent integration. - Source locations: Core logic resides in
src/tools/finance/read-filings.tswith helper utilities insrc/tools/finance/filings.ts.
Frequently Asked Questions
How does read_filings handle company name resolution?
The tool relies on the LLM planning stage to resolve natural language company names (e.g., "Apple" or "Microsoft") into stock tickers (e.g., AAPL or MSFT). During the buildPlanPrompt() execution, the model extracts the ticker symbol and includes it in the validated FilingPlanSchema output, which subsequent API calls use to fetch the correct filings.
What filing types does the read_filings tool support?
According to the source code in src/tools/finance/read-filings.ts, the tool supports three primary SEC filing types: 10-K (annual reports), 10-Q (quarterly reports), and 8-K (current reports). The LLM planner selects appropriate types based on query intent—for example, choosing 10-K for annual risk assessments or 10-Q for recent quarterly management discussions.
How does the tool avoid downloading entire SEC filings?
Instead of retrieving full PDF or HTML documents, read_filings uses a targeted extraction strategy. After fetching metadata via get_filings, the second LLM call (via buildStep2Prompt()) identifies specific items (e.g., "Item 7. Management’s Discussion and Analysis"). The tool then calls specialized retrieval functions like get_10K_filing_items or get_10Q_filing_items to fetch only those sections via the Financial Datasets API, significantly reducing latency and token usage.
Can I use read_filings outside of Dexter's agent loop?
Yes, the tool is exported as a DynamicStructuredTool factory function createReadFilings() that accepts a model name parameter (e.g., "gpt-5.2"). You can import and invoke it directly in TypeScript applications, passing a {query: string} input object. The tool returns a Promise resolving to {data, sourceUrls}, making it suitable for standalone scripts, custom financial analysis pipelines, or integration into other AI frameworks.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →