How to Use WebSearchTool and FileSearchTool for Information Retrieval in openai-agents-python

Use WebSearchTool for live web searches and FileSearchTool for querying vector stores by instantiating either class with configuration parameters and adding it to an Agent.tools list.

The openai-agents-python SDK provides two built-in hosted retrieval tools that enable agents to fetch external information without implementing custom search logic. These tools integrate directly with the OpenAI Responses API, allowing the LLM to invoke web_search or file_search functions during execution.

Understanding the Hosted Retrieval Tools

Both WebSearchTool and FileSearchTool are hosted implementations located in src/agents/tool.py. You configure them via dataclass instances, but the actual search execution happens server-side when the OpenAI API processes the tool call.

  • WebSearchTool (defined at src/agents/tool.py#L558-L583) performs live or cached web searches via the Responses API
  • FileSearchTool (defined at src/agents/tool.py#L332-L357) queries user-provided vector stores for document fragments

The runtime (src/agents/run_internal/turn_resolution.py) detects when the LLM outputs a ResponseFunctionWebSearch or ResponseFileSearchToolCall item and routes the call to the appropriate server-side handler.

Configuring WebSearchTool

WebSearchTool supports several parameters to control search behavior and context retrieval.

Location and Context Settings

The user_location parameter accepts a dictionary specifying geographic bias, while search_context_size controls how many surrounding snippets the server includes in the LLM context (options: "low", "medium", "high").

from agents import WebSearchTool

web_tool = WebSearchTool(
    user_location={"type": "approximate", "city": "New York"},
    search_context_size="high"
)

Search Filters and Access Control

Use external_web_access to toggle between live internet fetches (True) and cached/indexed results only (False). The filters parameter accepts a WebSearchToolFilters object for domain restrictions and safe-search settings.

import asyncio
from agents import Agent, Runner, WebSearchTool, trace

async def main():
    agent = Agent(
        name="Web searcher",
        instructions="You are a helpful agent.",
        tools=[WebSearchTool(
            user_location={"type": "approximate", "city": "New York"},
            search_context_size="high",
            external_web_access=True,
        )],
    )

    with trace("Web-search demo"):
        result = await Runner.run(
            agent,
            "What major tech conference is happening in New York next week?",
        )
        print("Final answer:", result.final_output)
        
        for item in result.new_items:
            print("Raw tool item:", item.raw_item)

if __name__ == "__main__":
    asyncio.run(main())

Configuring FileSearchTool

FileSearchTool queries vector stores created via the OpenAI Assistants API and returns relevant document fragments.

Vector Store Integration

Required parameters include vector_store_ids, a list of one or more store IDs. Optional parameters control result volume and ranking behavior.

from agents import FileSearchTool

file_tool = FileSearchTool(
    vector_store_ids=["vs_12345"],
    max_num_results=5,
    include_search_results=True
)

Result Ranking and Filtering

The ranking_options parameter accepts a RankingOptions object for custom score thresholds, while filters supports file-level restrictions by MIME type or metadata.

import asyncio
from openai import OpenAI
from agents import Agent, FileSearchTool, Runner, trace

async def main():
    client = OpenAI()
    vector_store_id = "vs_12345"  # Replace with your vector store ID

    agent = Agent(
        name="File searcher",
        instructions="Answer using only the knowledge in the vector store.",
        tools=[FileSearchTool(
            vector_store_ids=[vector_store_id],
            max_num_results=5,
            include_search_results=True,
        )],
    )

    with trace("File-search demo"):
        result = await Runner.run(
            agent,
            "Give me a concise summary of the main ideas in the Dune excerpt.",
        )
        print("Final answer:", result.final_output)
        
        for item in result.new_items:
            print("Raw file-search item:", item.raw_item)

if __name__ == "__main__":
    asyncio.run(main())

Runtime Execution Flow

When an agent invokes a retrieval tool, the following sequence occurs:

  1. Tool Registration: The Agent stores tool instances in self.tools (defined in src/agents/agent.py)
  2. LLM Invocation: The runtime serializes tool metadata into the request payload via src/agents/models/openai_responses.py
  3. Tool Selection: The LLM outputs a ResponseFunctionWebSearch or ResponseFileSearchToolCall item (defined in src/agents/items.py)
  4. Server Execution: The runtime (src/agents/run_internal/turn_resolution.py) forwards the call to the OpenAI Responses API, which executes the search server-side
  5. Result Handling: The server returns structured results that the SDK deserializes into the corresponding response items, available in RunnerResult.new_items

Summary

  • WebSearchTool (src/agents/tool.py#L558-L583) enables live web searches via the OpenAI Responses API, configurable with location bias, context size, and access controls
  • FileSearchTool (src/agents/tool.py#L332-L357) queries vector stores for document fragments, supporting custom ranking and filtering options
  • Both tools are hosted, meaning the SDK only handles configuration and result serialization while the OpenAI server performs the actual retrieval
  • Add tool instances to Agent.tools and inspect RunnerResult.new_items to access raw search results alongside the LLM's generated response

Frequently Asked Questions

Can I use both WebSearchTool and FileSearchTool in the same agent?

Yes. You can include both tools in the same Agent.tools list. The LLM will select the appropriate tool based on the query context. For example, it might use FileSearchTool for questions about internal documentation and WebSearchTool for current events.

How do I handle the raw search results returned by these tools?

The RunnerResult.new_items list contains ToolCallItem instances with a raw_item attribute. For WebSearchTool, this contains the search snippets and citations. For FileSearchTool, it includes the matching document fragments with relevance scores. Access these to display citations or perform additional processing.

Do I need to implement my own search logic for these tools?

No. Both WebSearchTool and FileSearchTool are hosted tools managed by the OpenAI API. You only need to configure the tool parameters (such as vector_store_ids for file search or search_context_size for web search). The actual search execution happens server-side when the LLM invokes the tool during a run.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →