Tool Input and Output Guardrails vs Regular Guardrails in openai-agents-python

Regular guardrails validate an agent's complete input and final output with boolean tripwires, while tool-specific guardrails wrap individual function calls to provide granular behavioral controls including content rejection without halting the agent.

The openai-agents-python library provides two distinct validation systems to enforce safety policies and business constraints at different architectural layers. While regular guardrails act as gatekeepers for the entire agent conversation lifecycle, tool input and output guardrails isolate validation logic to individual function tool executions. Understanding these architectural differences is essential for implementing defense-in-depth strategies that protect both the conversation boundary and internal tool operations.

Scope and Execution Timing

Understanding when and where these guardrails execute reveals their fundamental architectural separation.

Regular Guardrails

Regular guardrails wrap the complete agent execution as implemented in src/agents/guardrail.py. The InputGuardrail class (lines 71-104) executes before or in parallel with the agent to validate incoming user messages or TResponseInputItem objects. The OutputGuardrail class (lines 133-166) executes after the agent produces its final response, validating the content returned to the caller before the run completes.

Tool-Level Guardrails

In contrast, tool guardrails defined in src/agents/tool_guardrails.py scope validation to individual tool executions within a single turn. ToolInputGuardrail (lines 51-78) runs immediately before the tool function is invoked to validate arguments. ToolOutputGuardrail (lines 81-108) executes right after the tool returns but before the result is fed back into the agent's context window, allowing for output sanitization or rejection.

Data Structures and Behavioral Control

The two systems differ significantly in their function signatures, input data, and control mechanisms.

Input Parameters

Regular guardrails receive discrete parameters: RunContextWrapper, the Agent instance, and the raw input (string or list) for input guardrails, or the final agent output (Any) for output guardrails.

Tool guardrails receive specialized data containers. ToolInputGuardrailData bundles the ToolContext (tool-specific execution state), the owning agent, and the tool arguments. ToolOutputGuardrailData extends this structure by adding the output field containing the tool's return value.

Result Types and Execution Flow

Regular guardrails return GuardrailFunctionOutput (defined in src/agents/guardrail.py lines 19-33), which contains a simple tripwire_triggered: boolean. When True, execution halts immediately by raising InputGuardrailTripwireTriggered or OutputGuardrailTripwireTriggered.

Tool guardrails return ToolGuardrailFunctionOutput (lines 59-78 in src/agents/tool_guardrails.py), which provides a behavior enum with three distinct options:

  • allow – Continues normal execution and passes the tool result to the agent (default)
  • reject_content – Discards the tool result and sends a custom message back to the model without halting the agent run
  • raise_exception – Raises ToolGuardrailTripwireTriggered to halt the entire run

Configuration and Wiring

Attaching these guardrails requires different configuration patterns depending on their scope.

Attaching Regular Guardrails

Regular guardrails attach to the agent via RunConfig parameters input_guardrails and output_guardrails. The runner aggregates these results in RunState and serializes them for tracing, as implemented in src/agents/run.py (lines 688-713). The guardrail results persist in RunState (lines 246-258 in src/agents/run_state.py) for observability.

Attaching Tool Guardrails

Tool guardrails configure directly on the ToolSpec class in src/agents/tool.py (lines 321-340) through the tool_input_guardrails and tool_output_guardrails arguments. The execution pipeline in src/agents/run_internal/tool_execution.py (lines 1670-1700) calls _execute_tool_input_guardrails before tool invocation and _execute_tool_output_guardrails after completion, interpreting the behavior field to determine whether to continue, replace, or abort.

Code Examples

Regular Input Guardrail

This example blocks the entire agent run if disallowed content appears in user input:

from agents import input_guardrail, GuardrailFunctionOutput
from agents.run_context import RunContextWrapper
from agents.agent import Agent

@input_guardrail
def disallowed_topic_guardrail(
    ctx: RunContextWrapper, agent: Agent, user_input: str
) -> GuardrailFunctionOutput:
    if "hack" in user_input.lower():
        return GuardrailFunctionOutput(output_info="blocked word 'hack'", tripwire_triggered=True)
    return GuardrailFunctionOutput(output_info=None, tripwire_triggered=False)

Implemented in src/agents/guardrail.py (lines 71-100).

Regular Output Guardrail

Validate the final agent response before it reaches the caller:

from agents import output_guardrail, GuardrailFunctionOutput

@output_guardrail
def profanity_filter(
    ctx: RunContextWrapper, agent: Agent, agent_output: str
) -> GuardrailFunctionOutput:
    if "badword" in agent_output.lower():
        return GuardrailFunctionOutput(
            output_info="profanity detected", tripwire_triggered=True
        )
    return GuardrailFunctionOutput(output_info=None, tripwire_triggered=False)

Implemented in src/agents/guardrail.py (lines 133-161).

Tool Input Guardrail

Validate tool arguments before execution using reject_content to allow the conversation to continue:

from agents import tool_input_guardrail, ToolGuardrailFunctionOutput, ToolInputGuardrailData

@tool_input_guardrail
def url_whitelist_guardrail(data: ToolInputGuardrailData) -> ToolGuardrailFunctionOutput:
    url = data.context.kwargs.get("url", "")
    if not url.startswith(("https://example.com", "https://api.myservice.com")):
        return ToolGuardrailFunctionOutput.reject_content(
            message="URL not allowed", output_info={"url": url}
        )
    return ToolGuardrailFunctionOutput.allow()

Implemented in src/agents/tool_guardrails.py (lines 20-38).

Tool Output Guardrail

Sanitize tool results before they reach the agent:

from agents import tool_output_guardrail, ToolGuardrailFunctionOutput, ToolOutputGuardrailData

@tool_output_guardrail
def pii_sanitizer(data: ToolOutputGuardrailData) -> ToolGuardrailFunctionOutput:
    result = data.output
    sanitized = result.replace("[PHONE]", "[REDACTED]")
    if sanitized != result:
        return ToolGuardrailFunctionOutput.allow(output_info={"sanitized": True})
    return ToolGuardrailFunctionOutput.allow()

Implemented in src/agents/tool_guardrails.py (lines 59-78).

Wiring Tool Guardrails to ToolSpec

Attach guardrails to specific tools via ToolSpec:

from agents import ToolSpec, tool_input_guardrail, tool_output_guardrail, ToolGuardrailFunctionOutput

@tool_input_guardrail
def check_positive(data):
    if data.context.args[0] < 0:
        return ToolGuardrailFunctionOutput.reject_content(
            message="Only positive numbers allowed"
        )
    return ToolGuardrailFunctionOutput.allow()

@tool_output_guardrail
def log_output(data):
    print("Tool returned:", data.output)
    return ToolGuardrailFunctionOutput.allow()

my_tool = ToolSpec(
    name="increment",
    func=lambda x: x + 1,
    tool_input_guardrails=[check_positive],
    tool_output_guardrails=[log_output],
)

ToolSpec defined in src/agents/tool.py (lines 321-340).

Summary

  • Regular guardrails wrap the entire agent execution and return GuardrailFunctionOutput with a boolean tripwire_triggered that halts the entire run when true.
  • Tool guardrails isolate specific function tool calls and return ToolGuardrailFunctionOutput with granular behavior options (allow, reject_content, raise_exception) that can discard individual results without stopping the agent.
  • Regular guardrails attach via RunConfig and execute in src/agents/run.py, while tool guardrails attach to ToolSpec in src/agents/tool.py and execute in src/agents/run_internal/tool_execution.py.
  • Tool guardrails receive specialized data containers (ToolInputGuardrailData, ToolOutputGuardrailData) containing tool-specific context, while regular guardrails receive the agent context and raw input/output directly.

Frequently Asked Questions

Can I use the same function for both regular and tool guardrails?

No, the function signatures and return types are incompatible. Regular guardrails accept multiple parameters (RunContextWrapper, Agent, and input/output) and must return GuardrailFunctionOutput. Tool guardrails accept a single data parameter (ToolInputGuardrailData or ToolOutputGuardrailData) and must return ToolGuardrailFunctionOutput with behavioral controls. Attempting to share logic requires separate wrapper functions for each interface.

What happens when a tool output guardrail rejects content?

When a tool guardrail returns ToolGuardrailFunctionOutput.reject_content(message="..."), the actual tool result is discarded and replaced with the custom message provided. This message is sent back to the language model as the tool result, allowing the conversation to continue rather than raising an exception. This contrasts with raise_exception, which triggers ToolGuardrailTripwireTriggered and halts the entire agent run.

Do regular guardrails execute when agents hand off to each other?

Regular guardrails configured in the initial RunConfig apply to the overall run context. InputGuardrail typically executes at the start of the run before any agent activation, while OutputGuardrail validates the final output returned to the caller after all handoffs complete. Individual agents in a handoff chain do not automatically trigger new input guardrails unless explicitly configured with separate run configurations for each sub-agent.

Where are the behavior enums defined for tool guardrails?

The ToolGuardrailFunctionOutput class and its behavior options are defined in src/agents/tool_guardrails.py (lines 59-78) according to the openai-agents-python source. This file also contains the ToolInputGuardrailData and ToolOutputGuardrailData container classes, alongside the @tool_input_guardrail and @tool_output_guardrail decorators.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →