# How to Implement Input and Output Guardrails for Content Filtering and Validation in openai-agents-python

> Learn how to implement input and output guardrails in openai-agents-python for robust content filtering and validation. Secure your AI applications with custom rules.

- Repository: [OpenAI/openai-agents-python](https://github.com/openai/openai-agents-python)
- Tags: how-to-guide
- Published: 2026-04-17

---

**The openai-agents-python SDK provides `InputGuardrail` and `OutputGuardrail` classes for validating user inputs and model outputs, plus specialized `ToolInputGuardrail` and `ToolOutputGuardrail` classes for function tool validation, all configurable via decorators that return `GuardrailFunctionOutput` or `ToolGuardrailFunctionOutput` objects.**

The openai-agents-python framework includes a comprehensive guardrail system for content filtering and validation that intercepts data at three critical points: before the agent processes user input, after the model generates output, and immediately before or after function tool execution. According to the source code in [`src/agents/guardrail.py`](https://github.com/openai/openai-agents-python/blob/main/src/agents/guardrail.py), guardrails are Python callables that receive contextual data and return output objects containing a `tripwire_triggered` boolean that determines whether execution should abort or continue.

## Understanding the Guardrail Architecture

The guardrail system distinguishes between four distinct guardrail types, each serving a specific validation purpose in the agent lifecycle:

- **`InputGuardrail`** – Validates raw user input (strings or `TResponseInputItem` lists) before the agent processes the request. Defined in [`src/agents/guardrail.py`](https://github.com/openai/openai-agents-python/blob/main/src/agents/guardrail.py), it receives the `RunContextWrapper`, `Agent` instance, and user input.
- **`OutputGuardrail`** – Validates the final output object before it returns to the caller, running after the model completes generation but before the result leaves the agent boundary.
- **`ToolInputGuardrail`** – Executes immediately before a function tool is invoked in [`src/agents/run_internal/tool_execution.py`](https://github.com/openai/openai-agents-python/blob/main/src/agents/run_internal/tool_execution.py), validating pre-parsed arguments.
- **`ToolOutputGuardrail`** – Executes after a function tool returns, allowing post-processing or redaction of sensitive data before the result reaches the model.

All guardrail types support both synchronous and asynchronous implementations. When multiple guardrails are configured, the runner in [`src/agents/run_internal/guardrails.py`](https://github.com/openai/openai-agents-python/blob/main/src/agents/run_internal/guardrails.py) executes sequential guardrails first, then launches parallel guardrails concurrently.

## Implementing Input Guardrails for Content Filtering

Input guardrails act as the first line of defense against malicious or inappropriate user content. The following synchronous implementation demonstrates a profanity filter using the `@input_guardrail` decorator factory:

```python
from agents import input_guardrail, GuardrailFunctionOutput, InputGuardrailTripwireTriggered

PROFANITY = {"badword", "offensive"}

@input_guardrail
def profanity_filter(
    context,            # RunContextWrapper[TContext]

    agent,              # Agent[Any]

    user_input: str | list,
) -> GuardrailFunctionOutput:
    """Reject any input containing a profane token."""
    text = user_input if isinstance(user_input, str) else " ".join(str(i) for i in user_input)

    if any(word in text.lower().split() for word in PROFANITY):
        return GuardrailFunctionOutput(
            output_info={"offending_word": "badword"},
            tripwire_triggered=True,
        )
    return GuardrailFunctionOutput(output_info=None, tripwire_triggered=False)

```

Attach the guardrail to an agent via the `input_guardrails` parameter in [`src/agents/agent.py`](https://github.com/openai/openai-agents-python/blob/main/src/agents/agent.py):

```python
from agents import Agent

my_agent = Agent(
    name="chatbot",
    model=my_model,
    input_guardrails=[profanity_filter],
)

```

When `tripwire_triggered=True`, the runner raises `InputGuardrailTripwireTriggered` immediately and aborts the run. The trace span created in [`src/agents/tracing.py`](https://github.com/openai/openai-agents-python/blob/main/src/agents/tracing.py) records `triggered=True` for observability.

## Implementing Output Guardrails for Validation

Output guardrails validate model-generated content before it reaches the user. The following asynchronous example enforces a token limit using the `@output_guardrail` decorator:

```python
from agents import output_guardrail, GuardrailFunctionOutput
import asyncio

MAX_TOKENS = 300

@output_guardrail
async def length_limiter(
    context,
    agent,
    output,
) -> GuardrailFunctionOutput:
    """Abort if the generated text exceeds a token budget."""
    token_count = len(output.split())
    if token_count > MAX_TOKENS:
        return GuardrailFunctionOutput(
            output_info={"token_count": token_count},
            tripwire_triggered=True,
        )
    return GuardrailFunctionOutput(output_info=None, tripwire_triggered=False)

```

Register this guardrail on the agent:

```python
my_agent = Agent(
    name="summarizer",
    model=my_model,
    output_guardrails=[length_limiter],
)

```

The `run_output_guardrails` function in [`src/agents/run_internal/guardrails.py`](https://github.com/openai/openai-agents-python/blob/main/src/agents/run_internal/guardrails.py) awaits completion of all output guardrails. If any return `tripwire_triggered=True`, the runner raises `OutputGuardrailTripwireTriggered` and halts execution.

## Tool-Level Guardrails for Pre and Post Execution Validation

Tool guardrails provide granular control over function tool execution, with distinct behaviors defined in [`src/agents/tool_guardrails.py`](https://github.com/openai/openai-agents-python/blob/main/src/agents/tool_guardrails.py): **allow**, **reject_content** (replace output with a message), or **raise_exception** (abort the run).

### Tool Input Guardrails

The following example validates JSON arguments before tool execution:

```python
from agents import tool_input_guardrail, ToolGuardrailFunctionOutput
import json

@tool_input_guardrail
def validate_json(data):
    """Reject malformed JSON payloads for a function tool."""
    try:
        json.loads(data.context.tool_args)
    except Exception as exc:
        return ToolGuardrailFunctionOutput.reject_content(
            message="The tool arguments must be valid JSON.",
            output_info={"error": str(exc)},
        )
    return ToolGuardrailFunctionOutput.allow()

```

Attach to a `FunctionTool`:

```python
from agents import FunctionTool

my_tool = FunctionTool(
    name="search",
    description="Search a knowledge base.",
    parameters={...},
    func=my_search_impl,
    input_guardrails=[validate_json],
)

```

### Tool Output Guardrails

Post-execution guardrails sanitize sensitive data before it reaches the model:

```python
from agents import tool_output_guardrail, ToolGuardrailFunctionOutput

SENSITIVE_KEYS = {"ssn", "credit_card"}

@tool_output_guardrail
def scrub_sensitive(data):
    """Remove sensitive fields from a tool's JSON output."""
    result = json.loads(data.output)
    for key in SENSITIVE_KEYS:
        result.pop(key, None)
    sanitized = json.dumps(result)
    
    return ToolGuardrailFunctionOutput.allow(output_info={"sanitized": True})

```

The `_execute_tool_output_guardrails` function in [`src/agents/run_internal/tool_execution.py`](https://github.com/openai/openai-agents-python/blob/main/src/agents/run_internal/tool_execution.py) processes these guardrails. Unlike agent-level guardrails, tool guardrails can modify content via `reject_content` without necessarily aborting the entire run.

## Execution Flow and Error Handling

The guardrail execution pipeline follows a strict orchestration pattern defined in [`src/agents/run_internal/guardrails.py`](https://github.com/openai/openai-agents-python/blob/main/src/agents/run_internal/guardrails.py):

1. **Collection** – The runner gathers guardrails from the agent definition and any runtime configuration.
2. **Sequential Execution** – Guardrails marked as sequential run first, in order.
3. **Parallel Execution** – Remaining guardrails execute concurrently via `asyncio.gather`.
4. **Tripwire Evaluation** – If any guardrail returns `tripwire_triggered=True`, the system immediately raises the appropriate exception (`InputGuardrailTripwireTriggered`, `OutputGuardrailTripwireTriggered`, or `ToolGuardrailTripwireTriggered`).
5. **Tracing** – Each guardrail execution creates a span in [`src/agents/tracing.py`](https://github.com/openai/openai-agents-python/blob/main/src/agents/tracing.py) annotated with `triggered=True` when violations occur.

For streaming scenarios, `run_input_guardrails_with_queue` manages guardrail execution against queued input items.

## Summary

- **Input and output guardrails** in openai-agents-python validate content at the agent boundary using `InputGuardrail` and `OutputGuardrail` classes from [`src/agents/guardrail.py`](https://github.com/openai/openai-agents-python/blob/main/src/agents/guardrail.py).
- **Tool guardrails** provide pre- and post-execution validation via `ToolInputGuardrail` and `ToolOutputGuardrail` in [`src/agents/tool_guardrails.py`](https://github.com/openai/openai-agents-python/blob/main/src/agents/tool_guardrails.py), supporting content modification without full abortion.
- **Tripwire mechanism** – Returning `GuardrailFunctionOutput(tripwire_triggered=True)` aborts execution and raises specific exceptions captured in traces.
- **Async support** – All guardrail types support both sync and async implementations, with parallel execution handled by [`src/agents/run_internal/guardrails.py`](https://github.com/openai/openai-agents-python/blob/main/src/agents/run_internal/guardrails.py).
- **Registration** – Attach guardrails via decorator factories (`@input_guardrail`, `@output_guardrail`) and list parameters on `Agent` or `FunctionTool` instances.

## Frequently Asked Questions

### What is the difference between InputGuardrail and ToolInputGuardrail?

**`InputGuardrail`** validates the raw user input before the agent begins processing, operating at the conversation level in [`src/agents/run.py`](https://github.com/openai/openai-agents-python/blob/main/src/agents/run.py). **`ToolInputGuardrail`** executes immediately before a specific function tool is called, validating pre-parsed arguments in [`src/agents/run_internal/tool_execution.py`](https://github.com/openai/openai-agents-python/blob/main/src/agents/run_internal/tool_execution.py). Tool guardrails return `ToolGuardrailFunctionOutput` which supports **content rejection** (replacing output with a message) in addition to the binary allow/block behavior of standard guardrails.

### How do I handle asynchronous guardrail functions?

Decorate async functions with the same decorators (`@input_guardrail`, `@output_guardrail`, `@tool_input_guardrail`, `@tool_output_guardrail`). The runner in [`src/agents/run_internal/guardrails.py`](https://github.com/openai/openai-agents-python/blob/main/src/agents/run_internal/guardrails.py) automatically detects coroutines and awaits them using `asyncio.gather` for parallel guardrails. Both sync and async implementations return the same `GuardrailFunctionOutput` or `ToolGuardrailFunctionOutput` objects.

### What happens when a guardrail triggers a tripwire?

When any guardrail returns `tripwire_triggered=True` in its output object, the runner immediately raises `InputGuardrailTripwireTriggered` or `OutputGuardrailTripwireTriggered` (or `ToolGuardrailTripwireTriggered` for tool guardrails). The agent stops execution, and the trace system records a guardrail span with `triggered=True` for observability, as implemented in [`src/agents/tracing.py`](https://github.com/openai/openai-agents-python/blob/main/src/agents/tracing.py).

### Can guardrails modify content instead of blocking it?

**Standard input and output guardrails** operate on a binary allow/block model and cannot modify content. However, **tool guardrails** support content modification via `ToolGuardrailFunctionOutput.reject_content()`, which replaces the tool output with a custom message that gets sent back to the model instead of the original result. This allows the conversation to continue with corrected or sanitized information without aborting the entire agent run.