# How to Use the Responses API for Reasoning Models with Reasoning Items

> Discover how to use the Responses API for reasoning models and items. Persist chain-of-thought tokens with reasoning IDs across turns for enhanced AI interactions.

- Repository: [OpenAI/openai-cookbook](https://github.com/openai/openai-cookbook)
- Tags: how-to-guide
- Published: 2026-03-02

---

**The Responses API separates assistant messages from reasoning items in a stateful `output` array, allowing you to persist chain-of-thought tokens across turns by passing reasoning IDs or encrypted content via `previous_response_id` or explicit input context.**

The OpenAI Cookbook repository provides comprehensive examples for working with reasoning models like **o3** and **o4-mini** through the Responses API. Unlike traditional chat completions, this API exposes the model's internal reasoning process as discrete items that can be cached, encrypted, or manually threaded through multi-turn conversations to improve intelligence, reduce latency, and lower costs.

## Understanding the Responses API Structure for Reasoning

When you invoke a reasoning model through `client.responses.create`, the returned payload contains an `output` array with distinct element types. This architecture enables precise control over how chain-of-thought tokens flow through your application.

### The Output Array Anatomy

According to the implementation in `examples/responses_api/reasoning_items.ipynb`, the `output` array contains:

- A `reasoning` element exposing the model's internal deliberation (via an ID or encrypted payload), identified by prefixes like `rs_…`
- One or more `message` elements containing the user-facing `output_text`

This separation allows you to inspect or store the reasoning ID without processing the raw tokens, while the API handles the computational overhead of regenerating thought chains.

### Stateful Conversation Management

The Responses API is inherently stateful. Once a reasoning item is generated, you can persist it across subsequent calls using two distinct patterns:

1. **Automatic persistence**: Pass `previous_response_id` referencing the prior response, and the API automatically includes all reasoning items from that turn
2. **Manual threading**: Explicitly prepend the full `output` array (including the reasoning element) to your next `input` list, which is required when inserting function results or other intermediate steps

## Basic Implementation with Reasoning Items

To retrieve a reasoning item from **o4-mini**, initialize the client and inspect the structured output:

```python
from openai import OpenAI
import os

client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

response = client.responses.create(
    model="o4-mini",
    input="Tell me a joke"
)

print(response.output)          # List containing reasoning and message items

print(response.output[0].id)    # Reasoning item ID, e.g., rs_…

```

As shown in lines 71-78 of `reasoning_items.ipynb`, the first element of `response.output` is the reasoning item, while subsequent elements contain the actual response content. Capture this ID to avoid recomputing the reasoning chain in future turns.

## Advanced Patterns for Production Use

For complex workflows involving tool usage or compliance-sensitive data, you must explicitly manage how reasoning items traverse the conversation state.

### Function Calling with Reasoning Persistence

When using function calling with reasoning models, always forward the reasoning item alongside the function output. This prevents the model from regenerating its chain-of-thought and ensures continuity in its reasoning process:

```python
tools = [{
    "type": "function",
    "name": "get_weather",
    "description": "Get current temperature for provided coordinates.",
    "parameters": {
        "type": "object",
        "properties": {
            "latitude": {"type": "number"},
            "longitude": {"type": "number"}
        },
        "required": ["latitude", "longitude"],
        "additionalProperties": False
    },
    "strict": True
}]

# First turn: Model generates reasoning and decides to call the function

resp = client.responses.create(
    model="o4-mini",
    input=[{"role": "user", "content": "What's the weather in Paris?"}],
    tools=tools,
)

# Capture the full output including reasoning item

context = resp.output

# Execute the function locally

tool_call = resp.output[1]                     # The function_call item

args = json.loads(tool_call.arguments)
weather = get_weather(args["latitude"], args["longitude"])

# Append function result to context (maintaining reasoning item)

context.append({
    "type": "function_call_output",
    "call_id": tool_call.call_id,
    "output": str(weather)
})

# Second turn: Provide reasoning + function result back to model

resp2 = client.responses.create(
    model="o4-mini",
    input=context,
    tools=tools,
)

print(resp2.output_text)      # Final answer using persisted reasoning

```

This pattern, documented in lines 96-129 of `reasoning_items.ipynb`, ensures the model retains its original reasoning context when processing tool outputs, eliminating redundant token generation.

### Zero-Data-Retention with Encrypted Reasoning

For compliance-sensitive workloads requiring ZDR (Zero Data Retention), the API supports **encrypted reasoning items** via `reasoning.encrypted_content`. When enabled with `store=False`, reasoning tokens are never persisted by OpenAI and travel only in memory during the round-trip:

```python
resp = client.responses.create(
    model="o3",
    input=[{"role": "user", "content": "Weather in Paris?"}],
    tools=tools,
    store=False,                         # Enforced for ZDR compliance

    include=["reasoning.encrypted_content"]
)

print(resp.output[0].encrypted_content)  # Encrypted chain-of-thought

# Re-use the encrypted content in the next call

resp2 = client.responses.create(
    model="o3",
    input=resp.output,                 # Forward full output with encrypted reasoning

    tools=tools,
    store=False,
    include=["reasoning.encrypted_content"]
)

```

As detailed in the cookbook's safeguard guides and demonstrated in lines 126-136 of `reasoning_items.ipynb`, encrypted reasoning items enable sensitive use-cases while maintaining the performance benefits of cached reasoning across conversation turns.

## Summary

- The Responses API returns reasoning items separately from message content in the `output` array, accessible via IDs or encrypted payloads
- **Stateful persistence** via `previous_response_id` automatically maintains reasoning context, while manual input threading is required for function-calling workflows
- **Function calling** requires appending outputs to the original `output` array (including the reasoning item) to preserve the model's chain-of-thought
- **Encrypted reasoning items** with `store=False` enable ZDR compliance while still allowing reasoning reuse across turns
- Implementation examples are available in `examples/responses_api/reasoning_items.ipynb` within the openai/openai-cookbook repository

## Frequently Asked Questions

### What is the difference between reasoning items and message items?

**Reasoning items** contain the model's internal chain-of-thought tokens, exposed only via an ID or encrypted content for privacy and efficiency. **Message items** contain the user-facing `output_text` that answers the query. The Responses API returns these as separate elements in the `output` array to enable granular control over context management.

### How do I maintain context across multiple turns?

You can maintain context by either setting `previous_response_id` to the prior response's ID (automatic persistence) or explicitly including the full `output` array (including reasoning items) in the next request's `input` parameter. The latter is required when you need to insert function results or modify the conversation flow between turns.

### When should I use encrypted reasoning items?

Use encrypted reasoning items when processing **compliance-sensitive data** that requires Zero Data Retention (ZDR). Set `store=False` and request `reasoning.encrypted_content` in the `include` parameter. This ensures OpenAI never stores the reasoning tokens, which remain encrypted throughout the round-trip and reside only in your application's memory.

### Does using reasoning items reduce API costs?

**Yes.** By persisting reasoning items across turns using their IDs or encrypted content, you avoid paying for the regeneration of the same chain-of-thought tokens. The API caches these reasoning computations, reducing both token costs and latency in multi-turn conversations.