# How to Implement Streaming Responses with Real-Time Agent Updates in the Microsoft Agent Framework

> Learn to implement streaming responses and real-time agent updates with the Microsoft Agent Framework. Discover its three-layer architecture for efficient AI content delivery.

- Repository: [Microsoft/agent-framework](https://github.com/microsoft/agent-framework)
- Tags: how-to-guide
- Published: 2026-04-05

---

**The Microsoft Agent Framework enables real-time streaming of AI-generated content through a three-layer architecture comprising HTTP transport, event conversion, and client orchestration, allowing applications to receive tokens, tool calls, and state updates as soon as they are produced.**

The Microsoft Agent Framework provides native support for streaming AI-generated content, enabling clients to consume partial results incrementally rather than waiting for complete responses. This capability is essential for building responsive applications that display real-time agent updates, tool execution status, and intermediate reasoning. By leveraging the framework's `AGUIChatClient` and `ResponseStream` classes according to the microsoft/agent-framework source code, developers can implement production-grade streaming with minimal boilerplate.

## Architecture Overview

The streaming implementation rests on three tightly integrated layers that work together to deliver real-time updates from server to client.

### HTTP Transport Layer

The **HTTP Transport Layer** delivers continuous event streams from the server to the client using Server-Sent Events (SSE). In [`python/packages/ag-ui/agent_framework_ag_ui/_http_service.py`](https://github.com/microsoft/agent-framework/blob/main/python/packages/ag-ui/agent_framework_ag_ui/_http_service.py), the `AGUIHttpService` class manages low-level async HTTP requests through its `post_run` method, which returns an async generator yielding raw server events.

Server implementations include:

- **Azure Functions**: The `/api/agent/stream/{thread_id}` endpoint returns `text/event-stream` content types
- **FastMCP**: The [`local_mcp_streamable_http_server.py`](https://github.com/microsoft/agent-framework/blob/main/local_mcp_streamable_http_server.py) script exposes streamable HTTP endpoints for local development

### Event Conversion Layer

The **Event Conversion Layer** transforms raw server-side JSON events into strongly-typed framework objects. The `AGUIEventConverter` class in [`python/packages/ag-ui/agent_framework_ag_ui/_event_converters.py`](https://github.com/microsoft/agent-framework/blob/main/python/packages/ag-ui/agent_framework_ag_ui/_event_converters.py) parses each SSE event and maps its `type` field to specific `ChatResponseUpdate` subclasses including `MessageStartUpdate`, `MessageContentUpdate`, `ToolCallStartUpdate`, `ToolCallEndUpdate`, and `StateSnapshotUpdate`.

### Client Orchestration Layer

The **Client Orchestration Layer** exposes the streaming API through the `BaseChatClient` contract. The `AGUIChatClient` in [`python/packages/ag-ui/agent_framework_ag_ui/_client.py`](https://github.com/microsoft/agent-framework/blob/main/python/packages/ag-ui/agent_framework_ag_ui/_client.py) implements `_inner_get_response` to return a `ResponseStream` object when `stream=True` is passed. This layer handles tool-call unwrapping via `_apply_server_function_call_unwrap` and manages the distinction between client-side and server-side tool execution.

## Server-Side Streaming Implementation

Azure Functions and FastMCP servers both implement the same SSE protocol for delivering real-time updates.

### Azure Functions SSE Endpoint

The Azure Functions workflow orchestrator yields partial results that the HTTP function emits as SSE messages. Each event contains a JSON object with event types such as `run_started`, `message`, `tool_call_start`, `tool_call_end`, and `run_finished`.

Integration tests in [`python/packages/azurefunctions/tests/integration_tests/test_03_reliable_streaming.py`](https://github.com/microsoft/agent-framework/blob/main/python/packages/azurefunctions/tests/integration_tests/test_03_reliable_streaming.py) demonstrate the client request format:

```python
stream_response = requests.get(
    f"{self.stream_url}/{thread_id}",
    headers={"Accept": "text/event-stream"},
    timeout=30,
)

```

### FastMCP Streamable HTTP Server

For local development and testing, the [`python/scripts/local_mcp_streamable_http_server.py`](https://github.com/microsoft/agent-framework/blob/main/python/scripts/local_mcp_streamable_http_server.py) script creates a `FastMCP` instance with the `streamable_http_path` parameter:

```python
server = FastMCP(..., streamable_http_path=mount_path, ...)
server.run(transport="streamable-http")

```

This server exposes POST and GET endpoints that follow the same JSON event schema as the Azure Functions implementation, making it suitable for CI pipelines and local debugging.

## Client-Side Streaming Implementation

Consuming a streaming response requires initializing `AGUIChatClient` with the appropriate endpoint and calling `get_response` with `stream=True`.

### The Streaming Implementation Method

The `AGUIChatClient._streaming_impl` method in [`_client.py`](https://github.com/microsoft/agent-framework/blob/main/_client.py) (lines 82-90) builds the request and iterates over the HTTP service's async generator:

```python
async for event in self._http_service.post_run(
    thread_id=thread_id,
    run_id=run_id,
    messages=agui_messages,
    state=state,
    tools=agui_tools,
    ...
):
    logger.debug(f"[AGUIChatClient] Raw AG-UI event: {event}")
    update = converter.convert_event(event)
    if update is not None:
        yield update

```

### Response Stream and Finalization

When streaming is enabled, `_inner_get_response` returns a `ResponseStream` object that wraps the async generator with a finalizer:

```python
if stream:
    return ResponseStream(
        self._streaming_impl(messages=messages, options=options, **kwargs),
        finalizer=ChatResponse.from_updates,
    )

```

The `ResponseStream` class yields `ChatResponseUpdate` objects incrementally while aggregating them into a final `ChatResponse` upon completion through the transform hook applied by `_apply_server_function_call_unwrap`.

## Handling Real-Time Agent Updates

The framework distinguishes between different update types to provide granular visibility into agent execution.

### Event Types and State Changes

The `AGUIEventConverter` processes several event types:

- **MessageContentUpdate**: Contains individual tokens as they are generated by the LLM
- **ToolCallStartUpdate**: Signals the beginning of a function invocation
- **ToolCallEndUpdate**: Delivers the result of a completed tool call
- **StateSnapshotUpdate**: Provides intermediate workflow state for debugging

### Client-Side vs Server-Side Tool Execution

The client handles tool calls differently based on their origin:

- **Client-side tools**: Executed locally by the `FunctionInvocationLayer` when the function name matches a registered local tool, allowing real-time feedback during execution
- **Server-side tools**: Wrapped with a `server_function_call` placeholder that `_unwrap_server_function_call_contents` removes before yielding to the application, ensuring middleware receives clean objects

## Complete Implementation Examples

### Basic Streaming Client

The following example demonstrates consuming token-by-token text generation:

```python
from agent_framework import Message
from agent_framework.ag_ui import AGUIChatClient

async def demo():
    async with AGUIChatClient(endpoint="http://localhost:8888/") as client:
        async for upd in client.get_response(
            [Message(role="user", contents=["Write a haiku about cats."])],
            stream=True,
        ):
            for c in upd.contents:
                if getattr(c, "type", None) == "text":
                    print(c.text, end="", flush=True)
        print("\n--- done ---")

```

### Streaming with Client-Side Tools

To include tools that execute locally while streaming:

```python
from agent_framework import tool, Agent
from agent_framework.ag_ui import AGUIChatClient

@tool(description="Return the current UTC time as a string.")
def get_utc_time() -> str:
    from datetime import datetime, timezone
    return datetime.now(timezone.utc).isoformat()

agent = Agent(name="TimeBot", tools=[get_utc_time])

async def run():
    async with AGUIChatClient(endpoint="http://localhost:8888/") as client:
        async for upd in agent.run(
            "What time is it now?",
            client=client,
            stream=True,
        ):
            for c in upd.contents:
                if getattr(c, "type", None) == "text":
                    print(c.text, end="", flush=True)

```

### Starting a Local MCP Server for Testing

For development and integration testing, use the standalone MCP server:

```bash
python -m python.scripts.local_mcp_streamable_http_server \
    --host 127.0.0.1 --port 8011 --mount-path /mcp

```

This exposes `/mcp/run` and `/mcp/stream/{run_id}` endpoints that stream JSON events over HTTP, compatible with the `AGUIChatClient`.

## Key Source Files and Components

Understanding the following files is essential for customizing or debugging streaming behavior:

- **[`python/packages/ag-ui/agent_framework_ag_ui/_client.py`](https://github.com/microsoft/agent-framework/blob/main/python/packages/ag-ui/agent_framework_ag_ui/_client.py)**: Contains `AGUIChatClient` with `_streaming_impl` and `ResponseStream` wrapper logic
- **[`python/packages/ag-ui/agent_framework_ag_ui/_event_converters.py`](https://github.com/microsoft/agent-framework/blob/main/python/packages/ag-ui/agent_framework_ag_ui/_event_converters.py)**: Implements `AGUIEventConverter` for mapping raw events to `ChatResponseUpdate` objects
- **[`python/packages/ag-ui/agent_framework_ag_ui/_http_service.py`](https://github.com/microsoft/agent-framework/blob/main/python/packages/ag-ui/agent_framework_ag_ui/_http_service.py)**: Provides `AGUIHttpService` with `post_run` method for async HTTP communication
- **[`python/packages/ag-ui/agent_framework_ag_ui/_utils.py`](https://github.com/microsoft/agent-framework/blob/main/python/packages/ag-ui/agent_framework_ag_ui/_utils.py)**: Helper functions for translating Agent-Framework tools to AG-UI JSON schema
- **[`python/packages/azurefunctions/tests/integration_tests/test_03_reliable_streaming.py`](https://github.com/microsoft/agent-framework/blob/main/python/packages/azurefunctions/tests/integration_tests/test_03_reliable_streaming.py)**: Integration test validating SSE endpoint reliability
- **[`python/scripts/local_mcp_streamable_http_server.py`](https://github.com/microsoft/agent-framework/blob/main/python/scripts/local_mcp_streamable_http_server.py)**: Deterministic MCP server for local streaming development
- **[`python/packages/ag-ui/tests/ag_ui/event_stream.py`](https://github.com/microsoft/agent-framework/blob/main/python/packages/ag-ui/tests/ag_ui/event_stream.py)**: Test utilities for validating event stream ordering

## Summary

- The Microsoft Agent Framework implements streaming through three coordinated layers: HTTP transport, event conversion, and client orchestration
- Servers emit JSON events via SSE or streamable HTTP, containing event types like `message`, `tool_call_start`, and `tool_call_end`
- The `AGUIChatClient` consumes these events through `AGUIHttpService.post_run` and converts them via `AGUIEventConverter`
- Applications receive `ChatResponseUpdate` objects through `ResponseStream`, enabling real-time display of tokens and tool progress
- Client-side tools execute locally while server-side tools are stripped of wrappers via `_unwrap_server_function_call_contents` before reaching application code

## Frequently Asked Questions

### What is the difference between ChatResponse and ResponseStream in the Agent Framework?

`ChatResponse` represents a complete, non-streaming response containing all content and tool results, while `ResponseStream` is an async iterator that yields `ChatResponseUpdate` objects incrementally as the server generates them. According to the source code in [`_client.py`](https://github.com/microsoft/agent-framework/blob/main/_client.py), `ResponseStream` accepts a finalizer function (typically `ChatResponse.from_updates`) that aggregates all updates into a complete response when the stream ends.

### How does the framework handle tool calls during streaming?

When the server triggers a tool call, it emits a `tool_call_start` event that `AGUIEventConverter` transforms into a `ToolCallStartUpdate`. If the tool is client-side (registered locally), the `FunctionInvocationLayer` executes it immediately and injects the result back into the stream as a `tool_call_end` event. Server-side tools are handled remotely, with the client receiving only the final result wrapped in a placeholder that `_unwrap_server_function_call_contents` removes before yielding to the application.

### Can I use streaming with Azure Functions and local MCP servers simultaneously?

Yes, both implementations follow the same SSE JSON event schema. The `AGUIChatClient` consumes both identically, as demonstrated by the shared test infrastructure and the [`local_mcp_streamable_http_server.py`](https://github.com/microsoft/agent-framework/blob/main/local_mcp_streamable_http_server.py) script that mimics the Azure Functions streaming contract. You can develop against the local MCP server and deploy to Azure Functions without changing client code.

### What event types should my application handle for real-time UI updates?

At minimum, handle `MessageContentUpdate` for token-by-token text display and `ToolCallStartUpdate`/`ToolCallEndUpdate` for showing progress indicators during function execution. The `StateSnapshotUpdate` type provides additional metadata for debugging complex workflows, though most applications only need the message and tool call events to maintain responsive interfaces.