How to Implement Streaming Responses with Real-Time Agent Updates in the Microsoft Agent Framework
The Microsoft Agent Framework enables real-time streaming of AI-generated content through a three-layer architecture comprising HTTP transport, event conversion, and client orchestration, allowing applications to receive tokens, tool calls, and state updates as soon as they are produced.
The Microsoft Agent Framework provides native support for streaming AI-generated content, enabling clients to consume partial results incrementally rather than waiting for complete responses. This capability is essential for building responsive applications that display real-time agent updates, tool execution status, and intermediate reasoning. By leveraging the framework's AGUIChatClient and ResponseStream classes according to the microsoft/agent-framework source code, developers can implement production-grade streaming with minimal boilerplate.
Architecture Overview
The streaming implementation rests on three tightly integrated layers that work together to deliver real-time updates from server to client.
HTTP Transport Layer
The HTTP Transport Layer delivers continuous event streams from the server to the client using Server-Sent Events (SSE). In python/packages/ag-ui/agent_framework_ag_ui/_http_service.py, the AGUIHttpService class manages low-level async HTTP requests through its post_run method, which returns an async generator yielding raw server events.
Server implementations include:
- Azure Functions: The
/api/agent/stream/{thread_id}endpoint returnstext/event-streamcontent types - FastMCP: The
local_mcp_streamable_http_server.pyscript exposes streamable HTTP endpoints for local development
Event Conversion Layer
The Event Conversion Layer transforms raw server-side JSON events into strongly-typed framework objects. The AGUIEventConverter class in python/packages/ag-ui/agent_framework_ag_ui/_event_converters.py parses each SSE event and maps its type field to specific ChatResponseUpdate subclasses including MessageStartUpdate, MessageContentUpdate, ToolCallStartUpdate, ToolCallEndUpdate, and StateSnapshotUpdate.
Client Orchestration Layer
The Client Orchestration Layer exposes the streaming API through the BaseChatClient contract. The AGUIChatClient in python/packages/ag-ui/agent_framework_ag_ui/_client.py implements _inner_get_response to return a ResponseStream object when stream=True is passed. This layer handles tool-call unwrapping via _apply_server_function_call_unwrap and manages the distinction between client-side and server-side tool execution.
Server-Side Streaming Implementation
Azure Functions and FastMCP servers both implement the same SSE protocol for delivering real-time updates.
Azure Functions SSE Endpoint
The Azure Functions workflow orchestrator yields partial results that the HTTP function emits as SSE messages. Each event contains a JSON object with event types such as run_started, message, tool_call_start, tool_call_end, and run_finished.
Integration tests in python/packages/azurefunctions/tests/integration_tests/test_03_reliable_streaming.py demonstrate the client request format:
stream_response = requests.get(
f"{self.stream_url}/{thread_id}",
headers={"Accept": "text/event-stream"},
timeout=30,
)
FastMCP Streamable HTTP Server
For local development and testing, the python/scripts/local_mcp_streamable_http_server.py script creates a FastMCP instance with the streamable_http_path parameter:
server = FastMCP(..., streamable_http_path=mount_path, ...)
server.run(transport="streamable-http")
This server exposes POST and GET endpoints that follow the same JSON event schema as the Azure Functions implementation, making it suitable for CI pipelines and local debugging.
Client-Side Streaming Implementation
Consuming a streaming response requires initializing AGUIChatClient with the appropriate endpoint and calling get_response with stream=True.
The Streaming Implementation Method
The AGUIChatClient._streaming_impl method in _client.py (lines 82-90) builds the request and iterates over the HTTP service's async generator:
async for event in self._http_service.post_run(
thread_id=thread_id,
run_id=run_id,
messages=agui_messages,
state=state,
tools=agui_tools,
...
):
logger.debug(f"[AGUIChatClient] Raw AG-UI event: {event}")
update = converter.convert_event(event)
if update is not None:
yield update
Response Stream and Finalization
When streaming is enabled, _inner_get_response returns a ResponseStream object that wraps the async generator with a finalizer:
if stream:
return ResponseStream(
self._streaming_impl(messages=messages, options=options, **kwargs),
finalizer=ChatResponse.from_updates,
)
The ResponseStream class yields ChatResponseUpdate objects incrementally while aggregating them into a final ChatResponse upon completion through the transform hook applied by _apply_server_function_call_unwrap.
Handling Real-Time Agent Updates
The framework distinguishes between different update types to provide granular visibility into agent execution.
Event Types and State Changes
The AGUIEventConverter processes several event types:
- MessageContentUpdate: Contains individual tokens as they are generated by the LLM
- ToolCallStartUpdate: Signals the beginning of a function invocation
- ToolCallEndUpdate: Delivers the result of a completed tool call
- StateSnapshotUpdate: Provides intermediate workflow state for debugging
Client-Side vs Server-Side Tool Execution
The client handles tool calls differently based on their origin:
- Client-side tools: Executed locally by the
FunctionInvocationLayerwhen the function name matches a registered local tool, allowing real-time feedback during execution - Server-side tools: Wrapped with a
server_function_callplaceholder that_unwrap_server_function_call_contentsremoves before yielding to the application, ensuring middleware receives clean objects
Complete Implementation Examples
Basic Streaming Client
The following example demonstrates consuming token-by-token text generation:
from agent_framework import Message
from agent_framework.ag_ui import AGUIChatClient
async def demo():
async with AGUIChatClient(endpoint="http://localhost:8888/") as client:
async for upd in client.get_response(
[Message(role="user", contents=["Write a haiku about cats."])],
stream=True,
):
for c in upd.contents:
if getattr(c, "type", None) == "text":
print(c.text, end="", flush=True)
print("\n--- done ---")
Streaming with Client-Side Tools
To include tools that execute locally while streaming:
from agent_framework import tool, Agent
from agent_framework.ag_ui import AGUIChatClient
@tool(description="Return the current UTC time as a string.")
def get_utc_time() -> str:
from datetime import datetime, timezone
return datetime.now(timezone.utc).isoformat()
agent = Agent(name="TimeBot", tools=[get_utc_time])
async def run():
async with AGUIChatClient(endpoint="http://localhost:8888/") as client:
async for upd in agent.run(
"What time is it now?",
client=client,
stream=True,
):
for c in upd.contents:
if getattr(c, "type", None) == "text":
print(c.text, end="", flush=True)
Starting a Local MCP Server for Testing
For development and integration testing, use the standalone MCP server:
python -m python.scripts.local_mcp_streamable_http_server \
--host 127.0.0.1 --port 8011 --mount-path /mcp
This exposes /mcp/run and /mcp/stream/{run_id} endpoints that stream JSON events over HTTP, compatible with the AGUIChatClient.
Key Source Files and Components
Understanding the following files is essential for customizing or debugging streaming behavior:
python/packages/ag-ui/agent_framework_ag_ui/_client.py: ContainsAGUIChatClientwith_streaming_implandResponseStreamwrapper logicpython/packages/ag-ui/agent_framework_ag_ui/_event_converters.py: ImplementsAGUIEventConverterfor mapping raw events toChatResponseUpdateobjectspython/packages/ag-ui/agent_framework_ag_ui/_http_service.py: ProvidesAGUIHttpServicewithpost_runmethod for async HTTP communicationpython/packages/ag-ui/agent_framework_ag_ui/_utils.py: Helper functions for translating Agent-Framework tools to AG-UI JSON schemapython/packages/azurefunctions/tests/integration_tests/test_03_reliable_streaming.py: Integration test validating SSE endpoint reliabilitypython/scripts/local_mcp_streamable_http_server.py: Deterministic MCP server for local streaming developmentpython/packages/ag-ui/tests/ag_ui/event_stream.py: Test utilities for validating event stream ordering
Summary
- The Microsoft Agent Framework implements streaming through three coordinated layers: HTTP transport, event conversion, and client orchestration
- Servers emit JSON events via SSE or streamable HTTP, containing event types like
message,tool_call_start, andtool_call_end - The
AGUIChatClientconsumes these events throughAGUIHttpService.post_runand converts them viaAGUIEventConverter - Applications receive
ChatResponseUpdateobjects throughResponseStream, enabling real-time display of tokens and tool progress - Client-side tools execute locally while server-side tools are stripped of wrappers via
_unwrap_server_function_call_contentsbefore reaching application code
Frequently Asked Questions
What is the difference between ChatResponse and ResponseStream in the Agent Framework?
ChatResponse represents a complete, non-streaming response containing all content and tool results, while ResponseStream is an async iterator that yields ChatResponseUpdate objects incrementally as the server generates them. According to the source code in _client.py, ResponseStream accepts a finalizer function (typically ChatResponse.from_updates) that aggregates all updates into a complete response when the stream ends.
How does the framework handle tool calls during streaming?
When the server triggers a tool call, it emits a tool_call_start event that AGUIEventConverter transforms into a ToolCallStartUpdate. If the tool is client-side (registered locally), the FunctionInvocationLayer executes it immediately and injects the result back into the stream as a tool_call_end event. Server-side tools are handled remotely, with the client receiving only the final result wrapped in a placeholder that _unwrap_server_function_call_contents removes before yielding to the application.
Can I use streaming with Azure Functions and local MCP servers simultaneously?
Yes, both implementations follow the same SSE JSON event schema. The AGUIChatClient consumes both identically, as demonstrated by the shared test infrastructure and the local_mcp_streamable_http_server.py script that mimics the Azure Functions streaming contract. You can develop against the local MCP server and deploy to Azure Functions without changing client code.
What event types should my application handle for real-time UI updates?
At minimum, handle MessageContentUpdate for token-by-token text display and ToolCallStartUpdate/ToolCallEndUpdate for showing progress indicators during function execution. The StateSnapshotUpdate type provides additional metadata for debugging complex workflows, though most applications only need the message and tool call events to maintain responsive interfaces.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →