Migrating from Legacy Runner to the New Architecture in Agent-Lightning: A Complete Guide
Replace LegacyAgentRunner with LitAgentRunner and switch from HTTP client polling to the LightningStore abstraction to unlock async-native execution, automatic heartbeats, and unified storage.
This guide covers the migration path from the deprecated LegacyAgentRunner to the modern LitAgentRunner architecture in the microsoft/agent-lightning repository. The new store-based design replaces the legacy v0.1 client-server API with an async-first execution model that powers all current examples and the Trainer implementation.
Architectural Differences: LegacyAgentRunner vs LitAgentRunner
The LegacyAgentRunner (Defined in agentlightning/runner/legacy.py) relies on direct polling of an AgentLightningClient via client.poll_next_task. It requires manual heartbeat implementation and uses the private _trace_context_sync method for tracing, converting results into RolloutLegacy objects via _to_rollout_object.
The modern LitAgentRunner (Implemented in agentlightning/runner/agent.py) pulls tasks from a LightningStore using store.dequeue_rollout. It provides built-in heartbeat loops through _start_heartbeat_thread_loop, supports both async trace_context and sync _trace_context_sync tracing, and normalizes results via _post_process_rollout_result before writing to the store.
Key architectural shifts include:
- Entry point migration: From
AgentLightningClientpolling toLightningStore.dequeue_rolloutas defined inagentlightning/store/base.py - Heartbeat automation:
LitAgentRunnerautomatically snapshots system state at configurable intervals (default 10 seconds) instead of requiring manual client pings - Worker initialization:
init_workernow receives aLightningStoreinstance and registers the worker, rather than simply storing aworker_id - Result standardization: Returns are processed through
_post_process_rollout_resultto handle float rewards,Spanobjects, andSpanCoreFieldsbefore storage - Hook pipeline: Rich async-aware hooks (
on_trace_start,on_trace_end,on_rollout_start,on_rollout_end) with isolated error handling replace the limited legacy hook system
Step-by-Step Migration Checklist
Follow these specific actions to migrate your codebase, referencing the exact source locations in the repository:
-
Update imports in
agentlightning/runner/__init__.py— ReplaceLegacyAgentRunnerwithLitAgentRunnerin your import statements. -
Refactor trainer instantiation — Modify custom trainer code in
agentlightning/trainer/trainer.pyto remove manualLegacyAgentRunnerconstruction. TheTrainerclass now automatically instantiatesLitAgentRunnerviainstantiate_component. -
Switch to store-based task retrieval — Replace
AgentLightningClientpolling logic with aLightningStoreimplementation such asInMemoryLightningStoreor a remote store backend fromagentlightning/store/base.py. -
Migrate result handling — Convert custom result processing to conform to
_post_process_rollout_resultinagentlightning/runner/agent.py. Return values must be a float, a list ofSpan/SpanCoreFieldsobjects, orNone. -
Enable automatic heartbeats — Remove manual heartbeat logic and configure
heartbeat_intervalarguments (default 10 seconds) when constructingLitAgentRunner. The runner automatically invokes_start_heartbeat_thread_loop. -
Update hook signatures — Adjust
on_rollout_startandon_rollout_endhooks that expectRolloutLegacymodels to accept the newRolloutmodel defined inagentlightning/types/core.py. -
Update references — Change documentation, examples, and tests to reference
LitAgentRunnerinstead of the legacy class. See the reference implementation inexamples/unsloth/sft_rollout_runners.py. -
Validate migration — Execute the full test suite using
uv run pytest -vin the project roottests/folder, specifically reviewingtests/runner/test_agent_runner.pyfor async and sync path compatibility.
Code Migration Examples
Minimal LitAgentRunner Setup
This example demonstrates the modern initialization pattern using the store-based architecture:
from agentlightning import LitAgentRunner, InMemoryLightningStore, AgentOpsTracer
from agentlightning.litagent import LitAgent
# Create a store (in-memory for demo)
store = InMemoryLightningStore()
# Instantiate a tracer (AgentOpsTracer is optional)
tracer = AgentOpsTracer()
# Build the runner – the store is injected later via init_worker
runner = LitAgentRunner[dict](tracer=tracer, max_rollouts=100)
# Initialize the runner with an agent (your custom LitAgent subclass)
runner.init(agent=MyAgent())
# Register the worker (worker_id = 0) and the store
runner.init_worker(worker_id=0, store=store)
# Run the async iteration loop inside an asyncio event loop
await runner.iter()
Source: examples/unsloth/sft_rollout_runners.py
Converting Trainer Initialization
Legacy approach (v0.1):
trainer = TrainerLegacy(...)
trainer.fit_v0(agent, train_data="http://localhost:8000")
Modern store-based approach:
from agentlightning.trainer import Trainer
trainer = Trainer(...)
# The trainer automatically instantiates LitAgentRunner via instantiate_component
trainer.fit(agent=MyAgent(), train_data=my_dataset)
Source: Trainer.fit_v0 (legacy) vs Trainer.fit in agentlightning/trainer/trainer.py
Updating Custom Hooks
The new runner supports an expanded async hook pipeline:
class MyHook:
async def on_trace_start(self, *, agent, runner, tracer, rollout):
print(f"Starting trace for rollout {rollout.rollout_id}")
async def on_rollout_end(self, *, agent, runner, rollout, spans):
# spans is a list of Span objects already stored
print(f"Rollout {rollout.rollout_id} finished with {len(spans)} spans")
runner = LitAgentRunner[dict](tracer=tracer, max_rollouts=50)
runner.init(agent=my_agent, hooks=[MyHook()])
Source: Hook triggering in LitAgentRunner._trigger_hooks (lines 46-68 of agentlightning/runner/agent.py)
Key Source Files and Implementation Details
Understanding these core files ensures accurate migration:
agentlightning/runner/base.py— Defines the abstractRunnercontract specifyinginit,init_worker, and asynciter/stepmethodsagentlightning/runner/agent.py— ContainsLitAgentRunnerimplementation including heartbeat logic (_start_heartbeat_thread_loop), result processing (_post_process_rollout_result), and hook pipelineagentlightning/runner/legacy.py— HousesLegacyAgentRunnerfor backward compatibility only; marked for removalagentlightning/store/base.py— Specifies theLightningStoreinterface replacing the legacy HTTP client/server stackagentlightning/trainer/trainer.py— High-levelTrainerclass that now createsLitAgentRunnerinstances viainstantiate_componenttests/runner/test_agent_runner.py— Test suite validating async and sync execution paths for the new runner
Summary
Migrating from the legacy architecture to LitAgentRunner provides immediate benefits:
- Unified storage API: All components share the
LightningStorecontract, simplifying scaling and persistence across distributed workers - Native async support:
asyncio-firstiterandstepmethods enable high-throughput streaming workloads without blocking - Automatic health monitoring: Built-in heartbeat loops with configurable intervals (default 10 seconds) maintain worker state visibility
- Improved hook safety: Async-aware hooks are isolated from core runner loops with comprehensive error handling
- Future compatibility: The legacy client-server path is slated for removal; new features like APO and AgentOps integration target only the store-based runner
Frequently Asked Questions
What happened to the legacy HTTP client polling?
The AgentLightningClient polling mechanism (client.poll_next_task) used by LegacyAgentRunner has been replaced by the LightningStore abstraction. According to the agent-lightning source code, the store provides dequeue_rollout for task retrieval and handles persistence transparently, eliminating the need for manual HTTP polling loops.
How do I handle custom result processing in the new runner?
Migrate logic from _to_rollout_object to _post_process_rollout_result. As implemented in agentlightning/runner/agent.py, the new method accepts rollout results and returns either a float reward, a list of Span/SpanCoreFields objects, or None. These values are automatically written to the LightningStore, standardizing result handling across the framework.
Is the legacy runner still supported?
LegacyAgentRunner remains available in agentlightning/runner/legacy.py for backward compatibility but is marked for removal. All new examples, tests, and the Trainer implementation in agentlightning/trainer/trainer.py exclusively use LitAgentRunner. Microsoft recommends immediate migration to ensure compatibility with upcoming features.
How does the heartbeat mechanism work in LitAgentRunner?
LitAgentRunner automatically initializes a heartbeat loop via _start_heartbeat_thread_loop (line 44 of agentlightning/runner/agent.py) when init_worker is called. The runner snapshots system state and updates the LightningStore at intervals specified by heartbeat_interval (defaulting to 10 seconds), eliminating the need for manual heartbeat implementations required by the legacy architecture.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →