Implementing Custom Reward Functions in Agent Lightning: A Complete Guide
Call emit_reward() from agentlightning.emitter.reward to emit OpenTelemetry spans encoding scalar or multi-dimensional rewards, then retrieve them using utility functions like find_final_reward().
Agent Lightning (microsoft/agent-lightning) provides a telemetry-first framework for LLM-driven agents that captures performance signals as OpenTelemetry spans. The reward system records numeric feedback—whether scalar values or complex metrics—under the agentlightning.reward namespace, enabling fine-grained analysis of agent behavior across rollouts. This guide covers the complete workflow for creating, emitting, and consuming custom reward signals using the core APIs in agentlightning/emitter/reward.py.
Understanding the Reward Architecture
Agent Lightning implements rewards as reward spans, which are specialized OpenTelemetry spans where attributes encode numeric or multi-dimensional values. The system stores these dimensions under the LightningSpanAttributes.REWARD prefix, creating structured keys such as agentlightning.reward.0.name and agentlightning.reward.0.value.
The architecture separates emission from consumption:
- Emission: The
emit_reward()function inagentlightning/emitter/reward.py(lines 48-82) constructsRewardDimensionobjects and forwards them to the generic annotation emitter. - Storage: Attributes follow semantic conventions defined in
agentlightning/semconv.py(lines 69-73), ensuring consistent naming across the telemetry pipeline. - Retrieval: Utility functions like
get_reward_value()andfind_final_reward()parse span attributes to extract values for downstream analysis.
If the optional AgentOps client is configured, the legacy @reward decorator routes calls through AgentOps operations; otherwise, it falls back to the built-in telemetry emitter.
Emitting Custom Rewards
The primary API for recording rewards is emit_reward(), which supports scalar floats, dictionaries of metrics, or custom attributes for trace linking.
Scalar Rewards
For simple feedback signals, pass a numeric value directly to emit_reward():
from agentlightning.emitter.reward import emit_reward
# Inside a step or evaluation function
reward_value = 0.85
emit_reward(reward_value) # Creates a span with primary reward = 0.85
This creates a span where the value is stored as the primary reward dimension, accessible via agentlightning.reward.0.value.
Multi-Dimensional Rewards
When agents require granular evaluation across multiple criteria, pass a dictionary mapping dimension names to values:
from agentlightning.emitter.reward import emit_reward
reward_dict = {"task_completion": 1.0, "efficiency": 0.78}
emit_reward(reward_dict, primary_key="task_completion")
This produces two indexed attributes:
agentlightning.reward.0.name:"task_completion",agentlightning.reward.0.value:1.0(marked as primary)agentlightning.reward.1.name:"efficiency",agentlightning.reward.1.value:0.78
Adding Context with Linking Attributes
To correlate rewards with specific responses or operations, inject additional attributes using helpers from agentlightning.utils.otel:
from agentlightning.emitter.reward import emit_reward
from agentlightning.utils.otel import make_link_attributes
link_attrs = make_link_attributes({"gen_ai.response.id": "resp-42"})
emit_reward(0.5, attributes=link_attrs)
The attributes parameter accepts any dictionary of OpenTelemetry-compatible attributes, enabling precise trace correlation without modifying the reward value itself.
Retrieving Rewards from Traces
After recording rewards during agent execution, extract them using the parsing utilities in agentlightning/emitter/reward.py:
from agentlightning.emitter.reward import find_reward_spans, find_final_reward
# `spans` is the list returned by a store or tracer
reward_spans = find_reward_spans(spans)
print("All rewards:", [s.attributes for s in reward_spans])
final_reward = find_final_reward(spans)
print("Final reward of rollout:", final_reward)
Key retrieval functions include:
get_reward_value()(lines 13-28): Extracts the primary reward value from a single span.get_rewards_from_span(): Returns the full list ofRewardDimensionobjects for multi-dimensional analysis.find_reward_spans(): Filters a span list to return only those containing reward attributes.find_final_reward()(lines 107-121): Identifies the last reward in a rollout sequence, useful for episode-level evaluation.
Legacy API and Migration
The top-level module agentlightning/reward.py re-exports the emitter API but emits a deprecation warning on import. New implementations should import directly from agentlightning.emitter.reward.
For existing code using the decorator pattern, the @reward decorator remains functional:
from agentlightning.emitter.reward import reward
@reward # Emits the return value as a reward span
def compute_quality_score() -> float:
# Complex evaluation logic...
return 1.23
This decorator automatically emits the function's return value and is compatible with both the built-in telemetry and AgentOps backends.
Summary
- Import
emit_rewardfromagentlightning.emitter.reward(not the deprecated top-level module) to create reward spans. - Pass scalar floats for simple rewards or dictionaries for multi-dimensional feedback, specifying
primary_keyto designate the main metric. - Add trace correlation via the
attributesparameter using helpers likemake_link_attributes. - Extract recorded rewards using
find_final_reward()for rollouts orfind_reward_spans()for full trace analysis. - Reference
agentlightning/semconv.pyfor semantic conventions andtests/emitter/test_reward.pyfor production usage patterns.
Frequently Asked Questions
What is the difference between emit_reward() and the @reward decorator?
emit_reward() is the imperative API for emitting rewards at arbitrary points in your code, offering full control over timing and context. The @reward decorator is a legacy convenience wrapper that automatically emits the return value of a function as a reward span. According to the source in agentlightning/emitter/reward.py, new code should prefer emit_reward() for explicit control, while the decorator remains available for backward compatibility with AgentOps integrations.
How do I handle multi-dimensional rewards in Agent Lightning?
Pass a dictionary to emit_reward() where keys are dimension names and values are numeric scores. Use the primary_key parameter to designate which dimension represents the overall reward (stored at index 0). The system automatically creates indexed attributes like agentlightning.reward.0.name and agentlightning.reward.0.value for each dimension, as defined in LightningSpanAttributes.REWARD.
Where are reward attributes stored in the OpenTelemetry span?
Reward data lives under the agentlightning.reward namespace. Scalar rewards appear as agentlightning.reward.0.value, while multi-dimensional rewards populate agentlightning.reward.{index}.name and agentlightning.reward.{index}.value for each dimension. The constant LightningSpanAttributes.REWARD in agentlightning/semconv.py defines this prefix, ensuring consistency across the telemetry pipeline.
How do I migrate from the old reward module to the new emitter API?
Replace imports from agentlightning.reward with agentlightning.emitter.reward. The old module merely re-exports the same functions but triggers a deprecation warning. If using the @reward decorator, update the import path to agentlightning.emitter.reward while keeping the decorator syntax unchanged, as it remains supported in the new location for AgentOps compatibility.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →