Implementing Custom Reward Functions in Agent Lightning: A Complete Guide

Call emit_reward() from agentlightning.emitter.reward to emit OpenTelemetry spans encoding scalar or multi-dimensional rewards, then retrieve them using utility functions like find_final_reward().

Agent Lightning (microsoft/agent-lightning) provides a telemetry-first framework for LLM-driven agents that captures performance signals as OpenTelemetry spans. The reward system records numeric feedback—whether scalar values or complex metrics—under the agentlightning.reward namespace, enabling fine-grained analysis of agent behavior across rollouts. This guide covers the complete workflow for creating, emitting, and consuming custom reward signals using the core APIs in agentlightning/emitter/reward.py.

Understanding the Reward Architecture

Agent Lightning implements rewards as reward spans, which are specialized OpenTelemetry spans where attributes encode numeric or multi-dimensional values. The system stores these dimensions under the LightningSpanAttributes.REWARD prefix, creating structured keys such as agentlightning.reward.0.name and agentlightning.reward.0.value.

The architecture separates emission from consumption:

  • Emission: The emit_reward() function in agentlightning/emitter/reward.py (lines 48-82) constructs RewardDimension objects and forwards them to the generic annotation emitter.
  • Storage: Attributes follow semantic conventions defined in agentlightning/semconv.py (lines 69-73), ensuring consistent naming across the telemetry pipeline.
  • Retrieval: Utility functions like get_reward_value() and find_final_reward() parse span attributes to extract values for downstream analysis.

If the optional AgentOps client is configured, the legacy @reward decorator routes calls through AgentOps operations; otherwise, it falls back to the built-in telemetry emitter.

Emitting Custom Rewards

The primary API for recording rewards is emit_reward(), which supports scalar floats, dictionaries of metrics, or custom attributes for trace linking.

Scalar Rewards

For simple feedback signals, pass a numeric value directly to emit_reward():

from agentlightning.emitter.reward import emit_reward

# Inside a step or evaluation function

reward_value = 0.85
emit_reward(reward_value)  # Creates a span with primary reward = 0.85

This creates a span where the value is stored as the primary reward dimension, accessible via agentlightning.reward.0.value.

Multi-Dimensional Rewards

When agents require granular evaluation across multiple criteria, pass a dictionary mapping dimension names to values:

from agentlightning.emitter.reward import emit_reward

reward_dict = {"task_completion": 1.0, "efficiency": 0.78}
emit_reward(reward_dict, primary_key="task_completion")

This produces two indexed attributes:

  • agentlightning.reward.0.name: "task_completion", agentlightning.reward.0.value: 1.0 (marked as primary)
  • agentlightning.reward.1.name: "efficiency", agentlightning.reward.1.value: 0.78

Adding Context with Linking Attributes

To correlate rewards with specific responses or operations, inject additional attributes using helpers from agentlightning.utils.otel:

from agentlightning.emitter.reward import emit_reward
from agentlightning.utils.otel import make_link_attributes

link_attrs = make_link_attributes({"gen_ai.response.id": "resp-42"})
emit_reward(0.5, attributes=link_attrs)

The attributes parameter accepts any dictionary of OpenTelemetry-compatible attributes, enabling precise trace correlation without modifying the reward value itself.

Retrieving Rewards from Traces

After recording rewards during agent execution, extract them using the parsing utilities in agentlightning/emitter/reward.py:

from agentlightning.emitter.reward import find_reward_spans, find_final_reward

# `spans` is the list returned by a store or tracer

reward_spans = find_reward_spans(spans)
print("All rewards:", [s.attributes for s in reward_spans])

final_reward = find_final_reward(spans)
print("Final reward of rollout:", final_reward)

Key retrieval functions include:

  • get_reward_value() (lines 13-28): Extracts the primary reward value from a single span.
  • get_rewards_from_span(): Returns the full list of RewardDimension objects for multi-dimensional analysis.
  • find_reward_spans(): Filters a span list to return only those containing reward attributes.
  • find_final_reward() (lines 107-121): Identifies the last reward in a rollout sequence, useful for episode-level evaluation.

Legacy API and Migration

The top-level module agentlightning/reward.py re-exports the emitter API but emits a deprecation warning on import. New implementations should import directly from agentlightning.emitter.reward.

For existing code using the decorator pattern, the @reward decorator remains functional:

from agentlightning.emitter.reward import reward

@reward  # Emits the return value as a reward span

def compute_quality_score() -> float:
    # Complex evaluation logic...

    return 1.23

This decorator automatically emits the function's return value and is compatible with both the built-in telemetry and AgentOps backends.

Summary

  • Import emit_reward from agentlightning.emitter.reward (not the deprecated top-level module) to create reward spans.
  • Pass scalar floats for simple rewards or dictionaries for multi-dimensional feedback, specifying primary_key to designate the main metric.
  • Add trace correlation via the attributes parameter using helpers like make_link_attributes.
  • Extract recorded rewards using find_final_reward() for rollouts or find_reward_spans() for full trace analysis.
  • Reference agentlightning/semconv.py for semantic conventions and tests/emitter/test_reward.py for production usage patterns.

Frequently Asked Questions

What is the difference between emit_reward() and the @reward decorator?

emit_reward() is the imperative API for emitting rewards at arbitrary points in your code, offering full control over timing and context. The @reward decorator is a legacy convenience wrapper that automatically emits the return value of a function as a reward span. According to the source in agentlightning/emitter/reward.py, new code should prefer emit_reward() for explicit control, while the decorator remains available for backward compatibility with AgentOps integrations.

How do I handle multi-dimensional rewards in Agent Lightning?

Pass a dictionary to emit_reward() where keys are dimension names and values are numeric scores. Use the primary_key parameter to designate which dimension represents the overall reward (stored at index 0). The system automatically creates indexed attributes like agentlightning.reward.0.name and agentlightning.reward.0.value for each dimension, as defined in LightningSpanAttributes.REWARD.

Where are reward attributes stored in the OpenTelemetry span?

Reward data lives under the agentlightning.reward namespace. Scalar rewards appear as agentlightning.reward.0.value, while multi-dimensional rewards populate agentlightning.reward.{index}.name and agentlightning.reward.{index}.value for each dimension. The constant LightningSpanAttributes.REWARD in agentlightning/semconv.py defines this prefix, ensuring consistency across the telemetry pipeline.

How do I migrate from the old reward module to the new emitter API?

Replace imports from agentlightning.reward with agentlightning.emitter.reward. The old module merely re-exports the same functions but triggers a deprecation warning. If using the @reward decorator, update the import path to agentlightning.emitter.reward while keeping the decorator syntax unchanged, as it remains supported in the new location for AgentOps compatibility.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →