# How Token Tracking and Cost Calculation Work in Pi-AI: A Complete Technical Guide

> Explore how Pi-AI tracks tokens and calculates costs. Discover the technical details of usage monitoring and pricing in this complete guide to Pi-AI's cost management.

- Repository: [Mario Zechner/pi-mono](https://github.com/badlogic/pi-mono)
- Tags: deep-dive
- Published: 2026-02-16

---

**Pi-AI tracks input, output, and cache tokens through a unified Usage interface, then calculates costs using per-model pricing rates defined in [`packages/ai/src/models.ts`](https://github.com/badlogic/pi-mono/blob/main/packages/ai/src/models.ts).**

The `badlogic/pi-mono` repository implements a comprehensive token tracking and cost calculation system that normalizes provider-specific token data into a standardized format. This architecture enables accurate cost attribution across different AI providers like OpenAI, Anthropic, and Bedrock while maintaining a consistent interface for the Pi-AI runtime and UI components.

## The Unified Usage Interface

At the core of Pi-AI's token tracking system is the `Usage` interface defined in [`packages/ai/src/types.ts`](https://github.com/badlogic/pi-mono/blob/main/packages/ai/src/types.ts). This contract ensures every assistant message carries standardized token metadata regardless of which provider generated the response.

```typescript
// packages/ai/src/types.ts
export interface Usage {
  /** raw token counts */
  input: number;
  output: number;
  cacheRead: number;
  cacheWrite: number;
  totalTokens: number;            // convenience sum
  /** derived monetary amounts */
  cost: {
    input: number;
    output: number;
    cacheRead: number;
    cacheWrite: number;
    total: number;
  };
}

```

Every `AssistantMessage` in the same file includes a `usage: Usage` field, ensuring token data travels together with the conversation transcript through the entire pipeline.

## Normalizing Provider Token Data

Provider adapters in `packages/ai/src/providers/*.ts` map raw API responses onto the unified `Usage` interface. Each adapter extracts provider-specific fields and normalizes them into the standard format before cost calculation occurs.

For example, the OpenAI Responses adapter handles field mapping as follows:

```typescript
// packages/ai/src/providers/openai-responses.ts (excerpt)
output.usage.input  = response.usage.input_tokens ?? 0;
output.usage.output = response.usage.output_tokens ?? 0;
output.usage.cacheRead = response.usage.input_tokens_details?.cached_tokens ?? 0;
...
// After the raw values are set, the shared logic below adds cost.

```

Similar mapping exists for Anthropic, Bedrock, Google, and other providers. After populating the raw token counts, each adapter calls the shared cost calculation logic to populate the monetary fields.

## Converting Tokens to Dollars

The `calculateCost()` function in [`packages/ai/src/models.ts`](https://github.com/badlogic/pi-mono/blob/main/packages/ai/src/models.ts) transforms token counts into monetary values using per-model pricing rates. These rates are defined as dollars-per-million-tokens in the generated model catalog.

```typescript
// packages/ai/src/models.ts
export function calculateCost<TApi extends Api>(model: Model<TApi>, usage: Usage): Usage["cost"] {
  // model.cost.* are $/million‑tokens values defined in the generated model catalog
  usage.cost.input     = (model.cost.input     / 1_000_000) * usage.input;
  usage.cost.output    = (model.cost.output    / 1_000_000) * usage.output;
  usage.cost.cacheRead = (model.cost.cacheRead / 1_000_000) * usage.cacheRead;
  usage.cost.cacheWrite= (model.cost.cacheWrite/ 1_000_000) * usage.cacheWrite;
  usage.cost.total     = usage.cost.input + usage.cost.output +
                         usage.cost.cacheRead + usage.cost.cacheWrite;
  return usage.cost;
}

```

When a provider finishes building a `Usage` object, Pi-AI calls `calculateCost()` to populate the monetary fields. The resulting `usage.cost.total` is stored on the message and later summed across the entire conversation.

## Runtime Aggregation and UI Display

The agent runtime aggregates token and cost fields across all messages in a run, exposing totals for both CLI tools and web interfaces.

In [`packages/mom/src/agent.ts`](https://github.com/badlogic/pi-mono/blob/main/packages/mom/src/agent.ts), the runtime accumulates usage statistics:

```typescript
// packages/mom/src/agent.ts (excerpt)
runState.totalUsage.input      += assistantMsg.usage.input;
runState.totalUsage.output     += assistantMsg.usage.output;
runState.totalUsage.cacheRead  += assistantMsg.usage.cacheRead;
runState.totalUsage.cacheWrite += assistantMsg.usage.cacheWrite;
runState.totalUsage.cost.total += assistantMsg.usage.cost.total;

```

The coding-agent UI footer displays these aggregates in real-time:

```typescript
// packages/coding-agent/src/modes/interactive/components/footer.ts
let totalCost = 0;
for (const entry of entries) {
  totalCost += entry.message.usage.cost.total;
}

```

For human-readable display, [`packages/web-ui/src/utils/format.ts`](https://github.com/badlogic/pi-mono/blob/main/packages/web-ui/src/utils/format.ts) provides formatting utilities:

```typescript
// packages/web-ui/src/utils/format.ts
export function formatUsage(usage: Usage) {
  if (!usage) return "";
  const parts = [];
  if (usage.input)  parts.push(`↑${formatTokenCount(usage.input)}`);
  if (usage.output) parts.push(`↓${formatTokenCount(usage.output)}`);
  if (usage.cacheRead) parts.push(`R${formatTokenCount(usage.cacheRead)}`);
  if (usage.cacheWrite) parts.push(`W${formatTokenCount(usage.cacheWrite)}`);
  if (usage.cost?.total) parts.push(formatCost(usage.cost.total));
  return parts.join(" ");
}

```

The UI shows token arrows (↑ for input, ↓ for output, R for cache read, W for cache write) alongside the dollar amount for each message.

## CLI Cost Reporting

The [`scripts/cost.ts`](https://github.com/badlogic/pi-mono/blob/main/scripts/cost.ts) CLI tool reads session files and generates per-day, per-provider cost summaries:

```typescript
// scripts/cost.ts (excerpt)
stats[day][provider].total   += cost.total || 0;
stats[day][provider].input   += cost.input  || 0;
stats[day][provider].output  += cost.output || 0;
...
// Finally prints a table with $‑values per provider and day.

```

This script aggregates the `usage.cost` fields stored in conversation history files, enabling users to track spending across different providers and time periods.

## Handling Token Overflow

Pi-AI also monitors **totalTokens** against model context windows to detect silent truncation. The overflow detector in [`packages/ai/src/utils/overflow.ts`](https://github.com/badlogic/pi-mono/blob/main/packages/ai/src/utils/overflow.ts) compares `usage.input` against `model.contextWindow`:

When the provider silently truncates input, the overflow detector still reports the *actual* token count, letting the UI warn the user that part of the context was dropped.

## Practical Example: Calculating Cost for a Single Response

To calculate the cost of a single assistant message programmatically:

```ts
import { getModel } from "packages/ai/src/models.js";
import { calculateCost } from "packages/ai/src/models.js";

async function example() {
  const model = getModel("openai", "gpt-4o-mini");
  // Assume `response` is an AssistantMessage returned from a provider
  const { usage } = response;          // already filled by the provider
  const cost = calculateCost(model, usage);
  console.log(`This reply cost $${cost.total.toFixed(4)}`);
}

```

The `usage` object already contains token counts populated by the provider adapter; `calculateCost` adds the monetary values based on the selected model's pricing.

## Summary

- **Unified Interface**: The `Usage` interface in [`packages/ai/src/types.ts`](https://github.com/badlogic/pi-mono/blob/main/packages/ai/src/types.ts) standardizes input, output, cache read, and cache write tokens across all providers.
- **Provider Normalization**: Adapters in `packages/ai/src/providers/*.ts` map raw API responses to the `Usage` structure before cost calculation.
- **Cost Calculation**: The `calculateCost()` function in [`packages/ai/src/models.ts`](https://github.com/badlogic/pi-mono/blob/main/packages/ai/src/models.ts) converts token counts to dollars using per-model rates defined as $/million-tokens.
- **Runtime Aggregation**: Agent runtimes in [`packages/mom/src/agent.ts`](https://github.com/badlogic/pi-mono/blob/main/packages/mom/src/agent.ts) and [`packages/coding-agent/src/core/agent-session.ts`](https://github.com/badlogic/pi-mono/blob/main/packages/coding-agent/src/core/agent-session.ts) sum usage across all messages to provide session totals.
- **Human-Readable Output**: The [`scripts/cost.ts`](https://github.com/badlogic/pi-mono/blob/main/scripts/cost.ts) CLI and [`packages/web-ui/src/utils/format.ts`](https://github.com/badlogic/pi-mono/blob/main/packages/web-ui/src/utils/format.ts) format token counts and costs for display, including per-day cost breakdowns.

## Frequently Asked Questions

### How does Pi-AI handle different token counting methods across providers?

Pi-AI uses provider-specific adapters located in `packages/ai/src/providers/*.ts` to normalize raw API responses into the unified `Usage` interface. Each adapter extracts provider-specific fields—such as OpenAI's `input_tokens` or Anthropic's cache read/write statistics—and maps them to the standard `input`, `output`, `cacheRead`, and `cacheWrite` properties before cost calculation occurs.

### Where are the per-model pricing rates defined in the codebase?

The pricing rates are defined as dollars-per-million-tokens values within the generated model catalog accessed via [`packages/ai/src/models.ts`](https://github.com/badlogic/pi-mono/blob/main/packages/ai/src/models.ts). The `calculateCost()` function retrieves these rates from the `model.cost` object—which includes `input`, `output`, `cacheRead`, and `cacheWrite` pricing—and applies them to the token counts stored in the `Usage` object.

### How can I view total token costs for a conversation session?

You can view aggregated costs through multiple interfaces: the [`scripts/cost.ts`](https://github.com/badlogic/pi-mono/blob/main/scripts/cost.ts) CLI tool reads session files and prints per-day, per-provider cost tables; the web UI displays formatted costs via [`packages/web-ui/src/utils/format.ts`](https://github.com/badlogic/pi-mono/blob/main/packages/web-ui/src/utils/format.ts) next to each message; and the coding-agent interactive mode shows running totals in the footer component located at [`packages/coding-agent/src/modes/interactive/components/footer.ts`](https://github.com/badlogic/pi-mono/blob/main/packages/coding-agent/src/modes/interactive/components/footer.ts).

### Does Pi-AI detect when input exceeds the model's context window?

Yes, Pi-AI monitors token overflow through [`packages/ai/src/utils/overflow.ts`](https://github.com/badlogic/pi-mono/blob/main/packages/ai/src/utils/overflow.ts), which compares the `usage.input` count against the `model.contextWindow` limit. This detection runs independently of the provider's truncation behavior, ensuring the UI can warn users when silent input truncation occurs, even if the API response indicates success.