# Summarize Output Formats: Text, Markdown, and JSON Comparison

> Compare output formats text markdown and JSON for the summarize CLI tool. Understand raw text, colored Markdown, and structured JSON payloads for your summarization needs.

- Repository: [Peter Steinberger/summarize](https://github.com/steipete/summarize)
- Tags: deep-dive
- Published: 2026-02-19

---

**The summarize CLI provides three distinct output modes: `--format text` (default) streams raw content directly to stdout, `--format md` extracts Markdown and renders it with ANSI colors in supported terminals, and `--json` emits a single structured payload containing input metadata, extracted content, LLM prompts, and performance metrics.**

The `steipete/summarize` repository implements these formats to support both human-readable terminal workflows and machine-readable integrations. Each format follows a specific code path through the TypeScript source, with `--json` disabling streaming entirely to produce a parseable document.

## Plain Text Output (Default Behavior)

When you run the CLI without specifying a format, it defaults to **plain text streaming**. The `parseExtractFormat` function in [`src/flags.ts`](https://github.com/steipete/summarize/blob/main/src/flags.ts) normalizes the input and returns `"text"` when no `--format` flag is provided, or when explicitly set with `--format text`, `--format txt`, or `--format plain`.

```ts
// src/flags.ts
export type ExtractFormat = "text" | "markdown";

export function parseExtractFormat(raw: string): ExtractFormat {
  const normalized = raw.trim().toLowerCase();
  if (normalized === "text" || normalized === "txt" || normalized === "plain") return "text";
  if (normalized === "md" || normalized === "markdown") return "markdown";
  throw new Error(`Unsupported --format: ${raw}`);
}

```

In [`src/run/runner.ts`](https://github.com/steipete/summarize/blob/main/src/run/runner.ts), the runner checks `flags.format` and writes the extracted content or final summary verbatim to `stdout`. This mode produces no ANSI escape codes, no Markdown rendering, and no JSON wrappers, making it ideal for Unix pipelines and file redirection.

## Markdown Output with ANSI Rendering

The **`--format md`** flag triggers a sophisticated extraction pipeline that converts source content into Markdown before optionally rendering it with terminal colors.

### The Markdown Pipeline

When processing URLs, the CLI first extracts content using Readability, Firecrawl, or LLM-based conversion depending on the `--markdown-mode` setting. The extracted Markdown string is then passed through the output handlers in [`src/run/flows/url/summary.ts`](https://github.com/steipete/summarize/blob/main/src/run/flows/url/summary.ts) or [`src/run/flows/asset/output.ts`](https://github.com/steipete/summarize/blob/main/src/run/flows/asset/output.ts).

### Terminal Detection and markdansi

If the terminal is detected as "rich" (supporting truecolor) and the `--plain` flag is not set, the Markdown is processed through `markdansi` to produce ANSI-colored output with formatted headings, links, and inline slide thumbnails. This happens automatically when `isRichTty` returns true, providing a polished reading experience in modern terminals like kitty or iTerm2.

```bash
summarize "https://example.com" --format md

```

## JSON Machine-Readable Output

The **`--json`** flag fundamentally changes the CLI behavior by suppressing streaming and emitting a single structured document. Unlike `--format`, which controls content representation, `--json` is a boolean flag that forces the application to build a comprehensive payload and exit after printing it.

### Payload Structure and Schema

Both asset and URL flows construct identical top-level schemas containing seven key fields:

- **`input`**: Request metadata including URL, timeout, format selection, and model parameters
- **`env`**: Diagnostic flags indicating which API keys are configured (`hasXaiKey`, `hasOpenAIKey`, etc.)
- **`extracted`**: The raw extracted content with media type and source labels
- **`prompt`**: The final LLM prompt string when summary generation occurs
- **`llm`**: Diagnostic information about the LLM call
- **`metrics`**: Token usage, latency, and cost data when `--metrics` is enabled
- **`summary`**: The generated summary text or `null` when using `--extract` mode

### Asset vs URL Flow Implementation

In [`src/run/flows/asset/output.ts`](https://github.com/steipete/summarize/blob/main/src/run/flows/asset/output.ts) (lines 65-95), the JSON payload construction handles file and audio inputs:

```ts
if (flags.json) {
  const finishReport = flags.shouldComputeReport ? await hooks.buildReport() : null;
  const payload = {
    input: { kind: "asset-url", url, timeoutMs: flags.timeoutMs, format: flags.format, preprocess: flags.preprocessMode },
    env: { hasXaiKey: Boolean(apiStatus.xaiApiKey), … },
    extracted: { kind: "asset", source: sourceLabel, mediaType: attachment.mediaType, filename: attachment.filename, content: extracted.content },
    prompt: null,
    llm: null,
    metrics: flags.metricsEnabled ? finishReport : null,
    summary: null,
  };
  io.stdout.write(`${JSON.stringify(payload, null, 2)}\n`);
}

```

For URL processing, [`src/run/flows/url/summary.ts`](https://github.com/steipete/summarize/blob/main/src/run/flows/url/summary.ts) (lines 34-70) builds a more detailed input object including YouTube mode, Firecrawl settings, and output language preferences. In both cases, the output is pretty-printed with `JSON.stringify(payload, null, 2)`.

## Streaming and Flag Interactions

**JSON mode disables streaming.** The `parseStreamMode` logic in [`src/run/runner.ts`](https://github.com/steipete/summarize/blob/main/src/run/runner.ts) automatically turns off streaming when `--json` is present, ensuring the payload is emitted as a single contiguous document. Additionally, `--json` forces a non-rich terminal path to prevent ANSI codes from corrupting the JSON structure.

## Practical Usage Examples

### Default Text Streaming

```bash
summarize "https://example.com/article"

```

Streams the summary as plain text directly to stdout.

### Markdown with LLM Conversion

```bash
summarize "https://example.com" --format md --markdown-mode llm

```

Sends raw HTML to the configured LLM for high-quality Markdown conversion before rendering.

### Structured JSON for Debugging

```bash
summarize "https://example.com" --json --verbose --metrics

```

Outputs a complete diagnostic payload including the LLM prompt, token usage, and timing metrics.

### Extraction-Only JSON

```bash
summarize "https://example.com" --extract --format md --json

```

Returns the JSON schema with the `extracted` field populated but `summary` set to `null`, skipping the LLM summarization step entirely.

## Summary

- **Text format** (`--format text` or default) streams raw strings without formatting, optimized for shell pipelines and file redirection.
- **Markdown format** (`--format md`) extracts structured Markdown and renders it with ANSI colors via `markdansi` when running in rich terminals.
- **JSON format** (`--json`) suppresses streaming to emit a machine-readable payload from [`src/run/flows/asset/output.ts`](https://github.com/steipete/summarize/blob/main/src/run/flows/asset/output.ts) or [`src/run/flows/url/summary.ts`](https://github.com/steipete/summarize/blob/main/src/run/flows/url/summary.ts), containing input metadata, environment diagnostics, extracted content, and optional metrics.
- **Implementation** details reside in [`src/flags.ts`](https://github.com/steipete/summarize/blob/main/src/flags.ts) for parsing, [`src/run/runner.ts`](https://github.com/steipete/summarize/blob/main/src/run/runner.ts) for dispatch, and the respective flow files for payload assembly.

## Frequently Asked Questions

### What is the difference between `--format md` and `--json`?

The `--format md` flag controls how content is extracted and displayed, producing human-readable Markdown with optional ANSI coloring. The `--json` flag is a separate boolean that changes the output structure entirely, wrapping the extracted content (whether text or Markdown) in a machine-readable schema containing metadata, environment flags, and diagnostics.

### Does JSON output include the LLM prompt and token usage?

Yes, when you combine `--json` with `--verbose` or `--metrics`, the payload includes the `prompt` field containing the final prompt sent to the LLM, and the `metrics` field containing token counts, latency measurements, and cost estimates. Without these flags, those fields may be `null`.

### Why does my terminal show colors with `--format md` but plain text with `--json`?

The CLI detects "rich" terminal capabilities (truecolor support) when determining whether to apply ANSI colors via `markdansi`. However, the `--json` flag explicitly forces a non-rich path to ensure valid JSON output without escape codes, regardless of terminal capabilities.

### Can I use `--format text` with `--json`?

Yes, but the result differs from standard text mode. When using `--json`, the `format` field inside the JSON payload will indicate `"text"`, and the `extracted.content` field will contain raw text rather than Markdown. However, the outer output is still a JSON document, not streaming text.