# How to Use `initial_prompt` and `condition_on_previous_text` for Context in OpenAI Whisper

> **Use `initial_prompt` to inject static text at the start of transcription, and enable `condition_on_previous_text` (default: True) to carry decoded output from previous audio windows into subsequent decoding steps for contextu...

- Repository: [OpenAI/whisper](https://github.com/openai/whisper)
- Tags: 
- Published: 2026-02-27

---

**Use `initial_prompt` to inject static text at the start of transcription, and enable `condition_on_previous_text` (default: True) to carry decoded output from previous audio windows into subsequent decoding steps for contextual continuity.**

OpenAI Whisper processes long audio files by splitting them into overlapping windows and decoding each sequentially. The `initial_prompt` and `condition_on_previous_text` parameters in [`whisper/transcribe.py`](https://github.com/openai/whisper/blob/main/whisper/transcribe.py) control how contextual information flows between these windows, allowing you to guide the model with domain-specific vocabulary, speaker names, or formatting hints.

## Understanding Whisper's Windowed Transcription Architecture

Whisper handles long-form audio by processing it in chunks. For each window, the decoder can receive a **prompt**—a list of token IDs that the model treats as preceding text. This mechanism prevents the model from treating every audio segment as an isolated utterance.

The transcription loop in [`whisper/transcribe.py`](https://github.com/openai/whisper/blob/main/whisper/transcribe.py) manages two distinct prompt sources:
- **Static prompts** provided by the user via `initial_prompt`
- **Dynamic prompts** generated from previously decoded text when `condition_on_previous_text=True`

## How `initial_prompt` Injects Static Context

The `initial_prompt` parameter accepts an optional string that gets tokenized once at the beginning of transcription. According to the source code in [`whisper/transcribe.py`](https://github.com/openai/whisper/blob/main/whisper/transcribe.py) (lines 48-52), these tokens are prepared before the main decoding loop begins.

### Single Window Context (Default Behavior)

By default, `carry_initial_prompt=False`, meaning the `initial_prompt` tokens are only prepended to the very first audio window (lines 239-244). This is ideal for providing a one-time context such as a language hint or speaker identification that should influence the opening of the transcript but not constrain subsequent segments.

```python
import whisper

model = whisper.load_model("base")
result = whisper.transcribe.transcribe(
    model,
    "audio.mp3",
    initial_prompt="Speaker: Dr. Smith\nTopic: Cardiology",
    carry_initial_prompt=False,  # Only first window sees this

    condition_on_previous_text=True
)

```

### Persistent Context with `carry_initial_prompt`

When you set `carry_initial_prompt=True`, the code prepends the `initial_prompt` tokens to **every** internal `decode()` call (lines 288-291). This ensures that every window receives the same leading context, which helps maintain consistent formatting or domain-specific language modeling throughout the transcription.

```python
result = whisper.transcribe.transcribe(
    model,
    "audio.mp3",
    initial_prompt="Medical Transcription - Patient ID 12345:",
    carry_initial_prompt=True,   # Prepended to every window

    condition_on_previous_text=True
)

```

## How `condition_on_previous_text` Maintains Continuity

The `condition_on_previous_text` boolean (default: `True`) controls whether Whisper feeds the decoded output of the previous window into the next one as a prompt. When enabled, the model receives a running context that prevents it from "resetting" between windows (line 503).

This dynamic conditioning is crucial for maintaining consistency in proper nouns, acronyms, and speaking style across long recordings. However, if the model encounters a difficult segment and produces an error, this error can propagate forward because subsequent windows are conditioned on the mistaken text (line 550).

## Interaction Between Prompt Options

When both `carry_initial_prompt=True` and `condition_on_previous_text=True` are active, the prompt for each window contains **both** the static `initial_prompt` tokens and the dynamic previous-window text. The code constructs this combined prompt in [`whisper/transcribe.py`](https://github.com/openai/whisper/blob/main/whisper/transcribe.py) before passing it via `decode_options["prompt"]` to the model's `decode()` method.

**Critical Trade-off:** Because the prompt length is limited, prepending a long `initial_prompt` can truncate the dynamic context from the previous window. The source code explicitly warns about this limitation (lines 548-550), noting that excessive static prompting reduces the benefit of `condition_on_previous_text`.

## Practical Implementation Examples

These patterns demonstrate common configurations using the Python API or command-line interface:

### Isolate Difficult Segments

Disable conditioning when processing noisy or unrelated audio sections where context propagation causes hallucinations:

```python
result = whisper.transcribe.transcribe(
    model,
    "noisy_audio.mp3",
    initial_prompt=None,
    condition_on_previous_text=False  # Each window decoded independently

)

```

### Combine Static and Dynamic Context

Use a short persistent header while maintaining window-to-window continuity:

```bash
whisper audio.mp3 --initial_prompt "Court Proceedings:" --carry_initial_prompt True --condition_on_previous_text True

```

### Domain-Specific Vocabulary Priming

Prime the model with technical terms at the start without consuming context window space throughout:

```python
result = whisper.transcribe.transcribe(
    model,
    "tech_talk.mp3",
    initial_prompt="Kubernetes, Docker, microservices",
    carry_initial_prompt=False,
    condition_on_previous_text=True
)

```

## Key Source Files and Implementation Details

- **[`whisper/transcribe.py`](https://github.com/openai/whisper/blob/main/whisper/transcribe.py)**: Contains the core transcription loop and prompt assembly logic. Relevant sections include parameter definition (lines 48-52), first-window initialization (lines 239-244), `carry_initial_prompt` handling (lines 288-291), and `condition_on_previous_text` application (line 503).
- **[`whisper/tokenizer.py`](https://github.com/openai/whisper/blob/main/whisper/tokenizer.py)**: Handles the `tokenizer.encode` call that converts `initial_prompt` strings into token IDs used by the decoder.
- **[`whisper/__main__.py`](https://github.com/openai/whisper/blob/main/whisper/__main__.py)**: Defines CLI arguments including `--initial_prompt`, `--carry_initial_prompt`, and `--condition_on_previous_text`.

## Summary

- **`initial_prompt`** provides static context tokenized at the start of transcription, useful for domain vocabulary or formatting hints.
- **`carry_initial_prompt=True`** prepends the initial prompt to every decode window, but reduces space available for `condition_on_previous_text` context.
- **`condition_on_previous_text`** (default enabled) feeds previous window output forward, maintaining consistency across long audio files.
- These parameters interact in the prompt construction logic in [`whisper/transcribe.py`](https://github.com/openai/whisper/blob/main/whisper/transcribe.py), where the combined token list is passed to `decode_options["prompt"]`.
- Disable `condition_on_previous_text` when the model gets stuck in error loops on difficult audio segments.

## Frequently Asked Questions

### What is the difference between `initial_prompt` and `condition_on_previous_text`?

`initial_prompt` accepts user-provided text that remains constant throughout (or at the start of) transcription, while `condition_on_previous_text` automatically feeds the model's own output from previous audio windows into subsequent ones. The former provides static guidance you control; the latter provides dynamic continuity the model generates.

### Should I enable `carry_initial_prompt` for long-form transcription?

Only if every audio window requires the same leading context, such as a mandatory header or consistent speaker label. Be aware that in [`whisper/transcribe.py`](https://github.com/openai/whisper/blob/main/whisper/transcribe.py) (lines 548-550), the code warns that persistent initial prompts consume token budget that would otherwise carry previous-window context, potentially harming transcription coherence across window boundaries.

### Why would I disable `condition_on_previous_text`?

Disable this option when the model enters a failure loop—repeatedly hallucinating the same incorrect text across multiple windows—because the error propagates forward via the prompt. Setting `condition_on_previous_text=False` makes each window independent, allowing the model to recover from localized audio corruption or ambiguous speech.

### How do I use these options from the command line?

Whisper's CLI exposes `--initial_prompt` as a string argument, `--carry_initial_prompt` as a flag (stores_true), and `--condition_on_previous_text` as a boolean flag (default True). For example: `whisper audio.mp3 --initial_prompt "Interview transcript:" --carry_initial_prompt`.