VT Parser System in Windows Terminal: Architecture and Implementation

The Windows Terminal VT parser system is a finite-state machine that converts incoming byte streams into high-level terminal actions through a two-layer architecture separating generic lexing from command dispatch.

The VT parser system in Windows Terminal provides the core translation layer between raw ANSI/VT escape sequences and concrete terminal operations. Implemented in the microsoft/terminal repository, this component processes everything from cursor movements to mouse events by following DEC VT100/VT520 specifications. It serves both the input path (translating PTY data into Win32 input events) and the output path (generating VT sequences for terminal rendering).

Architecture Overview

The system splits responsibilities across two distinct layers to maintain separation between parsing logic and command execution.

Layer Responsibility Core Types
State Machine Generic lexer that reads characters, recognizes escape-sequence families (CSI, OSC, DCS, SS3, VT52), and builds parameter lists. Microsoft::Console::VirtualTerminal::StateMachine in src/terminal/parser/stateMachine.hpp and src/terminal/parser/stateMachine.cpp
Engine Command dispatcher implementing concrete actions via the IStateMachineEngine interface. InputStateMachineEngine and OutputStateMachineEngine in src/terminal/parser/InputStateMachineEngine.hpp and OutputStateMachineEngine.hpp

The StateMachine acts as the generic lexer, identifying sequence families and accumulating parameters. When a complete sequence is recognized, it calls back into an IStateMachineEngine implementation to perform the actual work. This design allows the same parser core to service both input and output pipelines.

How the State Machine Processes Sequences

Construction and Engine Binding

The parser initializes with a concrete engine implementation. The constructor receives a std::unique_ptr<IStateMachineEngine> and records whether the engine is for input (_isEngineForInput) to handle corner-cases like ESC key processing differently.

auto engine = std::make_unique<InputStateMachineEngine>(std::move(dispatch));
StateMachine parser{ std::move(engine) };

Character Processing and State Transitions

The public entry points ProcessCharacter(wchar_t) and ProcessString(std::wstring_view) forward data to internal _Event* methods based on the current state. The state enum (VTStates) mirrors the classic VT100 state diagram, including states like Ground, Escape, CSI-Entry, and CSI-Param.

Helper functions such as _isC0Code, _isC1ControlCharacter, _isCsiIndicator, and _isOscIndicator perform low-level Unicode range checks required by the specification. These constexpr functions are defined in src/terminal/parser/stateMachine.cpp.

Parameter Accumulation

When the parser encounters numeric characters or delimiters (; or :) while in CSI, OSC, or DCS states, it invokes _ActionParam or _ActionSubParam. These functions use _AccumulateTo to convert digit characters into integer values (VTInt), capping values at MAX_PARAMETER_VALUE = 65535.

The system enforces strict limits to prevent malformed input from causing issues:

  • MAX_PARAMETER_COUNT: 32 parameters maximum
  • MAX_SUBPARAMETER_COUNT: 6 subparameters maximum

Command Dispatch

Once the parser reaches a terminating character (e.g., m for SGR, [ for CSI, \ for OSC), it calls the corresponding _Action*Dispatch method. This forwards the built identifier and collected parameters to the engine:

_engine->ActionCsiDispatch(_identifier.Finalize(wch), { _parameters, _subParameters, _subParameterRanges });

The _identifier is constructed using VTIDBuilder, which accumulates intermediate characters to form the final command identifier.

Reset Injection and Diagnostics

Certain "hard-reset" sequences (RIS, DECSET 1004, CSI ? 9001 h) must survive parser resets. The InjectSequence(InjectionType) method records sequence offsets so that ConPTY can replay them automatically after reconstruction.

For diagnostics, ParserTracing (defined in src/terminal/parser/tracing.hpp) records state transitions and actions when the VT_TRACE environment variable is enabled. This tracing system allows developers to debug complex VT interactions without impacting production performance.

Input vs. Output Engines

The InputStateMachineEngine parses incoming VT sequences from child processes (such as \x1b[31m for red text or mouse reports) and translates them into Win32 INPUT_RECORD structures. This engine handles Windows-specific extensions like Win32 Input Mode (CSI ? 9001 h) and integrates with the host through VtInputThread.cpp.

The OutputStateMachineEngine performs the inverse transformation, emitting VT byte sequences when the console core requests screen updates. When the system needs to move the cursor or set SGR attributes, this engine generates the proper escape codes for downstream PTY consumers.

Both engines inherit from IStateMachineEngine and implement virtual methods including ActionPrint, ActionCsiDispatch, ActionOscDispatch, and ActionExecute.

Implementation Examples

Minimal Custom Engine

The following example demonstrates implementing a custom engine that echoes CSI sequences:

// EchoEngine.hpp
#pragma once
#include "IStateMachineEngine.hpp"

class EchoEngine final : public Microsoft::Console::VirtualTerminal::IStateMachineEngine
{
public:
    bool ActionExecute(const wchar_t wch) override { return true; }
    bool ActionPrint(const wchar_t wch) override   { wprintf(L"%c", wch); return true; }

    bool ActionCsiDispatch(const VTID id,
                           const VTParameters parameters) override
    {
        wprintf(L"CSI: %.*s  params:", (int)id.Length(), id.Data());
        for (auto p : parameters.Parameters)
            wprintf(L" %d", p);
        wprintf(L"\n");
        return true;
    }

    // Stub implementations for other required methods
    bool ActionEscDispatch(const VTID) override { return true; }
    bool ActionOscDispatch(size_t, std::wstring_view) noexcept override { return true; }
    StringHandler ActionDcsDispatch(const VTID, const VTParameters) noexcept override { return nullptr; }
    bool ActionSs3Dispatch(wchar_t, const VTParameters) noexcept override { return true; }
    bool ActionVt52EscDispatch(const VTID, const VTParameters) noexcept override { return true; }
    bool ActionPrintString(std::wstring_view) override { return true; }
    bool ActionPassThroughString(std::wstring_view) override { return true; }
};
// main.cpp
#include "EchoEngine.hpp"
#include "stateMachine.hpp"
using namespace Microsoft::Console::VirtualTerminal;

int wmain()
{
    auto engine = std::make_unique<EchoEngine>();
    StateMachine parser{ std::move(engine) };

    // Feed a raw VT string
    parser.ProcessString(L"\x1b[31;1mHello\x1b[0m World\r\n");
    return 0;
}

Output:


CSI: m  params: 31 1
Hello
CSI: m  params: 0
 World

ConPTY Input Integration

In src/host/VtInputThread.cpp, the terminal creates an input engine to process PTY data:

auto dispatch = std::make_unique<VtInputDispatch>(/*...*/);
auto engine   = std::make_unique<InputStateMachineEngine>(std::move(dispatch));
_stateMachine = std::make_unique<StateMachine>(std::move(engine));

// Process incoming data
void VtInputThread::_OnDataReceived(const std::wstring_view data)
{
    _stateMachine->ProcessString(data);
}

This setup translates VT mouse reports (\x1b[<0;30;12M), focus events (\x1b[I), and attribute changes into Win32 input records.

Triggering Reset Injections

After a hard reset, the system preserves critical sequences:

// Following a reset (e.g., Ctrl+Shift+R)
_stateMachine->InjectSequence(InjectionType::RIS);

// Replay injections when reconstructing the PTY
for (auto&& inj : _stateMachine->GetInjections())
{
    WriteFile(ptyInHandle, buffer.data() + inj.offset, inj.length, nullptr, nullptr);
}

This mechanism ensures that mode-enabling sequences survive parser reconstruction.

Key Source Files

File Role
src/terminal/parser/stateMachine.hpp Declares the StateMachine class, state enums, Injection struct, and VTIDBuilder.
src/terminal/parser/stateMachine.cpp Implements state transitions, character classification, parameter handling, and dispatch logic.
src/terminal/parser/IStateMachineEngine.hpp Abstract interface defining virtual methods for command execution.
src/terminal/parser/InputStateMachineEngine.hpp / .cpp Input engine converting VT sequences to Win32 input events.
src/terminal/parser/OutputStateMachineEngine.hpp / .cpp Output engine generating VT sequences for terminal updates.
src/terminal/parser/tracing.hpp Diagnostic tracing infrastructure controlled by VT_TRACE.
src/host/VtInputThread.cpp Integration point creating the input engine and feeding PTY data to the parser.

Summary

  • The VT parser system uses a finite-state machine architecture with strict separation between lexing (StateMachine) and command execution (IStateMachineEngine).
  • Parameter handling enforces hard limits: 32 parameters and 6 subparameters maximum, with values capped at 65535.
  • Two production engines exist: InputStateMachineEngine for ConPTY-to-Win32 translation and OutputStateMachineEngine for Win32-to-VT generation.
  • The injection mechanism preserves critical sequences like RIS across parser resets, enabling reliable ConPTY reconstruction.
  • Developers can trace parser behavior using the VT_TRACE environment variable to monitor state transitions in real time.

Frequently Asked Questions

What is the maximum number of parameters a VT sequence can have in Windows Terminal?

The parser enforces a hard limit of 32 parameters per sequence, with up to 6 subparameters each. Numeric values are capped at 65535 (MAX_PARAMETER_VALUE). These limits protect against malformed input causing buffer overflows or excessive memory consumption.

How does Windows Terminal handle mouse events from VT sequences?

The InputStateMachineEngine in src/terminal/parser/InputStateMachineEngine.cpp parses mouse report sequences (such as \x1b[<0;30;12M for SGR mouse mode) and converts them into Win32 INPUT_RECORD structures. The VtInputThread feeds raw PTY data into the state machine, which dispatches to the engine for translation into mouse button events and coordinates.

What is the difference between the Input and Output state machine engines?

The InputStateMachineEngine translates incoming VT byte streams into Win32 console input events, handling keyboard sequences, mouse reports, and focus notifications. The OutputStateMachineEngine performs the inverse operation, generating VT escape sequences when the terminal needs to update cursor position, text attributes, or screen content. Both implement the IStateMachineEngine interface but serve opposite data flow directions.

How can I debug VT parser behavior in Windows Terminal?

Set the VT_TRACE environment variable to enable ParserTracing diagnostics. When active, the system logs every state transition and parser action from src/terminal/parser/tracing.hpp. This tracing does not affect performance when disabled and provides visibility into how specific byte sequences transition through states like Ground, Escape, and CSI-Entry.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →