Windows Terminal Input Handling: A Complete Guide to Keyboard, Mouse, and VT Processing

Windows Terminal supports a multi-layered input handling system that processes raw Win32 keyboard and mouse events, translates them into VT escape sequences or legacy Win32 records, and parses them through a full VT state machine.

Windows Terminal's input handling architecture serves as a flexible conduit between the Windows console host and modern terminal applications. This open-source terminal emulator from Microsoft implements a modular pipeline that can consume raw Win32 INPUT_RECORD structures, convert them to VT-compatible escape sequences, and process them through a sophisticated state machine parser.

The Input Handling Architecture

Windows Terminal processes input through three distinct layers: raw Win32 event consumption, translation to VT sequences, and state-machine parsing.

From Raw Win32 Events to VT Sequences

The input stack begins with the operating system delivering INPUT_RECORD structures to the terminal. These records contain keyboard states, mouse coordinates, and window events. The VtInputThread class, implemented in src/host/VtInputThread.cpp, reads these raw bytes from the pseudo-terminal pipe and initiates the conversion process.

The TerminalInput Class

The TerminalInput class, defined in src/terminal/input/terminalInput.hpp and implemented in src/terminal/input/terminalInput.cpp, serves as the primary translator. It exposes two key methods: HandleKey (line 220) for keyboard events and HandleMouse (line 295) for pointer events. These methods inspect the INPUT_RECORD, apply modifier tracking via _trackControlKeyState, and generate appropriate output sequences.

The InputStateMachineEngine Parser

Once TerminalInput generates VT sequences, the InputStateMachineEngine takes over. Located in src/terminal/parser/InputStateMachineEngine.hpp and src/terminal/parser/InputStateMachineEngine.cpp, this engine implements a full VT parser with states for ground, escape, CSI, DCS, and OSC sequences. Key methods include ActionExecute, ActionCsiDispatch, and ActionOscDispatch (lines 144-389), which ultimately call IInteractDispatch::WriteInput to deliver processed records to the console host.

Keyboard Input Handling

Windows Terminal supports three distinct keyboard encoding modes, selectable at runtime.

Standard VT Keyboard Encoding

By default, TerminalInput initializes a keyboard mapping table via _initKeyboardMap that translates virtual-key codes to VT escape sequences. When HandleKey processes a KEY_EVENT_RECORD, it checks the current InputMode and generates standard CSI sequences for function keys, arrow keys, and modified Unicode characters.

Kitty Keyboard Protocol Support

For applications requiring extended key reporting, Windows Terminal implements the Kitty keyboard protocol. The SetKittyKeyboardProtocol method enables this mode, which can replace or supplement standard VT output. Internal functions _encodeKitty and _getKittyFunctionalKeyCode generate sequences like \x1b[97u for key events, including repeat counts and modifier states. Unit tests in src/terminal/adapter/ut_adapter/kittyKeyboardProtocol.cpp (lines 281-285) verify this behavior.

Legacy Win32-Input-Mode

When compatibility with legacy console applications is required, the terminal can emit raw Win32 KEY_EVENT structures instead of VT sequences. This win32-input-mode is handled within src/terminal/input/terminalInput.cpp by checking the InputMode flags in HandleKey and passing through the original INPUT_RECORD unchanged.

Mouse Input Handling

Windows Terminal provides comprehensive mouse support with multiple encoding schemes and scroll detection.

SGR Mouse Encoding

The default mouse protocol uses SGR (Select Graphic Rendition) encoding as specified in XTerm. The HandleMouse method (line 295 in terminalInput.cpp) delegates to _GenerateSGRSequence (line 471 in mouseInput.cpp), which formats events as \x1b[<button;column;rowM for presses and \x1b[<button;column;rowm for releases. This format supports buttons 1-11 and avoids coordinate limitations.

UTF-8 and Default Protocols

For compatibility with older terminal emulators, Windows Terminal also implements UTF-8 and Default (X10/DEC) mouse encodings. The _GenerateUtf8Sequence function encodes coordinates as UTF-8 characters, while _GenerateDefaultSequence produces the legacy single-byte coordinate format used by DEC terminals.

Alternate Scroll Detection

When the alternate screen buffer is active, Windows Terminal supports alternate scroll behavior, converting mouse wheel events into arrow key sequences. The ShouldSendAlternateScroll method checks the current buffer state, and _makeAlternateScrollOutput generates the appropriate VT sequences (typically \x1b[A or \x1b[B for up/down scrolling), allowing mouse wheel navigation in applications like less or vim without explicit mouse support.

Advanced Input Features

Beyond basic keyboard and mouse, Windows Terminal integrates with system input services and provides clean abstractions for PTY consumers.

Text Services Framework Integration

For complex input scenarios including Input Method Editors (IME), pen, and touch input, Windows Terminal integrates with the Windows Text Services Framework (TSF). The implementation in src/tsf/ handles composition strings, candidate lists, and stylus events, converting them into standard INPUT_RECORD structures that flow through the same TerminalInput pipeline as hardware keyboard events.

The VtInputThread Integration Point

The VtInputThread class in src/host/VtInputThread.cpp serves as the primary integration point between the pseudo-terminal (PTY) pipe and the input processing stack. At line 30, it constructs the processing chain:

auto dispatch = std::make_unique<TestInteractDispatch>(pfn);
auto engine   = std::make_unique<InputStateMachineEngine>(std::move(dispatch));

The thread reads raw bytes from the PTY, feeds them to engine->ProcessString, and forwards any generated INPUT_RECORD structures back to the console host via the dispatch callback. This design allows Windows Terminal to act as a bidirectional bridge between Win32 console APIs and VT-aware applications.

Code Examples

Converting Win32 Key Events to VT Sequences

The following example demonstrates creating a TerminalInput instance and converting a KEY_EVENT_RECORD into a VT escape sequence:

#include "terminal/input/terminalInput.hpp"

int main()
{
    // Create a TerminalInput that simply returns the VT output as a string
    TerminalInput terminalInput;

    // Simulate a Win32 KEY_EVENT for the letter "A" with Shift pressed
    INPUT_RECORD ir = {};
    ir.EventType = KEY_EVENT;
    ir.Event.KeyEvent.bKeyDown = TRUE;
    ir.Event.KeyEvent.wVirtualKeyCode = 'A';
    ir.Event.KeyEvent.uChar.UnicodeChar = L'A';
    ir.Event.KeyEvent.dwControlKeyState = SHIFT_PRESSED;

    // Convert to VT escape sequence (default VT mode)
    const auto output = terminalInput.HandleKey(ir);
    // `output` now contains "\x1b[97;2u" (Kitty protocol disabled)
}

Source: src/terminal/input/terminalInput.hpp and src/terminal/input/terminalInput.cpp (line 220).

Enabling Kitty Keyboard Protocol

To enable the Kitty keyboard protocol for extended key reporting:

TerminalInput term;
term.SetKittyKeyboardProtocol(
    TerminalInput::KittyKeyboardProtocolFlags::Enable,   // enable the protocol
    TerminalInput::KittyKeyboardProtocolMode::Replace   // replace normal VT output
);

// First press
auto out1 = term.HandleKey(ir);          // -> "\x1b[97u"
// Second press (repeat)
auto out2 = term.HandleKey(ir);          // -> "\x1b[97;1:2u"

Test reference: src/terminal/adapter/ut_adapter/kittyKeyboardProtocol.cpp lines 281-285.

Generating SGR Mouse Sequences

For applications requiring precise mouse tracking, generate SGR-encoded mouse events:

TerminalInput term;
til::point pos{10, 5};
auto mouseOut = term.HandleMouse(pos, /*button*/ 0, /*modifiers*/ 0, /*delta*/ 0,
                                 MouseButtonState::Pressed);
// mouseOut now contains "\x1b[<0;11;6M" (SGR default)

Implementation: src/terminal/input/mouseInput.cpp_GenerateSGRSequence (line 471).

Parsing VT Input Streams

To parse incoming VT sequences into structured input records:

#include "terminal/parser/InputStateMachineEngine.hpp"

auto dispatch = std::make_unique<TestInteractDispatch>([](auto const& rec){ /* write to host */ });
auto engine   = std::make_unique<InputStateMachineEngine>(std::move(dispatch));

std::wstring vt = L"\x1b[31mHello\x1b[0m";
engine->ProcessString(vt);   // Parses colour change, writes characters as INPUT_RECORDs

Key code: InputStateMachineEngine::ProcessString (via StateMachine in stateMachine.cpp).

Summary

Windows Terminal's input handling system provides a comprehensive bridge between Win32 console APIs and modern VT-aware applications:

  • Multi-layered processing: Raw INPUT_RECORD structures flow through TerminalInput for translation, then InputStateMachineEngine for parsing.
  • Flexible keyboard encoding: Supports standard VT sequences, the Kitty keyboard protocol for extended key reporting, and legacy Win32-input-mode for backward compatibility.
  • Comprehensive mouse support: Implements SGR, UTF-8, and Default encoding schemes with alternate-scroll detection for application compatibility.
  • System integration: Text Services Framework (TSF) support enables IME, pen, and touch input through the same pipeline.
  • Clean abstractions: The VtInputThread class provides a standardized integration point for PTY consumers, constructing the dispatch chain and managing byte-to-record conversion.

Frequently Asked Questions

What input modes does Windows Terminal support?

Windows Terminal supports three primary input modes: standard VT escape sequences for modern terminal applications, the Kitty keyboard protocol for extended key reporting with modifier and repeat information, and legacy Win32-input-mode that passes raw INPUT_RECORD structures through unchanged. These modes can be toggled at runtime via mode sequences to accommodate different application requirements.

How does Windows Terminal handle mouse input in WSL?

When running WSL or other PTY-based applications, Windows Terminal captures Win32 mouse events and translates them into VT-compatible escape sequences using the HandleMouse method in src/terminal/input/terminalInput.cpp. By default, it uses SGR encoding (\x1b[<...), which supports buttons 1-11 and avoids coordinate limitations, making it fully compatible with Linux terminal mouse protocols used by applications like vim and tmux.

What is the Kitty keyboard protocol and is it supported?

The Kitty keyboard protocol is an extended key reporting format that provides detailed information about key presses, including modifier states, key repeat counts, and distinct press/release events. Windows Terminal fully implements this protocol through the SetKittyKeyboardProtocol method in src/terminal/input/terminalInput.cpp, using internal functions _encodeKitty and _getKittyFunctionalKeyCode to generate sequences like \x1b[97;1:2u for repeated keys.

Can Windows Terminal process legacy Win32 console input?

Yes, Windows Terminal maintains backward compatibility with legacy console applications through Win32-input-mode. When enabled, the HandleKey method in src/terminal/input/terminalInput.cpp checks the InputMode flags and emits raw INPUT_RECORD structures containing KEY_EVENT_RECORD data instead of VT escape sequences. This allows older console applications that depend on traditional Win32 console input APIs to function correctly within the modern terminal emulator.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →