How TerminalInput Processes Keyboard and Mouse Input in Windows Terminal

TerminalInput translates raw Windows input events into VT escape sequences, enabling Windows Terminal to support both legacy terminal applications and modern protocols like Kitty Keyboard and SGR mouse tracking.

The microsoft/terminal repository contains the core logic that powers Windows Terminal's input handling. At the heart of this system lies TerminalInput, a class responsible for converting raw Windows INPUT_RECORD structures and mouse messages into standardized VT (Virtual Terminal) sequences that Unix-style shells and applications expect.

Architecture of TerminalInput

Core Components

The input processing pipeline is organized into several specialized components within the src/terminal/input directory:

Input Processing Flow

The flow for keyboard and mouse input follows a consistent four-stage pipeline:

  1. Receive raw input – Windows INPUT_RECORD structures for keyboard or mouse messages (WM_* values) from the host.
  2. Sanitize the event – Strip unwanted modifiers, detect AltGr, and preserve timing info in SanitizedKeyEvent.
  3. Select an encoder – Based on current input mode bits (Mode), choose between Win32, ANSI, Kitty, or mouse-specific encoders.
  4. Generate output – Produce an OutputType (std::optional<std::wstring>) containing the exact VT sequence to feed to the console input buffer.

Keyboard Input Processing

Entry Point HandleKey

The primary entry point for keyboard events is the HandleKey method:

[[nodiscard]] OutputType TerminalInput::HandleKey(const INPUT_RECORD& event);

This method first verifies that the event is a KEY_EVENT. If Win32-input mode is active and no Kitty flags are set, it short-circuits to _makeWin32Output. Otherwise, the event is sanitized into a SanitizedKeyEvent structure containing the virtual key, scan code, Unicode code-point, control state, and repeat flag.

Key code excerpt from terminalInput.cpp (lines 23-44):

SanitizedKeyEvent key{
    .virtualKey = event.Event.KeyEvent.wVirtualKeyCode,
    .scanCode   = event.Event.KeyEvent.wVirtualScanCode,
    .codepoint  = event.Event.KeyEvent.uChar.UnicodeChar,
    .controlKeyState = _trackControlKeyState(event.Event.KeyEvent),
    .keyDown = event.Event.KeyEvent.bKeyDown != FALSE,
};

Input Mode Management

TerminalInput maintains a bitset _inputMode that stores flags such as Ansi, AutoRepeat, Keypad, Win32, Utf8MouseEncoding, and SgrMouseEncoding. Mode changes are applied via SetInputMode(Mode, bool) and reset with ResetInputModes.

The mode bits influence encoding decisions:

Mode Effect on encoding
LineFeed Determines whether Enter produces \r\n or just \r.
BackarrowKey Controls whether Backspace yields DEL (\x7f) or BS (\b).
CursorKey / Keypad Choose between CSI versus SS3 style sequences for arrow and keypad keys.
Kitty flags (DisambiguateEscapeCodes, ReportEventTypes, ReportAssociatedText) Enable the Kitty Keyboard Protocol, producing CSI u sequences with rich modifier information.

Kitty Keyboard Protocol Support

When any Kitty flag is set, _encodeKitty is invoked (lines 90-150 in terminalInput.cpp). The protocol builds an EncodingHelper containing:

  • csiUnicodeKeyCode – functional key code or Unicode code-point.
  • csiFinal = L'u' for CSI u sequences.
  • Modifier bitmap (csiModifier) – bits 1 (Shift), 2 (Alt), 4 (Ctrl).
  • Optional associated text (csiTextAsCodepoint) when ReportAssociatedText is enabled.

The final VT string is assembled in _formatEncodingHelper (lines 116-164), emitting sequences like:


CSI <code>;<modifiers>;<event-type>;<text> u

Regular VT Encoding Fallback

If no Kitty output is produced, _encodeRegular builds classic VT sequences (lines 222-460). This path handles:

  • Alphanumeric keys → plain characters (optionally prefixed with ESC for Alt).
  • Control keys → CSI <modifier> + appropriate final character (e.g., \x1b[5~ for PageUp).
  • Function keys – mapping via KeyboardHelper::getKittyBaseKey or internal lookup tables.

The EncodingHelper stores either a plain string (plain), an SS3 final (ss3Final), or a CSI final (csiFinal). The formatter selects the correct introducer (_csi or _ss3) and appends the data.

Keyboard Input Example

The following example demonstrates how HandleKey processes Ctrl + C:

INPUT_RECORD rec = {};
rec.EventType = KEY_EVENT;
rec.Event.KeyEvent.bKeyDown = TRUE;
rec.Event.KeyEvent.wVirtualKeyCode = 'C';
rec.Event.KeyEvent.uChar.UnicodeChar = L'c';
rec.Event.KeyEvent.dwControlKeyState = LEFT_CTRL_PRESSED;

// Process the key
auto out = terminalInput.HandleKey(rec);
// out holds "\x03" (ETX), the VT representation of Ctrl-C.

Mouse Input Processing

Entry Point HandleMouse

Mouse handling resides in mouseInput.cpp and is driven by TerminalInput::HandleMouse:

[[nodiscard]] OutputType TerminalInput::HandleMouse(
    til::point position,
    unsigned int button,
    short modifierKeyState,
    short delta,
    MouseButtonState state);

The method first checks IsTrackingMouseInput() to determine if any mouse-tracking mode is active. It accumulates wheel deltas until WHEEL_DELTA is reached (lines 97-115) and resolves button states using _isButtonMsg, _isHoverMsg, and s_GetPressedButton.

Mouse Tracking Modes

TerminalInput supports several mutually configurable modes that determine what events generate sequences:

Mode Behavior
DefaultMouseTracking Reports only button press and release events.
ButtonEventMouseTracking Reports press/release plus motion events while a button is held.
AnyEventMouseTracking Reports all motion events, regardless of button state.
Utf8MouseEncoding Uses X10 default encoding with UTF-8-compatible coordinates (up to 2015).
SgrMouseEncoding Uses SGR format CSI < button ; x ; y M/m (preferred modern encoding).
AlternateScroll In the alternate screen buffer, wheel events become cursor-up/down sequences.

Sequence Generation

Based on the active encoding mode, one of three generators produces the final sequence:

  • _GenerateDefaultSequence – Classic X10 encoding (lines 400-416).
  • _GenerateUtf8Sequence – UTF-8 variant permitting coordinates beyond 94 (lines 430-456).
  • _GenerateSGRSequence – Emits CSI < button ; x+1 ; y+1 M for press or m for release (lines 71-77).

All generators rely on _windowsButtonToXEncoding and _windowsButtonToSGREncoding (lines 44-86) to translate Win32 constants (WM_LBUTTONDOWN, WM_MOUSEWHEEL) into VT button codes with modifier bits.

Alternate Scroll Behavior

When mouse tracking is disabled but AlternateScroll is enabled, ShouldSendAlternateScroll (lines 88-94) returns true for wheel events while the alternate screen buffer is active. The method _makeAlternateScrollOutput synthesizes cursor movement sequences using the regular keyboard encoder (lines 100-124), effectively converting scroll actions into arrow key inputs.

Mouse Input Example

The following demonstrates SGR mouse encoding for a left-click at cell (10, 5):

// Enable SGR mouse mode
terminalInput.SetInputMode(TerminalInput::Mode::SgrMouseEncoding, true);
terminalInput.SetInputMode(TerminalInput::Mode::ButtonEventMouseTracking, true);

// Simulate a left-button press
til::point pos{9,4};           // Windows coordinates start at 0
unsigned int btn = WM_LBUTTONDOWN;
short mods = 0;                // No Shift/Alt/Ctrl
short delta = 0;
TerminalInput::MouseButtonState ms{ true, false, false };

auto out = terminalInput.HandleMouse(pos, btn, mods, delta, ms);
// out == "\x1b[<0;11;6M"

The sequence CSI < 0 ; 11 ; 6 M indicates mouse button 0 pressed at column 11, row 6 (VT coordinates are 1-based).

Integration with the Windows Host

In production, the Windows Console Host (conhost) bridges the operating system and TerminalInput. The host reads raw events via ReadConsoleInputW, wraps them into INPUT_RECORD structures, and forwards them to a TerminalInput instance defined in src/host/input.cpp and src/host/inputReadHandleData.cpp. The resulting VT sequences are then fed into the console's input buffer for the child process, making Windows Terminal indistinguishable from native Unix terminals to applications like vim, tmux, and Emacs.

Summary

  • TerminalInput serves as the central translator between Windows native input events and VT escape sequences.
  • The class supports both legacy VT encoding and modern protocols including the Kitty Keyboard Protocol and SGR mouse encoding.
  • Mode bits stored in _inputMode dynamically control encoding behavior for keys, mouse tracking, and scroll events.
  • Implementation spans src/terminal/input/terminalInput.cpp for keyboard logic and src/terminal/input/mouseInput.cpp for pointer handling.
  • Integration with the Windows host occurs through src/host/input.cpp, feeding sanitized INPUT_RECORD data into the VT pipeline.

Frequently Asked Questions

What is the Kitty Keyboard Protocol and why does Windows Terminal support it?

The Kitty Keyboard Protocol is a modern extension to VT encoding that uses CSI u sequences to report key events with rich metadata including specific modifier states, key event types (press, release, repeat), and associated text. Windows Terminal supports this protocol via TerminalInput::_encodeKitty to provide enhanced compatibility with modern terminal applications that require precise key event information beyond what legacy VT sequences can convey.

How does TerminalInput distinguish between Left Ctrl and AltGr?

During the sanitization phase in HandleKey, TerminalInput constructs a SanitizedKeyEvent that tracks control key state with timing information. The _trackControlKeyState helper analyzes the dwControlKeyState field alongside timing data to detect AltGr (which manifests as Left Ctrl + Right Alt on Windows) versus a genuine Left Ctrl press, ensuring the correct VT sequence is generated for each scenario.

What is the difference between SGR and UTF-8 mouse encoding?

SGR encoding (SgrMouseEncoding mode) uses the format CSI < button ; x ; y M for presses and CSI < button ; x ; y m for releases, supporting coordinate values up to 32767. UTF-8 encoding (Utf8MouseEncoding mode) uses the classic X10 format but encodes coordinates as UTF-8 characters, allowing values up to 2015 but limiting compatibility with certain legacy applications. SGR is the preferred modern encoding due to its larger coordinate space and unambiguous button release reporting.

Can TerminalInput handle focus events?

Yes, TerminalInput includes support for focus tracking via the FocusEvent mode. When enabled, the class generates CSI I sequences when the terminal gains focus and CSI O sequences when focus is lost. These events are processed through the same OutputType pipeline as keyboard and mouse input, allowing applications to detect when the terminal window becomes active or inactive.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →