How to Debug Lock Contention with LiteBox Lock Tracing
LiteBox provides a built-in lock-tracing subsystem that records every lock attempt, acquisition, and release to JSONL files, enabling precise identification of contention hotspots through timestamp analysis.
Debugging lock contention in concurrent Rust applications requires deep visibility into synchronization primitive behavior. The microsoft/litebox repository ships with a sophisticated lock-tracing feature that instruments mutexes and read-write locks to capture detailed timing data with minimal overhead. This guide explains how to enable the subsystem, record contention events, and analyze the resulting traces to optimize your application's parallel performance.
Enabling Lock Tracing in LiteBox
Activating the Cargo Feature
To use the tracing infrastructure, enable the lock_tracing feature in your Cargo.toml. This feature gates the entire subsystem to ensure zero runtime overhead when disabled.
[dependencies]
litebox = { git = "https://github.com/microsoft/litebox", features = ["lock_tracing"] }
Initializing the Global LockTracker
The tracing system centers on a singleton LockTracker that maintains per-thread stacks of held locks. According to the source code in litebox/src/litebox.rs, the tracker initializes automatically during LiteBox::new:
// litebox/src/litebox.rs#L62-L66
pub fn new(platform: &dyn Platform) -> Self {
#[cfg(feature = "lock_tracing")]
lock_tracing::init();
// ... remainder of initialization
}
Recording Lock Events
Starting and Stopping Recording Sessions
The public API exposed in litebox/src/sync/mod.rs provides three core functions to control data capture:
use litebox::sync::{start_recording, stop_recording, flush_to_jsonl};
fn analyze_critical_section() {
start_recording();
// Your concurrent code here...
stop_recording();
// Write events to stdout or file
for line in flush_to_jsonl() {
println!("{}", line);
}
}
Automatic Recording in Linux Userland
The litebox_runner_linux_userland crate demonstrates a full-process recording strategy. As shown in litebox_runner_linux_userland/src/lib.rs, the runner wraps program execution with automatic trace collection:
// litebox_runner_linux_userland/src/lib.rs#L34-L48
pub fn run_with_tracing<F>(f: F)
where
F: FnOnce()
{
start_recording();
f();
stop_recording();
// Automatically writes to /tmp/locks.jsonl
std::fs::write("/tmp/locks.jsonl", flush_to_jsonl().join("\n")).unwrap();
}
Detecting Lock Contention
Console Diagnostics for Contended Locks
The tracing subsystem provides compile-time configuration constants in litebox/src/sync/lock_tracing.rs to control console output. When CONFIG_PRINT_CONTENDED_LOCKS is enabled, the system emits "Attempt ... CONTENDED" messages before blocking acquisitions.
The debug_log_println! macro handles these emissions around line 92 of the same file. Additionally, setting CONFIG_PRINT_LOCKS_SLOWER_THAN (default 10ms) triggers "LONG WAIT ..." messages for acquisitions exceeding the threshold.
Identifying Slow Lock Acquisitions in JSONL Traces
For quantitative analysis, examine the JSON Lines output produced by flush_to_jsonl(). Each event contains nanosecond-precision timestamps:
{"event_type":"attempt","timestamp_ns":123456789,"lock_addr":"0x7fffd1234abc","lock_type":"Mutex","file":"src/main.rs","line":42}
{"event_type":"acquired","timestamp_ns":123476789,"lock_addr":"0x7fffd1234abc","lock_type":"Mutex","file":"src/main.rs","line":42}
The delta between attempt and acquired timestamps reveals contention duration. In this example, the 20,000 nanosecond (20µs) gap indicates the thread waited for another holder to release the lock.
Analyzing Lock Traces
Understanding JSONL Event Format
The EVENT_RECORDER buffer stores structured events defined in litebox/src/sync/lock_tracing.rs. When flushed, each RecordedEvent serializes to JSON with these fields:
- event_type: Operation classification (
attempt,acquired,released,created,destroyed) - timestamp_ns: Monotonic nanoseconds since tracker initialization
- lock_addr: Memory address identifying the specific lock instance
- lock_type: Synchronization primitive type (
Mutex,RwLockRead,RwLockWrite) - file / line: Source location captured via
file!()andline!()macros
Calculating Contention from Timestamps
To programmatically identify hotspots, aggregate events by lock_addr and compute wait times:
use std::collections::HashMap;
fn analyze_contention(jsonl_lines: &[String]) {
let mut attempts: HashMap<String, u64> = HashMap::new();
for line in jsonl_lines {
let event: serde_json::Value = serde_json::from_str(line).unwrap();
let addr = event["lock_addr"].as_str().unwrap().to_string();
let ts = event["timestamp_ns"].as_u64().unwrap();
match event["event_type"].as_str().unwrap() {
"attempt" => { attempts.insert(addr, ts); }
"acquired" => {
if let Some(start) = attempts.remove(&addr) {
let wait_ns = ts - start;
if wait_ns > 10_000_000 { // 10ms threshold
println!("High contention on {}: {}ms wait", addr, wait_ns / 1_000_000);
}
}
}
_ => {}
}
}
}
Configuring Trace Verbosity
Compile-Time Configuration Constants
The tracing behavior is controlled by boolean constants defined in litebox/src/sync/lock_tracing.rs (lines 47-69). These are evaluated at compile time to ensure zero-cost when disabled:
- CONFIG_PRINT_LOCK_ATTEMPTS: Emit console messages for every lock attempt
- CONFIG_PRINT_CONTENDED_LOCKS: Print "CONTENDED" warnings before blocking acquisitions
- CONFIG_PRINT_LOCKS_SLOWER_THAN: Threshold in milliseconds for "LONG WAIT" messages (default 10ms)
- CONFIG_ENABLE_RECORDING: Buffer events in
EVENT_RECORDERfor JSONL export - CONFIG_PANIC_ON_NON_BRACKETED_UNLOCK: Enable strict lock/unlock pairing validation
To customize, copy lock_tracing.rs into your crate as a module, modify the constants, and ensure your version takes precedence in the module hierarchy.
Minimal Example: Recording Contention in Rust
The following complete example demonstrates instrumenting a multi-threaded workload to capture contention data:
// Cargo.toml
[package]
name = "contention_demo"
version = "0.1.0"
edition = "2021"
[dependencies]
litebox = { git = "https://github.com/microsoft/litebox", features = ["lock_tracing"] }
serde_json = "1.0"
// src/main.rs
use litebox::{LiteBox, sync::{start_recording, stop_recording, flush_to_jsonl}};
use std::sync::Arc;
use std::thread;
use std::time::Duration;
fn main() {
// Initialize platform and LiteBox (creates LockTracker)
let platform = litebox_platform_linux_userland::LinuxPlatform::new();
let _lb = LiteBox::new(&platform);
// Start recording lock events
start_recording();
// Create a LiteBox Mutex (instrumented when lock_tracing is enabled)
let mutex = Arc::new(litebox::sync::Mutex::new(0u64));
let mut handles = vec![];
// Spawn threads that contend for the lock
for i in 0..4 {
let mu = Arc::clone(&mutex);
handles.push(thread::spawn(move || {
for _ in 0..100 {
let mut guard = mu.lock();
*guard += 1;
// Hold lock briefly to force contention
thread::sleep(Duration::from_micros(10));
}
}));
}
for h in handles {
h.join().unwrap();
}
// Stop recording and output JSONL
stop_recording();
println!("\nLock trace output:");
for line in flush_to_jsonl() {
println!("{}", line);
}
}
When executed with the lock_tracing feature enabled, this program outputs JSON Lines showing every attempt, acquired, and released event. Analyzing the timestamp deltas between attempt and acquired events reveals which threads experienced contention and for precisely how long.
Summary
- LiteBox provides a compile-time optional lock-tracing subsystem activated via the
lock_tracingCargo feature. - The LockTracker singleton initializes during
LiteBox::newand maintains per-thread lock stacks throughout the process lifetime. - Use start_recording and stop_recording to delimit capture windows, then flush_to_jsonl to export structured event data.
- Contention manifests as gaps between
attemptandacquiredtimestamps in the JSONL output, or as "CONTENDED" console messages whenCONFIG_PRINT_CONTENDED_LOCKSis enabled. - Configure verbosity via compile-time constants in
litebox/src/sync/lock_tracing.rsto tailor diagnostic detail against runtime overhead.
Frequently Asked Questions
What is the performance overhead of LiteBox lock tracing?
When the lock_tracing feature is disabled, the subsystem imposes zero runtime cost due to compile-time conditional compilation. When enabled, overhead depends on configuration: buffering events to EVENT_RECORDER adds memory allocation and atomic operations, while console printing via CONFIG_PRINT_LOCK_ATTEMPTS incurs synchronous I/O latency. For production diagnostics, use CONFIG_ENABLE_RECORDING with periodic flush_to_jsonl calls rather than continuous console output to minimize blocking.
How do I enable lock tracing without modifying LiteBox source code?
Enable the feature in your dependent crate's Cargo.toml by specifying features = ["lock_tracing"] for the litebox dependency. To customize configuration constants without forking the repository, copy litebox/src/sync/lock_tracing.rs into your project as a local module, modify the CONFIG_* boolean constants at the top of the file, and ensure your build configuration prefers your local version over the crate's internal module.
Can I use lock tracing in production environments?
Yes, provided you configure it for reliability. Set CONFIG_PANIC_ON_NON_BRACKETED_UNLOCK to false to prevent instrumentation errors from crashing your application. Use CONFIG_ENABLE_RECORDING to buffer events in memory, then periodically call flush_to_jsonl to write to disk asynchronously. Avoid CONFIG_PRINT_LOCK_ATTEMPTS in high-throughput scenarios due to console I/O blocking. The Linux userland runner demonstrates this pattern by writing to /tmp/locks.jsonl only after stop_recording.
Where does LiteBox write the lock trace files?
By default, the tracing subsystem returns trace data via flush_to_jsonl(), which yields a Vec<String> of JSON Lines rather than writing directly to disk. The litebox_runner_linux_userland crate demonstrates a typical pattern by explicitly writing to /tmp/locks.jsonl after stopping the recording, as shown in litebox_runner_linux_userland/src/lib.rs. You can direct output to any path by collecting the strings from flush_to_jsonl() and using std::fs::write or your preferred logging infrastructure.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →