# Debugging System Issues with strace and ltrace: Essential Linux Techniques

> Debug Linux system issues with strace and ltrace. Monitor kernel calls and library interactions. Use flags like -f and -c to effectively diagnose program failures hangs or crashes.

- Repository: [Joshua Levy/the-art-of-command-line](https://github.com/jlevy/the-art-of-command-line)
- Tags: how-to-guide
- Published: 2026-02-24

---

**Use `strace` to monitor kernel-level system calls and `ltrace` to inspect library interactions, leveraging flags like `-f` to follow child processes and `-c` to profile call performance when diagnosing why a Linux program fails, hangs, or crashes.**

The `jlevy/the-art-of-command-line` repository documents essential techniques for debugging system issues with strace and ltrace, two lightweight utilities that reveal exactly where a process breaks by intercepting its runtime behavior without requiring source code or recompilation. According to the repository's **README.md** at line 336, these tools provide immediate visibility into system call failures and library function errors that cause mysterious crashes or performance degradation.

## Understanding strace vs. ltrace for System Debugging

**strace** intercepts system calls—the interface between a program and the Linux kernel. Use it to inspect **file I/O**, **network sockets**, **process control**, and permission checks.

**ltrace** intercepts calls to dynamically linked libraries. Use it to trace high-level functions like `printf`, `malloc`, or external library APIs without seeing kernel-level noise.

When deciding which tool to use for debugging system issues with strace and ltrace, consider the failure type:

- **Kernel-level problems** (missing files, permission denied, network timeouts) → Use `strace`
- **Library-level problems** (memory leaks, library initialization failures) → Use `ltrace`
- **Performance profiling** → Use either with the **`-c`** flag to summarize call frequency and execution time

Both utilities operate on any ELF binary and require no dependencies, making them immediately available for production debugging.

## Critical Flags for Effective Debugging

### Follow Child Processes with -f

Many applications spawn subprocesses using `fork` or `exec`. Without the **`-f`** flag, you will only trace the parent process and miss critical execution paths in spawned children. As documented in the repository's README at line 336, the "trace-child" technique is essential to avoid missing important calls that often contain the root cause of a failure.

### Attach to Running Processes with -p

The **`-p <pid>`** flag allows you to attach to an already-running process without restarting it. This is crucial for debugging production services that cannot be interrupted. Simply specify the process ID to begin interception immediately.

### Profile Execution with -c

Use **`-c`** to generate a statistical summary instead of a full trace. This mode counts how many times each call was made and the total time spent, helping you identify bottlenecks without drowning in log output.

### Filter Specific Calls with -e

Reduce noise by filtering for specific calls using **`-e <expr>`**. For example, trace only file operations with `-e open,read,write` or monitor memory management with `-e malloc,free`.

## Practical Debugging Scenarios and Code Examples

### Trace a New Command from Launch

Capture every system call from program startup, including timestamps and child processes:

```bash

# Trace every system call made by `ls -l /tmp`

strace -o ls.strace -tt -f ls -l /tmp

# View the detailed log with microsecond timestamps

less ls.strace

```

The **`-tt`** flag prints timestamps with microsecond precision, while **`-f`** ensures you capture any subprocesses spawned by the command.

### Diagnose Hanging Production Processes

Attach to a live process to identify where it is stuck:

```bash

# Find the PID of the problematic process

pid=$(pgrep -f my_app)

# Attach and capture a short performance snapshot (5 seconds)

strace -p "$pid" -c -e trace=all -o snapshot.txt &
sleep 5
kill $!   # stop strace

cat snapshot.txt

```

The **`-c`** summary reveals which system calls dominate execution time, immediately highlighting I/O waits or infinite loops.

### Isolate Memory Allocation Issues

Use `ltrace` to monitor specific library functions without kernel noise:

```bash

# Watch only memory-allocation functions in a program

ltrace -e malloc,free -f ./my_program 2> malloc.log

```

This filters the trace to show only heap allocations, revealing memory leaks or excessive allocation patterns that cause slowdowns.

### Run strace and ltrace Simultaneously

For comprehensive analysis, run both tools in parallel to see kernel and library behavior together:

```bash

# Run both simultaneously (output to separate files)

strace -ff -o syscalls.log ./my_program &
ltrace -ff -o libcalls.log ./my_program &
wait   # wait for both to finish

```

The **`-ff`** flag creates separate output files for each followed child process, preventing log interleaving.

## Identifying Performance Bottlenecks

Convert statistical output into actionable intelligence using standard Unix text processing:

```bash

# Generate a histogram of system-call frequencies

strace -c -p "$pid" 2> call_stats.txt
awk '{print $1 "\t" $2}' call_stats.txt | sort -nr | head

```

This pipeline extracts call counts, sorts them numerically in reverse order, and displays the top resource consumers. High counts on `read`, `write`, or `futex` often indicate I/O bottlenecks or lock contention.

## Summary

- **strace** reveals kernel-level system calls while **ltrace** exposes library function calls, providing complementary visibility into program execution when debugging system issues.
- Critical flags include **`-f`** (follow children), **`-p`** (attach to PID), **`-c`** (profile performance), and **`-e`** (filter specific calls).
- The `jlevy/the-art-of-command-line` README at line 336 documents these techniques as essential for diagnosing crashes, hangs, and I/O failures in production environments.
- Both tools operate on any ELF binary without dependencies, making them immediately available for emergency debugging on any Linux system.

## Frequently Asked Questions

### What is the difference between strace and ltrace?

**strace** intercepts and records system calls made to the Linux kernel, such as file operations, network requests, and process management. **ltrace** intercepts calls to dynamically linked libraries, such as `malloc`, `printf`, or external shared objects, revealing higher-level program behavior without kernel-level detail.

### How do I debug a program that spawns child processes?

Use the **`-f`** flag with either tool to follow processes created by `fork` or `exec`. Without this flag, you will only trace the parent process and miss critical execution paths in spawned children, potentially hiding the source of a crash or hang in a subprocess.

### Can I use strace or ltrace on a process that is already running?

Yes. Use **`-p <pid>`** to attach to an existing process without restarting it. This is essential for debugging production services. You can combine this with **`-c`** to generate a performance summary over a specific time window, then detach without interrupting the service.

### When should I use the -c flag versus full tracing?

Use **`-c`** when you suspect a performance bottleneck and need to identify which calls consume the most time. Use full tracing (without `-c`) when you need to see the exact sequence of calls, arguments, and return values to diagnose logical errors, missing files, or permission failures.