How to Use Process Substitution and Named Pipes in Bash: A Complete Guide

Question

Master bash process substitution and named pipes. Efficiently stream command data without temporary files and boost your script performance now.

Accepted Answer

Bash provides two powerful mechanisms—process substitution (<(…) and >(…)) and named pipes (mkfifo)—that treat command output as file-like streams without creating temporary files, enabling efficient data feeding between commands that expect filename arguments.

The jlevy/the-art-of-command-line repository documents these advanced stream-handling techniques extensively in its README.md, demonstrating how to leverage anonymous file descriptors and FIFOs for sophisticated shell workflows. According to the source code, mastering these features allows you to compare remote files locally, split pipelines across independent processes, and avoid the overhead of temporary disk I/O.

Understanding Process Substitution in Bash

Process substitution creates an anonymous file descriptor that appears as a regular filename, allowing commands to read from or write to the output of other commands as if they were files.

The `<(command)` Syntax for Input

When you use <(command), Bash replaces the expression with a path like /dev/fd/63 (or a /proc/self/fd/… entry on Linux) that the consuming command can read. This is ideal for tools that require a filename argument rather than stdin.

As documented in the repository at line 147-150 of README.md, the canonical pattern compares a local file with a remote version without storing the remote copy:

diff /etc/hosts <(ssh somehost cat /etc/hosts)

This approach avoids creating a real temporary file and works transparently with any command that reads from a file path.

The `>(command)` Syntax for Output

Process substitution also supports write-only endpoints using >(command), which creates a file-like target that feeds its input to the supplied command. This is useful when a program requires an output file argument but you want to process that data immediately through another tool.

Working with Named Pipes (FIFOs)

Unlike process substitution, a named pipe is a persistent filesystem entry created with mkfifo that exists as an actual file. Multiple processes can open it concurrently, with the kernel buffering data between writers and readers.

Creating and Using FIFOs

Create a named pipe using the mkfifo command:

mkfifo mypipe

Data written by one process blocks until another process reads it, and vice versa. This decouples producers and consumers, allowing them to start independently:


# Producer (runs in background)

cat large.log > mypipe &

# Consumer

while IFS= read -r line; do
    echo "Processed: $line"
done < mypipe

Because the FIFO lives in the filesystem, you can initiate producers and consumers from different shells or scripts, making it ideal for complex pipeline splitting.

Multiple Producers and Consumers

Named pipes support any number of readers and writers simultaneously. The kernel handles synchronization, ensuring data flows correctly between processes without explicit coordination in your scripts.

Process Substitution vs Named Pipes: When to Use Each

The choice between these mechanisms depends on your specific use case, as detailed in the repository's comparison of stream-handling approaches.

Process Substitution (<(…)) works best for single-use, temporary file views of command output. It requires no cleanup, supports only single consumers, and integrates seamlessly with commands that accept filename arguments like diff, grep, or awk.

Named Pipes (mkfifo) excel when you need persistent filesystem entries for decoupled producer/consumer relationships. They support multiple concurrent readers and writers, remain available across different shell sessions, and require explicit creation and removal.

Practical Examples from the Source Code

The following patterns demonstrate robust implementations based on the-art-of-command-line source analysis.

Comparing Files Across Systems

The repository highlights this pattern at README.md lines 147-150 for comparing local and remote configurations:

diff /etc/hosts <(ssh remote.example.com cat /etc/hosts)

Processing Command Output with awk

Use process substitution to feed command output directly into tools that expect files:

awk '{sum+=$1} END {print sum}' <(ls -l *.txt)

Parallel Processing with FIFOs

Create a processing pipeline with multiple consumers:

mkfifo data.fifo

# Producer: generate data

seq 1000 > data.fifo &

# Consumer 1: count lines

wc -l < data.fifo > lines.txt &

# Consumer 2: compute sum

awk '{s+=$1} END {print s}' < data.fifo > sum.txt &

wait   # Wait for all background jobs

rm data.fifo   # Clean up

Splitting Streams with tee and FIFOs

Route output to multiple destinations using a named pipe:

mkfifo logpipe

# Write to both file and background monitor

tee /var/log/app.log < logpipe | grep ERROR > errors.log &

# Send application output to the FIFO

myapp > logpipe

Robust Scripting Practices

When implementing these techniques, the repository recommends enabling strict error handling. As noted at line 130 of README.md, always include:

set -euo pipefail

This ensures your scripts exit immediately on errors, treat unset variables as errors, and catch failures in pipeline components.

Summary

Process substitution (<(command) and >(command)) creates temporary anonymous file descriptors perfect for single-use comparisons and feeding data to filename-based tools.
Named pipes (mkfifo) provide persistent filesystem entries that support multiple concurrent producers and consumers for complex decoupled workflows.
Use process substitution when you need inline, temporary file views without cleanup overhead.
Use FIFOs when you require inter-process communication across different shells or need multiple readers accessing the same stream.
Always enable set -euo pipefail for robust pipeline error handling, as documented in the jlevy/the-art-of-command-line repository.

Frequently Asked Questions

What is the difference between process substitution and a named pipe in Bash?

Process substitution creates a temporary anonymous file descriptor (typically under /dev/fd/) that exists only for the duration of a single command, while a named pipe is a persistent filesystem entry created with mkfifo that remains available until explicitly deleted. Process substitution works best for one-off operations with single consumers, whereas named pipes support multiple concurrent readers and writers across different processes and shell sessions.

How do I use process substitution with commands that only accept filenames?

Wrap your command output using the <(…) syntax where you would normally provide a filename. For example, diff file1 <(command) treats the output of command as the second file for comparison. Bash automatically manages the underlying file descriptor at paths like /dev/fd/63, making the stream appear as a regular file to the consuming application.

Can multiple processes read from the same process substitution?

No. Process substitution creates a single-use file descriptor that can only support one consumer. If you need multiple processes to read the same data stream simultaneously, create a named pipe using mkfifo, which allows any number of readers to open the FIFO and consume the data concurrently.

When should I use `>(command)` instead of `<(command)`?

Use >(command) when you need to write data to a command that expects a filename for output rather than reading from stdin. This is the write-only counterpart to input process substitution, useful when a program requires an output file argument but you want to process or filter that data immediately through another command in your pipeline.