best-practices

How to Build a Robust Server Monitor Node.js Application for Performance and Error Tracking

February 16, 2026 nodejs/node ↗

The most robust way to monitor a Node.js server is to combine perf_hooks.monitorEventLoopDelay() for performance metrics with process.on('uncaughtException') and diagnostics_channel for error handling, creating a centralized monitoring module that exposes health checks without adding external dependencies.

Setting up a production-grade server monitor Node.js implementation requires more than basic logging. According to the nodejs/node source code, the runtime provides built-in APIs in perf_hooks and process that enable zero-overhead performance tracking and reliable error capture. This guide demonstrates how to wire these primitives into a reusable monitor module that tracks event-loop health, garbage collection pauses, and fatal errors while remaining decoupled from external observability stacks.

Core Monitoring Domains

A robust monitoring strategy separates performance telemetry from reliability signals. Node.js exposes distinct APIs for each domain in its core modules.

Performance Monitoring with perf_hooks

The perf_hooks module provides the foundation for low-overhead performance tracking in a server monitor Node.js setup.

monitorEventLoopDelay(): Instantiates an IntervalHistogram that samples event-loop latency in nanoseconds. According to lib/internal/perf_hooks.js, this runs inside libuv’s event loop and adds virtually no extra timers.
PerformanceObserver: Subscribe to 'gc' and 'function' entry types to capture garbage collection pause times and function execution durations without manual instrumentation.
Process resource usage: process.memoryUsage(), process.cpuUsage(), and process.resourceUsage() provide heap, external memory, and CPU consumption metrics.

Error Handling with Process Events

Capturing fatal errors requires hooking into process-level events before the application crashes.

process.on('uncaughtException'): Caught in doc/api/process.md, this event fires when an exception bubbles to the event loop without being caught. Use it to log the error and trigger a graceful shutdown.
process.on('unhandledRejection'): Tracks Promise rejections that lack a .catch() handler. As noted in the Node.js source, this is critical for async/await error visibility.
diagnostics_channel: Create custom channels (e.g., 'my-app:errors') to publish error payloads. This decouples error generation from error handling, allowing multiple subscribers (loggers, APM agents) to react independently.

Building the Centralized Monitor Module

Consolidate these APIs into a single module that initializes once at startup. This pattern, derived from the architecture in nodejs/node, ensures consistent metric collection across your server monitor Node.js application.

// monitor.js – core of the monitoring service
import {
  monitorEventLoopDelay,
  PerformanceObserver,
  performance,
} from 'node:perf_hooks';
import { createServer } from 'node:http';
import diagnostics_channel from 'node:diagnostics_channel';
import { writeFileSync } from 'node:fs';

// 1️⃣ Event‑loop delay histogram
const loopHist = monitorEventLoopDelay({ resolution: 5 });
loopHist.enable();

// 2️⃣ GC & function observer
const perfObserver = new PerformanceObserver((list) => {
  for (const entry of list.getEntries()) {
    // Emit to diagnostics channel for external consumers
    diagnostics_channel.channel('my-app:perf').publish(entry);
  }
});
perfObserver.observe({ entryTypes: ['gc', 'function'] });

// 3️⃣ Process‑level error handling
process.on('uncaughtException', (err) => {
  diagnostics_channel.channel('my-app:errors')
    .publish({ type: 'uncaughtException', err });
  // Optional: graceful shutdown
  shutdown(1);
});

process.on('unhandledRejection', (reason, promise) => {
  diagnostics_channel.channel('my-app:errors')
    .publish({ type: 'unhandledRejection', reason, promise });
});

// Helper to dump a snapshot (useful for post‑mortem)
function snapshot() {
  const data = {
    loop: {
      min: loopHist.min,
      max: loopHist.max,
      mean: loopHist.mean,
      p95: loopHist.percentile(95),
    },
    gc: performance.getEntriesByType('gc')
      .map(e => ({ kind: e.kind, duration: e.duration })),
    mem: process.memoryUsage(),
    cpu: process.cpuUsage(),
  };
  writeFileSync('monitor-snapshot.json', JSON.stringify(data, null, 2));
}

// 4️⃣ Simple health‑check endpoint
const server = createServer((req, res) => {
  if (req.url === '/healthz') {
    const ok = loopHist.max < 30_000_000 // 30 ms in ns
      && process.memoryUsage().heapUsed < 150 * 1024 * 1024; // 150 MiB
    res.writeHead(ok ? 200 : 503);
    res.end(ok ? 'OK' : 'UNHEALTHY');
  } else {
    res.writeHead(404);
    res.end('Not found');
  }
});
server.listen(3001, () => console.log('Health check on :3001'));

// Export for reuse in the main app
export const monitor = {
  loopHist,
  snapshot,
};

Key Implementation Details

Zero-GC impact: The IntervalHistogram stores nanosecond samples in a ring buffer without allocating per-tick objects, keeping GC pressure low.
Percentile tracking: Access h.percentile(95) and h.max directly from the histogram to detect tail latency without external aggregation.
Decoupled metrics: diagnostics_channel allows business logic to publish metrics without importing the monitor module directly.

Integration with External Ecosystems

While the core monitor uses only Node.js built-ins, production server monitor Node.js deployments typically integrate with external observability stacks.

Integration	Implementation Strategy
Prometheus	Use `prom-client` to register histogram values (`loopHist.mean`, `loopHist.max`) and process metrics (`process_cpu_seconds_total`, `process_resident_memory_bytes`). Expose via `/metrics` endpoint.
Structured Logging	Subscribe to diagnostics channels and forward to `pino` or `winston`: `diagnostics_channel.channel('my-app:errors').subscribe((msg) => logger.error(msg));`
Process Managers	Handle `SIGTERM` in the monitor to call `loopHist.disable()`, flush snapshots, and exit cleanly for PM2 or systemd.
Distributed Tracing	Bind `diagnostics_channel.tracingChannel('my-app:trace')` to OpenTelemetry exporters to correlate custom events with trace spans.

Key Source Files in the Node.js Repository

Understanding the underlying implementation helps debug edge cases in your server monitor Node.js setup.

File	Purpose	Link
`doc/api/perf_hooks.md`	Documentation for `monitorEventLoopDelay()`, `PerformanceObserver`, and `eventLoopUtilization`	perf_hooks.md
`doc/api/process.md`	Details on `uncaughtException`, `unhandledRejection`, and signal handling	process.md
`doc/api/diagnostics_channel.md`	API reference for publish/subscribe diagnostic channels	diagnostics_channel.md
`lib/internal/perf_hooks.js`	Core implementation of the performance hooks API	perf_hooks.js
`test/parallel/test-event-loop-delay-monitor.js`	Test suite demonstrating event-loop monitor behavior under load	test-event-loop-delay-monitor.js

Summary

Use perf_hooks.monitorEventLoopDelay() to track event-loop latency with minimal overhead; expose percentiles like h.percentile(95) for tail-latency detection.
Capture GC and function timings via PerformanceObserver listening to 'gc' and 'function' entry types to identify pause sources without manual instrumentation.
Handle fatal errors by subscribing to process.on('uncaughtException') and process.on('unhandledRejection'), then publishing to diagnostics_channel for decoupled error processing.
Decouple metrics using diagnostics_channel so business logic can emit performance data without importing the monitor module directly.
Expose health checks via a lightweight HTTP endpoint that validates event-loop and memory thresholds for load-balancer integration.

Frequently Asked Questions

How do I monitor event loop lag without adding significant overhead to my Node.js application?

Use the built-in perf_hooks.monitorEventLoopDelay() API. According to the Node.js source in lib/internal/perf_hooks.js, this creates an IntervalHistogram that samples event-loop latency in nanoseconds using a ring buffer, which avoids per-tick object allocation and minimizes GC pressure. Enable it once at startup with loopHist.enable() and read percentiles via loopHist.percentile(95) when needed.

What is the best way to handle uncaught exceptions in a Node.js monitoring setup?

Subscribe to process.on('uncaughtException') and process.on('unhandledRejection') within your monitor module, then publish the error payloads to a diagnostics_channel (e.g., 'my-app:errors'). This decouples error detection from handling, allowing multiple subscribers—such as loggers, alerting systems, or graceful shutdown routines—to react without modifying the core monitor code. Always perform minimal synchronous operations in these handlers to avoid undefined behavior during process termination.

How can I track garbage collection pauses without using external APM agents?

Instantiate a PerformanceObserver from node:perf_hooks and observe the 'gc' entry type. As documented in doc/api/perf_hooks.md, this emits entries containing kind (minor, major, incremental, weakcb) and duration for each GC cycle. By subscribing to these events and forwarding them via diagnostics_channel, you can aggregate pause times and identify memory pressure without importing native tracing modules.

Why should I use diagnostics_channel instead of directly importing a logger in my business logic?

diagnostics_channel provides a publish-subscribe pattern that eliminates tight coupling between your application code and monitoring infrastructure. As implemented in doc/api/diagnostics_channel.md, channels act as named buses where producers publish data without knowing the consumers. This allows you to swap logging libraries, add Prometheus exporters, or enable OpenTelemetry tracing without modifying business logic. It also supports zero-cost abstraction when no subscribers are registered, making it safe to instrument hot paths.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:

curl -s "https://instagit.com/install.md"

Add to your MCP client configuration:

{
  "mcpServers": {
    "instagit": {
      "command": "npx",
      "args": ["-y", "instagit@latest"]
    }
  }
}

Ask your agent:

"Use Instagit MCP to understand how nodejs/node works."

Works with

Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →