Introduction
StatsD, originally created at Etsy, solves a simple problem: application code should be able to emit metrics with a single line over UDP, with no performance overhead. The daemon collects those fire-and-forget packets, aggregates counters, timers, and gauges, then flushes summaries to a time-series backend.
What StatsD Does
- Listens on a UDP (or TCP) port for lightweight metric packets
- Aggregates counters, timers, gauges, histograms, and sets over configurable flush intervals
- Forwards aggregated data to pluggable backends (Graphite, InfluxDB, Datadog, and more)
- Supports metric namespacing and tagging for organized dashboards
- Handles high-throughput metric streams with minimal resource usage
Architecture Overview
StatsD runs as a single Node.js process. Applications send short text-encoded metric lines over UDP. A flush timer fires at a configurable interval (default 10 seconds), computing sums, rates, percentiles, and other aggregates. Results are written to one or more backend modules. Because the protocol is UDP and fire-and-forget, instrumentation adds near-zero latency to application code.
Self-Hosting & Configuration
- Requires Node.js; start with a JSON config file specifying port and backend
- Default port 8125 (UDP); optionally enable TCP for reliable delivery
- Configure flush interval, percentile thresholds, and key prefix in the config file
- Add backend plugins for Graphite, InfluxDB, Elasticsearch, or cloud services
- Run behind a process manager like systemd or PM2 for production use
Key Features
- Near-zero overhead metric emission via UDP from any language
- Broad client library ecosystem (Python, Ruby, Go, Java, PHP, and more)
- Pluggable backend architecture for any time-series store
- Built-in percentile calculations for timer metrics
- Battle-tested at Etsy and adopted across the industry
Comparison with Similar Tools
- Prometheus — pull-based model with richer querying; StatsD is push-based and simpler to instrument
- Telegraf — InfluxData's agent that can replace StatsD with a StatsD input plugin plus many other inputs
- Datadog Agent — commercial agent with StatsD-compatible interface and managed backend
- OpenTelemetry Collector — vendor-neutral telemetry pipeline; broader scope but more complex to set up
FAQ
Q: Why UDP instead of TCP? A: UDP means the application never blocks on metric delivery. If the StatsD daemon is down, packets are silently dropped with no impact on the app.
Q: Can StatsD handle high cardinality metrics? A: StatsD works best with bounded metric names. For high-cardinality tagging, consider pairing it with a backend that supports tags natively.
Q: Is the Node.js implementation a bottleneck? A: For most workloads, a single Node.js process handles thousands of metrics per second. Alternative implementations in Go or C exist for extreme throughput.
Q: How do I visualize StatsD data? A: Forward to Graphite and use Grafana, or send to InfluxDB, Datadog, or any supported backend with its own dashboard.