What Jaeger Does
- Trace collection — receive spans via OTLP, Jaeger protocol, Zipkin
- Storage backends — Elasticsearch, Cassandra, Kafka, Badger, memory
- Query API — search traces by service, operation, tags, duration
- UI — waterfall view of spans, service dependencies graph
- Sampling — adaptive, probabilistic, rate-limited
- Service Performance Monitoring (SPM) — RED metrics from traces
- Critical Path — highlight bottleneck spans
Architecture
Jaeger components:
- Agent (deprecated, use OTLP) — local daemon
- Collector — receives spans, writes to storage
- Query — serves UI and API, reads from storage
- Ingester — for Kafka async pipeline
- All-in-one — dev container bundling everything
- OpenTelemetry Collector — modern ingestion preferred
Self-Hosting
# Production deployment
components:
- Collector (multi-replica, behind LB)
- Elasticsearch cluster (storage)
- Query service (multi-replica)
- OTel Collector (ingestion)Kubernetes: use the official Jaeger Operator or Helm charts.
Key Features
- OpenTelemetry native ingestion
- Multiple storage backends
- Service dependency graph
- Adaptive sampling
- Trace search and filtering
- RED metrics (SPM)
- Zipkin compatibility
- gRPC and HTTP APIs
- Kubernetes operator
Comparison
| Tracing | Storage | OTel | Metrics |
|---|---|---|---|
| Jaeger | ES, Cassandra, Kafka | Yes | SPM |
| Tempo | Object storage | Yes | Via Grafana |
| Zipkin | ES, MySQL, Cassandra | Yes | Partial |
| Honeycomb | Managed | Yes | Yes |
| Lightstep | Managed | Yes | Yes |
| OpenTelemetry Collector | Any backend | Native | Yes |
常见问题 FAQ
Q: Jaeger vs Tempo? A: Jaeger 有独立 UI、成熟生态;Tempo 把 trace 存对象存储(便宜),用 Grafana 查看,和 Loki/Prometheus 集成更好。
Q: 采样策略? A: 生产环境不要采 100%(浪费存储)。用 probabilistic 1% + 基于 tag 的 force-keep(错误、慢请求必留)。
Q: 和 OpenTelemetry 关系? A: OTel 是标准(API + SDK + Collector),Jaeger 是后端。新项目应用 OTel 采集 + Jaeger 存储查询。
来源与致谢 Sources
- Docs: https://www.jaegertracing.io/docs
- GitHub: https://github.com/jaegertracing/jaeger
- License: Apache 2.0