What is OpenTSDB — Scalable Time Series Database on HBase?

OpenTSDB is a distributed, scalable time series database built on top of Apache HBase, designed for storing and querying billions of data points from infrastructure and application metrics.

Is OpenTSDB — Scalable Time Series Database on HBase free to use?

Yes. OpenTSDB — Scalable Time Series Database on HBase is freely available on TokRepo. Check the Source & Thanks section on the asset page for the specific open-source license.

How do I install OpenTSDB — Scalable Time Series Database on HBase?

Visit the asset page on TokRepo and click "Copy for agent" to get the installation instructions. Most assets can be installed with a single command.

OpenTSDB — Scalable Time Series Database on HBase

Introduction

OpenTSDB is a time series database that stores metrics data in Apache HBase, enabling it to scale horizontally to handle billions of data points. It was created at StumbleUpon and is widely used for infrastructure monitoring at organizations that already run Hadoop ecosystems.

What OpenTSDB Does

Stores and retrieves time series data at scale using HBase as the storage backend
Supports high write throughput for collecting millions of data points per second
Provides an HTTP API and built-in web UI for querying and visualizing metrics
Implements downsampling, rate calculation, and aggregation at query time
Tags each data point with arbitrary key-value pairs for flexible filtering

Architecture Overview

OpenTSDB runs as a stateless daemon (TSD) that accepts data points via HTTP, Telnet, or a collector framework. Each data point consists of a metric name, timestamp, value, and one or more tags. The TSD processes writes into compact row keys optimized for HBase range scans, with metric names and tag values mapped to short UIDs to save storage. Multiple TSD instances can run in parallel behind a load balancer, sharing the same HBase cluster.

Self-Hosting & Configuration

Requires a running Apache HBase cluster (standalone or distributed)
Run the create_table.sh script to initialize the OpenTSDB tables in HBase
Configure opentsdb.conf with HBase ZooKeeper quorum and storage settings
Deploy one or more TSD instances behind a load balancer for high availability
Use tcollector or other agents to push metrics into the HTTP endpoint

Key Features

Horizontal scalability via HBase with no single point of failure
Tag-based data model allows flexible, ad-hoc queries across dimensions
Built-in downsampling reduces storage costs for older data
HTTP JSON API for integration with Grafana and custom dashboards
Supports rate calculations, interpolation, and mathematical expressions in queries

Comparison with Similar Tools

Prometheus — pull-based with local storage; OpenTSDB uses push-based writes and HBase for long-term scale
InfluxDB — standalone time series DB; OpenTSDB leverages existing HBase infrastructure
TimescaleDB — PostgreSQL extension; OpenTSDB is purpose-built for Hadoop ecosystems
VictoriaMetrics — Prometheus-compatible; OpenTSDB predates it and integrates with HBase/HDFS
Graphite — whisper-file storage limits scale; OpenTSDB scales horizontally via HBase

FAQ

Q: Does OpenTSDB require Hadoop? A: OpenTSDB requires HBase, which typically runs on HDFS. For small deployments, HBase standalone mode works without a full Hadoop cluster.

Q: How does OpenTSDB handle high cardinality? A: OpenTSDB maps metric names and tag values to compact UIDs. Very high cardinality (millions of unique tag combinations) can degrade query performance.

Q: Can I use OpenTSDB with Grafana? A: Yes. Grafana includes a built-in OpenTSDB data source plugin for querying and visualizing metrics.

Q: What is the maximum retention period? A: OpenTSDB has no built-in retention limit. Data persists in HBase until explicitly deleted or managed via HBase TTL settings on the tables.

OpenTSDB — Scalable Time Series Database on HBase

先审查再安装

Introduction

What OpenTSDB Does

Architecture Overview

Self-Hosting & Configuration

Key Features

Comparison with Similar Tools

FAQ

Sources

讨论

相关资产

TDengine — High-Performance Time-Series Database for IoT

GreptimeDB — Unified Time-Series Database in Rust

Apache IoTDB — Time-Series Database for Internet of Things

QuestDB — High-Performance Time-Series Database with SQL