How do I install Apache IoTDB — Time-Series Database for Internet of Things?

Visit the asset page on TokRepo and click "Copy for agent" to get the installation instructions. Most assets can be installed with a single command.

Apache IoTDB — Time-Series Database for Internet of Things

Introduction

Apache IoTDB is a time-series database purpose-built for IoT scenarios. Developed at Tsinghua University and donated to the Apache Software Foundation, it handles high-frequency sensor data ingestion while maintaining efficient compression and query performance on resource-constrained devices.

What Apache IoTDB Does

Ingests millions of time-series data points per second from sensors and devices
Compresses time-series data with encoding schemes like Gorilla, RLE, and dictionary encoding
Provides SQL-like query language (IoTDB SQL) for aggregation, downsampling, and filtering
Supports both standalone single-node and distributed multi-node cluster deployments
Integrates with Apache Spark, Flink, and Kafka for stream and batch processing pipelines

Architecture Overview

IoTDB uses a tree-based metadata model where devices and measurements form a hierarchical path (e.g., root.factory.device1.temperature). Data is stored in TsFiles, a columnar format optimized for time-series with per-column encoding and compression. Writes buffer in a MemTable before flushing to disk. The distributed mode uses a ConfigNode for metadata consensus and DataNodes for storage, coordinated via Raft protocol.

Self-Hosting & Configuration

Deploy via Docker, download binary, or build from source with Maven and JDK 11+
Configure iotdb-system.properties for memory allocation and storage directories
Set wal_buffer_size and memtable_size_threshold to balance write throughput and memory usage
Enable compaction strategies (cross-space, inner-space) for long-term storage efficiency
Use ConfigNode and DataNode scripts separately for distributed cluster setup

Key Features

Tree-structured metadata model maps naturally to IoT device hierarchies
Time-series specific encoding achieves 10-30x compression ratios on sensor data
Aligned timeseries feature stores multiple measurements at the same timestamp efficiently
Trigger and continuous query mechanisms for real-time alerting and downsampling
Edge-cloud sync allows lightweight edge instances to replicate data to central clusters

Comparison with Similar Tools

InfluxDB — Popular time-series DB with Flux query language; IoTDB offers better compression for high-cardinality IoT data
TimescaleDB — PostgreSQL extension for time-series; stronger SQL compatibility but requires PostgreSQL overhead
TDengine — Also targets IoT with clustering and SQL support; IoTDB has broader Apache ecosystem integration
QuestDB — Optimized for fast SQL analytics on time-series; less focused on IoT device hierarchy modeling
Prometheus — Pull-based metrics collection; designed for monitoring rather than general IoT data storage

FAQ

Q: What query language does IoTDB use? A: IoTDB uses its own SQL-like dialect (IoTDB SQL) that supports time-range filters, aggregation functions, GROUP BY time intervals, and FILL clauses for missing data interpolation.

Q: Can IoTDB run on edge devices? A: Yes. The standalone mode has a small memory footprint and can run on ARM-based devices. Edge instances can sync data to a central cloud cluster.

Q: How does IoTDB handle schema? A: IoTDB supports both schema-on-write (explicitly create timeseries) and auto-creation mode where schemas are inferred from incoming data. The tree model organizes measurements hierarchically.

Q: What is the TsFile format? A: TsFile is IoTDB's native columnar file format designed for time-series data. It stores data sorted by time with per-column encoding, enabling efficient range scans and compression.

Apache IoTDB — Time-Series Database for Internet of Things

Introduction

What Apache IoTDB Does

Architecture Overview

Self-Hosting & Configuration

Key Features

Comparison with Similar Tools

FAQ

Sources

Discussion

Related Assets

Apache Calcite — Dynamic SQL Query Planning and Optimization Framework

Apache HBase — Distributed Wide-Column Store on Hadoop

Immudb — Immutable Database with Cryptographic Verification