# Netdata — Real-Time Infrastructure Monitoring & Observability > Netdata is an open-source monitoring agent that collects thousands of metrics per second with zero configuration. Beautiful dashboards, ML-powered alerts, and instant deployment. ## Install Save as a script file and run: ## Quick Use ```bash # One-line install on any Linux curl https://get.netdata.cloud/kickstart.sh > /tmp/netdata-kickstart.sh && sh /tmp/netdata-kickstart.sh # Or Docker docker run -d --name netdata -p 19999:19999 -v netdata-config:/etc/netdata -v netdata-lib:/var/lib/netdata -v netdata-cache:/var/cache/netdata -v /:/host/root:ro -v /etc/passwd:/host/etc/passwd:ro -v /etc/group:/host/etc/group:ro -v /etc/localtime:/host/etc/localtime:ro -v /proc:/host/proc:ro -v /sys:/host/sys:ro --cap-add SYS_PTRACE --security-opt apparmor=unconfined netdata/netdata ``` Open `http://localhost:19999` — see real-time metrics immediately, no configuration needed. ## Intro **Netdata** is an open-source, real-time infrastructure monitoring and observability platform. It auto-discovers and collects thousands of metrics per second from systems, containers, databases, and applications with zero configuration — presenting everything in beautiful, interactive dashboards that update every second. With 78.4K+ GitHub stars and GPL-3.0 license, Netdata is the most starred monitoring project on GitHub, valued for its instant deployment, zero-config auto-discovery, and per-second granularity that competitors can't match. ## What Netdata Does - **Auto-Discovery**: Automatically detects and monitors OS, containers, databases, web servers, and 800+ integrations - **Per-Second Metrics**: Collects metrics every second (not every 15s like Prometheus) for real-time visibility - **Zero Config**: Install and immediately see 2,000+ metrics — no YAML files, no exporters to deploy - **ML-Powered Alerts**: Machine learning detects anomalies in every metric automatically - **Beautiful Dashboards**: Interactive, drill-down dashboards that update in real-time - **Distributed Architecture**: Deploy agents everywhere, view all data in one place via Netdata Cloud - **Low Overhead**: ~1% CPU, ~100MB RAM for monitoring an entire server with thousands of metrics - **Long-Term Storage**: Built-in tiered storage with configurable retention ## Architecture ``` ┌─────────────────────────────────────────────┐ │ Netdata Agent (on each server) │ │ ┌───────────┐ ┌──────────┐ ┌────────────┐ │ │ │Collectors │ │ ML Engine│ │ Dashboard │ │ │ │(800+ auto)│ │(Anomaly) │ │ (Built-in) │ │ │ └───────────┘ └──────────┘ └────────────┘ │ │ ┌───────────┐ ┌──────────┐ ┌────────────┐ │ │ │ TSDB │ │ Alerts │ │ Streaming │ │ │ │(Per-second)│ │(ML+Rules)│ │ (to Cloud) │ │ │ └───────────┘ └──────────┘ └────────────┘ │ └─────────────────────────────────────────────┘ ``` ## What Gets Monitored Automatically ``` System: ├── CPU (per core, per process, by type) ├── Memory (RAM, swap, page faults, NUMA) ├── Disk I/O (per device, latency, utilization) ├── Network (per interface, packets, errors) ├── Processes (count, states, context switches) └── Sensors (temperature, fans, voltage) Containers: ├── Docker (per container CPU, memory, I/O, network) ├── Kubernetes (pods, deployments, nodes) └── LXC/LXD Databases: ├── MySQL / MariaDB (queries, connections, replication) ├── PostgreSQL (locks, transactions, WAL) ├── Redis (commands, memory, keys) ├── MongoDB (operations, connections, replication) └── Elasticsearch (indexing, search, cluster health) Web Servers: ├── Nginx (requests, connections, status) ├── Apache (workers, requests, bandwidth) ├── HAProxy (frontend/backend, sessions) └── Traefik (entrypoints, routers) Applications: ├── Node.js, Python, Go, Java (runtime metrics) ├── RabbitMQ, Kafka (queues, messages) ├── DNS servers (queries, cache) └── 800+ more integrations ``` ## Key Features ### ML-Powered Anomaly Detection Every metric gets a machine learning model trained on its historical patterns: ``` Normal: CPU usage follows daily work pattern Alert: CPU anomaly detected — usage 3σ above predicted Normal: Disk I/O steady at 50 MB/s Alert: Disk I/O anomaly — unusual spike to 500 MB/s at 3am ``` No manual threshold configuration needed — ML learns what's normal for YOUR infrastructure. ### Composite Charts Drill down from high-level overview to individual metrics: ``` Server Overview → CPU → Per Core → Per Process → System Calls ``` ### Alert Notifications ```yaml # Built-in notification channels: - Email (SMTP) - Slack - Discord - PagerDuty - Opsgenie - Telegram - Microsoft Teams - Custom webhook ``` ### Streaming & Centralization ``` ┌──────────┐ ┌──────────┐ ┌──────────┐ │ Agent 1 │────▶│ │ │ Netdata │ │ (Web) │ │ Parent │────▶│ Cloud │ │ │ │ Agent │ │ (SaaS) │ └──────────┘ │ │ └──────────┘ ┌──────────┐ │ │ │ Agent 2 │────▶│ │ │ (DB) │ └──────────┘ └──────────┘ ``` Stream metrics from child agents to a parent for centralized dashboarding and long-term storage. ## Netdata vs Alternatives | Feature | Netdata | Prometheus+Grafana | Datadog | Zabbix | |---------|---------|-------------------|---------|--------| | Setup time | 1 minute | Hours | Minutes | Hours | | Configuration | Zero-config | Extensive YAML | Agent config | Templates | | Granularity | Per-second | 15-second default | 15-second | 1-minute | | ML alerts | Built-in | No (manual rules) | Yes | No | | Out-of-box metrics | 2000+ | Need exporters | Agent-based | Templates | | Resource usage | ~1% CPU, 100MB | Varies | ~1% CPU | Varies | | Dashboard | Built-in real-time | Grafana (separate) | Built-in | Built-in | ## 常见问题 **Q: Netdata 和 Prometheus + Grafana 怎么选?** A: Netdata 适合快速部署和实时监控,开箱即用。Prometheus + Grafana 适合需要长期指标存储、自定义查询(PromQL)和定制化仪表盘的场景。两者可以共存——Netdata 导出指标到 Prometheus 也是常见架构。 **Q: Netdata Cloud 是必须的吗?** A: 不是。每个 Netdata agent 都有完整的本地仪表盘。Cloud 是可选的 SaaS 服务,用于跨多服务器的统一视图。自托管用户可以用 parent agent 替代。 **Q: 对服务器性能影响大吗?** A: 非常小。典型场景下 CPU 占用 ~1%,内存 ~100-150MB。Netdata 使用高效的 C 语言编写,专门优化了低开销采集。 ## 来源与致谢 - GitHub: [netdata/netdata](https://github.com/netdata/netdata) — 78.4K+ ⭐ | GPL-3.0 - 官网: [netdata.cloud](https://netdata.cloud) --- Source: https://tokrepo.com/en/workflows/ca4a8158-34bf-11f1-9bc6-00163e2b0d79 Author: Script Depot