# Grafana Loki — Prometheus-Inspired Log Aggregation System

> Loki is a horizontally scalable, multi-tenant log aggregation system by Grafana Labs. Unlike other log systems, Loki indexes metadata about logs, not log content itself.

## Install

Save as a script file and run:

## Quick Use

```bash
# Docker run
docker run -d --name loki -p 3100:3100 grafana/loki:latest

# Docker run promtail (log collector)
docker run -d --name promtail 
  -v /var/log:/var/log:ro 
  -v ./promtail-config.yml:/etc/promtail/config.yml 
  grafana/promtail:latest
```

Then add Loki as a data source in Grafana at `http://localhost:3000`.

## Intro

**Loki** is a horizontally scalable, highly available, multi-tenant log aggregation system inspired by Prometheus. Built by Grafana Labs, Loki takes a unique approach: instead of indexing the full text of logs, it only indexes labels (metadata), making it much more cost-effective and efficient than traditional log systems like ELK.

With 28K+ GitHub stars and AGPL-3.0 license, Loki has become the go-to open-source log aggregation solution, especially for teams already using Prometheus and Grafana for metrics.

## What Loki Does

- **Log Aggregation**: Collect and store logs from all your services in one place
- **LogQL**: Prometheus-inspired query language for searching and aggregating logs
- **Label-Based Indexing**: Index only metadata (labels), not log content — 10x cheaper storage
- **Grafana Integration**: Native integration with Grafana for visualization
- **Multi-Tenancy**: Separate logs per tenant/team/environment
- **Horizontal Scaling**: Scale read/write paths independently
- **Cloud Native**: Designed for Kubernetes and cloud environments
- **Compression**: Gzip/LZ4/Snappy log compression
- **Retention**: Configurable retention periods per label stream

## Architecture

```
┌──────────────┐     ┌──────────────┐     ┌──────────────┐
│  Applications│────▶│  Promtail /  │────▶│  Loki        │
│  Containers  │     │  Fluent Bit /│     │  Distributor │
│  Systemd     │     │  Vector /    │     │  Ingester    │
└──────────────┘     │  OTel        │     │  Querier     │
                     └──────────────┘     └──────┬───────┘
                                                 │
                                          ┌──────┴───────┐
                                          │  Object      │
                                          │  Storage     │
                                          │  (S3/GCS/    │
                                          │  MinIO/local)│
                                          └──────────────┘
```

## Self-Hosting

### Docker Compose (Simple Setup)

```yaml
services:
  loki:
    image: grafana/loki:latest
    ports:
      - "3100:3100"
    command: -config.file=/etc/loki/local-config.yaml
    volumes:
      - loki-data:/loki

  promtail:
    image: grafana/promtail:latest
    volumes:
      - /var/log:/var/log:ro
      - /var/lib/docker/containers:/var/lib/docker/containers:ro
      - ./promtail-config.yml:/etc/promtail/config.yml
    command: -config.file=/etc/promtail/config.yml

  grafana:
    image: grafana/grafana:latest
    ports:
      - "3000:3000"
    environment:
      GF_AUTH_ANONYMOUS_ENABLED: "true"
      GF_AUTH_ANONYMOUS_ORG_ROLE: Admin

volumes:
  loki-data:
```

### Promtail Config

```yaml
# promtail-config.yml
server:
  http_listen_port: 9080

positions:
  filename: /tmp/positions.yaml

clients:
  - url: http://loki:3100/loki/api/v1/push

scrape_configs:
  - job_name: system
    static_configs:
      - targets:
          - localhost
        labels:
          job: varlogs
          host: myserver
          __path__: /var/log/*log

  - job_name: docker
    docker_sd_configs:
      - host: unix:///var/run/docker.sock
    relabel_configs:
      - source_labels: ['__meta_docker_container_name']
        target_label: container
```

## LogQL (Query Language)

### Basic Queries

```logql
# All logs from nginx
{container="nginx"}

# Logs containing "error" (case insensitive)
{container="nginx"} |~ "(?i)error"

# JSON logs with specific field
{job="api"} | json | level="error"

# Exclude healthchecks
{container="nginx"} != "/health"

# Multiple filters
{namespace="production", app="web"} |= "500" != "healthcheck"
```

### Metric Queries

```logql
# Count errors per minute
count_over_time({container="api"} |= "ERROR" [1m])

# Rate of requests
rate({container="nginx"}[5m])

# Error rate percentage
sum(rate({app="api"} |= "ERROR" [5m]))
  / sum(rate({app="api"}[5m])) * 100

# Top 10 hosts by log volume
topk(10, sum(rate({job="varlogs"}[5m])) by (host))
```

### Structured Logs

```logql
# Parse JSON and filter
{app="api"}
  | json
  | status >= 500
  | duration > 1000

# Extract labels from logs
{app="web"}
  | regexp `(?P<method>w+) (?P<path>/S+)`
  | method="POST"
```

## Key Features

### Cost Efficiency

```
ElasticSearch indexes every word in logs:
  → 100GB logs → 200-400GB storage
  → High CPU for indexing
  → Expensive RAM requirements

Loki indexes only labels:
  → 100GB logs → 50-100GB storage (compressed)
  → Low CPU for indexing
  → Minimal RAM (only for query time)
```

### Label-Based Sharding

```
Log stream = unique combination of labels
{namespace="prod", app="api", pod="api-abc123"}
{namespace="prod", app="web", pod="web-xyz789"}

Labels become index keys
Log content is only scanned during queries
```

### Integration with Metrics

```
Grafana Dashboard:
├── CPU Usage (Prometheus metric)
├── Error Rate (LogQL count_over_time)
├── Recent Errors (Loki logs)
└── Link errors to trace in Tempo
```

## Loki vs Alternatives

| Feature | Loki | ElasticSearch | Splunk | Graylog |
|---------|------|---------------|--------|---------|
| Open Source | Yes (AGPL-3.0) | Yes (Elastic/AGPL) | No | Yes (SSPL) |
| Indexing | Labels only | Full-text | Full-text | Full-text |
| Storage cost | Low | High | Very high | Medium |
| Query language | LogQL | KQL/Lucene | SPL | Graylog syntax |
| Grafana integration | Native | Plugin | Plugin | Plugin |
| Scale | Horizontal | Horizontal | Horizontal | Horizontal |
| Best for | Label-rich env (K8s) | Full-text search | Enterprise | Mid-size |

## 常见问题

**Q: Loki 和 ElasticSearch 怎么选？**
A: 如果你主要想按时间范围和标签（container、namespace、pod）过滤日志，Loki 成本更低、效率更高。如果你需要对日志内容进行复杂的全文搜索和分析，ElasticSearch 更强大。

**Q: 为什么只索引标签？**
A: 这是 Loki 的核心设计。大多数日志查询都是"给我 X 服务在 Y 时间段的日志"，用标签索引就够了。然后用 grep 式过滤在查询时处理内容匹配。这样存储成本降低 10x+。

**Q: 适合什么规模？**
A: 从单机日志（几 GB/天）到大规模生产集群（几 TB/天）都适用。单实例部署可以处理中小规模，分布式部署可以线性扩展到 PB 级。

## 来源与致谢

- GitHub: [grafana/loki](https://github.com/grafana/loki) — 28K+ ⭐ | AGPL-3.0
- 官网: [grafana.com/loki](https://grafana.com/oss/loki)

---
Source: https://tokrepo.com/en/workflows/92fa7c1f-352f-11f1-9bc6-00163e2b0d79
Author: Script Depot