Scripts2026年4月15日·1 分钟阅读

Apache SkyWalking — Distributed APM & Observability Platform

Apache-licensed APM platform unifying distributed tracing, metrics, logs, and eBPF profiling for microservices and service meshes.

Introduction

Apache SkyWalking is a top-level Apache Foundation observability platform focused on distributed tracing, service-mesh telemetry, and application performance monitoring for cloud-native stacks. It was designed from day one around the complexity of microservices and service meshes, so it bundles tracing, metrics, logs, events, and alerting into a single backend instead of asking you to glue them together.

What SkyWalking Does

  • Collects distributed traces from Java, .NET, Node.js, Python, Go, PHP, Rust, and LUA agents.
  • Ingests OpenTelemetry, Zipkin, Jaeger, Prometheus, and eBPF-based profiling data out of the box.
  • Builds topology maps of services, instances, endpoints, and external dependencies automatically.
  • Correlates logs and traces via a shared trace/segment ID and searchable tag query language.
  • Provides alerting, metric analysis language (MAL/OAL), and dashboards in a single OAP backend.

Architecture Overview

The core backend is called OAP (Observability Analysis Platform), written in Java. Agents and collectors push data to OAP via gRPC or HTTP; OAP parses it through stream analysis pipelines, produces metrics from traces using the Observability Analysis Language, and persists everything to a pluggable storage layer (Elasticsearch, OpenSearch, BanyanDB, MySQL/PostgreSQL, TiDB). The Rocketbot UI and a GraphQL API sit on top. For service meshes, SkyWalking includes Envoy ALS receivers and Rover, an eBPF agent that profiles processes on Kubernetes nodes without any code changes.

Self-Hosting & Configuration

  • Helm chart: helm install skywalking oci://registry-1.docker.io/apache/skywalking-helm with value overrides for storage.
  • For production, run BanyanDB or Elasticsearch 8.x — avoid H2 beyond evaluation.
  • Scale OAP horizontally behind a headless service; SkyWalking uses Zookeeper or Kubernetes for cluster coordination.
  • Tune core.recordDataTTL and core.metricsDataTTL to bound storage growth.
  • Enable the alarm engine via alarm-settings.yml and plug in webhooks, Slack, DingTalk, or PagerDuty.

Key Features

  • Native support for both agent-based instrumentation and service-mesh telemetry.
  • eBPF profiling with Rover for CPU/off-CPU and network profiling without recompilation.
  • Log/trace correlation and trace-to-metrics conversion via the OAL scripting language.
  • Browser Real User Monitoring (RUM) agent for frontend performance data.
  • BanyanDB: a purpose-built observability database written in Go, shipped by the same project.

Comparison with Similar Tools

  • Jaeger — strong distributed tracing but lacks integrated metrics and logs pipeline.
  • Prometheus + Grafana + Loki + Tempo — powerful stack, but you assemble and operate four systems.
  • Elastic APM — tight Elasticsearch coupling; SkyWalking is storage-agnostic.
  • Datadog APM — SaaS, per-host pricing; SkyWalking is self-hostable and Apache-licensed.
  • SigNoz — similar all-in-one goal, smaller scale; SkyWalking has broader agent coverage.

FAQ

Q: Can SkyWalking ingest OpenTelemetry data? A: Yes. The OTel collector can ship OTLP traces, metrics, and logs directly to OAP.

Q: Does the Java agent require code changes? A: No. Attach it via -javaagent and it auto-instruments common frameworks.

Q: What storage should I choose for production? A: BanyanDB for observability-native workloads, or Elasticsearch/OpenSearch if you already run one.

Q: Is it viable for non-Java stacks? A: Absolutely. Node.js, Python, Go (via SkyAPM-go2sky), .NET Core, and PHP agents are first-class.

Sources

讨论

登录后参与讨论。
还没有评论,来写第一条吧。

相关资产