Scripts2026年5月9日·1 分钟阅读

async-profiler — Low-Overhead Sampling Profiler for Java

A low-overhead sampling profiler for JVM applications that captures CPU, allocation, and lock profiles without safepoint bias using perf_events and AsyncGetCallTrace.

Introduction

async-profiler is a low-overhead sampling profiler for Java that avoids the safepoint bias plaguing traditional JVM profilers. It uses Linux perf_events and the HotSpot AsyncGetCallTrace API to capture accurate CPU and allocation profiles from production workloads.

What async-profiler Does

  • Captures CPU profiles using hardware performance counters via perf_events
  • Tracks heap allocations with TLAB-based sampling to find memory-hungry code paths
  • Records lock contention events to pinpoint threading bottlenecks
  • Generates interactive HTML flame graphs, JFR files, or collapsed stack output
  • Attaches to a running JVM without restarts or special JVM flags

Architecture Overview

async-profiler consists of a native agent (shared library) that loads into the target JVM via the attach API. On Linux it hooks into perf_events for CPU sampling and intercepts TLAB allocation slow paths for allocation profiling. Stack unwinding combines Java frames from AsyncGetCallTrace with native frames from frame pointers or DWARF, producing unified mixed-mode stack traces without stopping application threads at safepoints.

Self-Hosting & Configuration

  • Download platform-specific binaries from GitHub releases for Linux or macOS
  • Attach to a live JVM with ./bin/asprof -d -f output.html
  • Use -e alloc to switch from CPU to allocation profiling mode
  • Use -e lock to profile monitor contention and synchronized blocks
  • Convert output to FlameGraph, JFR, or pprof formats with built-in converters

Key Features

  • Safepoint-free sampling delivers accurate profiles under real production load
  • Mixed-mode flame graphs show Java, native, and kernel frames in a single view
  • Allocation profiling reveals object creation hotspots without full heap dumps
  • Wall-clock profiling mode captures off-CPU time including I/O waits
  • Supports JDK 8 through JDK 21+ on Linux and macOS

Comparison with Similar Tools

  • JFR (Java Flight Recorder) — built into the JDK; async-profiler offers lower overhead and avoids safepoint bias
  • VisualVM — GUI-based profiler with instrumentation; async-profiler is CLI-first and sampling-based
  • perf — Linux system profiler; async-profiler adds Java stack unwinding and JIT symbol resolution
  • FlameGraph — visualization scripts; async-profiler generates flame graphs natively and provides the data source
  • YourKit / JProfiler — commercial profilers with rich GUIs; async-profiler is free and production-safe

FAQ

Q: Does async-profiler require special JVM flags? A: No. It attaches to a running JVM dynamically. For some advanced features, adding -XX:+UnlockDiagnosticVMOptions -XX:+DebugNonSafepoints improves frame accuracy.

Q: Can I use it in Docker containers? A: Yes, but you may need --cap-add SYS_PTRACE and --pid=host, or run with --privileged to allow perf_events access.

Q: What is safepoint bias? A: Traditional JVM profilers only sample at safepoints, skewing results toward code that happens to trigger safepoints. async-profiler samples at true random intervals for accurate profiles.

Q: Does it support ARM processors? A: Yes. Builds are available for both x64 and aarch64 on Linux and macOS.

Sources

讨论

登录后参与讨论。
还没有评论,来写第一条吧。

相关资产