Introduction
Tracy is a real-time profiler designed for game engines, simulations, and performance-critical C/C++ applications. It captures frame timing, zone durations, memory allocations, lock contention, context switches, and GPU events with nanosecond precision. A companion GUI application connects to the profiled program over a network socket, displaying live timelines, flame charts, and statistical breakdowns while the application runs.
What Tracy Does
- Instruments code zones with near-zero overhead macro annotations
- Captures CPU zone timing, call stacks, memory allocations, and lock contention
- Records GPU events for Vulkan, OpenGL, Direct3D 11/12, and Metal
- Streams profiling data in real time to a GUI viewer over TCP
- Supports Lua and other scripting languages via manual zone API calls
Architecture Overview
Tracy uses a client-server model. The client is a header-only C++ library linked into the profiled application. Annotated zones and events are written into a lock-free ring buffer with minimal overhead (typically single-digit nanoseconds per zone). A background thread streams this data over a TCP connection to the Tracy GUI server. The server decompresses, indexes, and renders the data as interactive timelines, flame charts, histograms, and statistics tables. The profiler can also capture kernel-level context switch data on Linux and Windows for full system visibility.
Self-Hosting & Configuration
- Clone the repository and include
tracy/Tracy.hppin your project - Add
TracyClient.cppto your build or use the CMake integration - Compile with
TRACY_ENABLEdefined to activate profiling macros - Build the profiler GUI from the
profilerdirectory using CMake - Connect the GUI to your running application by entering its IP address and port
Key Features
- Nanosecond-resolution zone timing with single-digit nanosecond instrumentation overhead
- Live network streaming allows profiling remote or embedded targets
- GPU profiling for Vulkan, OpenGL, Direct3D, and Metal rendering pipelines
- Memory allocation tracking with per-callstack attribution and leak detection
- Lock and mutex contention visualization showing which threads are waiting and where
Comparison with Similar Tools
- Valgrind / Callgrind — full CPU emulation with high overhead; Tracy instruments natively with near-zero cost
- perf — Linux sampling profiler; Tracy provides deterministic instrumentation with GPU and memory tracking
- Intel VTune — powerful but proprietary and platform-specific; Tracy is free, open-source, and cross-platform
- Superluminal — commercial game profiler for Windows; Tracy offers similar features as open source with Linux and macOS support
- Optick — another open-source game profiler; Tracy adds GPU profiling, lock analysis, and a more mature network streaming architecture
FAQ
Q: How much overhead does Tracy add? A: Each instrumented zone adds roughly 2-5 nanoseconds on modern hardware when the profiler is connected, and near-zero when disconnected.
Q: Can I profile a release build? A: Yes. Tracy is designed to be left in release builds. Compile with TRACY_ENABLE for profiled builds and without it for fully stripped production builds.
Q: Does it work on consoles or embedded targets? A: Tracy supports any platform with a TCP stack. It has been used on game consoles, mobile devices, and embedded Linux systems.
Q: Can I save and share profiling sessions? A: Yes. The GUI can save captured sessions to a file and reload them later for offline analysis.