ConfigsJul 3, 2026·2 min read

Scalene — High-Performance Python CPU, GPU, and Memory Profiler

A Python profiler that provides detailed CPU, GPU, and memory profiling with line-level granularity and AI-powered optimization suggestions.

Agent ready

Ready-to-run agent install

This asset can be installed after the agent chooses its runtime, checks the plan, and runs the matching command.

Native · 98/100Policy: allow
Agent surface
Any MCP/CLI agent
Kind
Skill
Install
Single
Trust
Trust: Established
Entrypoint
Scalene
Direct install command
npx -y tokrepo@latest install 2307f8fd-76db-11f1-9bc6-00163e2b0d79 --target codex

Run after dry-run confirms the install plan.

Introduction

Scalene is a Python profiler that simultaneously tracks CPU time, GPU usage, and memory consumption at the line level. Unlike cProfile or other standard profilers, Scalene uses sampling to keep overhead low and provides AI-powered suggestions for optimization.

What Scalene Does

  • Profiles CPU, GPU, and memory usage at individual line granularity
  • Separates Python time from native C/C++ extension time
  • Tracks memory allocation and deallocation patterns per line
  • Detects memory leaks by identifying lines that allocate without freeing
  • Generates AI-driven optimization proposals using an integrated LLM

Architecture Overview

Scalene uses a combination of sampling and signal-based profiling to minimize overhead, typically under 15%. CPU profiling uses timer signals, memory profiling intercepts malloc/free calls through a custom allocator, and GPU profiling polls NVIDIA driver stats. Results are aggregated per line of source code and displayed in a web-based viewer or terminal output.

Self-Hosting & Configuration

  • Install with pip: pip install scalene (supports Python 3.8+)
  • Run with scalene script.py for automatic profiling of all metrics
  • Use --cpu-only or --memory-only flags to reduce profiling scope
  • Enable GPU profiling automatically when CUDA is detected
  • Launch the web viewer with --html for an interactive report

Key Features

  • Low overhead profiling (typically under 15%) via statistical sampling
  • Separates Python execution time from C extension time per line
  • Copy-detection identifies lines with excessive data copying
  • Web-based interactive report with sortable columns and flame graphs
  • Built-in AI optimization suggestions using GPT or local LLM models

Comparison with Similar Tools

  • cProfile — Built-in Python profiler; function-level only, no memory profiling
  • Pyinstrument — Statistical profiler focused on call stacks; no memory or GPU tracking
  • py-spy — Sampling CPU profiler; no memory profiling or AI suggestions
  • memory_profiler — Memory-only profiling; Scalene covers CPU, GPU, and memory together

FAQ

Q: How much overhead does Scalene add? A: Typically under 15%, significantly less than deterministic profilers like cProfile.

Q: Does it work with multithreaded code? A: Yes. Scalene profiles all threads and shows per-thread CPU usage.

Q: Can I profile Jupyter notebooks? A: Yes. Use the %%scalene cell magic after loading the extension.

Q: What does the AI optimization feature do? A: It sends profiling results to an LLM to generate specific code-level optimization suggestions.

Sources

Discussion

Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.

Related Assets