Cette page est affichée en anglais. Une traduction française est en cours.
ScriptsApr 29, 2026·3 min de lecture

Numba — JIT Compiler That Makes Python Code Run at C Speed

Numba is an open-source JIT compiler that translates Python and NumPy code into fast machine code using LLVM. It accelerates numerical functions by orders of magnitude with minimal code changes.

Introduction

Numba is a just-in-time compiler for Python developed by Anaconda. By adding a single decorator to a function, Numba compiles it to optimized machine code via LLVM at runtime. It targets numerical and scientific workloads where pure Python loops over arrays would otherwise be too slow.

What Numba Does

  • JIT-compiles Python functions to native machine code using the LLVM backend
  • Accelerates NumPy array operations and Python loops by 10-100x or more
  • Supports automatic parallelization of loops across CPU cores with @njit(parallel=True)
  • Generates CUDA GPU kernels from Python with @cuda.jit for NVIDIA GPUs
  • Provides ahead-of-time compilation for deployment without the JIT warmup cost

Architecture Overview

When a Numba-decorated function is first called, Numba analyzes the Python bytecode and infers types from the arguments. It translates the typed IR to LLVM IR, which the LLVM backend compiles to native machine code for the host CPU. Subsequent calls skip compilation and execute the cached native code directly. For GPU targets, Numba generates PTX code and launches CUDA kernels.

Self-Hosting & Configuration

  • Install with pip install numba or conda install numba (conda recommended for LLVM alignment)
  • Decorate functions with @njit (no-Python mode) for best performance
  • Enable parallel loops with @njit(parallel=True) and use prange instead of range
  • Set NUMBA_NUM_THREADS to control parallelism; defaults to the number of CPU cores
  • Use @cuda.jit for NVIDIA GPU acceleration with CUDA toolkit installed

Key Features

  • Zero-overhead decorator API requires no rewriting of algorithm logic
  • Supports NumPy arrays, dtypes, and many NumPy functions natively
  • Automatic loop parallelization and SIMD vectorization on modern CPUs
  • CUDA GPU support compiles Python directly to GPU kernels
  • Caching compiled functions to disk avoids recompilation across runs

Comparison with Similar Tools

  • Cython — Ahead-of-time compilation with C-like syntax; more setup but supports C library interop
  • PyPy — Alternative Python interpreter with JIT; faster for general code but less NumPy optimization
  • CuPy — GPU-accelerated NumPy replacement; array-level API rather than custom kernel compilation
  • JAX — Functional JIT with autograd and TPU support; better for ML, Numba better for general numerics
  • Taichi — Domain-specific JIT for parallel computing; stronger for spatial simulations and graphics

FAQ

Q: Does Numba work with all Python code? A: No. Numba's nopython mode supports a subset of Python: numeric types, NumPy arrays, tuples, and typed containers. It does not support dictionaries, classes, or string operations in nopython mode.

Q: How much speedup can I expect? A: Numerical loops typically see 10-100x speedup over pure Python. Array-heavy code with NumPy operations may see 2-10x improvement depending on the workload.

Q: Can I use Numba in production? A: Yes. Use ahead-of-time compilation (@cc.export) or rely on the function cache (cache=True) to avoid JIT warmup in production environments.

Q: Does Numba support AMD GPUs? A: Numba has experimental ROCm support via the roc target, but CUDA on NVIDIA GPUs is the mature and recommended GPU path.

Sources

Fil de discussion

Connectez-vous pour rejoindre la discussion.
Aucun commentaire pour l'instant. Soyez le premier à partager votre avis.

Actifs similaires