Configs2026年5月23日·1 分钟阅读

CuPy — NumPy and SciPy for GPU

Open-source array library accelerated with NVIDIA CUDA, providing a drop-in replacement for NumPy and SciPy on the GPU.

Agent 就绪

这个资产可以被 Agent 直接读取和安装

TokRepo 同时提供通用 CLI 命令、安装契约、metadata JSON、按适配器生成的安装计划和原始内容链接,方便 Agent 判断适配度、风险和下一步动作。

Native · 98/100策略:允许
Agent 入口
任意 MCP/CLI Agent
类型
Skill
安装
Single
信任
信任等级:Established
入口
CuPy Overview
通用 CLI 安装命令
npx tokrepo install cb150eb6-56e5-11f1-9bc6-00163e2b0d79

Introduction

CuPy is an open-source Python library that mirrors the NumPy and SciPy APIs while executing operations on NVIDIA GPUs via CUDA. By changing a single import line, existing NumPy code can leverage GPU acceleration with minimal refactoring. CuPy is maintained by Preferred Networks and used in scientific computing, deep learning preprocessing, and signal processing workloads.

What CuPy Does

  • Provides GPU-backed ndarray compatible with NumPy array operations
  • Implements hundreds of NumPy and SciPy functions including linear algebra, FFT, and sparse matrices
  • Supports custom CUDA kernels through ElementwiseKernel and RawKernel APIs
  • Integrates with cuDNN, cuBLAS, cuSOLVER, cuSPARSE, and NCCL for optimized routines
  • Offers interoperability with PyTorch, TensorFlow, and DLPack tensors

Architecture Overview

CuPy allocates device memory through a pooled allocator that reduces CUDA malloc overhead. Array operations dispatch to pre-compiled CUDA kernels or call into NVIDIA library routines. A JIT compilation cache stores custom kernels so they compile only once per session. The library follows the Python Array API standard, making it compatible with array-agnostic code written for NumPy.

Self-Hosting & Configuration

  • Install the wheel matching your CUDA version: pip install cupy-cuda12x
  • Set CUPY_CACHE_DIR to persist JIT-compiled kernels across runs
  • Use cupy.cuda.Device(n) to select which GPU to target
  • Configure the memory pool with cupy.get_default_memory_pool().set_limit(size=4*1024**3) to cap usage
  • For multi-GPU work, combine CuPy with mpi4py or NCCL communicators

Key Features

  • Drop-in NumPy replacement requiring only an import change
  • Routinely achieves 10-100x speedups over CPU NumPy on large arrays
  • Supports CUDA Graphs for reduced kernel-launch overhead
  • Works with AMD ROCm GPUs through the HIP backend
  • Actively maintained with regular releases tracking CUDA toolkit versions

Comparison with Similar Tools

  • NumPy — CPU-only; CuPy mirrors its API on the GPU
  • JAX — JIT-compiled with autograd focus; CuPy is closer to a direct NumPy port
  • PyTorch Tensors — deep learning-oriented; CuPy targets general scientific computing
  • RAPIDS cuDF — GPU DataFrames built on top of CuPy for tabular data
  • Numba — JIT-compiles Python loops to GPU; CuPy provides pre-built array ops

FAQ

Q: Can I use CuPy without NVIDIA hardware? A: CuPy requires a CUDA-capable GPU by default, but an experimental ROCm backend supports AMD GPUs.

Q: Does CuPy work in Jupyter notebooks? A: Yes. Install the appropriate cupy wheel, and GPU arrays display just like NumPy arrays in cells.

Q: How does CuPy handle data transfer between CPU and GPU? A: Use cupy.asarray(np_array) to send data to GPU and cupy.asnumpy(cp_array) to bring it back.

Q: Is CuPy compatible with the latest CUDA versions? A: CuPy ships wheels for each major CUDA release. Check the installation guide for your CUDA version.

Sources

讨论

登录后参与讨论。
还没有评论,来写第一条吧。

相关资产