Cette page est affichée en anglais. Une traduction française est en cours.
SkillsApr 14, 2026·3 min de lecture

Ray — Distributed Computing for Python and AI Workloads

Ray is a unified framework for scaling Python and AI applications. From distributed training and hyperparameter search to large-scale data processing and model serving — Ray powers the infrastructure behind ChatGPT, Uber, and Pinterest.

Prêt pour agents

Installation agent prête

Cet actif peut être installé après choix du runtime, vérification du plan et exécution de la commande adaptée.

Native · 98/100Policy : autoriser
Surface agent
Tout agent MCP/CLI
Type
Skill
Installation
Single
Confiance
Confiance : Established
Point d'entrée
step-1.md
Commande d'installation directe
npx -y tokrepo@latest install b0f2e5e4-37db-11f1-9bc6-00163e2b0d79 --target codex

À exécuter après confirmation du plan en dry-run.

TL;DR
Ray scales Python and AI workloads across clusters with a simple decorator API for distributed training, tuning, and serving.
§01

What it is

Ray is an open-source unified framework for scaling Python and AI applications. It handles distributed training, hyperparameter search, large-scale data processing, and model serving under a single API. You turn any Python function into a distributed task with the @ray.remote decorator.

ML engineers, data scientists, and platform teams use Ray when their workloads outgrow a single machine. Ray's ecosystem includes Ray Train (distributed training), Ray Tune (hyperparameter optimization), Ray Serve (model serving), and Ray Data (distributed data processing).

§02

How it saves time or tokens

Ray abstracts away the complexity of distributed computing. Instead of managing MPI, Dask clusters, or custom job schedulers, you write standard Python and Ray handles scheduling, fault tolerance, and resource allocation. The @ray.remote decorator converts a function call into a distributed task without rewriting your code.

For LLM workloads, Ray Serve can host multiple models behind a single endpoint with autoscaling, reducing infrastructure management overhead.

§03

How to use

  1. Install Ray:
pip install 'ray[default]'
  1. Distribute a computation:
import ray
ray.init()

@ray.remote
def heavy(n):
    return sum(i * i for i in range(n))

futures = [heavy.remote(10_000_000) for _ in range(8)]
results = ray.get(futures)
print(sum(results))
  1. This runs 8 parallel tasks across available CPU cores. On a cluster, it distributes across multiple nodes automatically.
§04

Example

Serving a model with Ray Serve:

from ray import serve

@serve.deployment
class Classifier:
    def __init__(self):
        self.model = load_model()

    def __call__(self, request):
        data = request.json()
        return self.model.predict(data['input'])

serve.run(Classifier.bind())

This deploys a model endpoint with built-in autoscaling, health checks, and request batching.

§05

Related on TokRepo

§06

Common pitfalls

  • Ray's object store consumes significant memory. On machines with limited RAM, set object_store_memory in ray.init() to prevent out-of-memory errors.
  • Serialization errors occur when passing non-picklable objects between tasks. Use Ray's built-in serialization or convert objects to serializable formats before passing.
  • Starting Ray on a cluster requires matching Python and Ray versions across all nodes. Use Docker images or Conda environments to ensure consistency.

Questions fréquentes

How does Ray compare to Dask?+

Dask focuses on parallelizing NumPy and Pandas operations with familiar APIs. Ray is more general-purpose, handling arbitrary Python functions, actor-based concurrency, and ML-specific workloads like distributed training and model serving. Many teams use Ray for ML pipelines and Dask for data analytics.

Can Ray run on Kubernetes?+

Yes. The KubeRay operator deploys and manages Ray clusters on Kubernetes. It handles autoscaling, fault tolerance, and resource allocation. Ray also supports YARN and Slurm for HPC environments.

What is Ray Serve?+

Ray Serve is Ray's model serving framework. It deploys ML models as scalable HTTP endpoints with features like request batching, model composition, A/B testing, and autoscaling. It runs on top of Ray's distributed runtime.

Does Ray support GPU workloads?+

Yes. You can request GPU resources in the @ray.remote decorator: `@ray.remote(num_gpus=1)`. Ray's scheduler assigns tasks to nodes with available GPUs and handles multi-GPU distribution for training.

Is Ray suitable for small-scale projects?+

Yes. Ray works on a single machine for local parallelism (replacing multiprocessing) and scales to clusters when needed. The overhead of ray.init() on a laptop is minimal, so you can develop locally and deploy to a cluster without code changes.

Sources citées (3)
  • Ray GitHub— Ray is a unified framework for scaling Python and AI applications
  • Ray Documentation— Ray ecosystem includes Train, Tune, Serve, and Data libraries
  • KubeRay GitHub— KubeRay operator for running Ray on Kubernetes

Fil de discussion

Connectez-vous pour rejoindre la discussion.
Aucun commentaire pour l'instant. Soyez le premier à partager votre avis.

Actifs similaires