Cette page est affichée en anglais. Une traduction française est en cours.

SkillsApr 14, 2026·3 min de lecture

Ray — Distributed Computing for Python and AI Workloads

Ray is a unified framework for scaling Python and AI applications. From distributed training and hyperparameter search to large-scale data processing and model serving — Ray powers the infrastructure behind ChatGPT, Uber, and Pinterest.

Script Depot · Community

Prêt pour agents

Installation agent prête

Cet actif peut être installé après choix du runtime, vérification du plan et exécution de la commande adaptée.

Native · 98/100Policy : autoriser

Surface agent

Tout agent MCP/CLI

Type

Skill

Installation

Single

Confiance

Confiance : Established

Point d'entrée

step-1.md

Commande d'installation directe

npx -y tokrepo@latest install b0f2e5e4-37db-11f1-9bc6-00163e2b0d79 --target codex

À exécuter après confirmation du plan en dry-run.

TL;DR

Ray scales Python and AI workloads across clusters with a simple decorator API for distributed training, tuning, and serving.

§01

What it is

Ray is an open-source unified framework for scaling Python and AI applications. It handles distributed training, hyperparameter search, large-scale data processing, and model serving under a single API. You turn any Python function into a distributed task with the @ray.remote decorator.

ML engineers, data scientists, and platform teams use Ray when their workloads outgrow a single machine. Ray's ecosystem includes Ray Train (distributed training), Ray Tune (hyperparameter optimization), Ray Serve (model serving), and Ray Data (distributed data processing).

§02

How it saves time or tokens

Ray abstracts away the complexity of distributed computing. Instead of managing MPI, Dask clusters, or custom job schedulers, you write standard Python and Ray handles scheduling, fault tolerance, and resource allocation. The @ray.remote decorator converts a function call into a distributed task without rewriting your code.

For LLM workloads, Ray Serve can host multiple models behind a single endpoint with autoscaling, reducing infrastructure management overhead.

§03

How to use

Install Ray:

pip install 'ray[default]'

Distribute a computation:

import ray
ray.init()

@ray.remote
def heavy(n):
    return sum(i * i for i in range(n))

futures = [heavy.remote(10_000_000) for _ in range(8)]
results = ray.get(futures)
print(sum(results))

This runs 8 parallel tasks across available CPU cores. On a cluster, it distributes across multiple nodes automatically.

§04

Example

Serving a model with Ray Serve:

from ray import serve

@serve.deployment
class Classifier:
    def __init__(self):
        self.model = load_model()

    def __call__(self, request):
        data = request.json()
        return self.model.predict(data['input'])

serve.run(Classifier.bind())

This deploys a model endpoint with built-in autoscaling, health checks, and request batching.

§05

Related on TokRepo

AI Tools for Coding -- Development tools for building AI applications
Local LLM Providers -- Run LLMs locally with distributed inference via Ray

§06

Common pitfalls

Ray's object store consumes significant memory. On machines with limited RAM, set object_store_memory in ray.init() to prevent out-of-memory errors.
Serialization errors occur when passing non-picklable objects between tasks. Use Ray's built-in serialization or convert objects to serializable formats before passing.
Starting Ray on a cluster requires matching Python and Ray versions across all nodes. Use Docker images or Conda environments to ensure consistency.

Questions fréquentes

How does Ray compare to Dask?+

Dask focuses on parallelizing NumPy and Pandas operations with familiar APIs. Ray is more general-purpose, handling arbitrary Python functions, actor-based concurrency, and ML-specific workloads like distributed training and model serving. Many teams use Ray for ML pipelines and Dask for data analytics.

Can Ray run on Kubernetes?+

Yes. The KubeRay operator deploys and manages Ray clusters on Kubernetes. It handles autoscaling, fault tolerance, and resource allocation. Ray also supports YARN and Slurm for HPC environments.

What is Ray Serve?+

Ray Serve is Ray's model serving framework. It deploys ML models as scalable HTTP endpoints with features like request batching, model composition, A/B testing, and autoscaling. It runs on top of Ray's distributed runtime.

Does Ray support GPU workloads?+

Yes. You can request GPU resources in the @ray.remote decorator: `@ray.remote(num_gpus=1)`. Ray's scheduler assigns tasks to nodes with available GPUs and handles multi-GPU distribution for training.

Is Ray suitable for small-scale projects?+

Yes. Ray works on a single machine for local parallelism (replacing multiprocessing) and scales to clusters when needed. The overhead of ray.init() on a laptop is minimal, so you can develop locally and deploy to a cluster without code changes.

Sources citées (3)

Ray GitHub— Ray is a unified framework for scaling Python and AI applications
Ray Documentation— Ray ecosystem includes Train, Tune, Serve, and Data libraries
KubeRay GitHub— KubeRay operator for running Ray on Kubernetes

En lien sur TokRepo

AI Coding Tools Local LLM Providers Featured Workflows

Fil de discussion

Connectez-vous pour rejoindre la discussion.

Aucun commentaire pour l'instant. Soyez le premier à partager votre avis.

Actifs similaires

Daft — High-Performance Data Engine for AI Workloads

Daft is a distributed DataFrame library written in Rust with Python bindings. It is designed for AI and multimodal workloads, handling images, audio, video, and structured data in a unified API that scales from a laptop to a Ray cluster.

Skills

Script Depot

Hazelcast — Real-Time Distributed Computing Platform

Hazelcast is an open-source distributed computing platform providing in-memory data structures, stream processing, and compute capabilities for low-latency applications at scale.

Skills

AI Open Source

Dask — Parallel Computing and Scalable DataFrames for Python

Dask is a flexible parallel computing library for Python that scales NumPy, pandas, and scikit-learn workflows from a laptop to a distributed cluster with minimal code changes.

Skills

AI Open Source

Hyperopt — Distributed Hyperparameter Optimization in Python

Hyperopt uses Tree of Parzen Estimators and random search to efficiently optimize hyperparameters for machine learning models, with optional distributed execution via MongoDB.

Skills

Script Depot