Cette page est affichée en anglais. Une traduction française est en cours.
ScriptsJul 1, 2026·3 min de lecture

Volcano — Kubernetes Batch and HPC Job Scheduler

Volcano is a cloud-native batch scheduling system for Kubernetes that supports machine learning, deep learning, bioinformatics, and high-performance computing workloads with advanced scheduling policies.

Prêt pour agents

Installation agent prête

Cet actif peut être installé après choix du runtime, vérification du plan et exécution de la commande adaptée.

Native · 98/100Policy : autoriser
Surface agent
Tout agent MCP/CLI
Type
Skill
Installation
Single
Confiance
Confiance : Established
Point d'entrée
Volcano Overview
Commande d'installation directe
npx -y tokrepo@latest install e3acbb23-751f-11f1-9bc6-00163e2b0d79 --target codex

À exécuter après confirmation du plan en dry-run.

Introduction

Volcano is a CNCF incubating project that extends Kubernetes with batch scheduling capabilities. It was created to address the gap between Kubernetes' default scheduler and the demands of compute-intensive workloads such as ML training, genomics pipelines, and scientific simulations.

What Volcano Does

  • Provides gang scheduling so all pods in a job start together or not at all
  • Supports fair-share, priority, and preemption scheduling policies
  • Manages job lifecycles with dependency-aware task ordering
  • Integrates with frameworks like Spark, TensorFlow, PyTorch, and MPI
  • Offers queue-based resource management for multi-tenant clusters

Architecture Overview

Volcano consists of three main components: the Volcano Scheduler (a custom kube-scheduler that implements advanced scheduling algorithms), the Volcano Controller Manager (which manages CRDs like Job, Queue, and PodGroup), and the Volcano Admission Webhook (which validates and mutates resources). These components run as deployments in the volcano-system namespace and extend the Kubernetes API with custom resource definitions.

Self-Hosting & Configuration

  • Deploy via Helm chart or YAML manifests from the official repository
  • Configure scheduling policies through SchedulerConfiguration CRD
  • Set up Queues with resource quotas for multi-tenant isolation
  • Tune gang scheduling parameters per job via PodGroup minAvailable
  • Monitor through Prometheus metrics exposed by the scheduler

Key Features

  • Gang scheduling ensures all-or-nothing pod allocation for distributed jobs
  • Multiple scheduling algorithms: gang, binpack, fair-share, DRF, proportion
  • Native CRD-based job management with task-level dependency graphs
  • Queue management with hierarchical resource allocation
  • Plugin-based scheduler architecture for custom scheduling logic

Comparison with Similar Tools

  • Default kube-scheduler — handles general workloads but lacks gang scheduling and job-level awareness
  • Apache YuniKorn — similar batch scheduler with different queue model and resource partitioning
  • Kueue — newer Kubernetes-native job queueing focused on quota management, less scheduling customization
  • Armada — multi-cluster job scheduling at larger scale, more complex setup

FAQ

Q: Does Volcano replace the default Kubernetes scheduler? A: No. Volcano runs alongside the default scheduler. You assign workloads to Volcano by setting schedulerName: volcano in your pod spec.

Q: Can Volcano schedule GPU workloads? A: Yes. Volcano supports GPU scheduling and can enforce topology-aware placement for multi-GPU training jobs.

Q: What is gang scheduling and why does it matter? A: Gang scheduling ensures all pods in a group are scheduled simultaneously. This prevents deadlocks in distributed training where partial allocation wastes resources.

Q: Does Volcano work with managed Kubernetes services? A: Yes. Volcano runs on any conformant Kubernetes cluster including EKS, GKE, and AKS.

Sources

Fil de discussion

Connectez-vous pour rejoindre la discussion.
Aucun commentaire pour l'instant. Soyez le premier à partager votre avis.

Actifs similaires