Cette page est affichée en anglais. Une traduction française est en cours.
CLI ToolsMay 11, 2026·2 min de lecture

Olive — Optimize Models for Faster Inference

Olive automates model optimization via a CLI so teams can reduce latency and cost (e.g., quantization/ONNX paths) before serving models in apps or agents.

Prêt pour agents

Cet actif peut être lu et installé directement par les agents

TokRepo expose une commande CLI universelle, un contrat d'installation, le metadata JSON, un plan selon l'adaptateur et le contenu raw pour aider les agents à juger l'adaptation, le risque et les prochaines actions.

Stage only · 29/100Stage only
Surface agent
Tout agent MCP/CLI
Type
CLI Tool
Installation
Single
Confiance
Confiance : Established
Point d'entrée
README.md
Commande CLI universelle
npx tokrepo install 46ee49fb-a2a1-4d36-af94-e6fb4b7fa220
Introduction

Olive automates model optimization via a CLI so teams can reduce latency and cost (e.g., quantization/ONNX paths) before serving models in apps or agents.

  • Best for: Teams serving models who want a repeatable optimization pipeline (CLI-first, configable)
  • Works with: Python environments + Olive CLI; integrates with model download flows and hardware-specific optimization paths
  • Setup time: 30 minutes

Practical Notes

  • Setup time ~30 minutes (env + install + one optimize run)
  • Quantitative knob from README: --precision int4 is an explicit measurable target
  • GitHub stars + forks (verified): see Source & Thanks

In agent products, optimization is often the cheapest “quality win”: you can keep the same prompts and tools while reducing latency enough to make multi-step plans feasible.

Practical workflow:

  1. Define a target metric (latency, memory, cost) and hardware target.
  2. Run Olive optimizations from a config or scripted CLI invocation.
  3. Benchmark the optimized model in your actual agent loop (not only in an isolated benchmark).

Treat artifacts as build outputs: version them, and attach the exact command/config used so results are reproducible.

FAQ

Q: Is Olive only for ONNX? A: The README highlights ONNX-related paths, but the project is positioned as a general model optimization toolkit with configurable pipelines.

Q: How do I know optimization helped agents? A: Measure end-to-end agent latency and success rate with the optimized model in the loop.

Q: What should I version-control? A: Your Olive config/commands plus benchmark notes and artifact hashes/paths.

🙏

Source et remerciements

Source: https://github.com/microsoft/Olive > License: MIT > GitHub stars: 2,312 · forks: 295

Fil de discussion

Connectez-vous pour rejoindre la discussion.
Aucun commentaire pour l'instant. Soyez le premier à partager votre avis.

Actifs similaires