Manifest — Intelligent LLM Cost Optimization
The Problem
Different LLM tasks have different complexity levels. Sending every request to GPT-4o or Claude Opus wastes money — many requests could be handled by cheaper models just as well.
The Solution
Manifest analyzes each request's complexity and routes it to the cheapest model that meets the quality threshold. Simple tasks go to fast, cheap models. Complex tasks go to powerful ones.
How It Works
- Request arrives from your application
- 23-dimension scoring analyzes complexity (under 2ms latency)
- Model selection picks the cheapest capable model
- Routing sends to the selected provider
- Fallback automatically retries with a different model if the first fails
Key Features
- 300+ models from 13+ providers
- 23-dimension scoring in under 2ms
- Up to 70% cost reduction without quality loss
- Automatic fallbacks when models fail
- Budget controls — set spending limits per model, team, or project
- Transparent decisions — dashboard shows why each request was routed where
- Direct provider access — your API keys, no middleman markup
Supported Providers
OpenAI, Anthropic (Claude), Google (Gemini), DeepSeek, Mistral, Groq, Together AI, Fireworks, Cerebras, and more.
Deployment Options
| Option | Command |
|---|---|
| Cloud | Visit app.manifest.build |
| Local | openclaw plugins install manifest |
| Docker | docker run -p 2099:2099 mnfst/manifest |
Cost Savings Example
| Scenario | Without Manifest | With Manifest | Savings |
|---|---|---|---|
| Customer support bot | $500/mo (all GPT-4o) | $150/mo (mixed routing) | 70% |
| Code review agent | $800/mo (all Claude Opus) | $320/mo (mixed routing) | 60% |
| Data extraction pipeline | $300/mo (all GPT-4) | $90/mo (mixed routing) | 70% |
FAQ
Q: What is Manifest? A: A smart LLM router that scores requests across 23 dimensions and routes them to the cheapest capable model, cutting LLM API costs up to 70% without quality degradation.
Q: Is Manifest free? A: The core router is open-source under MIT. Self-host for free or use the cloud version.
Q: Does Manifest add latency? A: The routing decision takes under 2ms. Total added latency is negligible compared to LLM response times.