Quick Use
- Have LiteLLM Proxy running with a Postgres database_url
- Visit
http://localhost:4000/uiand log in withUI_USERNAME/UI_PASSWORD - Generate a team + user with budgets via the API snippets below
Intro
LiteLLM's cost-tracking layer attributes every LLM call to a project, user, and tag, then surfaces it in a built-in dashboard. Set hard budgets per team — when the budget is hit, the proxy returns 429 instead of forwarding the request. Best for: any team where 'who's burning our LLM budget' is a recurring question. Works with: LiteLLM Proxy (self-host), LiteLLM Cloud (managed). Setup time: 5 minutes (Postgres + Redis + .env).
Enable in proxy config
# config.yaml
general_settings:
master_key: sk-master
database_url: postgresql://litellm:pass@db:5432/litellm
store_model_in_db: true
spend_logs_max_age: "90d" # auto-prune
litellm_settings:
callbacks: ["langfuse", "prometheus"] # optional, ship to your observabilitydocker compose up -d # spins up proxy + PostgresGenerate keys with budgets
# Per-team
curl -X POST http://localhost:4000/team/new \
-H "Authorization: Bearer sk-master" \
-d '{"team_alias": "frontend-team", "max_budget": 1000, "budget_duration": "30d"}'
# Per-user (within a team)
curl -X POST http://localhost:4000/user/new \
-H "Authorization: Bearer sk-master" \
-d '{"user_id": "alice@acme.com", "team_id": "frontend-team", "max_budget": 50}'
# Generate a key for that user
curl -X POST http://localhost:4000/key/generate \
-H "Authorization: Bearer sk-master" \
-d '{"user_id": "alice@acme.com", "max_budget": 50}'When alice@acme.com hits $50, all subsequent calls return 429.
Tag every request
client.chat.completions.create(
model="claude-3-5-sonnet",
messages=[...],
extra_body={
"tags": ["feature:onboarding", "env:prod", "user-tier:enterprise"],
},
)The dashboard then shows spend grouped by any tag combination — "how much did onboarding cost last week?" "What's enterprise vs free spend?".
Built-in dashboard
Visit http://localhost:4000/ui (default password from UI_USERNAME / UI_PASSWORD). Tabs:
- Spend — by team, user, model, tag, date
- Keys — generate, rotate, revoke
- Models — health status, RPM/TPM consumed
- Logs — every call with prompt + response (configurable retention)
Export to your warehouse
litellm_settings:
callbacks: ["s3"]
s3_callback_params:
s3_bucket_name: my-llm-logs
s3_region_name: us-east-1Each call lands as a JSON line in S3 — query with Athena / DuckDB.
FAQ
Q: Does this require LiteLLM Pro? A: No — cost tracking, the dashboard, per-team budgets, and S3 export are all in the open-source proxy. Pro adds SOC2 attestation, SSO/SAML, and managed hosting.
Q: How accurate is the cost tracking?
A: LiteLLM uses each provider's official token-count + per-model pricing table to calculate cost per call. For models without published prices (e.g. local Ollama), set input_cost_per_token / output_cost_per_token in config.yaml or it logs as $0.
Q: Can I block PII before sending to providers? A: Yes — LiteLLM has a Guardrails layer (regex-based or via Presidio / Lakera) that runs before the request goes upstream. Combined with cost tracking, you can also block specific tags from going to specific providers.
Source & Thanks
Built by BerriAI. Licensed under MIT.
BerriAI/litellm — Spend Tracking — ⭐ 17,000+