Esta página se muestra en inglés. Una traducción al español está en curso.
KnowledgeMay 7, 2026·4 min de lectura

LiteLLM Cost Tracking — Per-Project LLM Spend Dashboard

LiteLLM ships a built-in cost dashboard. Track LLM spend by project, user, model, tag. Hard budgets that block at the proxy. SOC2 / SSO via Pro tier.

Listo para agents

Staging seguro para este activo

Este activo primero queda en staging. El prompt copiado pide inspeccionar los archivos staged antes de activar scripts, config MCP o config global.

Stage only · 27/100Política: staging
Superficie agent
Cualquier agent MCP/CLI
Tipo
Knowledge
Instalación
Stage only
Confianza
Confianza: Community
Entrada
Asset
Comando de staging seguro
npx -y tokrepo@latest install 72b2e16c-71b4-4702-87ed-f6ea3ba99f69 --target codex

Primero deja archivos en staging; la activación requiere revisar el README y el plan staged.

Introducción

LiteLLM's cost-tracking layer attributes every LLM call to a project, user, and tag, then surfaces it in a built-in dashboard. Set hard budgets per team — when the budget is hit, the proxy returns 429 instead of forwarding the request. Best for: any team where 'who's burning our LLM budget' is a recurring question. Works with: LiteLLM Proxy (self-host), LiteLLM Cloud (managed). Setup time: 5 minutes (Postgres + Redis + .env).


Enable in proxy config

# config.yaml
general_settings:
  master_key: sk-master
  database_url: postgresql://litellm:pass@db:5432/litellm
  store_model_in_db: true
  spend_logs_max_age: "90d"  # auto-prune

litellm_settings:
  callbacks: ["langfuse", "prometheus"]  # optional, ship to your observability
docker compose up -d  # spins up proxy + Postgres

Generate keys with budgets

# Per-team
curl -X POST http://localhost:4000/team/new \
  -H "Authorization: Bearer sk-master" \
  -d '{"team_alias": "frontend-team", "max_budget": 1000, "budget_duration": "30d"}'

# Per-user (within a team)
curl -X POST http://localhost:4000/user/new \
  -H "Authorization: Bearer sk-master" \
  -d '{"user_id": "alice@acme.com", "team_id": "frontend-team", "max_budget": 50}'

# Generate a key for that user
curl -X POST http://localhost:4000/key/generate \
  -H "Authorization: Bearer sk-master" \
  -d '{"user_id": "alice@acme.com", "max_budget": 50}'

When alice@acme.com hits $50, all subsequent calls return 429.

Tag every request

client.chat.completions.create(
    model="claude-3-5-sonnet",
    messages=[...],
    extra_body={
        "tags": ["feature:onboarding", "env:prod", "user-tier:enterprise"],
    },
)

The dashboard then shows spend grouped by any tag combination — "how much did onboarding cost last week?" "What's enterprise vs free spend?".

Built-in dashboard

Visit http://localhost:4000/ui (default password from UI_USERNAME / UI_PASSWORD). Tabs:

  • Spend — by team, user, model, tag, date
  • Keys — generate, rotate, revoke
  • Models — health status, RPM/TPM consumed
  • Logs — every call with prompt + response (configurable retention)

Export to your warehouse

litellm_settings:
  callbacks: ["s3"]
  s3_callback_params:
    s3_bucket_name: my-llm-logs
    s3_region_name: us-east-1

Each call lands as a JSON line in S3 — query with Athena / DuckDB.


FAQ

Q: Does this require LiteLLM Pro? A: No — cost tracking, the dashboard, per-team budgets, and S3 export are all in the open-source proxy. Pro adds SOC2 attestation, SSO/SAML, and managed hosting.

Q: How accurate is the cost tracking? A: LiteLLM uses each provider's official token-count + per-model pricing table to calculate cost per call. For models without published prices (e.g. local Ollama), set input_cost_per_token / output_cost_per_token in config.yaml or it logs as $0.

Q: Can I block PII before sending to providers? A: Yes — LiteLLM has a Guardrails layer (regex-based or via Presidio / Lakera) that runs before the request goes upstream. Combined with cost tracking, you can also block specific tags from going to specific providers.


Quick Use

  1. Have LiteLLM Proxy running with a Postgres database_url
  2. Visit http://localhost:4000/ui and log in with UI_USERNAME / UI_PASSWORD
  3. Generate a team + user with budgets via the API snippets below

Intro

LiteLLM's cost-tracking layer attributes every LLM call to a project, user, and tag, then surfaces it in a built-in dashboard. Set hard budgets per team — when the budget is hit, the proxy returns 429 instead of forwarding the request. Best for: any team where 'who's burning our LLM budget' is a recurring question. Works with: LiteLLM Proxy (self-host), LiteLLM Cloud (managed). Setup time: 5 minutes (Postgres + Redis + .env).


Enable in proxy config

# config.yaml
general_settings:
  master_key: sk-master
  database_url: postgresql://litellm:pass@db:5432/litellm
  store_model_in_db: true
  spend_logs_max_age: "90d"  # auto-prune

litellm_settings:
  callbacks: ["langfuse", "prometheus"]  # optional, ship to your observability
docker compose up -d  # spins up proxy + Postgres

Generate keys with budgets

# Per-team
curl -X POST http://localhost:4000/team/new \
  -H "Authorization: Bearer sk-master" \
  -d '{"team_alias": "frontend-team", "max_budget": 1000, "budget_duration": "30d"}'

# Per-user (within a team)
curl -X POST http://localhost:4000/user/new \
  -H "Authorization: Bearer sk-master" \
  -d '{"user_id": "alice@acme.com", "team_id": "frontend-team", "max_budget": 50}'

# Generate a key for that user
curl -X POST http://localhost:4000/key/generate \
  -H "Authorization: Bearer sk-master" \
  -d '{"user_id": "alice@acme.com", "max_budget": 50}'

When alice@acme.com hits $50, all subsequent calls return 429.

Tag every request

client.chat.completions.create(
    model="claude-3-5-sonnet",
    messages=[...],
    extra_body={
        "tags": ["feature:onboarding", "env:prod", "user-tier:enterprise"],
    },
)

The dashboard then shows spend grouped by any tag combination — "how much did onboarding cost last week?" "What's enterprise vs free spend?".

Built-in dashboard

Visit http://localhost:4000/ui (default password from UI_USERNAME / UI_PASSWORD). Tabs:

  • Spend — by team, user, model, tag, date
  • Keys — generate, rotate, revoke
  • Models — health status, RPM/TPM consumed
  • Logs — every call with prompt + response (configurable retention)

Export to your warehouse

litellm_settings:
  callbacks: ["s3"]
  s3_callback_params:
    s3_bucket_name: my-llm-logs
    s3_region_name: us-east-1

Each call lands as a JSON line in S3 — query with Athena / DuckDB.


FAQ

Q: Does this require LiteLLM Pro? A: No — cost tracking, the dashboard, per-team budgets, and S3 export are all in the open-source proxy. Pro adds SOC2 attestation, SSO/SAML, and managed hosting.

Q: How accurate is the cost tracking? A: LiteLLM uses each provider's official token-count + per-model pricing table to calculate cost per call. For models without published prices (e.g. local Ollama), set input_cost_per_token / output_cost_per_token in config.yaml or it logs as $0.

Q: Can I block PII before sending to providers? A: Yes — LiteLLM has a Guardrails layer (regex-based or via Presidio / Lakera) that runs before the request goes upstream. Combined with cost tracking, you can also block specific tags from going to specific providers.


Source & Thanks

Built by BerriAI. Licensed under MIT.

BerriAI/litellm — Spend Tracking — ⭐ 17,000+

🙏

Fuente y agradecimientos

Built by BerriAI. Licensed under MIT.

BerriAI/litellm — Spend Tracking — ⭐ 17,000+

Discusión

Inicia sesión para unirte a la discusión.
Aún no hay comentarios. Sé el primero en compartir tus ideas.

Activos relacionados