Prompts2026年4月8日·1 分钟阅读

Anthropic Prompt Caching — Cut AI API Costs 90%

Use Anthropic's prompt caching to reduce Claude API costs by up to 90%. Cache system prompts, tool definitions, and long documents across requests for massive savings.

What is Prompt Caching?

Anthropic's prompt caching caches repeated content (system prompts, tool definitions, long documents) for reuse across requests; cache reads cost only 1/10.

TL;DR: Anthropic prompt caching. Cache system prompts / tools / documents. Read cost is 1/10. 5-minute TTL auto-renews. Must-use for production Claude apps. Up to 90% savings.

Cacheable Content

  1. System prompts — most common case
  2. Tool definitions — big wins with many tools
  3. RAG documents — multi-turn Q&A over the same doc
  4. Multi-turn conversation prefix — cache early context

Best Practices

  • Cache the longest, most stable content first
  • Cached content must be a prefix
  • Monitor cache_read_input_tokens to confirm hits
  • Minimum 1024 tokens

FAQ

Q: Does it affect quality? A: No — the model sees identical input either way.

Q: Does Claude Code use it? A: Yes — CLAUDE.md and tool definitions are cached automatically.

🙏

来源与感谢

讨论

登录后参与讨论。
还没有评论,来写第一条吧。