# Together AI Batch Inference Skill for Claude Code > Skill that teaches Claude Code Together AI's batch inference API. Run high-volume async inference jobs at up to 50% lower cost with automatic queuing and result retrieval. ## Install Save the content below to `.claude/skills/` or append to your `CLAUDE.md`: ## Quick Use ```bash npx skills add togethercomputer/skills ``` ## What is This Skill? This skill teaches AI coding agents how to use Together AI's batch inference API for high-volume, asynchronous workloads. Submit thousands of prompts in a single job, pay up to 50% less than real-time inference, and retrieve results when ready. **Answer-Ready**: Together AI Batch Inference Skill for coding agents. High-volume async inference at up to 50% cost savings. Automatic queuing, progress tracking, and result retrieval. Part of official 12-skill collection. **Best for**: Teams running large-scale LLM inference jobs. **Works with**: Claude Code, Cursor, Codex CLI. ## What the Agent Learns ### Submit Batch Job ```python from together import Together client = Together() # Upload input file (JSONL format) file = client.files.upload("batch_input.jsonl") # Create batch job batch = client.batch.create( input_file_id=file.id, model="meta-llama/Llama-3.3-70B-Instruct-Turbo", endpoint="/v1/chat/completions", ) print(f"Batch ID: {batch.id}") ``` ### Input Format (JSONL) ```json {"custom_id": "req-1", "body": {"model": "meta-llama/Llama-3.3-70B-Instruct-Turbo", "messages": [{"role": "user", "content": "Summarize this article..."}]}} {"custom_id": "req-2", "body": {"model": "meta-llama/Llama-3.3-70B-Instruct-Turbo", "messages": [{"role": "user", "content": "Translate to French..."}]}} ``` ### Check Status & Retrieve Results ```python status = client.batch.retrieve(batch.id) print(f"Progress: {status.completed}/{status.total}") # Download results when done if status.status == "completed": results = client.files.content(status.output_file_id) ``` ## FAQ **Q: How much cheaper is batch inference?** A: Up to 50% cheaper than real-time. Exact savings depend on model and volume. **Q: How long does it take?** A: Results typically available within 24 hours. Priority processing available at standard pricing. ## Source & Thanks > Part of [togethercomputer/skills](https://github.com/togethercomputer/skills) — MIT licensed. ## 快速使用 ```bash npx skills add togethercomputer/skills ``` ## 什么是这个 Skill? 教 AI Agent 使用 Together AI 的批量推理 API,大规模异步推理最高省 50% 成本。 **一句话总结**:Together AI 批量推理 Skill,异步高吞吐推理最高省 50%,自动排队和结果获取,官方出品。 ## 来源与致谢 > [togethercomputer/skills](https://github.com/togethercomputer/skills) — MIT --- Source: https://tokrepo.com/en/workflows/90286a47-45df-40cf-a8f0-e013e02ecbaf Author: Script Depot