# Together AI Dedicated Endpoints Skill for Agents

> Skill that teaches Claude Code Together AI's dedicated endpoints API. Deploy single-tenant GPU inference with autoscaling, no rate limits, and custom model configurations.

## Install

Save the content below to `.claude/skills/` or append to your `CLAUDE.md`:

## Quick Use

```bash
npx skills add togethercomputer/skills
```

## What is This Skill?

This skill teaches AI coding agents how to provision and manage Together AI dedicated endpoints. Deploy models on single-tenant GPUs with autoscaling, no rate limits, and custom configurations for production workloads.

**Answer-Ready**: Together AI Dedicated Endpoints Skill for coding agents. Single-tenant GPU inference with autoscaling and no rate limits. Custom model configs and production deployment. Part of official 12-skill collection.

**Best for**: Teams deploying models for production inference at scale. **Works with**: Claude Code, Cursor, Codex CLI.

## What the Agent Learns

### Create Endpoint

```python
from together import Together

client = Together()
endpoint = client.endpoints.create(
    model="meta-llama/Llama-3.3-70B-Instruct-Turbo",
    hardware="gpu-h100-80gb",
    min_replicas=1,
    max_replicas=4,
    autoscale=True,
)
print(f"Endpoint URL: {endpoint.url}")
```

### Hardware Options
| GPU | VRAM | Best For |
|-----|------|----------|
| H100 80GB | 80GB | Large models, high throughput |
| H200 | 141GB | Largest models |
| A100 80GB | 80GB | Cost-effective |

### Endpoint Management

```python
# Scale
client.endpoints.update(endpoint.id, min_replicas=2)
# Monitor
status = client.endpoints.retrieve(endpoint.id)
# Delete
client.endpoints.delete(endpoint.id)
```

## FAQ

**Q: How is pricing different from serverless?**
A: Dedicated endpoints charge per GPU-hour, not per token. Better economics at sustained high volume.

## Source & Thanks

> Part of [togethercomputer/skills](https://github.com/togethercomputer/skills) — MIT licensed.

<!-- ZH -->

## 快速使用

```bash
npx skills add togethercomputer/skills
```

## 什么是这个 Skill？

教 AI Agent 管理 Together AI 专属端点，单租户 GPU 推理，自动扩缩容，无速率限制。

**一句话总结**：Together AI 专属端点 Skill，单租户 GPU + 自动扩缩容 + 无限速，官方出品。

## 来源与致谢

> [togethercomputer/skills](https://github.com/togethercomputer/skills) — MIT

---
Source: https://tokrepo.com/en/workflows/cbbd6c05-ac9c-4dd6-9fde-cf5528919ea5
Author: AI Open Source