Esta página se muestra en inglés. Una traducción al español está en curso.

SkillsApr 8, 2026·1 min de lectura

Together AI Dedicated Containers Skill for Agents

Skill that teaches Claude Code Together AI's container deployment API. Run custom Docker inference workers on managed GPU infrastructure with full environment control.

Together AI · Community

Listo para agents

Instalación lista para agent

Este activo puede instalarse después de elegir el runtime, revisar el plan y ejecutar el comando correspondiente.

Native · 98/100Política: permitir

Superficie agent

Cualquier agent MCP/CLI

Tipo

Skill

Instalación

Single

Confianza

Confianza: Community

Entrada

Together AI Dedicated Containers Skill for Agents

Comando de instalación directa

npx -y tokrepo@latest install 4d4e267f-143c-4a16-a715-72206e5aad38 --target codex

Ejecutar después de confirmar el plan con dry-run.

TL;DR

A Claude Code skill for deploying custom Docker inference workers on Together AI GPU infrastructure.

§01

What it is

This skill teaches Claude Code how to use Together AI's dedicated container deployment API. It enables AI agents to deploy custom Docker images as inference workers on managed GPU infrastructure with full environment control, scaling configuration, and health monitoring.

The skill targets developers who use Claude Code to manage AI infrastructure and want their agent to handle container deployments on Together AI's GPU cloud.

§02

How it saves time or tokens

Without this skill, deploying containers on Together AI requires reading API docs, constructing JSON payloads, and managing authentication manually. The skill gives Claude Code the exact API patterns, so you describe what you want in natural language and the agent handles the REST calls, environment configuration, and deployment verification.

§03

How to use

Add this skill to your Claude Code project configuration.
Set your Together AI API key as an environment variable.
Ask Claude Code to deploy, scale, or manage your inference containers.

§04

Example

import requests

TOGETHER_API_KEY = 'your-api-key'

# Deploy a custom inference container
response = requests.post(
    'https://api.together.xyz/v1/dedicated/containers',
    headers={'Authorization': f'Bearer {TOGETHER_API_KEY}'},
    json={
        'image': 'my-registry/my-model:latest',
        'gpu_type': 'NVIDIA_A100_80GB',
        'num_gpus': 1,
        'env': {
            'MODEL_NAME': 'my-custom-model',
            'MAX_BATCH_SIZE': '32'
        }
    }
)
print(response.json())

§05

Related on TokRepo

AI Tools for DevOps -- infrastructure deployment and management tools
AI Tools for Automation -- workflow automation for AI infrastructure

§06

Common pitfalls

GPU availability varies by type and region. A100 80GB instances may have queues during peak demand. Check availability before committing to a deployment timeline.
Container images must be accessible from Together AI's infrastructure. Use a public registry or configure registry credentials in the API call.
Dedicated containers have a minimum billing period. Shut down unused containers promptly to avoid unnecessary costs.

Preguntas frecuentes

What GPU types does Together AI offer for dedicated containers?+

Together AI offers NVIDIA A100 (40GB and 80GB), H100, and other GPU types depending on availability. Check the Together AI documentation for the current list and pricing.

Can I use custom Docker images?+

Yes. You provide your own Docker image with your model and serving code. Together AI runs it on their GPU infrastructure with the environment variables and ports you specify.

How does scaling work?+

You specify the number of GPUs and replicas in the deployment configuration. Together AI manages the infrastructure scaling. You can update replica counts through the API.

Is this skill Claude Code specific?+

The skill is designed for Claude Code but the underlying API knowledge applies to any AI agent or manual workflow. The skill format follows Claude Code's CLAUDE.md convention.

How do I monitor deployed containers?+

Together AI provides health check endpoints and status APIs. The skill teaches Claude Code how to query container status, check logs, and verify that the deployment is healthy.

Referencias (3)

Together AI Docs— Together AI dedicated container deployment API
Together AI Official Site— GPU infrastructure for custom inference
Anthropic Claude Code Docs— Claude Code skill format specification

Relacionados en TokRepo

DevOps Tools Automation Tools Featured workflows

🙏

Fuente y agradecimientos

Part of togethercomputer/skills — MIT licensed.

Discusión

Inicia sesión para unirte a la discusión.

Aún no hay comentarios. Sé el primero en compartir tus ideas.

Activos relacionados

Together AI Embeddings & Reranking Skill for Agents

Skill that teaches Claude Code Together AI's embeddings and reranking API. Covers dense vector generation, semantic search, RAG pipelines, and result reranking patterns.

Skills

Together AI

Together AI Dedicated Endpoints Skill for Agents

Skill that teaches Claude Code Together AI's dedicated endpoints API. Deploy single-tenant GPU inference with autoscaling, no rate limits, and custom model configurations.

Skills

Together AI

Together AI Batch Inference Skill for Claude Code

Skill that teaches Claude Code Together AI's batch inference API. Run high-volume async inference jobs at up to 50% lower cost with automatic queuing and result retrieval.

Skills

Together AI

Together AI GPU Clusters Skill for Claude Code

Skill that teaches Claude Code Together AI's GPU cluster API. Provision on-demand and reserved H100, H200, and B200 GPU clusters for large-scale training and inference.

Skills

Together AI