skills/zainhas/togetherai-skills/together-dedicated-containers

together-dedicated-containers

SKILL.md

Together Dedicated Containers

Overview

Run custom Dockerized inference workloads on Together's managed GPU infrastructure. You bring the container — Together handles compute, autoscaling, networking, and observability.

Components:

  • Jig CLI: Build, push, and deploy containers
  • Sprocket SDK: Python SDK for handling inference requests inside containers
  • Container Registry: registry.together.xyz for storing images
  • Queue API: Async job submission with priority and progress tracking

Installation

# Python (recommended)
uv init  # optional, if starting a new project
uv add together
# or with pip
pip install together
# TypeScript / JavaScript
npm install together-ai

Set your API key:

export TOGETHER_API_KEY=<your-api-key>

Workflow

  1. Write inference code using Sprocket SDK (setup() + predict())
  2. Build container with Jig CLI (jig build)
  3. Push to registry (jig push)
  4. Deploy (jig deploy)
  5. Send requests to your deployment

Quick Start

1. Install Jig CLI

pip install together
# Set your API key as an environment variable:
# export TOGETHER_API_KEY=<your-api-key>

2. Create Inference Worker

# worker.py
import sprocket

class MyWorker(sprocket.Sprocket):
    def setup(self):
        """Load model and resources (runs once at startup)."""
        import torch
        self.model = torch.load("model.pt")

    def predict(self, args: dict) -> dict:
        """Handle a single inference request."""
        input_data = args
        result = self.model(input_data["prompt"])
        return {"output": result}

3. Configure Project

# pyproject.toml
[project]
name = "my-inference-service"
version = "0.1.0"
dependencies = ["sprocket"]

[[tool.uv.index]]
name = "together-pypi"
url = "https://pypi.together.ai/"

[tool.uv.sources]
sprocket = { index = "together-pypi" }

[tool.jig.image]
cmd = "python worker.py --queue"
copy = ["worker.py"]

[tool.jig.deploy]
gpu_type = "h100-80gb"
gpu_count = 1

4. Build, Push, Deploy

jig build                    # Build Docker image
jig push                     # Push to registry.together.xyz
jig deploy                   # Deploy to Together infrastructure
jig status                   # Check deployment status
jig logs                     # View logs

5. Send Requests

Check the health endpoint:

curl https://api.together.ai/v1/deployments/my-inference-service/health \
  -H "Authorization: Bearer $TOGETHER_API_KEY"

Submit a job via the Queue API:

curl -X POST "https://api.together.ai/v1/queue/submit" \
  -H "Authorization: Bearer $TOGETHER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "my-inference-service",
    "payload": {"prompt": "Hello world"},
    "priority": 1
  }'

Response:

{
  "request_id": "req_abc123",
  "status": "pending"
}

Poll for the result:

curl "https://api.together.ai/v1/queue/status?model=my-inference-service&request_id=req_abc123" \
  -H "Authorization: Bearer $TOGETHER_API_KEY"

Response (when complete):

{
  "request_id": "req_abc123",
  "model": "my-inference-service",
  "status": "done",
  "outputs": {"output": "..."}
}

Or use the Python requests library:

import os
import requests

response = requests.post(
    "https://api.together.ai/v1/queue/submit",
    headers={"Authorization": f"Bearer {os.environ['TOGETHER_API_KEY']}"},
    json={
        "model": "my-inference-service",
        "payload": {"prompt": "Hello world"},
        "priority": 1,
    },
)
print(response.json())

Or submit directly via the Jig CLI:

together beta jig submit --payload '{"prompt": "Hello world"}' --watch

Sprocket SDK

The SDK provides the sprocket.Sprocket base class:

  • setup(): Called once at startup — load models, warm up caches
  • predict(args: dict) -> dict: Called per request — process input and return output
  • File handling: Upload/download files within predictions
  • GPU access: Full CUDA access inside the container

Queue API

For async workloads, use the Queue API for job submission with:

  • Priority-based fair queuing
  • Progress tracking
  • Job status polling

Key Jig CLI Commands

All commands are subcommands of together beta jig. Use --config <path> to specify a custom config file (default: pyproject.toml).

Build and Deploy

Command Description
jig init Create a starter pyproject.toml with defaults
jig dockerfile Generate a Dockerfile from config (for debugging)
jig build Build container image locally
jig build --tag <tag> Build with a specific image tag
jig build --warmup Build and pre-generate compile caches (requires GPU)
jig push Push image to registry.together.xyz
jig deploy Build, push, and create/update deployment
jig deploy --build-only Build and push only, skip deployment creation
jig deploy --image <ref> Deploy an existing image, skip build and push

Deployment Management

Command Description
jig status Show deployment status and configuration
jig list List all deployments in your organization
jig logs View deployment logs
jig logs --follow Stream logs in real-time
jig endpoint Print the deployment's endpoint URL
jig destroy Delete the deployment

Queue

Command Description
jig submit --payload '<json>' Submit a job to the queue
jig submit --prompt '<text>' Submit with shorthand prompt payload
jig submit --watch Submit and wait for the result
jig job_status --request-id <id> Get the status of a submitted job
jig queue_status Show queue backlog and worker status

Secrets

Command Description
jig secrets set --name <n> --value <v> Create or update a secret
jig secrets list List all secrets for the deployment
jig secrets unset <name> Remove a secret

Volumes

Command Description
jig volumes create --name <n> --source <path> Create a volume and upload files
jig volumes update --name <n> --source <path> Update a volume with new files
jig volumes describe --name <n> Show volume details and contents
jig volumes list List all volumes
jig volumes delete --name <n> Delete a volume

Resources

Weekly Installs
8
First Seen
Feb 27, 2026
Installed on
opencode8
github-copilot8
codex8
kimi-cli8
gemini-cli8
cursor8