skills/vtexdocs/ai-skills/vtex-io-service-runtime

vtex-io-service-runtime

Originally fromvtex/skills
Installation
SKILL.md

Service Runtime & Execution Model

When this skill applies

Use this skill when the main decision is how a VTEX IO backend app runs inside the node builder: how the Service entrypoint is structured, how runtime configuration is declared, and how routes, events, or GraphQL handlers are registered into the service.

  • Creating a new backend app under node/
  • Structuring node/index.ts as the service entrypoint
  • Defining typed Context, State, and params contracts for handlers
  • Configuring service.json for timeout, memory, workers, and replicas
  • Troubleshooting runtime issues caused by service registration or execution model mismatches
  • Registering GraphQL handlers at the runtime level, while keeping schema and resolver design in a separate skill

Do not use this skill for:

  • deciding the app contract in manifest.json
  • designing custom clients or integration transport layers
  • detailed HTTP route handler behavior
  • event-specific business workflows
  • GraphQL schema or resolver modeling beyond runtime registration

Decision rules

  • Treat node/index.ts as the runtime composition root of the backend app.
  • Use the Service definition to register runtime surfaces such as routes, events, and GraphQL handlers, not to hold business logic directly.
  • Keep runtime wiring explicit: context typing, client typing, route registration, and event registration should be visible at the service boundary.
  • Put execution knobs such as timeout, ttl, memory, workers, and replica limits in service.json, not inside handler code.
  • Use service.json to declare the runtime parameters the platform uses to execute the service, especially memory, timeout, ttl, minReplicas, maxReplicas, workers, routes, events, and rateLimitPerReplica.
  • Use routes in service.json to expose HTTP entrypoints. Routes are private by default, so set public: true explicitly for routes that must be externally reachable.
  • Use smartcache: true only on idempotent, cacheable routes where the same response can be safely reused across repeated requests. Avoid it on personalized, authenticated, or write-oriented endpoints.
  • Use events in service.json to declare which event sources and handlers are part of the service runtime. Keep event registration in the runtime layer and event-specific business rules in dedicated event modules.
  • Use rateLimitPerReplica to shape throughput per replica for requests and events. Set a global baseline only when the service needs it, then add small explicit overrides only for expensive routes or noisy event sources.
  • Do not use rateLimitPerReplica as a substitute for redesigning expensive routes, queueing work, or moving slow operations to async processing.
  • Keep handlers focused on request or event behavior; keep runtime structure focused on bootstrapping and registration.
  • Model Context, State, and params types deliberately so middlewares and handlers share a stable contract. Apply the same typed Context and State to middlewares so they can safely manipulate ctx.state, ctx.vtex, and params without falling back to any.
  • If a backend app starts mixing runtime wiring, client implementation, and business rules in the same file, split those concerns before expanding the service further.
  • Although some authorization fields such as routes.access or routes.policies may live in service.json, they are primarily authorization concerns and belong in auth or security-focused skills rather than this runtime skill.

Runtime sizing heuristics:

  • These ranges are intended for partner and account-level apps. Native VTEX core services may legitimately use much higher values such as thousands of MB of memory or hundreds of replicas, but those values should not be used as defaults for custom apps.

Suggested defaults:

  • Start synchronous HTTP services with timeout between 10 and 30 seconds. For UX-facing routes, prefer 5 to 15 seconds.
  • Start memory at 256 MB.
  • Start workers at 1.
  • Use minReplicas: 2 as the default for installed apps, and reserve minReplicas: 1 for linked-app development contexts where the platform allows it.
  • Use maxReplicas: 5 as the lowest practical starting point, since the documented minimum is 5.
  • Use ttl intentionally. In VTEX IO, ttl is measured in minutes, with platform defaults and limits that differ from timeout. For partner apps, start from the default 10 minutes and increase intentionally up to 60 only when reducing cold starts matters more than allowing idle instances to sleep sooner.

Scaling ranges and exceptions:

  • Use 128 to 256 MB for simpler IO-bound services, and move to 512 MB only when there is evidence of OOM, large payload processing, or heavier libraries.
  • Increase workers to 2 to 4 only for high-throughput IO-bound workloads after measuring benefit. Avoid using more than 4 workers per instance as a default.
  • Increase maxReplicas from 5 toward 10 only when public traffic or predictable peaks justify it. Treat values above 10 as exceptions that require explicit justification and monitoring in partner apps.
  • Avoid timeout values above 60 seconds for HTTP routes; if more time is needed, redesign the flow as async work.
  • Remember that ttl has a documented minimum of 10 minutes and maximum of 60 minutes. Use higher values intentionally to reduce cold starts on low-traffic or bursty services, and avoid treating ttl like a per-request timeout.
  • For partner apps, rateLimitPerReplica.perMinute often starts in the 60 to 300 range for normal routes and in the 10 to 60 range for more expensive ones. rateLimitPerReplica.concurrent often starts between 1 and 5.

Hard constraints

Constraint: The Service entrypoint must stay a runtime composition root

node/index.ts MUST define and export the VTEX IO service runtime structure, not become a catch-all file for business logic, data transformation, or transport implementation.

Why this matters

When the entrypoint mixes registration with business logic, the execution model becomes harder to reason about, handlers become tightly coupled, and changes to routes, events, or GraphQL surfaces become risky.

Detection

If node/index.ts contains large handler bodies, external API calls, complex branching, or data-mapping logic, STOP and move that logic into dedicated modules. Keep the entrypoint focused on typing and registration.

Correct

import type { ClientsConfig, RecorderState, ServiceContext } from '@vtex/api'
import { Service } from '@vtex/api'
import { clients, Clients } from './clients'
import { routes } from './routes'

export interface State extends RecorderState {}

export type Context = ServiceContext<Clients, State>

const clientsConfig: ClientsConfig<Clients> = {
  implementation: clients,
  options: {},
}

export default new Service<Clients, State>({
  clients: clientsConfig,
  routes,
})

Wrong

import { Service } from '@vtex/api'
import axios from 'axios'

export default new Service({
  routes: {
    reviews: async (ctx: any) => {
      const response = await axios.get('https://example.com/data')
      const transformed = response.data.items.map((item: any) => ({
        ...item,
        extra: true,
      }))

      ctx.body = transformed.filter((item: any) => item.active)
    },
  },
})

Constraint: Runtime configuration must be expressed in service.json, not improvised in code

Resource and execution settings such as timeout, ttl, memory, workers, and replica behavior MUST be configured in service.json when the app depends on them. service.json resides inside the node/ folder and centralizes runtime parameters such as routes, events, memory, timeout, ttl, workers, replicas, and rate limits for this service.

Why this matters

These settings are part of the service runtime contract with the platform. Hiding them in assumptions or spreading them across code makes behavior harder to predict and can cause timeouts, cold-start churn, underprovisioning, or scaling mismatches. In VTEX IO, ttl is especially important because it is measured in minutes and influences how aggressively service infrastructure can go idle between requests. Using the minimum ttl on low-traffic services can increase cold starts, because the platform is allowed to scale the service down more aggressively between bursts.

Detection

If the app depends on long-running work, concurrency, warm capacity, or specific route exposure behavior, STOP and verify that the relevant service.json settings are present and intentional. If the behavior is only implied in code comments or handler logic, move it into runtime configuration.

Correct

{
  "memory": 256,
  "timeout": 30,
  "ttl": 10,
  "minReplicas": 2,
  "maxReplicas": 10,
  "workers": 4,
  "rateLimitPerReplica": {
    "perMinute": 300,
    "concurrent": 10
  },
  "routes": {
    "reviews": {
      "path": "/_v/api/reviews",
      "public": false
    }
  }
}

Wrong

{
  "routes": {
    "reviews": {
      "path": "/_v/api/reviews"
    }
  }
}

This runtime configuration is incomplete for a service that depends on explicit timeout, concurrency, rate limiting, or replica behavior, and it leaves execution characteristics undefined.

Constraint: Route exposure must be explicit in the runtime contract

Every HTTP route exposed by the service MUST be declared in service.json with an intentional visibility choice. Do not rely on implicit defaults when the route should be private or public. Routes are private by default, so always set public: true explicitly when the route must be externally reachable.

Why this matters

Route visibility is part of the runtime contract of the service. If exposure is ambiguous, a route can be published with the wrong accessibility, which creates security risk for private handlers and integration failures for routes expected to be public.

Detection

If a route exists in the service runtime, STOP and verify that it is declared in service.json and that public matches the intended exposure. If the route is consumed only by trusted backoffice or app-to-app flows, default to checking that it is private before expanding access.

Correct

{
  "routes": {
    "status": {
      "path": "/_v/status/health",
      "public": true,
      "smartcache": true
    },
    "reviews": {
      "path": "/_v/api/reviews",
      "public": false
    }
  }
}

Wrong

{
  "routes": {
    "reviews": {
      "path": "/_v/api/reviews"
    }
  }
}

This route leaves visibility implicit, so the runtime contract does not clearly communicate whether the endpoint is meant to be public or protected.

Constraint: Typed context and state must match the handlers registered in the runtime

The service MUST define Context, State, and handler contracts that match the routes, events, or GraphQL handlers it registers.

Why this matters

Untyped or inconsistent runtime contracts make middleware composition fragile and allow handlers to rely on state or params that are never guaranteed to exist.

Detection

If middlewares or handlers use ctx.state, ctx.clients, ctx.vtex, or params fields without a shared typed contract, STOP and introduce or fix the runtime types before adding more handlers.

Correct

import type { ParamsContext, RecorderState, ServiceContext } from '@vtex/api'

interface State extends RecorderState {
  reviewId?: string
}

type CustomContext = ServiceContext<Clients, State, ParamsContext>

export async function getReview(ctx: CustomContext) {
  ctx.state.reviewId = ctx.vtex.route.params.id
  ctx.body = { id: ctx.state.reviewId }
}

Wrong

export async function getReview(ctx: any) {
  ctx.state.reviewId = ctx.params.review
  ctx.body = { id: ctx.state.missingField.value }
}

Preferred pattern

Recommended file layout:

node/
├── index.ts
├── clients/
│   └── index.ts
├── routes/
│   └── index.ts
├── events/
│   └── index.ts
├── graphql/
│   └── index.ts
└── middlewares/
    └── validate.ts

Minimal service runtime pattern:

import type { ClientsConfig, RecorderState, ServiceContext } from '@vtex/api'
import { Service } from '@vtex/api'
import { clients, Clients } from './clients'
import { routes } from './routes'

export interface State extends RecorderState {}

export type Context = ServiceContext<Clients, State>

const clientsConfig: ClientsConfig<Clients> = {
  implementation: clients,
  options: {},
}

export default new Service<Clients, State>({
  clients: clientsConfig,
  routes,
})

Minimal service.json pattern:

{
  "memory": 256,
  "timeout": 30,
  "ttl": 10,
  "minReplicas": 2,
  "maxReplicas": 5,
  "workers": 1,
  "rateLimitPerReplica": {
    "perMinute": 120,
    "concurrent": 4
  },
  "routes": {
    "status": {
      "path": "/_v/status/health",
      "public": true,
      "smartcache": true
    },
    "reviews": {
      "path": "/_v/api/reviews",
      "public": false
    }
  },
  "events": {
    "orderCreated": {
      "sender": "vtex.orders-broadcast",
      "topics": ["order-created"],
      "rateLimitPerReplica": {
        "perMinute": 60,
        "concurrent": 2
      }
    }
  }
}

Use the service entrypoint to compose runtime surfaces, then push business behavior into handlers, clients, and other focused modules. If routes/index.ts or events/index.ts grows too large, split it by domain such as routes/orders.ts or events/catalog.ts and keep the index file as a small registry.

Common failure modes

  • Putting business logic directly into node/index.ts.
  • Treating service.json as optional when runtime behavior depends on explicit resource settings.
  • Setting ttl too low and causing the service to sleep too aggressively between bursts of traffic.
  • Enabling smartcache on personalized or write-oriented routes and risking incorrect cache reuse across requests.
  • Registering routes, events, or GraphQL handlers without a clear typed Context and State.
  • Mixing runtime composition with client implementation details.
  • Letting one service entrypoint accumulate unrelated responsibilities across HTTP, events, and GraphQL without clear module boundaries.

Review checklist

  • Is node/index.ts acting as a runtime composition root rather than a business-logic file?
  • Are routes, events, and GraphQL handlers registered explicitly and cleanly?
  • Does service.json express the runtime behavior the app actually depends on?
  • Are Context, State, and params types shared consistently across handlers?
  • Are runtime concerns separated from client implementation and business logic?

Reference

Weekly Installs
26
GitHub Stars
16
First Seen
Apr 1, 2026
Installed on
claude-code23
github-copilot21
codex16
opencode13
gemini-cli13
antigravity13