skills/hubeiqiao/skills/cloudflare-container-gateway-autostart

cloudflare-container-gateway-autostart

Installation
SKILL.md

Cloudflare Container Gateway Auto-Start

Problem

When running a bot (Discord, Telegram, Slack) inside a Cloudflare Workers Container that's protected by Cloudflare Access, the bot goes offline because the gateway process only starts when an authenticated request comes in. Since bot messages are outbound connections (not inbound HTTP), no request triggers the startup.

Context / Trigger Conditions

  • Bot stops responding to Discord/Telegram messages
  • /api/status returns {"ok": false, "status": "not_running"}
  • /sandbox-health returns {"status": "ok"} (container is up, but process isn't)
  • Bot works after visiting the web UI in browser (which authenticates via CF Access)
  • This happens after deploys or when the container restarts
  • Using Cloudflare Workers Containers (Sandbox) with Durable Objects

Root Cause

The container lifecycle:

  1. Container starts when any request hits the Durable Object
  2. /sandbox-health touches the container but doesn't call ensureMoltbotGateway()
  3. Protected routes (behind CF Access) call ensureMoltbotGateway() to start the process
  4. No authenticated requests = gateway never starts = bot stays offline

The Discord/Telegram connections are outbound from the container, so they don't trigger the gateway startup flow that expects inbound authenticated requests.

Solution

1. Add a Public /start Endpoint

Create a public endpoint that starts the gateway without requiring authentication:

// src/routes/public.ts
import { ensureMoltbotGateway } from '../gateway';

publicRoutes.post('/start', async (c) => {
  const sandbox = c.get('sandbox');

  try {
    console.log('[start] Starting gateway...');
    await ensureMoltbotGateway(sandbox, c.env);
    return c.json({ ok: true, status: 'started' });
  } catch (err) {
    console.error('[start] Failed to start gateway:', err);
    return c.json({
      ok: false,
      status: 'error',
      error: err instanceof Error ? err.message : 'Unknown error'
    }, 500);
  }
});

2. Make Cron Job Start the Gateway

Modify the scheduled handler to ensure the gateway is running:

// src/index.ts
async function scheduled(event, env, ctx) {
  const sandbox = getSandbox(env.Sandbox, 'moltbot', options);

  // Ensure gateway is running - keeps Discord/Telegram bots alive
  console.log('[cron] Ensuring gateway is running...');
  try {
    await ensureMoltbotGateway(sandbox, env);
    console.log('[cron] Gateway is running');
  } catch (err) {
    console.error('[cron] Failed to start gateway:', err);
  }

  // Continue with backup sync...
  await syncToR2(sandbox, env);
}

3. Configure Cron in wrangler.jsonc

{
  "triggers": {
    "crons": ["*/5 * * * *"]  // Every 5 minutes
  }
}

Verification

  1. Deploy the changes: npm run deploy
  2. Check status: curl https://your-worker.workers.dev/api/status
    • Should return {"ok": false, "status": "not_running"}
  3. Start gateway: curl -X POST https://your-worker.workers.dev/start
    • Should return {"ok": true, "status": "started"}
  4. Verify running: curl https://your-worker.workers.dev/api/status
    • Should return {"ok": true, "status": "running", "processId": "..."}
  5. Wait 5 minutes, check status again - should still be running (cron keeps it alive)

Example: Debugging Flow

# 1. Check if container is running (not the gateway)
curl https://moltbot.workers.dev/sandbox-health
# {"status":"ok","service":"moltbot-sandbox","gateway_port":18789}

# 2. Check if gateway process is running
curl https://moltbot.workers.dev/api/status
# {"ok":false,"status":"not_running"}  <-- Gateway not started!

# 3. Start the gateway
curl -X POST https://moltbot.workers.dev/start
# {"ok":true,"status":"started"}

# 4. Verify
curl https://moltbot.workers.dev/api/status
# {"ok":true,"status":"running","processId":"proc_123"}

Notes

  • The /start endpoint is public (no auth) which is intentional - it only starts what should already be running, doesn't expose any data
  • Cron interval of 5 minutes means worst case the bot is offline for ~5 minutes after a crash, not indefinitely
  • Consider adding alerting if the gateway fails to start repeatedly
  • The keepAlive: true sandbox option keeps the container alive, but not the process inside - they're separate concerns

Architecture Insight

Request Flow:
                                     ┌─────────────────────┐
    Browser ──CF Access──> Worker ──>│ ensureMoltbotGateway│──> Gateway starts
                                     └─────────────────────┘

    Discord ──────────────────────────────> Gateway (already running)
                                            Must pre-start!

Solution:
    Cron ──────────────────> Worker ──> ensureMoltbotGateway ──> Gateway stays alive
    POST /start ──────────> Worker ──> ensureMoltbotGateway ──> Manual recovery

References

Weekly Installs
1
First Seen
7 days ago