skills/hubeiqiao/skills/node-http-proxy-layered-timeouts

node-http-proxy-layered-timeouts

Installation
SKILL.md

Node.js HTTP Proxy: Layered Timeouts for CDN-Tunneled Origins

Problem

When proxying HTTP requests through a CDN tunnel (e.g., Cloudflare Tunnel / cloudflared), a simple connect timeout is useless because the CDN edge server is always reachable. If the origin behind the tunnel goes down (laptop sleeps, server crashes), the TCP+TLS handshake succeeds instantly against the CDN, but the request hangs for the full request timeout (often 30-60s) waiting for the CDN to forward through the dead tunnel.

With a circuit breaker that requires N failures to open, the worst case is N * request_timeout seconds of total hang time before fallback works.

Context / Trigger Conditions

  • Proxy or reverse-proxy using http.request() / https.request() in Node.js
  • Origin is behind Cloudflare Tunnel, ngrok, or any CDN-to-origin tunnel
  • Origin goes offline (Mac sleep, server restart, network disconnect)
  • Connect timeout fires quickly but doesn't help (CDN edge is always up)
  • Requests hang for full REQUEST_TIMEOUT before failing over
  • socket.once('connect') handlers don't fire on keep-alive reused sockets

Solution

1. Three-Layer Timeout Architecture

const CONNECT_TIMEOUT = 5_000;   // TCP+TLS handshake
const TTFB_TIMEOUT = 8_000;     // After connect, waiting for response headers
const REQUEST_TIMEOUT = 45_000;  // Overall safety net for stalled mid-stream

proxyReq.on('socket', (socket) => {
  const startTtfbTimer = () => {
    socket.setTimeout(0); // clear connect timeout

    const ttfbTimer = setTimeout(() => {
      if (!settled) {
        proxyReq.destroy(new Error('TTFB timeout (tunnel backend unreachable)'));
      }
    }, TTFB_TIMEOUT);

    // Once headers arrive, switch to overall safety-net timeout
    proxyReq.once('response', () => {
      clearTimeout(ttfbTimer);
      socket.setTimeout(REQUEST_TIMEOUT, () => {
        proxyReq.destroy(new Error('response body timeout'));
      });
    });
    proxyReq.once('error', () => clearTimeout(ttfbTimer));
  };

  if (socket.connecting) {
    // Fresh connection -- wait for TCP+TLS handshake
    socket.setTimeout(CONNECT_TIMEOUT, () => {
      proxyReq.destroy(new Error('connect timeout'));
    });
    socket.once('connect', startTtfbTimer);
  } else {
    // Socket reused from keep-alive pool -- already connected
    startTtfbTimer();
  }
});

2. Critical: socket.connecting Check

Node.js's HTTP agent maintains a keep-alive connection pool. On reused sockets:

  • socket.once('connect') never fires (the handshake already happened)
  • Any timeout set via socket.setTimeout() in the connect phase is never cleared
  • The TTFB timer is never started

Always check socket.connecting before attaching connect handlers:

if (socket.connecting) {
  socket.once('connect', startTtfbTimer);
} else {
  startTtfbTimer(); // already connected, start TTFB immediately
}

Without this, the second request to the same provider on a keep-alive connection will have no timeout at all -- silently defeating the entire failover mechanism.

3. Background Health Probe Pattern

When using fetch() in a setInterval health probe:

const probe = async () => {
  try {
    const res = await fetch(healthUrl, { signal: AbortSignal.timeout(5000) });
    await res.text(); // CRITICAL: drain body to release connection back to pool
    if (!res.ok) throw new Error('HTTP ' + res.status);
  } catch (err) {
    // Pre-emptively open circuit breaker
  }
};

probe(); // Run immediately, don't wait for first interval
const intervalId = setInterval(probe, 30_000);
intervalId.unref(); // Allow process to exit cleanly during shutdown

Three key details:

  • await res.text(): Without draining the body, the TCP connection is never released back to the pool. Over many intervals, this leaks connections.
  • probe() before setInterval: Don't wait 30s for the first health check. If the tunnel is down at startup, the first real request discovers it the hard way.
  • intervalId.unref(): Prevents the interval from keeping the Node.js process alive during graceful shutdown.

Verification

  1. Deploy the proxy
  2. Kill the tunnel origin (e.g., launchctl unload the cloudflared plist)
  3. Send a request -- should fail over within TTFB_TIMEOUT (8s), not REQUEST_TIMEOUT (45s)
  4. Wait for health probe interval -- circuit should open pre-emptively
  5. Send another request -- should skip the dead provider instantly (0s)
  6. Restore the tunnel -- circuit recovers via HALF_OPEN after cooldown

Notes

  • Why 8s for TTFB? Cloudflare typically returns a 502 within 5-10s when a tunnel has no connectors. 8s gives headroom for legitimate slow responses (edge routing, tunnel handshake) while being much faster than a 45s overall timeout.
  • socket.setTimeout() vs setTimeout(): socket.setTimeout() uses the socket's built-in idle timer (resets on activity). setTimeout() is a hard wall-clock timer. Use setTimeout for TTFB since you want a hard deadline, and socket.setTimeout for the overall body timeout since it should reset on each chunk received.
  • HTTPS and TLS: The 'connect' event on the underlying TCP socket fires after TCP handshake but before TLS. For HTTPS via https.request(), this is fine because socket.connecting correctly reflects the TLS wrapper state too.
  • The health probe only proves the tunnel is alive, not that the upstream API is healthy. Don't use probe success to forcefully close a circuit opened by real traffic failures.

References

Weekly Installs
1
First Seen
7 days ago