deployment-checker
deployment-checker
Stage 2 of the autonomous deployment pipeline. Polls the team's health endpoint using a cheap bash script until the deployed sha matches the merged PR's sha, or times out. Exits with a structured result and applies deployed or deploy-failed label.
Design goal: zero token burn during polling. The Claude agent runs only to set up the poll and process the result. All polling is done by the bash script.
When to Use
- Dispatched by
/post-mergeafter a PR merges - Manual:
/deployment-checker PR_NUMBER org/repo
Invocation
/deployment-checker 42 fellowship-dev/pylot
Arguments
PR_NUMBER=$1 # PR number
REPO=$2 # org/repo
Runbook
Step 0: Dedup Gate
PR=$1
REPO=$2
ALREADY_DONE=$(gh pr view $PR --repo $REPO --json labels \
--jq '[.labels[].name] | (contains(["deployed"]) or contains(["deploy-failed"]))')
if [ "$ALREADY_DONE" = "true" ]; then
echo "[deployment-checker] outcome=\"already complete — deployed or deploy-failed label present\" status=success"
exit 0
fi
Step 1: Read Context from Job
Extract parameters from the job context (set by /post-merge):
# These are passed via job context or derived from PR
PR_URL=$(gh pr view $PR --repo $REPO --json url --jq '.url')
PR_TITLE=$(gh pr view $PR --repo $REPO --json title --jq '.title')
# From job context (set by post-merge):
# EXPECTED_SHA — merge commit sha to wait for
# HEALTH_URL — endpoint to poll
# TIMEOUT_MINUTES — max wait (default: 5)
# CHECKER_SCRIPT — path to team bash polling script
# POLL_INTERVAL — seconds between polls (default: 30)
EXPECTED_SHA="${EXPECTED_SHA:-}"
HEALTH_URL="${HEALTH_URL:-}"
TIMEOUT_MINUTES="${TIMEOUT_MINUTES:-5}"
CHECKER_SCRIPT="${CHECKER_SCRIPT:-}"
POLL_INTERVAL="${POLL_INTERVAL:-30}"
if [ -z "$HEALTH_URL" ] || [ -z "$CHECKER_SCRIPT" ]; then
echo "[deployment-checker] ERROR: HEALTH_URL and CHECKER_SCRIPT must be set in job context" >&2
exit 1
fi
Step 2: Run the Bash Polling Script
The team-specific bash script handles the actual polling. This keeps Claude out of the loop during the wait.
PYLOT_DIR="${PYLOT_DIR:-$HOME/projects/fellowship-dev/pylot}"
# Resolve checker script path
if [[ "$CHECKER_SCRIPT" != /* ]]; then
CHECKER_SCRIPT="$PYLOT_DIR/$CHECKER_SCRIPT"
fi
if [ ! -f "$CHECKER_SCRIPT" ]; then
echo "[deployment-checker] ERROR: checker script not found: $CHECKER_SCRIPT" >&2
DEPLOY_STATUS="failed"
FAILURE_REASON="checker script not found: $CHECKER_SCRIPT"
else
chmod +x "$CHECKER_SCRIPT"
# Run the script; it exits 0 on success, non-zero on failure/timeout
# stdout: "deploy_status=success sha=abc123" or "deploy_status=failed reason=OOM"
POLL_OUTPUT=$(HEALTH_URL="$HEALTH_URL" \
EXPECTED_SHA="$EXPECTED_SHA" \
TIMEOUT_MINUTES="$TIMEOUT_MINUTES" \
POLL_INTERVAL="$POLL_INTERVAL" \
bash "$CHECKER_SCRIPT" 2>&1) && POLL_EXIT=0 || POLL_EXIT=$?
echo "Checker output: $POLL_OUTPUT"
# Parse structured output (python3 avoids grep -P which is unavailable on macOS/BSD)
DEPLOY_STATUS=$(echo "$POLL_OUTPUT" | python3 -c "import sys,re; m=re.search(r'deploy_status=(\S+)', sys.stdin.read()); print(m.group(1) if m else 'failed')" 2>/dev/null || echo "failed")
DEPLOYED_SHA=$(echo "$POLL_OUTPUT" | python3 -c "import sys,re; m=re.search(r'sha=(\S+)', sys.stdin.read()); print(m.group(1) if m else '')" 2>/dev/null || echo "")
FAILURE_REASON=$(echo "$POLL_OUTPUT" | python3 -c "import sys,re; m=re.search(r'reason=(.+)', sys.stdin.read()); print(m.group(1).strip() if m else 'unknown')" 2>/dev/null || echo "unknown")
fi
Step 3: Apply Label and Comment
On success:
if [ "$DEPLOY_STATUS" = "success" ]; then
# Create label if missing
gh label create "deployed" --repo $REPO --color "0e8a16" \
--description "Deploy verified — health check passed" 2>/dev/null || true
gh pr edit $PR --repo $REPO --add-label "deployed"
gh pr comment $PR --repo $REPO --body "$(cat <<EOF
**Deployment verified** ✅
| Field | Value |
|-------|-------|
| Status | \`success\` |
| Deployed SHA | \`${DEPLOYED_SHA:-confirmed}\` |
| Health URL | \`$HEALTH_URL\` |
| Waited | up to ${TIMEOUT_MINUTES} min |
The \`deployed\` label triggers post-deploy file-match actions.
EOF
)"
fi
On failure or timeout:
if [ "$DEPLOY_STATUS" != "success" ]; then
# Create label if missing
gh label create "deploy-failed" --repo $REPO --color "d93f0b" \
--description "Deploy verification failed or timed out" 2>/dev/null || true
gh pr edit $PR --repo $REPO --add-label "deploy-failed"
gh pr comment $PR --repo $REPO --body "$(cat <<EOF
**Deployment check failed** ❌
| Field | Value |
|-------|-------|
| Status | \`failed\` |
| Reason | ${FAILURE_REASON} |
| Health URL | \`$HEALTH_URL\` |
| Timeout | ${TIMEOUT_MINUTES} min |
| Expected SHA | \`${EXPECTED_SHA:-unknown}\` |
**Action required:** Check the deployment manually. Remove \`deploy-failed\` and re-add \`deployed\` once confirmed.
EOF
)"
fi
Step 4: Write Report
PYLOT_DIR="${PYLOT_DIR:-$HOME/projects/fellowship-dev/pylot}"
REPORT="$PYLOT_DIR/reports/$(date +%Y-%m-%d)-deployment-checker-$(echo $REPO | tr '/' '-')-pr${PR}.md"
cat > "$REPORT" <<EOF
# Deployment Checker: $REPO PR #$PR
**Date**: $(date +%Y-%m-%d)
**PR**: [$REPO#$PR]($PR_URL) — $PR_TITLE
**Status**: $DEPLOY_STATUS
**SHA**: ${DEPLOYED_SHA:-not confirmed}
**Reason**: ${FAILURE_REASON:-n/a}
## Poll Output
\`\`\`
$POLL_OUTPUT
\`\`\`
EOF
Team Checker Script Interface
Each team provides a bash script at a path configured in their CLAUDE.md. The script must:
Inputs (env vars):
| Variable | Required | Description |
|---|---|---|
HEALTH_URL |
yes | URL to GET for health check |
EXPECTED_SHA |
no | Git sha to wait for (compare against health response) |
TIMEOUT_MINUTES |
no | Max wait in minutes (default: 5) |
POLL_INTERVAL |
no | Seconds between polls (default: 30) |
Exit codes:
0— deployment confirmed1— timeout or failure
stdout format (last line):
deploy_status=success sha=abc123def456
deploy_status=failed reason=timeout_after_5min
deploy_status=failed reason=health_check_returned_503
Example minimal script:
#!/bin/bash
# scripts/deployment-checker-pylot.sh
TIMEOUT=$((${TIMEOUT_MINUTES:-5} * 60))
INTERVAL=${POLL_INTERVAL:-30}
ELAPSED=0
while [ $ELAPSED -lt $TIMEOUT ]; do
RESPONSE=$(curl -sf "$HEALTH_URL" 2>/dev/null || echo "")
CURRENT_SHA=$(echo "$RESPONSE" | python3 -c "import json,sys; d=json.load(sys.stdin); print(d.get('sha',''))" 2>/dev/null || echo "")
if [ -n "$EXPECTED_SHA" ] && [ "$CURRENT_SHA" = "$EXPECTED_SHA" ]; then
echo "deploy_status=success sha=$CURRENT_SHA"
exit 0
elif [ -z "$EXPECTED_SHA" ] && [ -n "$RESPONSE" ]; then
# No sha to compare — just verify health returns 200
echo "deploy_status=success sha=confirmed"
exit 0
fi
sleep $INTERVAL
ELAPSED=$((ELAPSED + INTERVAL))
done
echo "deploy_status=failed reason=timeout_after_${TIMEOUT_MINUTES:-5}min"
exit 1
Notes
- Zero token burn during polling. Claude only runs before and after the bash script. The bash script does all the waiting.
- SHA comparison is optional. If
EXPECTED_SHAis empty, the checker confirms health endpoint returns 200. - Timeout is team-configurable. 5 min for Pylot (auto-pull), 30 min for Lexgo (GitHub Actions + Fly.io).
deployedlabel chains into/post-deployvia event-rules — no manual handoff needed.- Monorepo targets: dispatch one deployment-checker job per target if needed.
More from fellowship-dev/dogfooded-skills
entropy-check
Sensor — checks doc freshness and computes domain quality grades. Never fixes. Detects staleness, missing coverage, and FlowChad gaps. Updates QUALITY_SCORE.md. Skips inapplicable signals per repo.
16distill
Post-mission audit and distillation — capture mode classifies a completed mission using an 8-code failure taxonomy and writes an audit JSON; analyze mode aggregates audit JSONs into a findings report and creates GitHub issues with recommendations.
14migrate-skill
Move a skill from claude-toolkit plugin (or local .claude/skills) into the dogfooded-skills library, then import it back. Use when consolidating skills into the shared repo.
14skill-builder
Write a high-quality agent skill — covers frontmatter spec, section structure, quality criteria, and common antipatterns.
13popsicle
Agent-native onboarding doc generator — builds coverage maps, health baselines, generated docs, and agent adapters so any AI tool can autonomously navigate your repo.
8setup-harness
Scaffold the knowledge layer for a repo — ARCHITECTURE.md, QUALITY_SCORE.md, enhanced docs/code-structure.md, docs/code-guidelines.md, and FlowChad flow stubs. Gives agents a map, not a 1,000-page manual.
8