Sandbox Runner
SKILL.md
Sandbox Runner Skill
Run solutions through recursive sandboxing, iterating up to 100 times before presenting to human.
Concept
"Let agents fail 100 times in simulation so humans see only the winning solution."
Sandbox Loop
[Solution] → Sandbox → Test → Pass? → [Present to Human]
↑ │
└── No ──┘ (iterate, max 100x)
Configuration
sandbox:
max_iterations: 100
early_exit_threshold: 0.95 # Confidence to stop early
checkpoint_every: 10 # Save state every N iterations
tests:
- unit: "npm run test"
- lint: "npm run lint"
- build: "npm run build"
success_criteria:
all_tests_pass: true
no_lint_errors: true
build_success: true
Iteration Protocol
Per Iteration:
- Apply solution
- Run test suite
- Collect failures
- Analyze failures
- Generate fix
- Loop until pass OR max iterations
State Tracking:
iteration:
number: 42
status: running | passed | failed
failures: ["test A", "test B"]
fixes_attempted: [...]
confidence: 0.72
Early Exit Conditions
Stop before 100 iterations if:
- All tests pass (success)
- Confidence > 0.95
- Same failure 5x in a row (stuck)
- Critical error detected
Output Format
sandbox_result:
iterations_run: 47
final_status: passed | failed
solution:
code_changes: [...]
confidence: 0.89
iteration_log:
- iter: 1, status: failed, fixes: [...]
- iter: 2, status: failed, fixes: [...]
- iter: 47, status: passed
ready_for_hitl: true
Integration
- Input from: Swarm orchestrator
- Output to: HITL auditor
- Rollback: Restore to pre-sandbox state on failure