tmux-remote-worker-setup
Installation
SKILL.md
Tmux Remote Worker Setup
Use this skill when the user points at an existing local tmux session and wants the matching remote worker configured for Codex.
What this skill does
- Reads
memory/optimized-workflow.mdand recent run logs before acting so the latest proven workflow is reused. - Reads
references/worker_setup_memory.mdbefore acting so prior pitfalls and proven fixes are reused. - Reads tmux pane metadata and scrollback to resolve the worker replica tied to the named session.
- Uses local
/kubebrain/brainctlas the primary control path; the tmux session is only used to discover the worker. - Builds a local offline payload for
node,npm,npx,codex,rg, and local Codex skills, then installs them onto the remote worker. - Copies
~/.codex/.env,~/.codex/AGENTS.md, and~/.codex/feishu_notify.pythrough the local payload so the worker gets the same Codex config without manual paste. - Verifies the final remote environment and checks that
python /home/i-huangsiming/work/tools/gpu_util.pyis still running.
Important rules
- Use the exact tmux session name the user gave you. If they did not provide one, ask for it.
- Do not guess the replica. The helper script extracts the latest
replica=/pod=/JOB_ID=evidence from tmux history. If that evidence is missing, stop and ask the user for a session with preserved scrollback or for the explicit replica name. - Always use the absolute brainctl path
/kubebrain/brainctl. - Keep Python execution inside
/mnt/step3-abla/siming/.venv/bin/python. - Do not interrupt the tmux pane or kill
gpu_util.pyunless the user explicitly asks. This workflow configures the worker through localbrainctl exec, so the foreground process in tmux can stay untouched.
Self-evolution loop
Before running the helper:
- Read
memory/optimized-workflow.md. - Read
references/worker_setup_memory.md. - Read the newest 3 logs in
memory/runs/if they exist. - Reuse any existing workaround or validation step that matches the current failure mode or environment.
After each configure or repair run:
- Inspect the current run artifacts in the workspace, especially:
.../session_context.json.../access_check.txt.../remote_setup.stdout.txt.../remote_setup.stderr.txt.../verification.txt.../verification.stderr.txt
- Decide whether the run produced a new durable lesson:
- a non-obvious failure signature
- a more reliable fallback
- a hidden dependency
- a better verification technique
- Record the run with
python3 scripts/log_run.py .... - Rebuild
memory/optimized-workflow.mdwithpython3 scripts/update_workflow.py. - If the run produced a new durable lesson, add a concise entry to
references/worker_setup_memory.md. - If the run only repeated known behavior, keep
references/worker_setup_memory.mdunchanged.
When updating memory, prefer this shape:
Signal:what was observedLesson:the reusable decision or workaroundWhy it matters:how it improves future robustness
The goal is to make future worker setup more reliable, not to create a run log.
Default workflow
- Run the helper wrapper from this skill directory:
scripts/setup_remote_worker_from_tmux.sh configure <tmux-session-name> \
--workspace-dir workspace
- Read these generated files before replying:
memory/optimized-workflow.mdreferences/worker_setup_memory.mdworkspace/.../session_context.jsonworkspace/.../access_check.txtworkspace/.../verification.txtworkspace/.../summary.json
- Report back:
- resolved tmux session -> replica mapping
- whether setup finished successfully
- versions found for
codex,node,npm,rg, andnvitop - confirmation that
notifyand trusted-project config were written - confirmation that
gpu_util.pyis still running
Run log shape
Use scripts/log_run.py with fields such as:
requeststatussession_namereplicaissueslessonsactionsoutputsnotes
Keep logs concise and durable. Avoid secrets and transient noise.
If proxy/bootstrap fails
The helper script first runs networked commands with:
eval $(curl -s http://deploy.i.shaipower.com/httpproxy)
If that attempt fails, it retries once after:
unset https_proxy http_proxy all_proxy
If both attempts fail, surface the error instead of inventing a workaround.
Example prompts that should trigger this skill
- "帮我把 tmux 里的 gpu2 对应远端机器配成能跑 codex 的环境"
- "tmux session 名叫 gpu,帮我修一下对应 worker 的 codex / node / nvitop 配置"
- "我只知道 tmux session 是 gpu2,你直接把那台远端服务器环境配置好"
Related skills