codex-prompt-engineering
Codex Prompt Engineering
Knowledge snapshot: 2026-02-20
Purpose
Use this skill to design or review prompts for Codex/GPT coding agents with focus on:
- quality and reliability
- token efficiency
- correct tool usage
- safe autonomy
Use When
- Writing or revising system prompts for coding agents
- Debugging weak agent behavior (stalling, verbosity, bad tool calls)
- Calibrating
reasoning_effort(none|low|medium|high|xhigh) - Defining tool orchestration and planning behaviors
- Building evaluation loops for prompt iteration
Core Rules
- Calibrate reasoning effort
noneor omit for trivial formatting/retrievalmediumdefault for interactive codinghigh/xhighfor complex autonomous tasks
- Run end-to-end autonomously
- gather context -> plan -> implement -> test -> refine
- ask only for ambiguity, destructive actions, or major architecture trade-offs
- Use correct tool contracts
- file edits:
apply_patch - shell execution:
exec_command - task tracking:
update_planwithplan: [{step, status}] - batch independent calls with
multi_tool_use.parallel
- Keep communication compact
- tiny change: 2-5 sentences
- medium change: <=6 bullets
- large change: per-file summary + rationale
- Evaluate systematically
- Analyze -> Measure -> Improve -> Repeat
- keep graders and representative datasets
- Apply security basics
- validate/sanitize untrusted inputs
- defend against direct/indirect prompt injection
- enforce least privilege for tools/data
Runtime Mapping Note
Some environments expose shell tools under different names. In this repository runtime:
- use
exec_command(notshell_command) - use
update_planschema:planwithstep/status
File Guide
reference.md: compact canonical guidance and checklistspatterns.md: reusable patterns beyond prompt engineeringexamples.md: concise before/after examples with correct contracts
Invocation
Use codex-prompt-engineering to review this prompt for quality and token efficiency.
More from williamhallatt/cogworks
cogworks
Start here — turn source material into a validated agent skill. Orchestrates cogworks-encode (synthesis) and cogworks-learn (skill generation).
29cogworks-encode
Use when combining 2+ sources on a single topic to produce a unified, decision-first knowledge base — especially when sources conflict, overlap, or must be mapped to explicit decision rules. Handles multi-source synthesis, contradiction resolution, and cross-source relationship extraction. Does not handle single-source summarization, copy-editing, or format conversion.
25cogworks-learn
Generate and validate agent skill files (SKILL.md, reference.md, metadata). Enforces structural contracts, quality gates, and runtime compatibility.
24claude-prompt-engineering
Optimize Claude Code prompts for Opus 4.6, Sonnet 4.5, and Haiku 4.5 with model-aware reasoning settings, context control, safe tool use, and concise output shaping.
8skill-evaluation
Guides systematic evaluation of Claude Code skills through eval-driven development, SMART success criteria, layered grading (deterministic then LLM-as-judge then human), four-category test datasets with negative controls, and observable behavior checks. Apply when designing skill tests, defining quality metrics, building test cases, grading skill outputs, choosing graders, calibrating LLM judges, or assessing whether a skill is production-ready. Use when asked "how do I test this skill", "is this ready to ship", or "what should my success criteria be".
6