prompt-engineering
User request: $ARGUMENTS
Create or update an LLM prompt. Prompts act as manifests: clear goal, clear constraints, freedom in execution.
If no request provided: Ask the user whether they want to create a new prompt, update an existing one, or review prompt structure.
If creating: Discover goal, constraints, and structure through targeted questions.
If updating: Read existing prompt, identify issues against principles, make targeted fixes.
If creating or updating a skill: Read references/skills.md for skill-specific architecture patterns (folder structure, progressive disclosure, gotchas, setup config, description-as-trigger, skill type awareness) before proceeding.
Context Discovery
Before writing or improving a prompt, surface all required context through user engagement. Missing domain knowledge creates ambiguous prompts. You can't surface latent requirements you don't understand.
What to discover:
| Context Type | What to Surface |
|---|---|
| Domain knowledge | Industry terms, conventions, patterns, constraints |
| User types | Who interacts, expertise level, expectations |
| Success criteria | What good output looks like, what makes it fail |
| Edge cases | Unusual inputs, error handling, boundary conditions |
| Constraints | Hard limits (length, format, tone), non-negotiables |
| Integration context | Where prompt fits, what comes before/after |
Interview method:
| Principle | How |
|---|---|
| Generate candidates, learn from reactions | Don't ask open-ended "what do you want?" Propose concrete options: "Should this be formal or conversational? (Recommended: formal for enterprise context)" |
| Mark recommended options | Reduce cognitive load. For single-select, mark one "(Recommended)". For multi-select, mark sensible defaults or none if all equally valid. |
| Outside view | "What typically fails in prompts like this?" "What have you seen go wrong before?" |
| Pre-mortem | "If this prompt failed in production, what would likely cause it?" |
| Discovered ≠ confirmed | When you infer constraints from context, confirm before encoding: "I'm inferring X should be a constraint?" Includes ambiguous scope (list in/out assumptions). |
| Encode explicit statements | When user states a preference or requirement, it must appear in the final prompt. Don't let constraints get lost. |
| Domain terms | Ask for definitions, don't guess. Jargon you don't understand creates ambiguous prompts. |
| Missing examples | Ask for good/bad output examples when success criteria are unclear. |
Stopping rule: Continue probing until very confident further questions would yield nothing new, or user signals "enough". Err toward more probing—every requirement discovered now is one fewer failure later.
Handling ambiguity: Critical ambiguities (those that would cause prompt failure) require clarification even if user wants to move on. Minor ambiguities can be documented with chosen defaults and proceed. When in doubt, ask—a prompt built on assumptions will fail in ways the user didn't expect.
Core Principles
| Principle | What It Means |
|---|---|
| WHAT and WHY, not HOW | State goals and constraints. Don't prescribe steps the model knows how to do. |
| Trust capability, enforce discipline | Model knows how to search, analyze, generate. Only specify guardrails. |
| Maximize information density | Every word earns its place. Fewer words = same meaning = better. |
| Avoid arbitrary values | "Max 4 rounds" becomes rigid. State the principle: "stop when converged". |
| Output structure when needed | Define format only if artifact requires it. Otherwise let agent decide. |
Issue Types
Clarity:
- Ambiguous instructions (multiple interpretations)
- Vague language ("be helpful", "use good judgment", "when appropriate")
- Implicit expectations (unstated assumptions)
Conflict:
- Contradictory rules ("Be concise" vs "Explain thoroughly")
- Priority collisions (two MUST rules that can't both be satisfied)
- Edge case gaps (what happens when rules don't cover a situation?)
Structure:
- Buried critical info (important rules hidden in middle)
- No hierarchy (all instructions treated as equal priority)
- Unintentional redundancy (but: repetition can be intentional emphasis—don't remove if it reinforces critical rules)
Anti-Patterns to Eliminate
| Anti-pattern | Example | Fix |
|---|---|---|
| Prescribing HOW | "First search, then read, then analyze..." | State goal: "Understand the pattern" |
| Arbitrary limits | "Max 3 iterations", "2-4 examples" | Principle: "until converged", "as needed" |
| Capability instructions | "Use grep to search", "Read the file" | Remove - model knows how |
| Rigid checklists | Step-by-step heuristics tables | Convert to principles |
| Weak hedging | "Try to", "maybe", "if possible" | Direct imperative: "Do X" |
| Absolutes for judgment calls | "ALWAYS", "NEVER", "MUST" applied to non-invariants (when to search, ask, iterate, retry) | Decision rules: "When X, do Y; otherwise Z". Reserve absolutes for true invariants — safety rules, required fields, hard constraints |
| Buried critical info | Important rules in middle | Surface prominently |
| Over-engineering | 10 phases for a simple task | Match complexity to need |
When Updating Prompts
High-signal changes only: Every change must address a real failure mode or materially improve clarity. Don't change for the sake of change.
Right-sized changes: Don't overcorrect. One edge case doesn't warrant restructuring.
Questions before changing:
- Does this change address a real failure mode?
- Am I adding complexity to solve a rare case?
- Can this be said in fewer words?
- Am I turning a principle into a rigid rule?
Over-engineering warning signs:
- Prompt length doubled or tripled
- Adding edge cases that won't happen
- "Improving" clear language into verbose language
- Adding examples for obvious behaviors
Memento Pattern (Multi-Phase Workflows Only)
For prompts involving accumulated findings across steps:
| LLM Limitation | Pattern Response |
|---|---|
| Context rot (middle content lost) | Write findings to log after EACH step |
| Working memory is limited | Todo lists externalize tracked areas |
| Synthesis failure at scale | Read full log BEFORE final output |
| Recency bias | Refresh moves findings to context end |
Key disciplines:
→logafter each collection step (discipline, not capability)Refresh: read full logbefore synthesis (restores context)- Acceptance criteria on each todo ("; done when X")
Prompt Structure Reference
Skills/Agents
---
name: kebab-case-name
description: 'What it does. When to use. Trigger terms.'
---
**User request**: $ARGUMENTS
{One-line mission - WHAT, not HOW}
{Empty input handling}
{Log file path if multi-phase}
## {Sections based on actual workflow needs}
{Goals and constraints per section}
## Key Principles
| Principle | Rule |
|-----------|------|
| {Discipline} | {Enforcement} |
## Gotchas
{Known failure modes Claude hits — specific, actionable, observed}
## Never Do
- {Anti-pattern}
System Instructions
## Role
{Identity and stance — who the model is and how it behaves}
## Goal
{User-visible outcome — what the run produces}
## Success criteria
{Anything that would cause dissatisfaction with the run:
output correctness; validation passing (tests, lint, schema when available);
time / iteration bounds; handling of non-success cases — retry, fallback, abstain, ask}
## Constraints
{MUST > SHOULD > PREFER priority — reserve MUST for true invariants per "Absolutes for judgment calls" anti-pattern}
## Output
{Format requirements if needed}
Skill Description Pattern
Descriptions drive auto-invocation. Pattern: What + When + Triggers
# Weak
description: 'Helps with prompts'
# Strong
description: 'Craft or update LLM prompts from first principles. Use when creating new prompts, updating existing ones, or reviewing prompt structure.'
- Include trigger terms users say
- Specify when to use
- Under 1024 chars
Emotional Tone
Prompts shape the model's internal emotional state before generation begins. Research on transformer internals shows emotion concept representations that causally influence behavior — including sycophancy, reward hacking, and misalignment. These principles help calibrate the emotional context a prompt creates.
| Principle | What It Means | Why |
|---|---|---|
| Keep arousal low | Avoid urgency language ("CRITICAL", "you MUST"), excessive praise ("you're amazing at this!"), and pressure framing. | High-arousal emotions causally drive sycophancy (positive arousal) or corner-cutting and misalignment (negative arousal). |
| Opening framing propagates | The emotional tone set in a prompt's opening persists into the model's response planning. A tense opening produces a tense response. | Emotional context from early tokens propagates through later processing layers, even when subsequent content is neutral. |
| Normalize failure in iterative prompts | For agentic or multi-step prompts, explicitly frame failure as acceptable: "if this approach doesn't work, try another." | Repeated failures build desperation that causally drives reward hacking and corner-cutting solutions. |
| Sycophancy-harshness tradeoff | Pushing toward warmth and positivity increases sycophancy. Pushing away from warmth increases bluntness and harshness. Aim for a "trusted advisor" tone — honest pushback delivered with care. | Positive-valence emotion representations causally increase agreement-seeking behavior; their absence produces unnecessary harshness. |
| Avoid unintended high-stakes framing | The model reads semantic intensity, not surface patterns. "This is critical to my career" or "failure is not an option" activates negative emotion representations even if intended as motivation. | Emotion representations respond to the meaning of situations — quantities, stakes, consequences — not to keywords. |
Gotchas
- Rewriting working language for style: Claude rewrites clear, working prompt text for stylistic preference. If existing language is unambiguous and effective, don't touch it.
- Skipping context discovery when the task seems obvious: Claude jumps to writing/editing without probing. Even "simple" prompt tasks have hidden constraints — force discovery before producing output.
- Over-engineering simple prompts: A 3-line prompt doesn't need 10 sections, a memento pattern, and a validation checklist. Match complexity to the task.
- Converting principles into rigid rules: "Stop when converged" becomes "Max 5 iterations." Principles give flexibility; rigid rules create edge cases.
- Adding examples for behaviors Claude already knows: Examples earn their place only when they demonstrate non-obvious or counter-intuitive behavior.
Validation Checklist
Before finalizing any prompt:
- All ambiguities resolved through user questions
- Domain context gathered (terms, conventions, constraints)
- Goals stated, not steps prescribed
- No arbitrary numbers (or justified if present)
- Weak language replaced with direct imperatives
- Critical rules surfaced prominently
- Complexity matches the task
- Each word earns its place
- If multi-phase: memento pattern applied correctly
More from doodledood/claude-code-plugins
scrollytelling
Implements scroll-driven storytelling experiences with pinned sections, progressive reveals, and scroll-linked animations. Use when asked to build scrollytelling, scroll-driven animations, parallax effects, narrative scroll experiences, or story-driven landing pages.
126research-web
Deep web research with parallel investigators, multi-wave exploration, and structured synthesis. Spawns multiple web-researcher agents to explore different facets of a topic simultaneously, launches additional waves when gaps are identified, then synthesizes findings. Use when asked to research, investigate, compare options, find best practices, or gather comprehensive information from the web.\n\nThoroughness: quick for factual lookups | medium for focused topics | thorough for comparisons/evaluations (waves continue while critical gaps remain) | very-thorough for comprehensive research (waves continue until satisficed). Auto-selects if not specified.
21decide
Personal decision advisor for QUALITY over speed. Exhaustive discovery, option finding, sequential elimination, structured analysis. Use for investments, purchases, career, life decisions. Surfaces hidden factors, tracks eliminations with reasons, confident recommendations. Triggers: help me decide, should I, which should I choose, compare options, what should I do, weighing options.
20human-writing
Research-backed principles for writing prose that avoids AI tells. Apply when writing articles, blog posts, emails, marketing copy, social media, or any prose content. Covers vocabulary, structure, tone, rhythm, and craft techniques that make writing feel authentically human. Not for code, commit messages, or technical documentation.
14optimize-prompt-token-efficiency
Iteratively optimizes prompts for token efficiency by maximizing information density - reduces verbosity, removes redundancy, tightens phrasing while preserving semantic content. Use when asked to compress, shorten, reduce tokens, tighten, maximize density, increase information density, or make a prompt more concise.
13define-brand-guidelines
Create a BRAND_GUIDELINES.md that defines how to communicate with your customer. Requires CUSTOMER.md to exist first. Covers voice, tone, language rules, messaging framework, and copy patterns.
12