prompt-engineering

Installation
SKILL.md

User request: $ARGUMENTS

Create or update an LLM prompt. Prompts act as manifests: clear goal, clear constraints, freedom in execution.

If no request provided: Ask the user whether they want to create a new prompt, update an existing one, or review prompt structure.

If creating: Discover goal, constraints, and structure through targeted questions.

If updating: Read existing prompt, identify issues against principles, make targeted fixes.

If creating or updating a skill: Read references/skills.md for skill-specific architecture patterns (folder structure, progressive disclosure, gotchas, setup config, description-as-trigger, skill type awareness) before proceeding.

Context Discovery

Before writing or improving a prompt, surface all required context through user engagement. Missing domain knowledge creates ambiguous prompts. You can't surface latent requirements you don't understand.

What to discover:

Context Type What to Surface
Domain knowledge Industry terms, conventions, patterns, constraints
User types Who interacts, expertise level, expectations
Success criteria What good output looks like, what makes it fail
Edge cases Unusual inputs, error handling, boundary conditions
Constraints Hard limits (length, format, tone), non-negotiables
Integration context Where prompt fits, what comes before/after

Interview method:

Principle How
Generate candidates, learn from reactions Don't ask open-ended "what do you want?" Propose concrete options: "Should this be formal or conversational? (Recommended: formal for enterprise context)"
Mark recommended options Reduce cognitive load. For single-select, mark one "(Recommended)". For multi-select, mark sensible defaults or none if all equally valid.
Outside view "What typically fails in prompts like this?" "What have you seen go wrong before?"
Pre-mortem "If this prompt failed in production, what would likely cause it?"
Discovered ≠ confirmed When you infer constraints from context, confirm before encoding: "I'm inferring X should be a constraint?" Includes ambiguous scope (list in/out assumptions).
Encode explicit statements When user states a preference or requirement, it must appear in the final prompt. Don't let constraints get lost.
Domain terms Ask for definitions, don't guess. Jargon you don't understand creates ambiguous prompts.
Missing examples Ask for good/bad output examples when success criteria are unclear.

Stopping rule: Continue probing until very confident further questions would yield nothing new, or user signals "enough". Err toward more probing—every requirement discovered now is one fewer failure later.

Handling ambiguity: Critical ambiguities (those that would cause prompt failure) require clarification even if user wants to move on. Minor ambiguities can be documented with chosen defaults and proceed. When in doubt, ask—a prompt built on assumptions will fail in ways the user didn't expect.

Core Principles

Principle What It Means
WHAT and WHY, not HOW State goals and constraints. Don't prescribe steps the model knows how to do.
Trust capability, enforce discipline Model knows how to search, analyze, generate. Only specify guardrails.
Maximize information density Every word earns its place. Fewer words = same meaning = better.
Avoid arbitrary values "Max 4 rounds" becomes rigid. State the principle: "stop when converged".
Output structure when needed Define format only if artifact requires it. Otherwise let agent decide.

Issue Types

Clarity:

  • Ambiguous instructions (multiple interpretations)
  • Vague language ("be helpful", "use good judgment", "when appropriate")
  • Implicit expectations (unstated assumptions)

Conflict:

  • Contradictory rules ("Be concise" vs "Explain thoroughly")
  • Priority collisions (two MUST rules that can't both be satisfied)
  • Edge case gaps (what happens when rules don't cover a situation?)

Structure:

  • Buried critical info (important rules hidden in middle)
  • No hierarchy (all instructions treated as equal priority)
  • Unintentional redundancy (but: repetition can be intentional emphasis—don't remove if it reinforces critical rules)

Anti-Patterns to Eliminate

Anti-pattern Example Fix
Prescribing HOW "First search, then read, then analyze..." State goal: "Understand the pattern"
Arbitrary limits "Max 3 iterations", "2-4 examples" Principle: "until converged", "as needed"
Capability instructions "Use grep to search", "Read the file" Remove - model knows how
Rigid checklists Step-by-step heuristics tables Convert to principles
Weak hedging "Try to", "maybe", "if possible" Direct imperative: "Do X"
Absolutes for judgment calls "ALWAYS", "NEVER", "MUST" applied to non-invariants (when to search, ask, iterate, retry) Decision rules: "When X, do Y; otherwise Z". Reserve absolutes for true invariants — safety rules, required fields, hard constraints
Buried critical info Important rules in middle Surface prominently
Over-engineering 10 phases for a simple task Match complexity to need

When Updating Prompts

High-signal changes only: Every change must address a real failure mode or materially improve clarity. Don't change for the sake of change.

Right-sized changes: Don't overcorrect. One edge case doesn't warrant restructuring.

Questions before changing:

  • Does this change address a real failure mode?
  • Am I adding complexity to solve a rare case?
  • Can this be said in fewer words?
  • Am I turning a principle into a rigid rule?

Over-engineering warning signs:

  • Prompt length doubled or tripled
  • Adding edge cases that won't happen
  • "Improving" clear language into verbose language
  • Adding examples for obvious behaviors

Memento Pattern (Multi-Phase Workflows Only)

For prompts involving accumulated findings across steps:

LLM Limitation Pattern Response
Context rot (middle content lost) Write findings to log after EACH step
Working memory is limited Todo lists externalize tracked areas
Synthesis failure at scale Read full log BEFORE final output
Recency bias Refresh moves findings to context end

Key disciplines:

  • →log after each collection step (discipline, not capability)
  • Refresh: read full log before synthesis (restores context)
  • Acceptance criteria on each todo ("; done when X")

Prompt Structure Reference

Skills/Agents

---
name: kebab-case-name
description: 'What it does. When to use. Trigger terms.'
---

**User request**: $ARGUMENTS

{One-line mission - WHAT, not HOW}

{Empty input handling}

{Log file path if multi-phase}

## {Sections based on actual workflow needs}

{Goals and constraints per section}

## Key Principles

| Principle | Rule |
|-----------|------|
| {Discipline} | {Enforcement} |

## Gotchas

{Known failure modes Claude hits — specific, actionable, observed}

## Never Do

- {Anti-pattern}

System Instructions

## Role
{Identity and stance — who the model is and how it behaves}

## Goal
{User-visible outcome — what the run produces}

## Success criteria
{Anything that would cause dissatisfaction with the run:
 output correctness; validation passing (tests, lint, schema when available);
 time / iteration bounds; handling of non-success cases — retry, fallback, abstain, ask}

## Constraints
{MUST > SHOULD > PREFER priority — reserve MUST for true invariants per "Absolutes for judgment calls" anti-pattern}

## Output
{Format requirements if needed}

Skill Description Pattern

Descriptions drive auto-invocation. Pattern: What + When + Triggers

# Weak
description: 'Helps with prompts'

# Strong
description: 'Craft or update LLM prompts from first principles. Use when creating new prompts, updating existing ones, or reviewing prompt structure.'
  • Include trigger terms users say
  • Specify when to use
  • Under 1024 chars

Emotional Tone

Prompts shape the model's internal emotional state before generation begins. Research on transformer internals shows emotion concept representations that causally influence behavior — including sycophancy, reward hacking, and misalignment. These principles help calibrate the emotional context a prompt creates.

Principle What It Means Why
Keep arousal low Avoid urgency language ("CRITICAL", "you MUST"), excessive praise ("you're amazing at this!"), and pressure framing. High-arousal emotions causally drive sycophancy (positive arousal) or corner-cutting and misalignment (negative arousal).
Opening framing propagates The emotional tone set in a prompt's opening persists into the model's response planning. A tense opening produces a tense response. Emotional context from early tokens propagates through later processing layers, even when subsequent content is neutral.
Normalize failure in iterative prompts For agentic or multi-step prompts, explicitly frame failure as acceptable: "if this approach doesn't work, try another." Repeated failures build desperation that causally drives reward hacking and corner-cutting solutions.
Sycophancy-harshness tradeoff Pushing toward warmth and positivity increases sycophancy. Pushing away from warmth increases bluntness and harshness. Aim for a "trusted advisor" tone — honest pushback delivered with care. Positive-valence emotion representations causally increase agreement-seeking behavior; their absence produces unnecessary harshness.
Avoid unintended high-stakes framing The model reads semantic intensity, not surface patterns. "This is critical to my career" or "failure is not an option" activates negative emotion representations even if intended as motivation. Emotion representations respond to the meaning of situations — quantities, stakes, consequences — not to keywords.

Gotchas

  • Rewriting working language for style: Claude rewrites clear, working prompt text for stylistic preference. If existing language is unambiguous and effective, don't touch it.
  • Skipping context discovery when the task seems obvious: Claude jumps to writing/editing without probing. Even "simple" prompt tasks have hidden constraints — force discovery before producing output.
  • Over-engineering simple prompts: A 3-line prompt doesn't need 10 sections, a memento pattern, and a validation checklist. Match complexity to the task.
  • Converting principles into rigid rules: "Stop when converged" becomes "Max 5 iterations." Principles give flexibility; rigid rules create edge cases.
  • Adding examples for behaviors Claude already knows: Examples earn their place only when they demonstrate non-obvious or counter-intuitive behavior.

Validation Checklist

Before finalizing any prompt:

  • All ambiguities resolved through user questions
  • Domain context gathered (terms, conventions, constraints)
  • Goals stated, not steps prescribed
  • No arbitrary numbers (or justified if present)
  • Weak language replaced with direct imperatives
  • Critical rules surfaced prominently
  • Complexity matches the task
  • Each word earns its place
  • If multi-phase: memento pattern applied correctly
Related skills

More from doodledood/claude-code-plugins

Installs
13
GitHub Stars
12
First Seen
Mar 1, 2026