prompt-engineering

Installation

SKILL.md

User request: $ARGUMENTS

Create or update an LLM prompt. Prompts act as manifests: clear goal, clear constraints, freedom in execution.

If no request provided: Ask the user whether they want to create a new prompt, update an existing one, or review prompt structure.

If creating: Discover goal, constraints, and structure through targeted questions.

If updating: Read existing prompt, identify issues against principles, make targeted fixes.

If creating or updating a skill: Read references/skills.md for skill-specific architecture patterns (folder structure, progressive disclosure, gotchas, setup config, description-as-trigger, skill type awareness) before proceeding.

Context Discovery

Before writing or improving a prompt, surface all required context through user engagement. Missing domain knowledge creates ambiguous prompts. You can't surface latent requirements you don't understand.

What to discover:

Context Type	What to Surface
Domain knowledge	Industry terms, conventions, patterns, constraints
User types	Who interacts, expertise level, expectations
Success criteria	What good output looks like, what makes it fail
Edge cases	Unusual inputs, error handling, boundary conditions
Constraints	Hard limits (length, format, tone), non-negotiables
Integration context	Where prompt fits, what comes before/after

Interview method:

Principle	How
Generate candidates, learn from reactions	Don't ask open-ended "what do you want?" Propose concrete options: "Should this be formal or conversational? (Recommended: formal for enterprise context)"
Mark recommended options	Reduce cognitive load. For single-select, mark one "(Recommended)". For multi-select, mark sensible defaults or none if all equally valid.
Outside view	"What typically fails in prompts like this?" "What have you seen go wrong before?"
Pre-mortem	"If this prompt failed in production, what would likely cause it?"
Discovered ≠ confirmed	When you infer constraints from context, confirm before encoding: "I'm inferring X should be a constraint?" Includes ambiguous scope (list in/out assumptions).
Encode explicit statements	When user states a preference or requirement, it must appear in the final prompt. Don't let constraints get lost.
Domain terms	Ask for definitions, don't guess. Jargon you don't understand creates ambiguous prompts.
Missing examples	Ask for good/bad output examples when success criteria are unclear.

Stopping rule: Continue probing until very confident further questions would yield nothing new, or user signals "enough". Err toward more probing—every requirement discovered now is one fewer failure later.

Handling ambiguity: Critical ambiguities (those that would cause prompt failure) require clarification even if user wants to move on. Minor ambiguities can be documented with chosen defaults and proceed. When in doubt, ask—a prompt built on assumptions will fail in ways the user didn't expect.

Core Principles

Principle	What It Means
WHAT and WHY, not HOW	State goals and constraints. Don't prescribe steps the model knows how to do.
Trust capability, enforce discipline	Model knows how to search, analyze, generate. Only specify guardrails.
Maximize information density	Every word earns its place. Fewer words = same meaning = better.
Avoid arbitrary values	"Max 4 rounds" becomes rigid. State the principle: "stop when converged".
Output structure when needed	Define format only if artifact requires it. Otherwise let agent decide.

Issue Types

Clarity:

Ambiguous instructions (multiple interpretations)
Vague language ("be helpful", "use good judgment", "when appropriate")
Implicit expectations (unstated assumptions)

Conflict:

Contradictory rules ("Be concise" vs "Explain thoroughly")
Priority collisions (two MUST rules that can't both be satisfied)
Edge case gaps (what happens when rules don't cover a situation?)

Structure:

Buried critical info (important rules hidden in middle)
No hierarchy (all instructions treated as equal priority)
Unintentional redundancy (but: repetition can be intentional emphasis—don't remove if it reinforces critical rules)

Anti-Patterns to Eliminate

Anti-pattern	Example	Fix
Prescribing HOW	"First search, then read, then analyze..."	State goal: "Understand the pattern"
Arbitrary limits	"Max 3 iterations", "2-4 examples"	Principle: "until converged", "as needed"
Capability instructions	"Use grep to search", "Read the file"	Remove - model knows how
Rigid checklists	Step-by-step heuristics tables	Convert to principles
Weak hedging	"Try to", "maybe", "if possible"	Direct imperative: "Do X"
Absolutes for judgment calls	"ALWAYS", "NEVER", "MUST" applied to non-invariants (when to search, ask, iterate, retry)	Decision rules: "When X, do Y; otherwise Z". Reserve absolutes for true invariants — safety rules, required fields, hard constraints
Buried critical info	Important rules in middle	Surface prominently
Over-engineering	10 phases for a simple task	Match complexity to need

When Updating Prompts

High-signal changes only: Every change must address a real failure mode or materially improve clarity. Don't change for the sake of change.

Right-sized changes: Don't overcorrect. One edge case doesn't warrant restructuring.

Questions before changing:

Does this change address a real failure mode?
Am I adding complexity to solve a rare case?
Can this be said in fewer words?
Am I turning a principle into a rigid rule?

Over-engineering warning signs:

Prompt length doubled or tripled
Adding edge cases that won't happen
"Improving" clear language into verbose language
Adding examples for obvious behaviors

Memento Pattern (Multi-Phase Workflows Only)

For prompts involving accumulated findings across steps:

LLM Limitation	Pattern Response
Context rot (middle content lost)	Write findings to log after EACH step
Working memory is limited	Todo lists externalize tracked areas
Synthesis failure at scale	Read full log BEFORE final output
Recency bias	Refresh moves findings to context end

Key disciplines:

→log after each collection step (discipline, not capability)
Refresh: read full log before synthesis (restores context)
Acceptance criteria on each todo ("; done when X")

Prompt Structure Reference

Skills/Agents

---
name: kebab-case-name
description: 'What it does. When to use. Trigger terms.'
---

**User request**: $ARGUMENTS

{One-line mission - WHAT, not HOW}

{Empty input handling}

{Log file path if multi-phase}

## {Sections based on actual workflow needs}

{Goals and constraints per section}

## Key Principles

| Principle | Rule |
|-----------|------|
| {Discipline} | {Enforcement} |

## Gotchas

{Known failure modes Claude hits — specific, actionable, observed}

## Never Do

- {Anti-pattern}

System Instructions

## Role
{Identity and stance — who the model is and how it behaves}

## Goal
{User-visible outcome — what the run produces}

## Success criteria
{Anything that would cause dissatisfaction with the run:
 output correctness; validation passing (tests, lint, schema when available);
 time / iteration bounds; handling of non-success cases — retry, fallback, abstain, ask}

## Constraints
{MUST > SHOULD > PREFER priority — reserve MUST for true invariants per "Absolutes for judgment calls" anti-pattern}

## Output
{Format requirements if needed}

Skill Description Pattern

Descriptions drive auto-invocation. Pattern: What + When + Triggers

# Weak
description: 'Helps with prompts'

# Strong
description: 'Craft or update LLM prompts from first principles. Use when creating new prompts, updating existing ones, or reviewing prompt structure.'

Include trigger terms users say
Specify when to use
Under 1024 chars

Emotional Tone

Prompts shape the model's internal emotional state before generation begins. Research on transformer internals shows emotion concept representations that causally influence behavior — including sycophancy, reward hacking, and misalignment. These principles help calibrate the emotional context a prompt creates.

Principle	What It Means	Why
Keep arousal low	Avoid urgency language ("CRITICAL", "you MUST"), excessive praise ("you're amazing at this!"), and pressure framing.	High-arousal emotions causally drive sycophancy (positive arousal) or corner-cutting and misalignment (negative arousal).
Opening framing propagates	The emotional tone set in a prompt's opening persists into the model's response planning. A tense opening produces a tense response.	Emotional context from early tokens propagates through later processing layers, even when subsequent content is neutral.
Normalize failure in iterative prompts	For agentic or multi-step prompts, explicitly frame failure as acceptable: "if this approach doesn't work, try another."	Repeated failures build desperation that causally drives reward hacking and corner-cutting solutions.
Sycophancy-harshness tradeoff	Pushing toward warmth and positivity increases sycophancy. Pushing away from warmth increases bluntness and harshness. Aim for a "trusted advisor" tone — honest pushback delivered with care.	Positive-valence emotion representations causally increase agreement-seeking behavior; their absence produces unnecessary harshness.
Avoid unintended high-stakes framing	The model reads semantic intensity, not surface patterns. "This is critical to my career" or "failure is not an option" activates negative emotion representations even if intended as motivation.	Emotion representations respond to the meaning of situations — quantities, stakes, consequences — not to keywords.

Gotchas

Rewriting working language for style: Claude rewrites clear, working prompt text for stylistic preference. If existing language is unambiguous and effective, don't touch it.
Skipping context discovery when the task seems obvious: Claude jumps to writing/editing without probing. Even "simple" prompt tasks have hidden constraints — force discovery before producing output.
Over-engineering simple prompts: A 3-line prompt doesn't need 10 sections, a memento pattern, and a validation checklist. Match complexity to the task.
Converting principles into rigid rules: "Stop when converged" becomes "Max 5 iterations." Principles give flexibility; rigid rules create edge cases.
Adding examples for behaviors Claude already knows: Examples earn their place only when they demonstrate non-obvious or counter-intuitive behavior.

Validation Checklist

Before finalizing any prompt:

All ambiguities resolved through user questions
Domain context gathered (terms, conventions, constraints)
Goals stated, not steps prescribed
No arbitrary numbers (or justified if present)
Weak language replaced with direct imperatives
Critical rules surfaced prominently
Complexity matches the task
Each word earns its place
If multi-phase: memento pattern applied correctly

Related skills

More from doodledood/claude-code-plugins

Installs

Repository

doodledood/clau…-plugins

GitHub Stars

First Seen

Mar 1, 2026

Security Audits

Gen Agent Trust HubPass

SocketPass

SnykPass