rethink-surveys
Rethink Surveys
A survey is a conversation with a busy person. Design it like one.
This skill packages a framework for designing, critiquing, and operationalizing surveys that capture real intent — not satisficed checkbox approval. Built from the experience of designing the Proxymate muShanghai survey (which surfaced bugs in ~6 hours of testing that a category-checkbox flow would have hidden for weeks).
Tool affordances (read first)
The bundled rethink-survey-mcp-server exposes the skill's deterministic operations as MCP tools, its canonical content as MCP resources, and three framing prompts as slash-prompts. When available, prefer them over reasoning from memory.
- Use MCP tools when the user wants a specific deterministic result:
critique_surveyfor lint,get_templatefor a use-case starter,design_survey_sessionfor the staged wizard,score_responsefor a single response,cluster_responsesfor a batch. - Reason from skill content when the user wants to understand: debate a principle, discuss trade-offs, explain why hypotheticals fail, or do anything not exposed as a tool (e.g.
/turn-into-appcodegen). - Fetch resources for grounding instead of paraphrasing:
rethink://principles,rethink://principles/{id},rethink://structure/4-part,rethink://question-library/{part},rethink://use-case/{event|founder|gig},rethink://scoring/rubrics,rethink://modality/decision-tree,rethink://anti-patterns. - Slash-prompts the user may invoke:
/jarrett-review(Caroline Jarrett-style review),/design-coaching(Socratic design questions),/mom-test-check(Rob Fitzpatrick founder-discovery lint).
If the MCP server isn't connected, all of the below still works from this skill's reference files. For details on tool inputs/outputs, fallback rules, and prompt framing, read references/mcp-integration.md.
Three operating modes (commands)
/design-survey— interactive design session: research goal → audience → hypotheses → questions → modality → output schema. Walks the user through 5–8 decisions, each with explicit trade-offs. Reuses templates where appropriate./critique-survey— paste-in review: identify leading questions, double-barreled items, satisficing bait, missing behavioral anchors, demographic-for-demographic noise. Returns a Jarrett-style critique with concrete rewrites./turn-into-app— implementation handoff: takes a finalized question set and produces app scaffold (TanStack Start / Next.js / pure HTML, plus Supabase or simple JSON storage). User picks the stack; this command emits the code.
If the user's intent is unclear, ask: "Are you (a) starting fresh, (b) reviewing something you already wrote, or (c) ready to ship — meaning: turn an existing question set into an app?"
The seven design principles (the spine)
Every survey-design decision routes back to one of these. Memorize.
- Ask about real past behavior, not hypotheticals. "Last time X happened, what did you do?" beats "What would you do if X happened?" by an order of magnitude in signal quality. Hypotheticals invite imagined-self answers.
- Don't pre-signal the right answer. "Most people find X useful — would you?" is broken. Replace with neutral framing.
- Start broad and open-ended before narrowing. Open-ended question first → respondent's authentic frame; structured questions after → that frame doesn't get lost in a checkbox.
- Separate screening / diagnosis / segmentation. Screening = "are you the right respondent?" Diagnosis = "what's actually happening?" Segmentation = "what cohort do you fit into?" Mixing them leads to confusing branching and worse data.
- No double-barreled questions. "Was the food fast and cheap?" should be two questions. If you can't decompose without losing the meaning, reword.
- Behavioral anchors over Likert scales when anchors exist. "I'd lose ~30 minutes" is universal; "Somewhat annoying" is whatever the respondent thinks today.
- Honest length claim on the landing page. If you say 60 seconds, deliver 60 seconds. Lying about length is the single fastest way to nuke completion rate. Jarrett's rule: never lie about length.
For the why behind each principle (Tourangeau's 4-stage cognitive model, satisficing theory, etc.) read references/design-principles.md.
The 4-part hybrid structure
This is the structural skeleton every well-designed survey shares. Use it as a checklist:
| Part | Purpose | Example questions | Failure mode |
|---|---|---|---|
| 1. Discovery (open-ended) | Capture authentic frame before constraining it | "What's the one thing you're worried about?" / "Last time X, what did you do?" | Skipping → checkbox-only data with zero novelty |
| 2. Diagnostic (targeted, structured) | Make responses analyzable across users | Severity 1–5 with anchors / Frequency / Workaround text | Skipping → can't cluster or rank |
| 3. Intent & prioritization | Surface willingness, urgency, willingness-to-pay | Top-3 priorities / WTP buckets / "When would help most?" | Skipping → no demand signal, can't prioritize |
| 4. Segmentation & follow-up | Cohort the respondent + qualify for follow-up | Who-are-you / Group composition / Contact + split consent | Skipping → can't compare cohorts, no interview pipeline |
For each question, tag it with which part it serves. Questions that don't clearly belong to one part are usually candidates for cutting.
For battle-tested question patterns by part (with EN+ZH parallel text and trade-off notes), read references/question-library.md.
Modality decision tree
Match modality to context:
- Pure form (tap-to-select) — physical QR scan, on-venue, low-attention. Target ≤60s.
- Form with optional voice on key questions — moderate attention; the diagnostic question benefits from authentic phrasing. Target ≤90s.
- AI-conversational interviewer — pre-event email warm-list, deep research mode, willing respondents. AI probes with follow-ups based on prior answers. Target 3–5 minutes, ~10 substantive answers.
- Voice-note only — accessibility-first / language-flexible / low-literacy contexts. Single open prompt, transcribed server-side, post-processed for cluster signals.
Default rule: start with form + optional voice on the discovery question. Add AI-interviewer only if the research goal requires probing depth that a flat form can't deliver.
For the full UX/UI guide (with React/HTML scaffolds and storage schema recommendations), read references/multimodal-ux.md.
What to ask AI conversationally vs. show as form fields
| Show as form field | Ask via AI |
|---|---|
| Multiple-choice with finite known options | Open-ended diagnosis with potential follow-up |
| Numeric or scale (with behavioral anchors) | Causal explanation: "Why did that happen?" |
| Demographic/segmentation | Workaround: "What did you try first?" |
| Consent checkboxes | Edge cases the form options don't cover |
AI is for the parts where you don't know what answers to expect. Form is for the parts where you do.
Scoring & tagging framework
After collection, every response should map to:
pain_unit— the specific worry/problem captured in the discovery question, ideally with a cluster ID assigned by post-hoc LLM enrichmentseverity(1–5 with behavioral anchors)willingness_to_pay(bucket or numeric range, currency-localized for bilingual audiences)workaround_strength— derived from whether the respondent has an existing fallback (text length + sentiment)segment— derived from screening/segmentation answersinterview_score— composite for ranking interview-call candidates:severity × specificity × wtp_weight × consent_research_call × novelty_bonus
For the full intent-scoring math (composite formula, tag taxonomy, cluster assignment heuristics), read references/scoring-framework.md.
Use-case templates
Three full templates with research goal, target respondent, hypotheses, full question set, branching, and output schema:
- Event organizers — pre-event attendee surveys, post-event NPS-with-teeth, vendor-feedback
- Startup founders — customer-discovery (Mom-Test-style), product-validation, churn diagnosis
- Gig-economy workers — supply-side onboarding, fairness/pay diagnosis, retention drivers
When the user names a use case that matches one of the three (or wants to fork from one), read references/use-cases.md.
Anti-patterns to flag aggressively
- "Rate your excitement 1–10" — pure satisficing bait. Always cut.
- Demographics for demographics' sake (age, gender, nationality without a routing reason) — increases drop-off, adds zero signal. Cut unless one of those answers actually changes downstream behavior.
- Single combined consent checkbox ("OK to contact me about anything") — forces respondents to opt in to too much. Split into specific consents per channel of follow-up.
- "Optional" research questions at the bottom — respondents infer "this is the throwaway part." Move research-grade questions to the middle, where engagement is highest.
- Self-routing landing pages ("Got 60s or 2min?") — pushes a design decision onto the respondent. Pick the right instrument for the channel and serve it directly.
Workflow patterns (high level)
When user says "I want to design a survey for X"
If design_survey_session MCP tool is available, prefer it: call with stage: start, then iterate add_question, then finalize. Otherwise walk through manually:
- Scope: what's the research goal (single sentence)? Don't proceed without this.
- Audience: who's the respondent? How will they encounter the survey?
- Length budget: 60s / 90s / 3min / longer? Be honest with yourself before being honest with them.
- Hypotheses: what 2–4 things are you trying to prove or disprove? Each becomes a section.
- Match to template if available — call
get_templateif MCP is up, else readreferences/use-cases.md. - Walk through the 4-part hybrid structure, picking 1–3 questions per part.
- Modality: form / voice-mixed / AI-interviewer.
- Output schema: what columns/JSONB shape lands in the DB?
When user says "Critique this survey"
If critique_survey MCP tool is available, call it with the pasted survey and surface the typed CritiqueReport. Fetch rethink://principles/{id} if a flagged item needs explanation. Otherwise lint manually:
- Read the survey question by question.
- Check each against the seven principles. Note violations.
- Check the 4-part structure: which parts are missing or duplicated?
- Check the length claim against actual estimated time.
- Return: a punch-list of issues + concrete rewrites for the worst 3–5 + a verdict on the overall instrument.
When user says "Score / cluster these responses"
- Single response → call
score_response. Batch → callcluster_responses. Both return rubric prompts + JSON Schema; the host LLM (you) then executes the extraction. - If MCP is unavailable, work from
references/scoring-framework.md.
When user says "Turn this into an app"
No MCP tool for this — codegen stays host-side.
- Confirm the question set is final. If not → bounce back to
/design-survey. - Pick the stack: TanStack Start (we have a reference impl), Next.js, or static HTML. Default: TanStack Start if user has no preference and wants voice support.
- Pick the storage: Supabase (default), or simple JSON-on-disk for local prototypes.
- Emit the scaffold per
references/multimodal-ux.md.
Reference files (load on demand)
references/mcp-integration.md— full MCP tool/resource/prompt reference: when to call each tool, expected inputs/outputs, fallback rules. Read when first deciding tool-vs-manual, or when a tool call needs disambiguation.references/design-principles.md— Jarrett, Dillman, Tourangeau primer + satisficing theory. Read when designing from scratch and the user wants the why.references/question-library.md— battle-tested question patterns by part, with EN+ZH parallel text. Read when proposing specific questions.references/use-cases.md— three full templates: event organizers, startup founders, gig-economy workers. Read when the user names a matching use case.references/multimodal-ux.md— UX/UI patterns + React scaffolds. Read when running/turn-into-app.references/scoring-framework.md— full intent-scoring math + cluster taxonomy. Read when ranking or clustering responses.
What this skill is NOT for
- Statistical analysis of response data — that's a downstream job, use a different tool.
- Survey-platform recommendations (Typeform vs. Tally vs. SurveyMonkey) — this skill is opinionated about the survey itself, not the SaaS that hosts it. We assume custom code.
- General market research strategy — survey design is one tool, not the strategy.
More from ooiyeefei/ccc
excalidraw
Generate architecture diagrams as .excalidraw files from codebase analysis, with optional PNG/SVG export. Use when the user asks to create architecture diagrams, system diagrams, visualize codebase structure, generate excalidraw files, export excalidraw diagrams to PNG or SVG, or convert .excalidraw files to image formats.
73streak
Universal challenge tracker with flexible cadence, intelligent insights, and cross-challenge learning detection. Use when user wants to track any personal challenge - learning, habits, building, fitness, creative, or custom. Supports daily, weekly, or N-day check-ins with type-adaptive preferences, backlog, and context files.
16product-management
This skill should be used when the user asks to "analyze my product", "research competitors", "find feature gaps", "create feature request", "prioritize backlog", "generate PRD", "plan roadmap", "what should we build next", "competitive analysis", "gap analysis", "sync issues", or mentions product management workflows. Provides AI-native PM capabilities for startups with signal-based feature tracking, the WINNING prioritization filter, and GitHub Issues integration with deduplication.
8uat-testing
End-to-end User Acceptance Testing for web applications. Analyzes branch changes and specs to generate exhaustive test cases, sets up the local environment, executes tests via Playwright browser automation, and produces a pass/fail results report with screenshots and fix documentation. Use when the user says "run UAT", "test this feature", "UAT testing", "acceptance test", "test my branch", "generate test cases", or wants to verify a feature branch against its spec before merge.
7landing-page-gtm
Build and update high-converting SaaS landing pages with GTM-aware marketing copy, competitive positioning, and sales psychology. Use when creating new landing pages, rewriting feature cards, updating marketing copy, launching product pages, or transforming technical features into customer-facing sales language. Triggers on "build landing page", "update feature cards", "rewrite marketing copy", "create product page", "launch page", "GTM", "sales copy", "competitive positioning", or when converting product features into conversion-focused web pages.
4agentic-system-design
Prescriptive Q&A workflow for designing agentic pipelines, multi-model councils, sub-agent hierarchies, and tool-loop hardening for any domain. Use when the user asks to "design an agent", "design a multi-agent system", "should I use a council/debate", "build a [domain] review agent" (HAZOP, finance, tutorial, marketing, compliance, accounting), "real agency vs workflow", "how to add sub-agents", "AI for [domain] review", or names patterns like "orchestrator-worker", "evaluator-optimizer", "Magentic", "ReAct", "plan-and-execute", "handoffs". Walks the user through 12 stages one question at a time and emits a buildable design doc with citations. Do NOT use for general coding questions, single-shot prompt tuning, or bare "use Claude to do X" requests with no agency requirement.
1