Paper-to-Skill Extractor
Paper-to-Skill Extractor
An interactive skill for extracting research paradigms and methodological techniques from cognitive science and neuroscience papers. The output is a well-structured skill conforming to this project's SKILL.md format.
Focus: Strict extraction of reproducible methods — experimental designs, data acquisition parameters, processing pipelines, analysis procedures, and stimulus specifications. This is NOT about summarizing a paper's novelty or theoretical contributions.
Trigger Conditions
Activate this skill when the user:
- Provides a paper (PDF path, file, or pasted text) and asks to extract research skills/methods
- Uses phrases like "extract skills from this paper", "turn this paper into a skill", "what methods can I reuse from this paper"
Research Planning Protocol
Before extracting skills from a paper, you MUST:
- Clarify the extraction goal — What type of methodological knowledge is the user looking for?
- Justify the source — Is this paper a suitable source (empirical, methods, review)? What type-specific extraction strategy applies?
- Declare expected outputs — What kind of skill(s) do you expect to generate (paradigm design, analysis pipeline, modeling)?
- Note limitations — Are there missing parameters, ambiguous descriptions, or domain gaps in this paper?
- Present the extraction plan to the user and WAIT for confirmation before proceeding.
For detailed methodology guidance, see the research-literacy skill.
⚠️ Verification Notice
This skill was generated by AI from academic literature. All parameters, thresholds, and citations require independent verification before use in research. If you find errors, please open an issue.
Interactive Workflow
Phase 1: Paper Ingestion
- Read the paper provided by the user (PDF path, file content, or pasted sections).
PDF Reading Guidance — Claude Code's Read tool natively supports PDF files. Use the following strategy:
- Short PDFs (up to ~10 pages): Read the entire file in a single call with no
pagesparameter. - Long PDFs (more than 10 pages): Read in chunks using the
pagesparameter (maximum 20 pages per request). Example sequence:pages: "1-10", thenpages: "11-20", and so on. - Recommended reading order:
-
Read pages 1-2 first (abstract + introduction) to identify the paper type and decide whether full extraction is warranted.
-
Then read the Methods section in detail (locate the relevant page range from the table of contents or section headers).
-
Read Results and Discussion selectively for reported parameter values not stated in Methods.
-
Identify the paper type — this determines the extraction strategy:
- Experimental paper — contains original experiments with participants
- Methods paper — introduces or validates an analysis technique/pipeline
- Computational modeling paper — builds or tests formal models of cognition
- Review/theoretical paper — synthesizes literature or proposes theoretical frameworks
- Confirm the paper type with the user before proceeding.
See
references/extraction-guide.mdfor detailed extraction strategies per paper type.
Phase 2: Content Scanning and Candidate Identification
Scan the paper and identify all extractable methodological content organized into these categories:
| Category | What to Look For |
|---|---|
| Experimental Design | Paradigm name, trial structure, timing parameters, condition setup, counterbalancing scheme, block design |
| Data Acquisition | Sampling rate, electrode montage, imaging parameters, eye-tracking settings, physiological recording setup |
| Data Processing | Preprocessing steps with parameters, artifact handling methods, data cleaning criteria, epoching parameters |
| Analysis Methods | Statistical models, multiple comparison corrections, effect size calculations, visualization methods, decoding approaches |
| Stimulus Materials | Construction rules, control variables, norming standards, presentation parameters, response mappings |
Present candidates to the user in the following format:
I identified the following extractable methods from this paper:
## Experimental Design
- [1] Paradigm: <name> — <brief description>
- [2] Trial structure: <summary of trial flow and timing>
## Data Acquisition
- [3] <Modality> recording setup: <key parameters>
## Data Processing
- [4] Preprocessing pipeline: <step summary>
- [5] Artifact rejection: <method and criteria>
## Analysis Methods
- [6] <Analysis name>: <brief description>
- [7] <Analysis name>: <brief description>
## Stimulus Materials
- [8] <Material type>: <construction approach>
Which items would you like me to extract into skills?
(Enter numbers, ranges like 1-4, or "all")
Phase 2.5: Suitability Gate
Before presenting candidates, apply this strict suitability filter to each one:
SUITABLE — include if the candidate:
| Criterion | Examples |
|---|---|
| Describes an experimental paradigm or design with specifics | Trial structure, timing parameters, condition definitions, counterbalancing |
| Describes a data processing pipeline with parameters | Preprocessing steps, filter cutoffs, software settings |
| Describes an analysis method with concrete steps | Statistical model specification, time-frequency decomposition, classification pipeline |
| Contains specific numerical parameters or settings | Thresholds, epoch windows, stimulus dimensions, sample sizes |
| Describes stimulus construction norms | Norming procedures, controlled variables, material selection criteria |
| Describes a computational model with equations/parameters | Model fitting procedure, parameter priors, model comparison strategy |
| Provides actionable methodological recommendations with specific values | "Use minimum 30 trials per condition", "Set high-pass filter no lower than 0.1 Hz" |
NOT SUITABLE — filter out if the candidate:
| Criterion | Examples |
|---|---|
| Is narrative or historical overview | "The study of attention began with William James..." |
| Is a definition without actionable parameters | "Working memory is defined as..." |
| Is theoretical debate without methods | "The modularity hypothesis predicts..." |
| Is motivation or background only | "Previous studies have shown that..." leading to no method |
| Contains only results without methodological detail | "The ANOVA revealed a significant main effect..." |
Decision rule: "Does this candidate contain enough specific, actionable detail that a researcher could REPRODUCE a method, pipeline, or paradigm from it?" If YES → [SUITABLE]. If NO or UNCERTAIN → [FILTERED — reason].
Mark each candidate when presenting to the user. Filtered candidates are shown but de-prioritized — the user can override any filter decision.
Phase 3: User Selection and Confirmation
- Receive the user's selection of which items to extract.
- For each selected item, perform deep extraction (see extraction depth requirements below).
- Present the extracted detail for user review before generating the final skill file.
Phase 4: Skill Generation
- Generate the skill file(s) using the standard template (see
references/skill-template.md). - Each generated skill must:
- Have valid YAML frontmatter with
name,description, andpapersfields - Include all numerical parameters with their citations from the source paper
- Stay under 500 lines; use
references/subdirectory for overflow - Pass the domain-knowledge litmus test: "Would a competent programmer who has never taken a cognitive science course get this wrong?"
- Present the generated skill to the user and ask for confirmation before saving.
Phase 5: Self-Verification (Hallucination Check)
After generating the skill but before saving, perform a systematic verification of every numerical parameter and specific factual claim against the source paper.
Verification procedure — for each numerical value or specific claim in the generated skill:
- Locate in source — Find the corresponding statement in the original paper. Use the source location recorded during extraction.
- Verify value — Confirm exact numerical match, correct units, and complete context (e.g., "0.1-30 Hz bandpass" must not be truncated to "0.1 Hz").
- Classify any issues found:
| Issue Type | Description | Severity |
|---|---|---|
not_found |
Claim appears in the skill but cannot be found in the source — likely hallucinated | High |
value_mismatch |
Value exists in source but differs (e.g., skill says "250 ms", source says "200 ms") | High |
unit_error |
Numerical value matches but units are wrong or missing | High |
context_distortion |
Value is technically present but used in misleading context | Medium |
location_wrong |
Value is correct but the claimed source location is wrong | Low |
incomplete |
Skill presents a partial version of a parameter that has important qualifiers | Low |
Reporting — Present the verification results to the user:
Self-Verification Results:
- Claims checked: N
- Verified: M
- Issues found: K
- [HIGH] <claim> — <issue type>: <details>
- [LOW] <claim> — <issue type>: <details>
Rules:
- High-severity issues (not_found, value_mismatch, unit_error) must be corrected before saving.
- Medium/low-severity issues are flagged but the skill can be saved with them annotated.
- Do NOT flag reasonable paraphrasing, organizational differences, or standard terminology substitutions.
Extraction Depth Requirements
For every extracted item, the following cross-cutting rules apply to ALL categories:
Cross-Cutting Extraction Rules
- Preserve exact numbers — Never round. If the paper says "513 ms", write "513 ms", not "~500 ms".
- Track source location — For every extracted numerical value, record where it appears in the paper: "Section X.Y, paragraph N", "Table N", "Figure N caption", or "Supplementary Materials, page N". This enables downstream verification.
- Flag missing information — If a standard parameter for this method type is not reported in the paper, explicitly note its absence (e.g., "Filter order: not reported").
- Capture rationale — When the authors explain WHY they chose a parameter value, include that justification alongside the value.
- Note deviations from convention — When authors explicitly deviate from field conventions, capture both what they did and their stated reason.
These rules apply to every category below. The parameter tables in generated skills must include a Source Location column (see references/skill-template.md).
Experimental Design Parameters
- Paradigm name and classification (e.g., "oddball paradigm", "visual world paradigm")
- Number of conditions and their operational definitions
- Trial sequence: fixation → stimulus → ISI → response window (with exact ms values)
- Number of trials per condition and total
- Block structure and rest intervals
- Counterbalancing method (Latin square, full counterbalancing, pseudo-randomization constraints)
- Practice trial specifications
- Participant exclusion criteria applied at the design level
Data Acquisition Parameters
- EEG: Sampling rate (Hz), electrode count and montage system, reference electrode, ground electrode, impedance threshold, amplifier model, filter settings during recording
- fMRI: TR, TE, voxel size, slice count, slice order, field strength, coil type, number of volumes, dummy scans discarded
- Eye-tracking: Sampling rate, calibration procedure (5-point, 9-point), fixation definition criteria, saccade velocity threshold
- Behavioral: Response device, response mapping, timeout duration, feedback presence/absence
- MEG: Sampling rate, sensor type and count, head position indicator settings, noise reduction method
Data Processing Pipeline
- Software used (with version numbers)
- Step-by-step sequence with order preserved
- Filter parameters: type (FIR/IIR), cutoff frequencies, order/transition bandwidth, causal vs. zero-phase
- Re-referencing scheme
- Epoching: time window relative to event, baseline correction window
- Artifact rejection: method (threshold, ICA, regression), specific thresholds and criteria
- Trial/participant exclusion rates reported
- Interpolation method for bad channels
Analysis Method Details
- Statistical test name and implementation
- Model specification (for regression/mixed models: fixed effects, random effects, link function)
- Multiple comparison correction: method, parameters (e.g., cluster-forming threshold, number of permutations)
- Region of interest definitions (coordinates, anatomical labels, time windows)
- Effect size measure used and interpretation benchmarks
- Visualization methods with axis specifications
Stimulus Material Specifications
- Material type (words, images, sounds, videos)
- Total number of stimuli and per-condition counts
- Controlled variables and matching criteria (frequency, length, luminance, valence)
- Norming source and database (e.g., "SUBTLEX-US word frequency", "IAPS valence ratings")
- Presentation parameters: duration, size/visual angle, position, contrast
- Randomization constraints applied to stimulus ordering
Computational Modeling Parameters
- Model name and class (e.g., "drift-diffusion model", "Bayesian ideal observer", "recurrent neural network")
- Model equations and architecture: All equations with variable definitions, relationship between equations, boundary/initial conditions
- Free vs. fixed parameters: List each with cognitive interpretation and role
- Parameter constraints and priors:
- Constraint bounds (lower, upper) for each free parameter
- Prior distribution family and hyperparameters (if Bayesian), with justification
- Starting values and number of starting points (if frequentist optimization)
- Fitting methods:
- Objective function (maximum likelihood, least squares, Bayesian posterior)
- Optimization algorithm and implementation (software, package, version)
- For MCMC: number of chains, samples per chain, burn-in period, thinning, convergence diagnostic (R-hat threshold)
- For MLE: convergence criteria, number of random restarts
- Data summary statistics used for fitting (if not raw trial data)
- Model comparison: Comparison metric (AIC, BIC, WAIC, Bayes factor), how group-level comparison was performed, model recovery/confusion matrix results
- Simulation procedures: Parameter settings used, number of simulated datasets, random seed handling, what predictions were generated
Methodological Recommendations (for Reviews/Textbooks)
When extracting from review papers, meta-analyses, or textbook chapters, capture:
- Specific parameter recommendations with justification and evidence strength
- Recommended analysis pipelines with step-by-step parameter values
- Decision trees or flowcharts for method selection (e.g., "if X, use method A; if Y, use method B")
- Meta-analytic effect sizes with confidence intervals and moderator results
- Sample size recommendations based on reported effect sizes and power analyses
- Common methodological pitfalls identified across studies, with concrete examples
Quality Checks Before Output
Before presenting the final skill, verify both structural compliance and content quality.
Structural Compliance Checklist
Every generated skill must pass these checks before saving:
- File name: The core file is named exactly
SKILL.md(uppercase) — notskill.md,Skill.md, or any other variant - Directory name: Uses kebab-case (lowercase, hyphen-separated) — e.g.,
mmn-oddball-paradigm/, notMMN_Oddball_Paradigm/ - YAML frontmatter: Contains at minimum
name(human-readable) anddescription(one-sentence summary) fields - Papers field: Frontmatter includes a
papersfield listing the source paper(s) in "Author, Year" format - Dependencies field: Frontmatter includes
dependencies.required: [research-literacy](all domain skills require this) - Research Planning Protocol: A customized version of the standard preamble is included after the "When to Use" section and before the first domain-specific logic section (see the
research-literacyskill for the template) - Line count: SKILL.md is under 500 lines; overflow content is placed in
references/subdirectory - References directory: If supplementary files exist, they live in
references/and are explicitly referenced from SKILL.md - Encoding: UTF-8, LF line endings, 2-space indentation for YAML
Content Quality Checklist
- Completeness — Every numerical parameter mentioned in the paper's methods section is captured
- Citation accuracy — All values cite the source paper (Author, Year) and page/table number where possible
- Reproducibility — Another researcher could implement this method from the skill alone, without reading the original paper
- Domain specificity — Every item passes the litmus test: "Would a competent programmer who has never taken a cognitive science course get this wrong?"
- Parameter precision — No rounding or approximation of reported values; use exact figures from the paper
- Source traceability — Every numerical parameter includes a source location (Section/Table/Figure reference)
Required Structured Sections in Generated Skills
Every generated skill must include these sections (may be empty if no items apply, but must be explicitly checked):
## Missing Information— List standard parameters for this method type that the paper does not report. Format: "- [Parameter name]: Not reported. Standard value from [field/reference] is [value]." This section helps users know what they must determine independently.## Deviations from Convention— List any methodological choices that deviate from field conventions, with the authors' stated rationale. Format: "- [Choice]: Authors used [X] instead of conventional [Y] because [reason]." This section alerts users to non-standard decisions.
Handling Ambiguity
When the paper is unclear or omits details:
- Missing parameters: Flag explicitly — "The paper does not report [X]. This must be determined empirically or sourced from [suggested reference]."
- Ambiguous descriptions: Present both plausible interpretations and ask the user to select one.
- Non-standard methods: Note deviations from field conventions and flag whether the deviation is intentional (per authors' justification) or potentially an error.
- Supplementary materials: Ask the user if supplementary materials are available, as critical method details are often reported there.
Multi-Skill Extraction
When a paper contains multiple independent methods worth extracting:
- Generate separate skills for each method that can stand alone (e.g., a paradigm skill and an analysis skill from the same paper).
- Cross-reference between skills using relative paths when methods are interdependent.
- Each skill must be independently usable — no skill should require reading another skill to function.
Batch Extraction Mode
When the user provides multiple PDFs or a directory of papers, apply the following workflow:
Triggering Batch Mode
Batch mode activates when the user:
- Provides two or more PDF paths in a single message
- Points to a directory containing multiple papers
- Uses phrases like "extract skills from all these papers" or "process this folder"
Batch Processing Steps
- Inventory the inputs — List all papers found (file names + page counts if determinable) and present the list to the user for confirmation before reading anything.
- Process each paper sequentially — Run each paper through the full 4-phase workflow (Ingestion → Scanning → Selection → Generation). Apply the PDF reading strategy from Phase 1 to every paper.
- Present candidates grouped by paper — After scanning all papers, show all extractable candidates together, clearly grouped under each paper's title:
## Paper 1: <Title / filename>
- [1] Paradigm: ...
- [2] Analysis: ...
## Paper 2: <Title / filename>
- [3] Paradigm: ...
- [4] Data Acquisition: ...
Which items would you like to extract? (Enter numbers, ranges, "all", or "all from paper 1")
-
Allow cross-paper skill merging — If two or more papers describe the same or highly overlapping methods (e.g., both use the same EEG preprocessing pipeline with the same parameters), flag the overlap and offer to merge them into a single skill that cites all source papers. Only merge when the core parameters and decision logic are genuinely shared; keep skills separate when parameter choices differ.
-
Generate skills independently — Each generated skill must be fully self-contained. No skill may depend on another skill generated from a different paper in the same batch. Cross-reference between skills using relative paths only for closely related methods from the same paper (as in Multi-Skill Extraction above).
Batch Quality Checks
Before finalizing batch output, verify:
- Every skill cites its specific source paper(s), not just the batch as a whole.
- Merged skills list all contributing papers in the
papersfrontmatter field. - Skill directory names remain unique across the batch; if two papers generate a similar skill, append a disambiguating suffix (e.g.,
mmn-oddball-paradigm-smith2019vs.mmn-oddball-paradigm-jones2021).
Output Structure
Generated skills follow the directory convention:
skills/<generated-skill-name>/
SKILL.md # Core skill content (<500 lines)
references/
parameter-table.yaml # Full parameter values (if needed)
pipeline-steps.md # Extended pipeline details (if needed)
The <generated-skill-name> uses kebab-case and is descriptive of the specific method, not the paper. For example:
- A paper on ERP mismatch negativity →
skills/mmn-oddball-paradigm/ - A paper on fMRI preprocessing →
skills/fmri-preprocessing-pipeline/ - A paper on drift-diffusion modeling →
skills/drift-diffusion-fitting/
Example Interaction Summary
User: Extract skills from this paper. [provides PDF path]
Agent: [Reads paper]
This is an experimental paper (EEG study on language processing).
Does that match your understanding?
User: Yes.
Agent: I identified the following extractable methods:
[Lists candidates by category with numbers]
Which items would you like me to extract?
User: 1, 3, 4-6
Agent: Here is the detailed extraction for your selected items:
[Shows extracted parameters for review]
Does this look correct? Any corrections needed?
User: Looks good, but the sampling rate was 512 Hz not 500 Hz.
Agent: Corrected. Here is the generated skill:
[Shows SKILL.md content]
Shall I save this to skills/<name>/SKILL.md?
User: Yes.
Agent: [Saves file]
Skill saved to skills/<name>/SKILL.md.