skills/kaichop/phenoskill/phenosnap-phenotype-extractor

phenosnap-phenotype-extractor

SKILL.md

When to use

Use this skill when the user provides their own:

  • clinical phenotypes / symptoms / diagnoses (free text, bullet lists, clinical note-like text), and/or
  • drugs/medications (names, dosages, frequencies).

Examples that should trigger:

  • “Symptoms: ataxia, seizures, developmental delay. Meds: levetiracetam 500 mg BID.”
  • “I’m taking metformin 500mg daily and have fatigue, polyuria, blurry vision.”

When NOT to use

Do not use this skill when:

  • The user asks general questions (e.g., “What is HPO?”, “What is a phenotype?”, “What is GLP-1?”).
  • The user provides text that is not personal clinical information (news articles, academic paragraphs, code, etc.).
  • The user asks you to interpret someone else’s private clinical record (PHI) without clear permission.

Safety & privacy

  • Treat user input as potentially sensitive clinical information.
  • Do not upload user text or extracted results anywhere (this skill is local-only).
  • Before writing any input to disk, redact obvious identifiers:
    • emails, phone numbers, street addresses
    • MRN-like long numeric identifiers (e.g., 8+ digits)
    • names when clearly presented as “Name: …”
  • If the message appears to include highly identifying PHI (e.g., name + DOB + address, or name + MRN), pause and ask for confirmation to proceed, recommending the user remove identifiers first.

Requirements / setup

  • Python: python3 available on PATH.
  • Network access (only for initial PhenoSnap download if missing).
  • HPO OBO file:
    • Default expected path: {baseDir}/resources/hp.obo
    • Override path via environment variable: HPO_OBO_PATH
    • This skill does not auto-download hp.obo. You must supply it.

Recommended (best practice):

  • Use a virtual environment (venv/conda) before running this skill, because it may install Python packages via pip.

Inputs & outputs

  • Input text file (redacted):
    • {baseDir}/artifacts/phenosnap_inputs/input_<YYYYMMDD_HHMMSS>.txt
  • Output JSON file (timestamped):
    • {baseDir}/artifacts/phenosnap_outputs/phenotypes_<YYYYMMDD_HHMMSS>.json
  • Third-party download cache:
    • {baseDir}/third_party/phenosnap_main.zip
    • {baseDir}/third_party/get-pip.py

Detection heuristic (activation check)

Trigger if the user message contains any of:

  • phenotype cues: symptom(s), phenotype(s), Dx, diagnosis, PMH, Hx, history of, or a symptom-like list
  • medication cues: meds, medications, taking, prescribed, plus patterns like:
    • dosages: \b\d+(\.\d+)?\s?(mg|mcg|g|ml|units)\b
    • frequencies: qd, q.d., bid, b.i.d., tid, t.i.d., qhs, qAM, qPM, daily, weekly

Do not trigger for purely informational questions without user-provided phenotype/medication content.


Procedure

0) Create required directories

Create if missing:

  • {baseDir}/PhenoSnap/
  • {baseDir}/artifacts/phenosnap_inputs/
  • {baseDir}/artifacts/phenosnap_outputs/
  • {baseDir}/resources/
  • {baseDir}/third_party/

1) Confirm / redact sensitive identifiers

  1. Scan the user message for identifiers (email/phone/address/MRN/name fields).
  2. If strongly identifying PHI is present (name + DOB/address/MRN):
    • Ask the user to confirm proceeding and recommend removing identifiers.
  3. Produce a redacted version of the user text:
    • Replace emails with [REDACTED_EMAIL]
    • Replace phone numbers with [REDACTED_PHONE]
    • Replace long numeric IDs with [REDACTED_ID]
    • Replace address-like patterns with [REDACTED_ADDRESS]
    • Replace explicit “Name: …” fields with Name: [REDACTED_NAME]

2) Ensure PhenoSnap exists locally

Target location: {baseDir}/PhenoSnap/

2A) If {baseDir}/PhenoSnap/ exists and contains extract_phenotypes.py

Proceed to dependency self-test.

2B) If {baseDir}/PhenoSnap/ does NOT exist (or missing extract_phenotypes.py)

Prefer git; fallback to zip.

If git is available

Run:

  • git clone https://github.com/WGLab/PhenoSnap.git "{baseDir}/PhenoSnap"

If git is NOT available

Download zip:

  • URL: https://github.com/WGLab/PhenoSnap/archive/refs/heads/main.zip
  • Destination: {baseDir}/third_party/phenosnap_main.zip

Download method (pick first available):

  • If curl exists:
    • curl -L "https://github.com/WGLab/PhenoSnap/archive/refs/heads/main.zip" -o "{baseDir}/third_party/phenosnap_main.zip"
  • Else on Windows PowerShell:
    • Invoke-WebRequest -Uri "https://github.com/WGLab/PhenoSnap/archive/refs/heads/main.zip" -OutFile "{baseDir}/third_party/phenosnap_main.zip"

Unzip (choose by OS/tools):

  • Windows PowerShell:
    • Expand-Archive -Path "{baseDir}/third_party/phenosnap_main.zip" -DestinationPath "{baseDir}/third_party/phenosnap_unzip" -Force
  • macOS/Linux with unzip:
    • unzip -o "{baseDir}/third_party/phenosnap_main.zip" -d "{baseDir}/third_party/phenosnap_unzip"

If neither unzip nor Expand-Archive is available, use Python:

  • python3 -c "import zipfile; z=zipfile.ZipFile(r'{baseDir}/third_party/phenosnap_main.zip'); z.extractall(r'{baseDir}/third_party/phenosnap_unzip')"

Then rename/move the extracted folder to {baseDir}/PhenoSnap/:

  • The extracted folder is typically {baseDir}/third_party/phenosnap_unzip/PhenoSnap-main
  • Move/rename to {baseDir}/PhenoSnap/

Final check:

  • Verify {baseDir}/PhenoSnap/extract_phenotypes.py exists.
  • If not found, stop and report the directory listing of {baseDir}/PhenoSnap/ and {baseDir}/third_party/phenosnap_unzip/.

3) Dependency self-test and auto-install

Run from inside {baseDir}/PhenoSnap/.

3A) Smoke test importability

Run:

  • python3 -c "import importlib.util; spec=importlib.util.spec_from_file_location('extract_phenotypes','extract_phenotypes.py'); m=importlib.util.module_from_spec(spec); spec.loader.exec_module(m); print('ok')"

If it prints ok, proceed.

3B) If smoke test fails with missing module (ModuleNotFoundError / ImportError)

Step 1: Ensure pip exists for python3

Check:

  • python3 -m pip --version

If that fails, try:

  • python3 -m ensurepip --upgrade

Check again:

  • python3 -m pip --version

If still failing, bootstrap pip via get-pip.py:

  • Download https://bootstrap.pypa.io/get-pip.py to {baseDir}/third_party/get-pip.py
    • with curl:
      • curl -L "https://bootstrap.pypa.io/get-pip.py" -o "{baseDir}/third_party/get-pip.py"
    • or PowerShell:
      • Invoke-WebRequest -Uri "https://bootstrap.pypa.io/get-pip.py" -OutFile "{baseDir}/third_party/get-pip.py"
  • Install:
    • python3 "{baseDir}/third_party/get-pip.py"
  • Verify:
    • python3 -m pip --version

If pip still cannot be used, stop and report the error output.

Step 2: Install PhenoSnap dependencies

From {baseDir}/PhenoSnap/, run:

  • python3 -m pip install -r requirements.txt

Step 3: Re-run smoke test once

Re-run:

  • python3 -c "import importlib.util; spec=importlib.util.spec_from_file_location('extract_phenotypes','extract_phenotypes.py'); m=importlib.util.module_from_spec(spec); spec.loader.exec_module(m); print('ok')"

If still failing, stop and return:

  • the missing module name (from error)
  • the command output
  • recommended fix (use a venv/conda env; verify python3/pip; rerun pip install)

4) Prepare HPO OBO path

Resolve HPO OBO path in this order:

  1. If env var HPO_OBO_PATH is set and file exists, use that.
  2. Else use {baseDir}/resources/hp.obo if it exists.

If the resolved file does not exist:

  • Stop and tell the user to place hp.obo at {baseDir}/resources/hp.obo or set HPO_OBO_PATH to its full path.

5) Write input file (redacted)

  • Timestamp format: YYYYMMDD_HHMMSS (local time).
  • Write redacted user text to:
    • {baseDir}/artifacts/phenosnap_inputs/input_<TS>.txt

6) Run extraction

From {baseDir}/PhenoSnap/, run:

  • python3 extract_phenotypes.py --input-file "{baseDir}/artifacts/phenosnap_inputs/input_<TS>.txt" --hpo-obo "<HPO_OBO_PATH>" --output "{baseDir}/artifacts/phenosnap_outputs/phenotypes_<TS>.json" --format json

Validate:

  • Output file exists
  • Output file is non-empty

If validation fails:

  • Return stderr/stdout
  • Provide troubleshooting steps (missing hp.obo, permission issues, dependency issues)

7) Respond to user

Return a concise confirmation:

  • Detected content: “phenotypes” and/or “medications”
  • Input file path (redacted)
  • Output file path (timestamped JSON)
  • Note: no data uploaded; local-only
  • Any warnings (e.g., missing hp.obo, PHI redaction/confirmation)

Troubleshooting

  • PhenoSnap folder exists but script missing: confirm {baseDir}/PhenoSnap/extract_phenotypes.py exists.
  • No git: zip fallback should run; ensure curl/PowerShell is available for download.
  • Unzip fails: use Python zipfile fallback.
  • pip missing: ensurepip then get-pip.py steps above; consider installing Python with “pip” included.
  • Permission errors installing packages: use a virtual environment:
    • python3 -m venv .venv then activate and rerun skill.
  • hp.obo missing: place file at {baseDir}/resources/hp.obo or set HPO_OBO_PATH.

Examples

Example 1 (phenotypes + meds)

User:

  • “Symptoms: developmental delay, seizures, ataxia. Meds: valproate 250 mg BID.”

Action:

  • Write redacted input → run PhenoSnap → output phenotypes_<TS>.json

Example 2 (meds only)

User:

  • “Current meds: metformin 500mg daily, atorvastatin 20 mg qhs.”

Action:

  • Extract medication entities/phenotype-related terms supported by PhenoSnap → output JSON

Example 3 (should NOT trigger)

User:

  • “What is the Human Phenotype Ontology and how is it used?”

Action:

  • Do not run extraction; answer informationally outside this skill.
Weekly Installs
3
GitHub Stars
1
First Seen
Feb 27, 2026
Installed on
openclaw3
gemini-cli3
github-copilot3
codex3
kimi-cli3
cursor3