primekg
SKILL.md
PrimeKG Knowledge Graph Skill
Overview
PrimeKG is a precision medicine knowledge graph that integrates over 20 primary databases and high-quality scientific literature into a single resource. It contains over 100,000 nodes and 4 million edges across 29 relationship types, including drug-target, disease-gene, and phenotype-disease associations.
Key capabilities:
- Search for nodes (genes, proteins, drugs, diseases, phenotypes)
- Retrieve direct neighbors (associated entities and clinical evidence)
- Analyze local disease context (related genes, drugs, phenotypes)
- Identify drug-disease paths (potential repurposing opportunities)
Data access: Programmatic access via query_primekg.py. Data is stored at C:\Users\eamon\Documents\Data\PrimeKG\kg.csv.
When to Use This Skill
This skill should be used when:
- Knowledge-based drug discovery: Identifying targets and mechanisms for diseases.
- Drug repurposing: Finding existing drugs that might have evidence for new indications.
- Phenotype analysis: Understanding how symptoms/phenotypes relate to diseases and genes.
- Multiscale biology: Bridging the gap between molecular targets (genes) and clinical outcomes (diseases).
- Network pharmacology: Investigating the broader network effects of drug-target interactions.
Core Workflow
1. Search for Entities
Find identifiers for genes, drugs, or diseases.
from scripts.query_primekg import search_nodes
# Search for Alzheimer's disease nodes
results = search_nodes("Alzheimer", node_type="disease")
# Returns: [{"id": "EFO_0000249", "type": "disease", "name": "Alzheimer's disease", ...}]
2. Get Neighbors (Direct Associations)
Retrieve all connected nodes and relationship types.
from scripts.query_primekg import get_neighbors
# Get all neighbors of a specific disease ID
neighbors = get_neighbors("EFO_0000249")
# Returns: List of neighbors like {"neighbor_name": "APOE", "relation": "disease_gene", ...}
3. Analyze Disease Context
A high-level function to summarize associations for a disease.
from scripts.query_primekg import get_disease_context
# Comprehensive summary for a disease
context = get_disease_context("Alzheimer's disease")
# Access: context['associated_genes'], context['associated_drugs'], context['phenotypes']
Relationship Types in PrimeKG
The graph contains several key relationship types including:
protein_protein: Physical PPIsdrug_protein: Drug target/mechanism associationsdisease_gene: Genetic associationsdrug_disease: Indications and contraindicationsdisease_phenotype: Clinical signs and symptomsgwas: Genome-wide association studies evidence
Best Practices
- Use specific IDs: When using
get_neighbors, ensure you have the correct ID fromsearch_nodes. - Context first: Use
get_disease_contextfor a broad overview before diving into specific genes or drugs. - Filter relationships: Use the
relation_typefilter inget_neighborsto focus on specific evidence (e.g., onlydrug_protein). - Multiscale integration: Combine with
OpenTargetsfor deeper genetic evidence orSemantic Scholarfor the latest literature context.
Resources
Scripts
scripts/query_primekg.py: Core functions for searching and querying the knowledge graph.
Data Path
- Data:
/mnt/c/Users/eamon/Documents/Data/PrimeKG/kg.csv - Total nodes: ~129,000
- Total edges: ~4,000,000
- Database: CSV-based, optimized for pandas querying.
Weekly Installs
5
Repository
k-dense-ai/clau…c-skillsGitHub Stars
15.1K
First Seen
7 days ago
Security Audits
Installed on
opencode4
gemini-cli4
claude-code4
github-copilot4
codex4
kimi-cli4