NYC
skills/letta-ai/skills/sparql-university

sparql-university

SKILL.md

SPARQL University Query Tasks

Overview

This skill provides guidance for writing SPARQL queries against RDF/Turtle datasets, with emphasis on ensuring complete data analysis, proper query construction, and thorough verification.

Workflow

Step 1: Complete Data Acquisition

Before writing any query, ensure complete visibility of the source data.

Critical actions:

  • Read the entire Turtle (.ttl) or RDF file without truncation
  • If data appears truncated, request additional content or use pagination
  • Count distinct entities to verify data completeness
  • Document all entity types, predicates, and relationships observed

Verification checkpoint: Confirm the number of distinct entities matches expectations before proceeding.

Step 2: Schema Understanding

Map out the data structure before query construction.

Key elements to identify:

  • All entity types (classes) in the dataset
  • All predicates/properties used
  • Relationships between entities (e.g., professor → department → students)
  • Data types for literals (strings, dates, integers)
  • Naming conventions and value formats

Common patterns in academic data:

  • Roles/titles often use specific prefixes (e.g., "Professor of", "Associate Professor")
  • Dates may require comparison logic for "current" status
  • Geographic codes may use ISO standards (country codes)
  • Enrollment may span multiple departments

Step 3: Criteria Decomposition

Break down filtering requirements into discrete, testable conditions.

For each criterion:

  1. Identify the exact predicate path to the relevant data
  2. Determine the comparison type (equality, prefix match, membership, numeric)
  3. Consider edge cases in the criterion interpretation
  4. Test each criterion independently before combining

Example decomposition:

  • "Full professors" → Filter where role starts with specific prefix
  • "Working in EU countries" → Filter country codes against EU membership list
  • "Departments with >10 students" → Count students per department, apply threshold

Step 4: Query Construction

Build the query incrementally with validation at each stage.

Construction sequence:

  1. Start with the most restrictive filter to reduce result set
  2. Add one filter at a time, verifying intermediate results
  3. Include all necessary SELECT variables
  4. Add aggregation (GROUP BY, GROUP_CONCAT) last

Syntax validation:

  • Verify all prefixes are declared
  • Ensure FILTER expressions are properly closed
  • Check string comparisons use correct functions (STRSTARTS, CONTAINS, regex)
  • Confirm numeric comparisons handle data types correctly

Output format considerations:

  • Determine if results need aggregation (e.g., concatenating multiple values)
  • Specify sort order and separators for concatenated values
  • Distinguish between filtering criteria and output requirements (e.g., filter by EU countries but output ALL countries)

Step 5: Verification Strategy

Test the query against known expectations.

Verification methods:

  1. Run the query and examine raw output
  2. Manually trace through data for at least 2-3 entities to verify correctness
  3. Check for both inclusion (expected entities present) AND exclusion (unexpected entities absent)
  4. Verify aggregated values by manual count

Cross-reference checklist:

  • Do the returned entities match manual analysis?
  • Are all expected entities present in results?
  • Are any unexpected entities incorrectly included?
  • Do aggregated counts/values match manual verification?

Common Pitfalls

Incomplete Data Reading

  • Problem: Working with truncated data leads to missing entities
  • Prevention: Always confirm complete file content; re-read if truncated

Query Truncation

  • Problem: Long queries may be incompletely written
  • Prevention: After writing, read back the query file to verify completeness

Criterion Misinterpretation

  • Problem: Confusing filter criteria with output requirements
  • Prevention: Distinguish between "filter BY X" vs "output X" - these may differ

Date/Time Edge Cases

  • Problem: Incorrect handling of boundary dates
  • Prevention: Clarify whether comparisons are inclusive or exclusive; test boundaries

Aggregation Errors

  • Problem: Missing GROUP BY clauses or incorrect GROUP_CONCAT usage
  • Prevention: Verify aggregation syntax matches the query structure

EU Country List

  • Problem: Incomplete or outdated list of EU member country codes
  • Prevention: Use comprehensive list: AT, BE, BG, HR, CY, CZ, DK, EE, FI, FR, DE, GR, HU, IE, IT, LV, LT, LU, MT, NL, PL, PT, RO, SK, SI, ES, SE

Cross-Entity Relationships

  • Problem: Miscounting entities across relationships (e.g., students in departments)
  • Prevention: Trace the full predicate path; verify join conditions

Testing Protocol

  1. Syntax check: Ensure query parses without errors
  2. Subset test: Run on a known subset of data with expected results
  3. Full test: Run on complete dataset
  4. Manual verification: Trace 2-3 results through source data
  5. Boundary test: Check edge cases in filters (dates, counts, string matches)

Iteration Approach

If initial results do not match expectations:

  1. Isolate which filter condition is causing discrepancies
  2. Test each filter independently
  3. Examine entities that should appear but don't (false negatives)
  4. Examine entities that shouldn't appear but do (false positives)
  5. Adjust filter logic based on findings
  6. Re-verify after each adjustment
Weekly Installs
20
Repository
letta-ai/skills
First Seen
Jan 24, 2026
Installed on
claude-code14
codex13
opencode13
gemini-cli13
antigravity12
cursor10