code-semantic-search
SKILL.md
Code Semantic Search
Overview
Semantic code search using Phase 1 vector embeddings and Phase 2 hybrid search (semantic + structural). Find code by meaning, not just keywords.
Core principle: Search code by what it does, not what it's called.
Phase 2: Hybrid Search
This skill now supports three search modes:
-
Hybrid (Default):
- Combines semantic + structural search
- Best accuracy (95%+)
- Slightly slower but still <150ms
- Recommended for all searches
-
Semantic-Only:
- Uses only Phase 1 semantic vectors
- Fastest (<50ms)
- Good for conceptual searches
- Use when structure doesn't matter
-
Structural-Only:
- Uses only ast-grep patterns
- Precise for exact matches
- Best for finding function/class definitions
- Use when you need exact structural patterns
Performance Comparison
| Mode | Speed | Accuracy | Best For |
|---|---|---|---|
| Hybrid | <150ms | 95% | General search |
| Semantic-only | <50ms | 85% | Concepts |
| Structural-only | <50ms | 100% | Exact patterns |
| Phase 1 only | <50ms | 80% | Legacy (fallback) |
When to Use
Always:
- Finding authentication logic without knowing function names
- Searching for error handling patterns
- Locating database queries
- Finding similar code to a concept
- Discovering implementation patterns
Don't Use:
- Exact text matching (use Grep instead)
- File name searches (use Glob instead)
- Simple keyword searches (use ripgrep instead)
Usage Examples
Hybrid Search (Recommended)
// Basic hybrid search
Skill({ skill: 'code-semantic-search', args: 'find authentication logic' });
// With options
Skill({
skill: 'code-semantic-search',
args: 'database queries',
options: {
mode: 'hybrid',
language: 'javascript',
limit: 10,
},
});
Semantic-Only Search
// Fast conceptual search
Skill({
skill: 'code-semantic-search',
args: 'find authentication',
options: { mode: 'semantic-only' },
});
Structural-Only Search
// Exact pattern matching
Skill({
skill: 'code-semantic-search',
args: 'find function authenticate',
options: { mode: 'structural-only' },
});
Implementation Reference
Hybrid Search: .claude/lib/code-indexing/hybrid-search.cjs
Query Analysis: .claude/lib/code-indexing/query-analyzer.cjs
Result Ranking: .claude/lib/code-indexing/result-ranker.cjs
Integration Points
- developer: Code exploration, implementation discovery
- architect: System understanding, pattern analysis
- code-reviewer: Finding similar patterns, consistency checks
- reverse-engineer: Understanding unfamiliar codebases
- researcher: Research existing implementations
Iron Laws
- ALWAYS use hybrid mode (semantic + structural) for general searches — semantic-only misses exact matches; structural-only misses conceptual variants; hybrid provides 95% accuracy vs 85% for single mode.
- ALWAYS use ripgrep/keyword search first for fast keyword discovery — semantic search is for meaning-based queries; exact strings, function names, and filenames are found faster with ripgrep.
- NEVER use semantic search without a meaningful natural-language query — single-word or code-syntax queries produce poor semantic results; describe what the code does, not what it's called.
- ALWAYS combine with code-structural-search for precision refinement — start broad with semantic discovery, then use ast-grep patterns to find exact structural matches from the semantic results.
- NEVER ignore low-similarity results without checking them — similarity scores are approximations; a 0.7 score result may be more relevant than a 0.9 score result for uncommon patterns.
Anti-Patterns
| Anti-Pattern | Why It Fails | Correct Approach |
|---|---|---|
| Semantic search for exact string matching | Slower and less accurate than text search | Use ripgrep/Grep for exact keyword matching |
| Single-word queries ("auth") | Too vague for semantic matching; returns noise | Use natural-language descriptions ("authentication token validation logic") |
| Using semantic-only mode for general searches | Misses structural variants; 85% vs 95% accuracy | Use hybrid mode (default) for general queries |
| Ignoring search results that don't match expectations | Semantic results find surprising-but-relevant code | Read all results; unexpected matches are often the most valuable |
| Not combining with structural search | Finds concepts but not exact patterns | Use semantic for discovery → structural for precision |
Memory Protocol (MANDATORY)
Before starting:
Read .claude/context/memory/learnings.md
After completing:
- New pattern ->
.claude/context/memory/learnings.md - Issue found ->
.claude/context/memory/issues.md - Decision made ->
.claude/context/memory/decisions.md
ASSUME INTERRUPTION: If it's not in memory, it didn't happen.
Weekly Installs
52
Repository
oimiragieo/agent-studioGitHub Stars
16
First Seen
Feb 8, 2026
Security Audits
Installed on
github-copilot52
gemini-cli52
opencode51
codex51
cursor51
kimi-cli50