code-search-assistant

Installation

SKILL.md

Code Search Assistant

There are four kinds of code search. Picking the wrong one wastes time or misses results. The skill is matching the question to the tool.

Question → tool

Question shape	Tool	Why
"Where is `FooBar` defined?"	Text grep (`rg -w`)	Exact symbol — fast, precise
"Where is `FooBar` used?"	Text grep + filter, or LSP "find references"	Same symbol, many hits
"What calls this function, transitively?"	Call graph walk	Grep finds direct calls; you need the tree
"Where do we validate email addresses?"	Semantic / fuzzy search	Concept, not symbol — no single keyword
"Find all places that cast then dereference"	AST / structural query	Syntactic pattern, not a string
"What's the code path from HTTP to DB?"	Dataflow / taint trace	Cross-function, value-following

Text search — do it right

Grep is fast but dumb. Make it less dumb:

Trick	Example
Word boundaries	`rg -w foo` — matches `foo` not `foobar`
File type filter	`rg -t py foo` — only Python files
Definition vs use	`rg '^(def
Multi-line pattern	`rg -U 'if.\n.return None'`
Exclude vendored/generated	`rg foo -g '!vendor/' -g '!*.pb.go'`
Case-insensitive for NL concepts	`rg -i 'email.*valid'`

False-positive pruning: comments, strings, tests. rg foo | rg -v test_ | rg -v '^.*#' — crude but works. Or use the -t type filter to skip test directories if the language has conventions.

Structural search — when grep lies

Grep finds text. AST search finds structure. You need AST when:

The pattern has nesting: "a return inside a for inside a try."
The pattern is semantic: "function calls where the 2nd arg is a string literal."
The pattern spans lines in ways regex can't track.

Tools: semgrep (pattern syntax looks like code with holes), ast-grep, language-specific (Python ast module, clang query).

Example semgrep pattern — find SQL built by concatenation:

pattern: |
  $CURSOR.execute($X + $Y)

Grep for execute gives you thousands of hits. The pattern gives you the dangerous ones.

Call graph — the transitive question

"What eventually calls dangerous_write?" Grep finds direct callers. For the full tree:

Find direct callers of dangerous_write.
For each, find their callers.
Repeat until you hit entry points (main, route handlers, tests).

LSP "call hierarchy" does this in IDEs. Manually: breadth-first, dedupe visited functions. Output is a tree, not a list.

Semantic search — the fuzzy question

"Where do we handle session expiry?" — no single symbol. The code might say timeout, ttl, expires_at, staleness, max_age. Semantic search embeds code and query, ranks by meaning.

When you don't have a semantic search index, approximate:

Brainstorm synonyms: expir, timeout, ttl, stale, max_age.
Grep for each, union results.
Rank by proximity to other clue words (session, auth, cookie).

Worked example — a real question

Q: "Where in this Django app do we actually write to the orders table?"

Wrong first move: grep orders — 847 hits, mostly templates and tests.

Right sequence:

Find the model. rg 'class.*Model.*orders' -t py or rg "db_table.*orders" → Order in models/order.py.
Find writes. ORM writes are .save(), .create(), .update(), .delete(), .bulk_create(). But those are on any model — need to narrow.
Structural: rg 'Order\.(objects\.)?(create|update|bulk)' -t py + rg 'order\.save\(\)' (instance-level, harder — order could be any variable name).
Cross-reference: find functions that take an Order and call .save(). rg 'def.*order.*:' -A 20 | rg save.
Raw SQL escape hatch: rg 'INSERT INTO orders|UPDATE orders' -i — catches anyone bypassing the ORM.

Result: 6 write sites. 4 through the ORM (service layer), 1 in a migration, 1 raw SQL in a management command (flagged — why is this bypassing the ORM?).

Do not

Do not grep when the question is transitive. "Who calls X" (direct) is grep. "Who eventually calls X" is a graph walk.
Do not trust grep for "all usages" in dynamically-typed languages. getattr(obj, 'foo')() won't match rg foo\(. Know your language's reflection escape hatches.
Do not semantic-search when you have an exact symbol. It's slower and less precise than grep.
Do not present 847 hits. Filter, rank, group. "Here are 847 matches" is not an answer.

Output format

## Query
<what was asked>

## Search strategy
<text | AST | call-graph | semantic> — <why this one>

## Searches run
1. <command / pattern> → <N> hits
2. <refinement> → <M> hits
...

## Results (ranked)
| Location | Snippet | Relevance |
| -------- | ------- | --------- |

## Notes
<known blind spots — reflection, generated code, dynamic dispatch>

Related skills

More from santosomar/general-secure-coding-agent-skills

Installs

Repository

santosomar/gene…t-skills

GitHub Stars

First Seen

Mar 29, 2026

Security Audits

Gen Agent Trust HubPass

SnykPass

code-search-assistant

Code Search Assistant

Question → tool

Text search — do it right

Structural search — when grep lies

Call graph — the transitive question

Semantic search — the fuzzy question

Worked example — a real question

Do not

Output format

More from santosomar/general-secure-coding-agent-skills

dependency-resolver

configuration-generator

ci-pipeline-synthesizer

api-design-assistant

code-refactoring-assistant

code-smell-detector