skills/codealive-ai/codealive-skills/codealive-context-engine

codealive-context-engine

Installation
SKILL.md

CodeAlive Context Engine

Semantic code intelligence across your entire code ecosystem — current project, organizational repos, dependencies, and any indexed codebase.

Authentication

All scripts require a CodeAlive API key. If any script fails with "API key not configured", help the user set it up:

Option 1 (recommended): Run the interactive setup and wait for the user to complete it:

python setup.py

Option 2 (not recommended — key visible in chat history): If the user pastes their API key directly in chat, save it via:

python setup.py --key THE_KEY

Do NOT retry the failed script until setup completes successfully.

Table of Contents

Tools Overview

Tool Script Speed Cost Best For
List Data Sources datasources.py Instant Free Discovering indexed repos and workspaces
Semantic Search search.py Fast Low Default discovery — finds code by meaning (concepts, behavior, architecture)
Grep Search grep.py Fast Low Finds code containing a specific string or regex (identifiers, literals, patterns)
Fetch Artifacts fetch.py Fast Low Retrieving full content; function-like artifacts also include up to 3 outgoing/incoming calls as a preview
Artifact Relationships relationships.py Fast Low Full call graph (past the fetch preview's 3-cap), inheritance, or symbol references for one artifact
Chat with Codebase chat.py Slow High Not recommended. Call ONLY when the user explicitly asks (e.g. "use chat").

Cost guidance: semantic_search and grep_search are the default starting point — fast and cheap. Use fetch_artifacts to load full source and get_artifact_relationships to trace call graphs. All four tools are low-cost.

Chat is not recommended: chat.py invokes an LLM on the server side, can take up to 30 seconds, and is significantly more expensive per call. Do NOT call it unless the user has explicitly requested it (e.g. "use chat", "use codebase_consultant", "call the chat tool"). Phrases like "ask CodeAlive" or "search CodeAlive" do NOT qualify — they refer to search tools.

Highest-confidence guidance: If your agent supports subagents and the task needs maximum reliability or depth, prefer a subagent-driven workflow that combines search.py, grep.py, fetch.py, relationships.py, and local file reads.

Three-step workflow (search → triage → load real content):

  1. Search — find relevant code locations with descriptions and identifiers
  2. Triage — use description ONLY to decide which results are worth a closer look. It is a pointer, NOT the source of truth. Do not draw conclusions from it.
  3. Get real content — for every artifact you decide is relevant:
    • External repos (no local access): python fetch.py <identifier>
    • Current working repo: read the file at the shown path with your editor's file-read tool Treat only that real content as ground truth.

Drill into relationships.py when the fetch preview isn't enough. The fetch.py response already previews up to 3 outgoing + 3 incoming calls for function-like artifacts, so the call graph alone is rarely a reason to run relationships.py after a full fetch of a small artifact. Reach for it when:

  • You need all incoming callers — the fetch preview is capped at 3. The full incoming list also surfaces test coverage (incoming from test files).
  • You need the inheritance tree--profile inheritanceOnly returns ancestors + descendants (interface implementations, subclasses, base-class chains). The preview doesn't include inheritance.
  • You need symbol references--profile referencesOnly for places that reference a type or identifier.
  • The artifact is too large to fetch into context — the call graph is a cheaper summary than pulling the full source.

Analyzer noise: outgoing calls occasionally include compiler-generated helpers (MoveNext, GetEnumerator, closure invocations) from methods using foreach/LINQ. Ignore outgoing hits that don't match the artifact's real logic.

When to Use

Semantic search (default) — you describe behavior or concept:

  • "How is authentication implemented?"
  • "Show me error handling patterns across services"
  • "How does this library work internally?"
  • "Find similar features to guide my implementation"

Grep search — you know the exact text:

  • "Find all usages of RepositoryDeleted"
  • "Where is ConnectionString configured?"
  • "Search for TODO: fix across the codebase"
  • Error messages, URLs, config keys, import paths, regex patterns

Use local file tools instead for:

  • Finding specific files by name or pattern
  • Exact keyword search in the current directory
  • Reading known file paths
  • Searching uncommitted changes

Quick Start

1. Discover what's indexed

python scripts/datasources.py

2. Search for code (fast, cheap)

python scripts/search.py "JWT token validation" my-backend
python scripts/search.py "authentication flow" my-repo --path src/auth --ext .py
python scripts/grep.py "AuthService" my-repo
python scripts/grep.py "auth\\(" my-repo --regex

3. Fetch full content (for external repos)

python scripts/fetch.py "my-org/backend::src/auth.py::AuthService.login()"

4. Drill into an artifact's relationships (optional)

# Full call graph (default)
python scripts/relationships.py "my-org/backend::src/auth.py::AuthService.login()"

# Inheritance hierarchy for a class
python scripts/relationships.py "my-org/backend::src/models.py::User" --profile inheritanceOnly

# Calls + inheritance, raise the per-type cap
python scripts/relationships.py "my-org/backend::src/svc.py::Service" --profile allRelevant --max-count 200

5. Chat with codebase (not recommended — only if user explicitly asks)

python scripts/chat.py "Explain the authentication flow" my-backend
python scripts/chat.py "What about security considerations?" --continue CONV_ID

Do not call chat unless the user explicitly asks for it. Use search, grep, fetch, and relationships for all other tasks.

Tool Reference

datasources.py — List Data Sources

python scripts/datasources.py              # Ready-to-use sources
python scripts/datasources.py --all        # All (including processing)
python scripts/datasources.py --json       # JSON output

search.py — Semantic Code Search (default discovery tool)

The default starting point. Finds code by WHAT it does — concepts, behavior, architecture — not by exact text. Use when you can describe what you're looking for but don't know the exact names in the codebase.

python scripts/search.py <query> <data_sources...> [options]
Option Description
--max-results N Optional cap for the number of returned artifacts
--path PATH Repo-relative path or directory scope (repeatable)
--ext EXT File extension scope such as .py or .ts (repeatable)

description is a triage pointer ONLY — it tells you which artifacts are worth a closer look. It is NOT the source of truth and you must NOT draw conclusions from it. For every result you consider relevant, load the real source: use fetch.py <identifier> for external repos, or your editor's file-read tool on the path for repos in the current working directory. Treat only that real content as ground truth.

grep.py — Exact Text / Regex Search

Finds code containing a specific string or regex pattern. Use when you know the exact text to look for: identifiers, error messages, config keys, URLs, domain events, import paths, TODO comments.

python scripts/grep.py <query> <data_sources...> [--regex] [--max-results N] [--path PATH] [--ext EXT]
Option Description
--regex Interpret the query as a regex pattern
--max-results N Optional cap for the number of returned artifacts
--path PATH Repo-relative path or directory scope (repeatable)
--ext EXT File extension scope such as .py or .ts (repeatable)

Line previews are still search evidence, not source of truth. Use fetch.py or your local file-read tool before drawing conclusions about behavior.

fetch.py — Fetch Artifact Content

Retrieves the full source code content for artifacts found via search. Use this for external repositories you cannot access locally.

python scripts/fetch.py <identifier1> [identifier2...]
Constraint Value
Max identifiers per request 20
Identifiers source identifier field from search results
Identifier format {owner/repo}::{path}::{symbol} (symbols), {owner/repo}::{path} (files)

For function-like artifacts the response includes a small relationships preview (up to 3 outgoing/incoming calls per direction). To see the full call graph, inheritance, or references, run relationships.py with the artifact's identifier.

relationships.py — Drill into an Artifact's Relationship Graph

Returns the full call graph (incoming/outgoing calls), inheritance hierarchy (ancestors/descendants), or symbol references for a single artifact. This is the drill-down tool — use it AFTER search.py or fetch.py once you have an identifier and want to understand how the artifact relates to the rest of the codebase.

python scripts/relationships.py <identifier> [--profile PROFILE] [--max-count N]
Option Description
--profile callsOnly Default. Outgoing + incoming calls
--profile inheritanceOnly Ancestors + descendants
--profile allRelevant Calls + inheritance (4 groups)
--profile referencesOnly Symbol references
--max-count N Max related artifacts per relationship type (1–1000, default 50)
--json Emit the raw JSON response instead of the formatted view

When this adds value vs the fetch preview:

  • You need all incoming callers (including tests) — the fetch preview caps at 3 per direction
  • You need the inheritance tree (--profile inheritanceOnly) — preview doesn't include ancestors/descendants
  • You need symbol references (--profile referencesOnly) — preview doesn't include references
  • The artifact is too large to fetch into context

When it's usually redundant: you already ran fetch.py on a small artifact that fits in context. The outgoing calls you need are either in the source you just read or in the preview's 3-cap — reach for relationships.py only when you specifically need incoming calls, inheritance, or references.

Noise caveat: outgoing calls occasionally include compiler-generated helpers (MoveNext, GetEnumerator, closure invocations) for methods using foreach/LINQ. These are analyzer artifacts — ignore outgoing hits that don't match the artifact's real logic.

chat.py — Chat with Codebase (not recommended)

Do NOT call unless the user explicitly asks (e.g. "use chat", "use codebase_consultant", "call the chat tool"). Phrases like "ask CodeAlive" or "search CodeAlive" refer to search tools, not chat.

Sends your question to an AI consultant that has full context of the indexed codebase. Returns synthesized, ready-to-use answers. Supports conversation continuity for follow-ups.

This is slow and expensive — runs an LLM on the server side, up to 30 seconds per call. For all standard tasks (finding code, understanding architecture, debugging), use search.py, grep.py, fetch.py, and relationships.py instead.

python scripts/chat.py <question> <data_sources...> [options]
Option Description
--continue <id> Continue a previous conversation (saves context and cost)

Conversation continuity: Every response includes a conversation_id. Pass it with --continue for follow-up questions — this preserves context and is cheaper than starting fresh.

Data Sources

Repository — single codebase, for targeted searches:

python scripts/search.py "query" my-backend-api

Workspace — multiple repos, for cross-project patterns:

python scripts/search.py "query" workspace:backend-team

Multiple repositories:

python scripts/search.py "query" repo-a repo-b repo-c

Configuration

Prerequisites

  • Python 3.8+ (no third-party packages required — uses only stdlib)

API Key Setup

The skill needs a CodeAlive API key. Resolution order:

  1. CODEALIVE_API_KEY environment variable
  2. OS credential store (macOS Keychain / Linux secret-tool / Windows Credential Manager)

Environment variable (all platforms):

export CODEALIVE_API_KEY="your_key_here"

macOS Keychain:

security add-generic-password -a "$USER" -s "codealive-api-key" -w "YOUR_API_KEY"

Linux (freedesktop secret-tool):

secret-tool store --label="CodeAlive API Key" service codealive-api-key

Windows Credential Manager:

cmdkey /generic:codealive-api-key /user:codealive /pass:"YOUR_API_KEY"

Base URL (optional, defaults to https://app.codealive.ai):

export CODEALIVE_BASE_URL="https://your-instance.example.com"

For self-hosted CodeAlive, use your deployment origin. https://your-instance.example.com is preferred, but https://your-instance.example.com/api is also accepted and normalized automatically.

Get API keys at: https://app.codealive.ai/settings/api-keys

Using with CodeAlive MCP Server

This skill works standalone, but delivers the best experience when combined with the CodeAlive MCP server. The MCP server provides direct tool access via the Model Context Protocol, while this skill provides the workflow knowledge and query patterns to use those tools effectively.

Component What it provides
This skill Query patterns, workflow guidance, cost-aware tool selection
MCP server Direct semantic_search, grep_search, fetch_artifacts, get_artifact_relationships, get_data_sources tools via MCP protocol

When both are installed, prefer the MCP server's tools for direct operations and this skill's scripts for guided workflows.

Detailed Guides

For advanced usage, see reference files:

  • Query Patterns — effective query writing, anti-patterns, language-specific examples
  • Workflows — step-by-step workflows for onboarding, debugging, feature planning, and more
Weekly Installs
28
GitHub Stars
10
First Seen
1 day ago