deepsec-vulnerability-scanner

Installation
SKILL.md

deepsec Vulnerability Scanner

Skill by ara.so — Daily 2026 Skills collection.

deepsec is an agent-powered vulnerability scanner that uses coding agents (Claude, Codex) at maximum thinking levels to surface hard-to-find security issues in large codebases. It uses regex matchers to find candidate sites, then AI to deeply investigate each one, producing actionable findings with severity ratings.

Installation

Navigate to the root of the repo you want to scan:

npx deepsec init       # creates .deepsec/ directory with project scaffolding
cd .deepsec
pnpm install           # installs deepsec from npm

After init, bootstrap the installation by prompting your coding agent:

Read .deepsec/node_modules/deepsec/SKILL.md to understand the tool. Then read .deepsec/data/<id>/SETUP.md and follow it: skim this repo's README, any AGENTS.md/CLAUDE.md, and a handful of representative code files, then replace each section of .deepsec/data/<id>/INFO.md. Keep it SHORT — target 50–100 lines total.

AI Provider Setup

Vercel AI Gateway (recommended for large scans)

export AI_GATEWAY_API_KEY=vck_...

One key covers both Claude and Codex. Get a key from the Vercel dashboard.

Direct provider keys (bypasses gateway)

# Anthropic
export ANTHROPIC_AUTH_TOKEN=sk-ant-...
export ANTHROPIC_BASE_URL=https://api.anthropic.com

# OpenAI
export OPENAI_API_KEY=sk-...
export OPENAI_BASE_URL=https://api.openai.com/v1

Explicit values always win over AI_GATEWAY_API_KEY expansion.

Core Workflow

Run these commands from inside .deepsec/:

# Step 1: Find candidate sites with regex matchers (fast, no AI)
pnpm deepsec scan

# Step 2: AI investigation — emits findings + recommendations
pnpm deepsec process

# Step 3: Optional — re-check findings, cuts false positive rate
pnpm deepsec revalidate

# Step 4: Export findings
pnpm deepsec export --format md-dir --out ./findings

Full Command Reference

Command What it does
scan Regex matcher pass — fast, no AI cost
process AI deep-investigation of candidates
triage Lightweight P0/P1/P2 classification (cheaper model)
revalidate Re-check findings; checks git history for fixes
enrich Add git committer info + ownership data
report Markdown + JSON summary for one project
export Per-finding JSON or directory of markdown files
metrics Cross-project counts: severities, vulns by type, TPs
status Snapshot of the project mirror
sandbox <cmd> Run any command on Vercel Sandbox microVMs

Configuration

Create deepsec.config.ts in your .deepsec/ directory:

import { defineConfig } from 'deepsec';

export default defineConfig({
  projects: [
    {
      id: 'my-app',
      root: '../',          // path to repo root, relative to .deepsec/
      name: 'My Application',
    }
  ],
  // Model selection — defaults to highest capability
  model: {
    scan: 'claude-opus-4',
    triage: 'claude-haiku-4',
  },
  // Concurrency for local processing
  concurrency: 4,
});

See docs/configuration.md for the full deepsec.config.ts reference.

Writing Custom Matchers

Matchers are regex patterns that identify candidate code sites for AI investigation. Prompt your coding agent with the writing-matchers doc to grow your matcher set:

Read docs/writing-matchers.md and add matchers for [specific concern] in our codebase.

Example matcher file structure:

// .deepsec/matchers/auth.ts
import { defineMatcher } from 'deepsec';

export default defineMatcher({
  id: 'jwt-none-alg',
  description: 'JWT algorithm set to none or not verified',
  severity: 'critical',
  pattern: /jwt\.verify\s*\(|algorithm['":\s]+['"]none['"]/gi,
  fileGlobs: ['**/*.ts', '**/*.js'],
  // Context lines to include around match
  contextLines: 10,
});
// .deepsec/matchers/sql.ts
import { defineMatcher } from 'deepsec';

export default defineMatcher({
  id: 'raw-sql-interpolation',
  description: 'String interpolation directly into SQL queries',
  severity: 'high',
  pattern: /`\s*SELECT|INSERT|UPDATE|DELETE.*\$\{/gi,
  fileGlobs: ['**/*.ts', '**/*.js', '**/*.py'],
  contextLines: 15,
  // Provide project-specific context to the AI investigator
  hint: 'Check if user-controlled input reaches query construction. Our ORM is Prisma; raw queries use prisma.$queryRaw.',
});

INFO.md: Project Context for AI

The INFO.md file is injected into every scan batch. Keep it 50–100 lines:

## Auth
- JWT issued by `lib/auth/jwt.ts``signToken()` / `verifyToken()`
- Session middleware: `middleware/session.ts` wraps all `/api/*` routes
- RBAC: `lib/permissions.ts``can(user, action, resource)`

## Data Access
- ORM: Prisma via `lib/db.ts` singleton
- Raw queries only in `lib/db/raw.ts` — uses tagged template `sql\`\``
- User input reaches DB through `services/` layer only

## External Inputs
- Webhooks: `app/api/webhooks/` — bodies parsed before signature check in v1 routes
- File uploads: `app/api/upload/` → stored in S3, filenames sanitized by `lib/storage.ts`

## Known Sensitive Areas
- `lib/crypto.ts` — key derivation, do not flag standard bcrypt usage as vuln
- `app/admin/` — intentionally privileged, verify RBAC not auth bypass

Distributed Execution (Vercel Sandbox)

Fan work across microVMs for large monorepos:

# Process using 10 sandboxes, 4 concurrent per sandbox
pnpm deepsec sandbox process \
  --project-id my-app \
  --sandboxes 10 \
  --concurrency 4
# Full distributed pipeline
pnpm deepsec sandbox scan --project-id my-app --sandboxes 5
pnpm deepsec sandbox process --project-id my-app --sandboxes 10 --concurrency 4
pnpm deepsec sandbox revalidate --project-id my-app --sandboxes 5

Requires a Vercel account. The local working tree is tarballed and uploaded (.git excluded). Supports both OIDC tokens (local) and access tokens (CI).

For CI environments:

export VERCEL_ACCESS_TOKEN=your_token_here
export VERCEL_TEAM_ID=team_xxx

Export Formats

# Directory of markdown files (one per finding)
pnpm deepsec export --format md-dir --out ./findings

# Single JSON file with all findings
pnpm deepsec export --format json --out ./findings.json

# Per-finding JSON files
pnpm deepsec export --format json-dir --out ./findings-json

Viewing Results

# Summary report for a project
pnpm deepsec report --project-id my-app

# Cross-project metrics
pnpm deepsec metrics

# Current pipeline status
pnpm deepsec status

Idempotency and Resuming

Commands are idempotent — safe to interrupt and restart:

# If process is interrupted, just re-run — it picks up where it left off
pnpm deepsec process

# Force re-process specific files
pnpm deepsec process --force --file src/auth/login.ts

Plugin Authoring

// .deepsec/plugins/jira-ownership.ts
import { definePlugin } from 'deepsec';

export default definePlugin({
  name: 'jira-ownership',
  hooks: {
    // Called during `enrich` — add owner metadata to findings
    async enrichFinding(finding) {
      const owner = await fetchJiraTeamForPath(finding.file);
      return {
        ...finding,
        metadata: { ...finding.metadata, team: owner },
      };
    },
  },
});

Register in deepsec.config.ts:

import { defineConfig } from 'deepsec';
import jiraOwnership from './plugins/jira-ownership';

export default defineConfig({
  plugins: [jiraOwnership],
  projects: [{ id: 'my-app', root: '../' }],
});

Data Layout

Key paths inside .deepsec/data/<project-id>/:

data/<id>/
  SETUP.md          # one-time agent bootstrap instructions
  INFO.md           # project context injected into every scan batch
  scan/             # FileRecord JSON from matcher pass
  findings/         # AI-produced findings (one JSON per finding)
  revalidated/      # findings after revalidation pass
  run-meta.json     # pipeline state, timestamps, model used

Common Patterns

Scan a specific directory only

pnpm deepsec scan --include 'src/api/**' --include 'src/auth/**'

Focus on high-severity only during process

pnpm deepsec process --min-severity high

Triage first to prioritize expensive revalidation

pnpm deepsec process
pnpm deepsec triage          # classify P0/P1/P2 cheaply
pnpm deepsec revalidate --min-priority P0  # only revalidate critical

Add git blame context to findings

pnpm deepsec enrich          # adds committer info from git log

Troubleshooting

Scan finds nothing: Check that matchers cover your file extensions. Add custom matchers for project-specific patterns (e.g., internal auth helpers, custom ORMs).

High false positive rate: Run revalidate — it re-checks findings and checks git history. Also improve INFO.md with project-specific context so the AI understands your conventions.

AI model refusals: Some models refuse certain security content. See docs/models.md for refusal handling and alternative model configuration.

Cost is too high: Use triage before process to filter candidates. Use --include flags to focus on high-risk directories. Switch to a cheaper model for initial passes via deepsec.config.ts.

Interrupted scan: Just re-run the same command. All commands are idempotent and resume from where they stopped.

Prompt injection concern: Run on Vercel Sandbox — API keys are injected outside the sandbox and cannot be exfiltrated. Worker sandboxes have limited network egress.

Security Model

Treat deepsec like a coding agent with full shell access. It's designed for trusted inputs (your source code). For untrusted inputs (vendored code, external deps), use Vercel Sandbox to limit exposure — API keys are injected outside sandbox scope and network egress is restricted to coding agent hosts.

Cost Expectations

  • Uses best models at maximum thinking levels
  • Large codebases can cost thousands to tens of thousands of dollars
  • Use triage (cheap model) → revalidate (targeted) to control costs
  • Distributed sandbox runs are billed per compute time on Vercel

Key Files and Docs

File Purpose
docs/getting-started.md First-scan walkthrough
docs/writing-matchers.md Prompt your agent to grow matchers
docs/configuration.md Full deepsec.config.ts reference
docs/models.md Model selection, defaults, refusals
docs/vercel-setup.md AI Gateway + Sandbox keys
docs/architecture.md Pipeline internals
docs/faq.md Cost, FP rate, sandbox mode
samples/webapp/ Copy-paste starting points
Related skills
Installs
7
GitHub Stars
4
First Seen
4 days ago