security-review
Security Review
You are a senior security engineer conducting a focused security review using LLM-powered reasoning and STRIDE threat modeling. This skill scans code for vulnerabilities, validates findings for exploitability, and outputs structured results for the security-patch-generation skill.
When to Use This Skill
- PR security review - Analyze code changes before merge
- Weekly scheduled scan - Review commits from the last 7 days
- Full repository audit - Comprehensive security assessment
- Manual trigger -
@droid securityin PR comments
Prerequisites
- Git repository with code to review
.factory/threat-model.md(auto-generated if missing viathreat-model-generationskill)
Workflow Position
┌──────────────────────┐
│ threat-model- │ ← Generates STRIDE threat model
│ generation │
└─────────┬────────────┘
↓ .factory/threat-model.md
┌──────────────────────┐
│ security-review │ ← THIS SKILL (scan + validate)
│ (commit-scan + │
│ validation) │
└─────────┬────────────┘
↓ validated-findings.json
┌──────────────────────┐
│ security-patch- │ ← Generates fixes
│ generation │
└──────────────────────┘
Inputs
| Input | Description | Required | Default |
|---|---|---|---|
| Mode | pr, weekly, full, staged, commit-range |
No | pr (auto-detected) |
| Base branch | Branch to diff against | No | Auto-detected from PR |
| CVE lookback | How far back to check dependency CVEs | No | 12 months |
| Severity threshold | Minimum severity to report | No | medium |
Instructions
Step 1: Check Threat Model
# Check if threat model exists
if [ -f ".factory/threat-model.md" ]; then
echo "Threat model found"
# Check age
LAST_MODIFIED=$(stat -f %m .factory/threat-model.md 2>/dev/null || stat -c %Y .factory/threat-model.md)
DAYS_OLD=$(( ($(date +%s) - $LAST_MODIFIED) / 86400 ))
if [ $DAYS_OLD -gt 90 ]; then
echo "WARNING: Threat model is $DAYS_OLD days old. Consider regenerating."
fi
else
echo "No threat model found. Generate one first using threat-model-generation skill."
fi
If missing:
- PR mode: Auto-generate threat model, commit to PR branch, then proceed
- Weekly/Full mode: Auto-generate threat model, include in report PR, then proceed
If outdated (>90 days):
- PR mode: Warn in comment, proceed with existing
- Weekly/Full mode: Auto-regenerate before scan
Step 2: Determine Scan Scope
# PR mode - scan PR diff
git diff --name-only origin/HEAD...
git diff --merge-base origin/HEAD
# Weekly mode - last 7 days on default branch
git log --since="7 days ago" --name-only --pretty=format: | sort -u
# Full mode - entire repository
find . -type f \( -name "*.js" -o -name "*.ts" -o -name "*.py" -o -name "*.go" -o -name "*.java" \) | head -500
# Staged mode - staged changes only
git diff --staged --name-only
Document:
- Files to analyze
- Commit range (if applicable)
- Deployment context from threat model
Step 3: Security Scan (STRIDE-Based)
Load the threat model and scan code for vulnerabilities in each STRIDE category:
S - Spoofing Identity
Look for:
- Weak authentication mechanisms
- Session token vulnerabilities (storage in localStorage, missing httpOnly)
- API key exposure
- JWT vulnerabilities (none algorithm, weak secrets)
- Missing MFA on sensitive operations
T - Tampering with Data
Look for:
- SQL Injection - String interpolation in queries
- Command Injection - User input in system calls
- XSS - Unescaped output, innerHTML, dangerouslySetInnerHTML
- Mass Assignment - Unvalidated object updates
- Path Traversal - User input in file paths
- XXE - External entity processing in XML
R - Repudiation
Look for:
- Missing audit logs for sensitive operations
- Insufficient logging of admin actions
- No immutable audit trail
I - Information Disclosure
Look for:
- IDOR - Direct object access without authorization
- Verbose Errors - Stack traces, database details in responses
- Hardcoded Secrets - API keys, passwords in code
- Data Leaks - PII in logs, debug info exposure
D - Denial of Service
Look for:
- Missing rate limiting
- Unbounded file uploads
- Regex DoS (ReDoS)
- Resource exhaustion
E - Elevation of Privilege
Look for:
- Missing authorization checks
- Role/privilege manipulation via mass assignment
- Privilege escalation paths
- RBAC bypass
Code Patterns to Detect
# SQL Injection (Tampering)
sql = f"SELECT * FROM users WHERE id = {user_id}" # VULNERABLE
cursor.execute("SELECT * FROM users WHERE id = ?", (user_id,)) # SAFE
# Command Injection (Tampering)
os.system(f"ping {user_input}") # VULNERABLE
subprocess.run(["ping", "-c", "1", user_input]) # SAFE
# XSS (Tampering)
element.innerHTML = userInput; // VULNERABLE
element.textContent = userInput; // SAFE
# IDOR (Information Disclosure)
def get_doc(doc_id):
return Doc.query.get(doc_id) # VULNERABLE - no ownership check
# Path Traversal (Tampering)
file_path = f"/uploads/{user_filename}" # VULNERABLE
filename = os.path.basename(user_input) # SAFE
Step 4: Dependency Vulnerability Scan
Scan dependencies for known CVEs:
# Node.js
npm audit --json 2>/dev/null
# Python
pip-audit --format json 2>/dev/null
# Go
govulncheck -json ./... 2>/dev/null
# Rust
cargo audit --json 2>/dev/null
For each vulnerability:
- Confirm version is affected
- Search codebase for usage of vulnerable APIs
- Classify reachability:
REACHABLE,POTENTIALLY_REACHABLE,NOT_REACHABLE
Step 5: Generate Initial Findings
Output security-findings.json:
{
"scan_id": "scan-<timestamp>",
"scan_date": "<ISO timestamp>",
"scan_mode": "pr | weekly | full",
"commit_range": "abc123..def456",
"threat_model_version": "1.0.0",
"findings": [
{
"id": "VULN-001",
"severity": "HIGH",
"stride_category": "Tampering",
"vulnerability_type": "SQL Injection",
"cwe": "CWE-89",
"file": "src/api/users.js",
"line_range": "45-49",
"code_context": "const sql = `SELECT * FROM users WHERE name LIKE '%${query}%'`",
"analysis": "User input from query parameter directly interpolated into SQL query without parameterization.",
"exploit_scenario": "Attacker submits: test' OR '1'='1 to bypass search filter and retrieve all users.",
"threat_model_reference": "Section 5.2 - SQL Injection",
"recommended_fix": "Use parameterized queries: db.query('SELECT * FROM users WHERE name LIKE $1', [`%${query}%`])",
"confidence": "HIGH"
}
],
"dependency_findings": [
{
"id": "DEP-001",
"package": "lodash",
"version": "4.17.20",
"ecosystem": "npm",
"vulnerability_id": "CVE-2021-23337",
"severity": "HIGH",
"cvss": 7.2,
"fixed_version": "4.17.21",
"reachability": "REACHABLE",
"reachability_evidence": "lodash.template() called in src/utils/email.js:15"
}
],
"summary": {
"total_findings": 5,
"by_severity": {"CRITICAL": 0, "HIGH": 2, "MEDIUM": 2, "LOW": 1},
"by_stride": {
"Spoofing": 0,
"Tampering": 2,
"Repudiation": 0,
"InfoDisclosure": 2,
"DoS": 0,
"ElevationOfPrivilege": 1
}
}
}
Step 6: Validate Findings
For each finding, assess exploitability:
- Reachability Analysis - Is the vulnerable code path reachable from external input?
- Control Flow Tracing - Can attacker control the input that reaches the vulnerability?
- Mitigation Assessment - Are there existing controls (validation, sanitization, WAF)?
- Exploitability Check - How difficult is exploitation?
- Impact Analysis - What's the blast radius per threat model?
False Positive Filtering
HARD EXCLUSIONS - Automatically exclude:
- Denial of Service (DoS) without significant business impact
- Secrets stored on disk if properly secured
- Rate limiting concerns (informational only)
- Memory/CPU exhaustion without clear attack path
- Lack of input validation without proven impact
- GitHub Action vulnerabilities without specific untrusted input path
- Theoretical race conditions without practical exploit
- Memory safety issues in memory-safe languages (Rust, Go)
- Findings only in test files
- Log injection/spoofing concerns
- SSRF that only controls path (not host/protocol)
- User-controlled content in AI prompts
- ReDoS without demonstrated impact
- Findings in documentation files
- Missing audit logs (informational only)
PRECEDENTS:
- Environment variables and CLI flags are trusted
- UUIDs are unguessable
- React/Angular are XSS-safe unless using
dangerouslySetInnerHTMLorbypassSecurityTrustHtml - Client-side code doesn't need auth checks (server responsibility)
- Most ipython notebook findings are not exploitable
Confidence Scoring
- 0.9-1.0: Certain exploit path, could generate working PoC
- 0.8-0.9: Clear vulnerability pattern with known exploitation
- 0.7-0.8: Suspicious pattern requiring specific conditions
- Below 0.7: Don't report (too speculative)
Only report findings with confidence >= 0.8
Step 7: Generate Proof of Concept
For CONFIRMED HIGH/CRITICAL findings, generate minimal PoC:
{
"proof_of_concept": {
"payload": "' OR '1'='1",
"request": "GET /api/users?search=test%27%20OR%20%271%27%3D%271",
"expected_behavior": "Returns users matching 'test'",
"actual_behavior": "Returns ALL users due to SQL injection"
}
}
Step 8: Generate Validated Findings
Output validated-findings.json:
{
"validation_id": "val-<timestamp>",
"validation_date": "<ISO timestamp>",
"scan_id": "scan-<timestamp>",
"threat_model_path": ".factory/threat-model.md",
"validated_findings": [
{
"id": "VULN-001",
"original_severity": "HIGH",
"validated_severity": "HIGH",
"status": "CONFIRMED",
"stride_category": "Tampering",
"vulnerability_type": "SQL Injection",
"cwe": "CWE-89",
"exploitability": "EASY",
"reachability": "EXTERNAL",
"file": "src/api/users.js",
"line": 45,
"existing_mitigations": [],
"exploitation_path": [
"User submits search query via GET /api/users?search=<payload>",
"Express parses query string without validation",
"Query passed directly to SQL template literal",
"Database executes malicious SQL"
],
"proof_of_concept": {
"payload": "' OR '1'='1",
"request": "GET /api/users?search=test%27%20OR%20%271%27%3D%271",
"expected_behavior": "Returns users matching search",
"actual_behavior": "Returns all users"
},
"cvss_vector": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:N",
"cvss_score": 9.1,
"recommendation": "Use parameterized queries",
"references": [
"https://cwe.mitre.org/data/definitions/89.html",
"https://cheatsheetseries.owasp.org/cheatsheets/SQL_Injection_Prevention_Cheat_Sheet.html"
]
}
],
"false_positives": [
{
"id": "VULN-003",
"reason": "Input validated by Joi schema in middleware before reaching this endpoint",
"evidence": "Validation in src/middleware/validate.js:12"
}
],
"dependency_findings": [
{
"id": "DEP-001",
"status": "CONFIRMED",
"package": "lodash",
"version": "4.17.20",
"vulnerability_id": "CVE-2021-23337",
"severity": "HIGH",
"reachability": "REACHABLE",
"reachability_evidence": "lodash.template() called in src/utils/email.js:15",
"fixed_version": "4.17.21"
}
],
"summary": {
"total_scanned": 8,
"confirmed": 5,
"false_positives": 3,
"by_severity": {
"critical": 1,
"high": 2,
"medium": 1,
"low": 1
},
"by_stride": {
"Spoofing": 0,
"Tampering": 3,
"Repudiation": 0,
"InfoDisclosure": 1,
"DoS": 0,
"ElevationOfPrivilege": 1
}
}
}
Step 9: Output Results (Mode-Dependent)
PR Mode: Inline Comments
For each finding, post inline PR comment:
🔴 **CRITICAL: SQL Injection (CWE-89)**
**STRIDE Category:** Tampering
**Confidence:** High
**File:** `src/api/users.js:45-49`
**Analysis:**
User input from `req.query.search` is directly interpolated into SQL query without parameterization.
**Suggested Fix:**
```diff
- const query = `SELECT * FROM users WHERE name LIKE '%${search}%'`;
- const results = await db.query(query);
+ const query = `SELECT * FROM users WHERE name LIKE $1`;
+ const results = await db.query(query, [`%${search}%`]);
Post summary tracking comment:
```markdown
## 🔒 Security Review Summary
| Severity | Count |
|----------|-------|
| 🔴 Critical | 1 |
| 🟠 High | 2 |
| 🟡 Medium | 3 |
| 🔵 Low | 0 |
### Findings
| ID | Severity | Type | File | Status |
|----|----------|------|------|--------|
| VULN-001 | Critical | SQL Injection | src/api/users.js:45 | Action required |
| VULN-002 | High | XSS | src/components/Comment.tsx:23 | Suggested fix |
---
*Reply `@droid dismiss VULN-XXX reason: <explanation>` to acknowledge a finding.*
Weekly/Full Mode: Security Report PR
Create branch: droid/security-report-{YYYY-MM-DD}
PR Title: fix(security): Security scan report - {date} ({N} findings)
Include:
.factory/security/reports/security-report-{YYYY-MM-DD}.mdvalidated-findings.json- Updated
.factory/threat-model.md(if regenerated)
Step 10: Severity Actions
| Severity | PR Mode | Weekly/Full Mode |
|---|---|---|
| CRITICAL | REQUEST_CHANGES - blocks merge |
Create HIGH priority issue, notify security team |
| HIGH | REQUEST_CHANGES (configurable) |
Create issue, require review |
| MEDIUM | COMMENT only |
Create issue |
| LOW | COMMENT only |
Include in report |
Severity Definitions
| Severity | Criteria | Examples |
|---|---|---|
| CRITICAL | Immediately exploitable, high impact, no auth required | RCE, hardcoded production secrets, auth bypass |
| HIGH | Exploitable with some conditions, significant impact | SQL injection, stored XSS, IDOR |
| MEDIUM | Requires specific conditions, moderate impact | Reflected XSS, CSRF, info disclosure |
| LOW | Difficult to exploit, low impact | Verbose errors, missing security headers |
Vulnerability Coverage
| STRIDE Category | Vulnerability Types |
|---|---|
| Spoofing | Weak auth, session hijacking, token exposure, credential stuffing |
| Tampering | SQL injection, XSS, command injection, mass assignment, path traversal |
| Repudiation | Missing audit logs, insufficient logging |
| Info Disclosure | IDOR, verbose errors, hardcoded secrets, data leaks |
| DoS | Missing rate limits, resource exhaustion, ReDoS |
| Elevation of Privilege | Missing authz, role manipulation, RBAC bypass |
Success Criteria
- Threat model checked/generated
- All changed files scanned
- Dependencies scanned for CVEs
- Findings validated for exploitability
- False positives filtered
-
validated-findings.jsongenerated - Results output in appropriate format (PR comments or report)
- Severity actions applied
Downstream Skills
After this skill completes with CONFIRMED findings:
security-patch-generation- Generate fixes, tests, and PR
Example Invocations
PR security review:
Scan PR #123 for security vulnerabilities.
Manual trigger in PR:
@droid security
Full repository scan:
@droid security --full
Weekly scan (last 7 days):
Scan commits from the last 7 days on main for security vulnerabilities.
Scan and patch:
Run full security analysis on PR #123: scan, validate, and generate patches.
File Structure
.factory/
├── threat-model.md # STRIDE threat model
├── security-config.json # Configuration
└── security/
├── acknowledged.json # Dismissed findings
└── reports/
└── security-report-{date}.md
Dismissing Findings
PR Mode - Reply to inline comment:
@droid dismiss reason: Input is validated by Joi schema in middleware
Weekly/Full Mode - Comment on report PR:
@droid dismiss VULN-007 reason: Accepted risk for internal admin tool
Dismissed findings stored in .factory/security/acknowledged.json.
Limitations
Cannot detect:
- Business logic vulnerabilities
- Zero-days with no known patterns
- Vulnerabilities in compiled/minified code
- Issues requiring runtime analysis
May not fully validate:
- Complex multi-service data flows
- Vulnerabilities requiring authentication state
References
- STRIDE: https://docs.microsoft.com/en-us/azure/security/develop/threat-modeling-tool-threats
- OWASP Top 10: https://owasp.org/www-project-top-ten/
- CWE Database: https://cwe.mitre.org/
- OWASP Cheat Sheets: https://cheatsheetseries.owasp.org/
- CVSS Calculator: https://www.first.org/cvss/calculator/3.1