Opengrep - Open Source Code Security Engine

What is Opengrep?

Opengrep is a fork of Semgrep CE (Community Edition), launched in early 2025 by a consortium including JIT, Aikido Security, Endor Labs, and other companies. It was created in response to Semgrep's licensing changes that restricted community-contributed rules from being used in commercial products.

Key Differences from Semgrep:

Fully open-source rules (no license restrictions)
Community-driven governance
No proprietary feature lock-in
Compatible with Semgrep CE syntax and rules
Focused on keeping critical features open
Commercial integration friendly

When to Use Opengrep

Ideal scenarios:

Quick security scans (minutes, not hours)
Pattern-based vulnerability detection
Using community rules without license concerns
Commercial product integration requiring open-source SAST
Dataflow and taint analysis within files
Multi-language security scanning
First-pass security analysis before deeper tools
When Semgrep licensing is a concern

Consider CodeQL instead when:

Need interprocedural taint tracking across files
Complex data flow analysis across modules required
Analyzing custom proprietary frameworks with deep integration

When NOT to Use

Do NOT use this skill for:

Complex cross-file data flow analysis (use CodeQL)
Binary or compiled code analysis without source
Deep semantic analysis requiring full program analysis
Runtime vulnerability detection
Secrets scanning (use Gitleaks)
Dependency scanning (use OSV-Scanner or Depscan)

Installation

# Homebrew
brew install opengrep

# pip
pip install opengrep

# pipx (recommended)
pipx install opengrep

# Docker
docker pull ghcr.io/opengrep/opengrep:latest

# From source
git clone https://github.com/opengrep/opengrep.git
cd opengrep
pip install -e .

# Verify
opengrep --version

Core Workflow

1. Quick Scan

# Auto scan with default rules
opengrep scan .

# Scan with specific ruleset
opengrep scan -f p/security-audit .

# Multiple rulesets
opengrep scan -f p/owasp-top-ten -f p/cwe-top-25 .

2. SARIF Output

# Generate SARIF report
opengrep scan --sarif -o results.sarif .

# SARIF with specific rules
opengrep scan -f p/security-audit --sarif -o results.sarif .

# Filter by severity in SARIF
opengrep scan \
  --severity=WARNING \
  --severity=ERROR \
  --sarif \
  -o results.sarif \
  .

3. Advanced Scanning

# Enable dataflow traces
opengrep scan --dataflow-traces .

# Taint analysis (intra-file)
opengrep scan --taint-intrafile .

# Experimental features
opengrep scan --experimental .

# Combined: dataflow + taint + experimental
opengrep scan \
  --dataflow-traces \
  --taint-intrafile \
  --experimental \
  .

4. Custom Rules

# Local rule files
opengrep scan -f /path/to/rules .

# Multiple rule directories
opengrep scan -f ./rules -f ./custom-rules .

# Exclude specific rules
opengrep scan \
  -f p/security-audit \
  --exclude-rule="rule-id-to-skip" \
  .

Rulesets

Public Rulesets

Ruleset	Description
`p/default`	General security and code quality
`p/security-audit`	Comprehensive security rules
`p/owasp-top-ten`	OWASP Top 10 vulnerabilities
`p/cwe-top-25`	CWE Top 25 vulnerabilities
`p/trailofbits`	Trail of Bits security rules
`p/python`	Python-specific security
`p/javascript`	JavaScript/TypeScript security
`p/golang`	Go-specific security
`p/java`	Java security patterns
`p/ruby`	Ruby security patterns

Community Rules

# Clone community rules
git clone https://github.com/opengrep/opengrep-rules.git

# Use community rules
opengrep scan -f opengrep-rules/ .

# Trail of Bits rules (fully open)
git clone https://github.com/trailofbits/semgrep-rules.git
opengrep scan -f semgrep-rules/rules .

Output Formats

# Text output (default)
opengrep scan .

# SARIF (for CI/CD)
opengrep scan --sarif .

# JSON
opengrep scan --json .

# JUnit XML
opengrep scan --junit-xml .

# GitLab SAST format
opengrep scan --gitlab-sast .

# Vim quickfix
opengrep scan --vim .

# Emacs format
opengrep scan --emacs .

Configuration

.opengrepignore

Create .opengrepignore:

tests/fixtures/
**/testdata/
generated/
vendor/
node_modules/
__pycache__/
*.test.js
*.spec.ts

Project Configuration

Create .opengrep.yml:

rules:
  - id: custom-hardcoded-secret
    languages: [python, javascript]
    message: "Hardcoded secret detected"
    severity: ERROR
    pattern: |
      $VAR = "$SECRET"
    metadata:
      cwe: "CWE-798"
      owasp: "A07:2021 - Identification and Authentication Failures"

  - id: sql-injection-risk
    languages: [python]
    message: "Potential SQL injection"
    severity: ERROR
    mode: taint
    pattern-sources:
      - pattern: request.args.get(...)
    pattern-sinks:
      - pattern: cursor.execute($QUERY)
    pattern-sanitizers:
      - pattern: int(...)

exclude:
  - tests/
  - vendor/

Use config:

opengrep scan --config .opengrep.yml .

CI/CD Integration (GitHub Actions)

name: Opengrep Security Scan

on:
  push:
    branches: [main]
  pull_request:
  schedule:
    - cron: '0 0 * * 1'  # Weekly

jobs:
  opengrep:
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v4

      - name: Setup Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.11'

      - name: Install Opengrep
        run: pip install opengrep

      - name: Run Opengrep
        run: |
          opengrep scan \
            -f p/security-audit \
            -f p/owasp-top-ten \
            --dataflow-traces \
            --taint-intrafile \
            --experimental \
            --sarif \
            -o opengrep-results.sarif \
            --severity=WARNING \
            --severity=ERROR \
            --exclude=test \
            --exclude=tests \
            .

      - name: Upload SARIF
        if: always()
        uses: github/codeql-action/upload-sarif@v3
        with:
          sarif_file: opengrep-results.sarif
          category: opengrep

      - name: Upload Results
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: opengrep-results
          path: opengrep-results.sarif

Writing Custom Rules

Basic Rule Structure

rules:
  - id: dangerous-eval
    languages: [javascript, python]
    message: "Use of eval() is dangerous"
    severity: ERROR
    patterns:
      - pattern: eval($CODE)
      - pattern-not: eval("...")  # Literal strings okay

Pattern Syntax

Syntax	Description	Example
`...`	Match anything	`func(...)`
`$VAR`	Capture metavariable	`$FUNC($INPUT)`
`<... ...>`	Deep expression match	`<... user_input ...>`

Pattern Operators

Operator	Description
`pattern`	Match exact pattern
`patterns`	All must match (AND)
`pattern-either`	Any matches (OR)
`pattern-not`	Exclude matches
`pattern-inside`	Match only inside context
`pattern-not-inside`	Match only outside context
`pattern-regex`	Regex matching
`metavariable-regex`	Regex on captured value

Taint Mode

rules:
  - id: xss-vulnerability
    languages: [javascript]
    message: "User input flows to innerHTML (XSS risk)"
    severity: ERROR
    mode: taint
    pattern-sources:
      - pattern: req.query.$PARAM
      - pattern: req.body.$PARAM
    pattern-sinks:
      - pattern: $ELEMENT.innerHTML = $DATA
    pattern-sanitizers:
      - pattern: escapeHtml(...)
      - pattern: DOMPurify.sanitize(...)

Common Use Cases

1. Comprehensive Security Audit

# Multi-ruleset scan
opengrep scan \
  -f p/security-audit \
  -f p/owasp-top-ten \
  -f p/cwe-top-25 \
  --dataflow-traces \
  --experimental \
  --sarif \
  -o security-audit.sarif \
  .

2. Language-Specific Scan

# Python security
opengrep scan \
  -f p/python \
  --taint-intrafile \
  --sarif \
  -o python-security.sarif \
  ./src

# JavaScript/TypeScript security
opengrep scan \
  -f p/javascript \
  -f p/typescript \
  --dataflow-traces \
  --sarif \
  -o js-security.sarif \
  ./frontend

3. Pre-commit Hook

# Scan staged files only
git diff --cached --name-only --diff-filter=ACMR | \
  grep -E '\.(py|js|ts|go|java|rb)$' | \
  xargs opengrep scan -f p/security-audit

5. Diff Scan (Changed Files Only)

# Scan only modified files
git diff --name-only origin/main...HEAD | \
  xargs opengrep scan -f p/security-audit --sarif -o diff-scan.sarif

Suppressing False Positives

Inline Suppressions

# nosemgrep: rule-id
password = get_from_vault()

# Multiple rules
eval(safe_code)  # nosemgrep: dangerous-eval, code-injection

// nosemgrep: xss-vulnerability
element.innerHTML = sanitizedContent;

Configuration-Based Suppressions

# .opengrep.yml
exclude-rules:
  - rule-id-1
  - rule-id-2

exclude-paths:
  - tests/
  - generated/

Performance Optimization

# Limit to specific file types
opengrep scan --include='*.py' --include='*.js' .

# Exclude large directories
opengrep scan --exclude=node_modules --exclude=vendor .

# Set timeout per file
opengrep scan --timeout 60 .

# Disable experimental features for speed
opengrep scan -f p/security-audit .  # No --experimental

Comparing with Semgrep

Compatibility

Opengrep maintains compatibility with Semgrep CE:

Same rule syntax (YAML)
Same pattern language
Same command-line interface
Can use Semgrep rules directly

Key Differences

Feature	Opengrep	Semgrep CE
License	LGPL 2.1 (fully open)	LGPL 2.1 (engine), restrictive rules
Rules	Fully open, no restrictions	Community rules have usage restrictions
Governance	Community consortium	r2c/Semgrep Inc.
Commercial Use	Unrestricted	Restricted for community rules
Pro Features	Being migrated to open	Proprietary
Development	Community-driven	Company-driven

Migration from Semgrep

# Rules are compatible - just change binary
alias opengrep=semgrep  # For testing
opengrep scan -f p/security-audit .

# Update CI/CD configs
sed -i 's/semgrep/opengrep/g' .github/workflows/security.yml

Supported Languages

Web: JavaScript, TypeScript, JSX, TSX
Backend: Python, Go, Java, Kotlin, Scala
Systems: C, C++, Rust
Mobile: Swift, Kotlin, Java
Scripting: Ruby, PHP, Bash, Lua
Infrastructure: Terraform, Dockerfile, YAML, JSON
Other: C#, Elixir, Solidity, Apex

Limitations

Intra-file taint only: Cross-file dataflow requires CodeQL
Pattern-based: Can't understand complex program semantics
No runtime analysis: Static analysis only
Performance: Large codebases may be slow with all features enabled
Experimental features: May have bugs or incomplete coverage

Rationalizations to Reject

Shortcut	Why It's Wrong
"Opengrep found nothing = code is secure"	Pattern-based analysis can miss context-specific vulnerabilities
"Just use default rules"	Default rules are generic; custom rules for your stack are essential
"Skip dataflow/taint analysis for speed"	These features catch vulnerabilities simple patterns miss
"Semgrep and Opengrep are identical"	Licensing differences matter for commercial use; feature sets diverging
"Don't need both Opengrep and CodeQL"	Complementary: Opengrep is fast/broad, CodeQL is deep/precise

References

Repository: https://github.com/opengrep/opengrep
Website: https://www.opengrep.dev/
Rules Repository: https://github.com/opengrep/opengrep-rules
Documentation: https://www.opengrep.dev/docs/
Trail of Bits Rules: https://github.com/trailofbits/semgrep-rules
Comparison with Semgrep: https://semgrep.dev/docs/faq/comparisons/opengrep
Launch Announcement: https://www.aikido.dev/blog/launching-opengrep-why-we-forked-semgrep

Articles:

opengrep