Signal Scanner

Scheduled scanner that detects buying signals on TAM companies and watchlist personas, writes them to the signals table, and sets up downstream activation.

When to Use

After TAM Builder has populated companies and personas
As a recurring scan (daily/weekly) to detect timing-based outreach triggers
When you need to move from static lists to intent-driven outreach

Prerequisites

SUPABASE_URL + SUPABASE_SERVICE_ROLE_KEY in .env
APIFY_TOKEN in .env (for Phase 2 signals)
ANTHROPIC_API_KEY in .env (optional, for LLM content analysis)
TAM companies populated via tam-builder
Watchlist personas created for Tier 1-2 companies

Signal Types

Priority	Signal	Level	Source	Cost
P0	Headcount growth (>10% in 90d)	Company	Data diffs	Free
P0	Tech stack changes	Company	Data diffs	Free
P0	Funding round	Company	Data diffs	Free
P0	Job posting for relevant roles	Company	Apify linkedin-job-search	~$0.001/job
P1	Leadership job change	Person	Apify linkedin-profile-scraper	~$3/1k
P1	LinkedIn content analysis	Person	Apify linkedin-profile-posts + LLM	~$2/1k + LLM
P1	LinkedIn profile updates	Person	Apify linkedin-profile-scraper	~$3/1k
P2	New C-suite hire	Company	Derived from person scans	Free

Config Format

See configs/example.json for full schema. Key sections:

client_name — which client's TAM to scan
signals.* — enable/disable each signal type with thresholds
scan_scope — filter by tier, status, lead_status

Database Write Policy

CRITICAL: Never write signals or update lead statuses without explicit user approval.

The signal scanner writes to multiple tables: signals (insert), enrichment_log (insert), companies (patch snapshots), and people (patch lead_status). These writes affect downstream outreach decisions — bad signals lead to bad outreach timing.

Required flow:

Always run --dry-run first to detect signals without writing to the database
Present the dry-run results to the user: signal count, types, top signals, affected companies/people
Get explicit user approval before running without --dry-run
Only then run the actual scan that writes to the database

Why this matters:

Signals drive outreach timing — incorrect signals trigger premature outreach
lead_status changes from monitoring to signal_detected are hard to undo across many records
Snapshot updates affect future signal diffs — bad snapshots cascade into future scans
Enrichment log entries track Apify credit spend

The agent must NEVER pass --yes on a first run. The --yes flag is only for pre-approved scheduled scans where the user has already validated the signal detection logic.

Usage

# Dry run first (ALWAYS DO THIS) — detect signals without writing to DB
python skills/capabilities/signal-scanner/scripts/signal_scanner.py \
  --config skills/capabilities/signal-scanner/configs/my-client.json --dry-run

# Full scan (only after user reviews dry-run results and approves)
python skills/capabilities/signal-scanner/scripts/signal_scanner.py \
  --config skills/capabilities/signal-scanner/configs/my-client.json

# Test mode (5 companies max)
python skills/capabilities/signal-scanner/scripts/signal_scanner.py \
  --config configs/example.json --test --dry-run

# Free signals only (skip Apify)
# Set all Apify signals to enabled: false in config

Flags

Flag	Effect
`--config PATH`	Path to config JSON (required)
`--test`	Limit to 5 companies, 3 people
`--yes`	Auto-confirm Apify cost prompts. Only use for pre-approved scheduled scans.
`--dry-run`	Detect signals but don't write to DB. Always run this first.
`--max-runs N`	Override Apify run limit (default 50)

Output

Signals table writes

Each signal includes: client_name, company_id, person_id, signal_level (company or person), signal_type, signal_source, strength, signal_data (JSON), activation_score, detected_at, acted_on, run_id.

Other database writes

Person lead_status updated to signal_detected when activation_score >= threshold
Company metadata._signal_snapshot updated for next diff cycle
Person raw_data._signal_snapshot updated for next diff cycle
enrichment_log entries with tool='apify', action='search' or 'enrich', plus credits_used

Console output

Summary stats printed to stdout

Activation Score

activation_score = strength * recency_multiplier * account_fit

Recency:   <24h = 1.5, 1-3d = 1.2, 3-7d = 1.0, 1-2w = 0.8, 2-4w = 0.5
Account:   Tier 1 = 1.3, Tier 2 = 1.0, Tier 3 = 0.7

Connects To

Upstream: tam-builder (provides companies + people)
Downstream: cold-email-outreach (acts on signals)

File Structure

signal-scanner/
├── SKILL.md
├── configs/
│   └── example.json
└── scripts/
    └── signal_scanner.py