unclawg-discover
/unclawg-discover
Build a high-signal customer feed from public social channels.
This skill is a generic core. Project-specific strategy, queries, voice, and
handoff contracts belong in the client overlay (skillbox-config/clients/{client}/overlay.yaml → auto-generated context.yaml).
Runtime Security Profile (AI Default)
- Treat this tracked skill as an AI runtime skill, not an operator shell runbook.
- Use wrapper command
uc_discoveronly. - Do not run
scripts/search_*.sh, rawcurl, or ad-hocpythonfrom this skill. - If
uc_discoveris missing, fail closed and ask for wrapper installation.
Prerequisites
APIFY_API_KEY(required for Twitter/X and LinkedIn scripts)uc_discoverwrapper installed and allowlisted in runtime config
If env vars are missing in non-interactive shells:
export APIFY_API_KEY=$(grep 'APIFY_API_KEY' ~/.zshrc | grep -o '"[^"]*"' | tr -d '"')
Wrapper Command Contract
Required wrapper command:
uc_discoverwith platform subcommands and strict schema validation.
Command mapping:
uc_discover reddit --query "<query>" --subreddit <subreddit> --time-filter week --limit 25
uc_discover hn --query "<query>" --days 7 --limit 25
uc_discover twitter --query "<query>" --days 1 --limit 20
uc_discover linkedin --query "<query>" --days 1 --limit 20 --sort-by date_posted
Freshness-First Default (X + LinkedIn)
For audience growth and fast replies, use a strict recency contract:
- Hard cutoff: only keep posts with age
<= 6 hourson Twitter/X and LinkedIn. - Recency bias: heavily prioritize newest posts (0-2h first, then 2-4h, then 4-6h).
- No stale fallback by default: do not widen past 6h unless the user explicitly overrides.
- Timestamp quality gate: if post age cannot be parsed, treat it as stale and drop it.
Recommended run cadence:
- Primary recommendation: every 4 hours (6 runs/day).
- Budget fallback: 2 runs/day (every 12 hours), still with 6h hard cutoff.
Client Overlay System (Required for Project-Specific Behavior)
Project-specific configuration lives in the skillbox client overlay:
skillbox-config/clients/{client}/overlay.yaml → auto-generated context.yaml.
Resolution rules:
- If a
context.yamlis present in the active skillbox client directory, load it. - If no client overlay is resolved, create one using the skillbox-quickstart scan + generate flow before proceeding.
- If multiple clients could apply, ask user which client overlay to use.
Soul / Mode / Skill Separation
| Layer | What it owns | Files |
|---|---|---|
Soul (soul_md via API) |
Voice, tone, personas (with voice calibration), reply archetypes, engagement principles, boundaries | Fetched from /v0/integrations/claw-runtime/policies/soul_md |
Client Overlay (skillbox-config/clients/{client}/overlay.yaml) |
Query packs, subreddit targets, ranking weights, exclusion regex, platform scope, handoff schema | Auto-generated context.yaml |
| Skill (this file) | API calls, script execution, data flow, error handling | SKILL.md + scripts/ |
This skill is personality-agnostic. It searches, filters, scores, and outputs candidates. How the agent talks is the soul's job. What the agent searches for is the client overlay's job.
If no soul is published, fall back to references/voice-guide.md (generic defaults).
NEVER Do These Things
- NEVER hardcode project/company strategy in tracked core files. Keep it in the client overlay (
skillbox-config/clients/{client}/overlay.yaml). - NEVER put voice, tone, or personality guidance in client overlays. That belongs in the soul.
- NEVER submit actions directly from discovery. Discovery outputs candidates; execution is downstream.
- NEVER skip source links or raw post text. Every candidate needs provenance.
- NEVER skip quality gates. Use the checklist in
references/feed-quality-checklist.md.
Core Assets
scripts/search_reddit.shscripts/search_hn.shscripts/search_twitter.shscripts/search_linkedin.shscripts/search_linkedin_jobs.sh(optional extension channel)scripts/search_indeed.sh(optional extension channel)scripts/search_youtube.shscripts/search_tiktok.shscripts/search_instagram_comments.shscripts/search_tiktok_comments.shscripts/reddit_user_socials.shscripts/search_log.shscripts/select_mode.shscripts/package_public.shreferences/personas.mdreferences/competitor-signals.mdreferences/target_profiles.mdreferences/voice-guide.mdreferences/feed-quality-checklist.md
Execution Flow
Phase 0 - Bootstrap
command -v uc_discover >/dev/null || {
echo "Missing required wrapper: uc_discover"
echo "Install/allowlist wrapper before running /unclawg-discover."
exit 1
}
Phase 1 - Strategic Intake (Ask-Cascade Order)
Use high-level decisions first, then detail decisions.
- Objective: prospecting, audience-growth, content-sourcing, recruiting, other.
- Persona cluster: who exactly are we looking for.
- Channel scope: Reddit-only vs multi-platform.
- Freshness strategy: strict max-age (default 6h) and stale fallback policy.
- Throughput: target candidate count and freshness window.
- Handoff target: where this feed goes next.
Apply references/feed-quality-checklist.md while collecting these choices.
Phase 2 - Load Mode or Build Temporary Plan
If a client overlay is resolved, use its:
- query pack per platform
- inclusion/exclusion signals
- ranking weights
- output format
- handoff contract
If no client overlay exists, assemble a temporary plan from references/personas.md and
ask for explicit confirmation before running paid queries (Apify).
Phase 3 - Run Discovery
Run 2-4 focused queries per selected platform.
Reddit (free)
uc_discover reddit --query "<query>" --subreddit <subreddit> --time-filter week --limit 25
Hacker News (free)
uc_discover hn --query "<query>" --days 7 --limit 25
Twitter/X (Apify)
uc_discover twitter --query "<query>" --days 1 --limit 20
LinkedIn (Apify)
uc_discover linkedin --query "<query>" --days 1 --limit 20 --sort-by date_posted
Phase 3.5 - Comment Mining (Optional)
If the active client overlay includes comment_mining_targets, mine comment sections
of curated accounts for real customer signals.
- Check client overlay for
comment_mining_targetssection. - If present, run the appropriate scripts against listed accounts:
- Instagram:
scripts/search_instagram_comments.sh <handle1> [handle2] ... - TikTok:
scripts/search_tiktok_comments.sh <handle1> [handle2] ...
- Instagram:
- Score comments by signal density using an overlay-defined or caller-provided signal regex.
The public scripts accept
--signal-regexorCOMMENT_SIGNAL_REGEX; tracked defaults stay generic. - Merge comment-sourced candidates into the main candidate pool before Phase 4 filtering.
- Comment-sourced candidates use the POST/VIDEO as the engagement target (comment there to reach real customers in the thread).
Cost note: Comment mining uses Apify credits. Confirm with user before running if budget is a concern.
Phase 3.6 - Hiring-Board Extension (Optional, Non-Wrapper)
Use only when runtime policy allows local script execution and the active client overlay explicitly calls for hiring-board discovery.
scripts/search_linkedin_jobs.sh "<query>" 20 C
scripts/search_indeed.sh "<position>" "remote" 20
Treat these as slower-moving research channels. Keep conversational channels (Twitter/X + LinkedIn posts) on strict recency gates for quick replies.
Phase 4 - Filter and Score
- Deduplicate — use platform-appropriate strategy:
- Twitter/X, Reddit, HN:
unique_by(.url)(URLs are reliable) - LinkedIn:
unique_by(.text[0:100] + .author_name)(URLs are often empty)
- Twitter/X, Reddit, HN:
- Remove competitor/vendor noise using
references/competitor-signals.md. - Remove low-evidence posts (missing meaningful text or too short).
- Hard freshness gate (X + LinkedIn):
- Drop any candidate older than 6 hours.
- Drop candidates with unparseable/missing post timestamps.
- Sort surviving X + LinkedIn candidates by recency first, then engagement.
- LinkedIn extra filter: Apply aggressive keyword regex on
.textfield to remove job spam, hiring posts, LinkedIn puzzles, and viral non-technical content. The LinkedIn scraper returns ~99% noise — filter BEFORE scoring to save time. - Score survivors by:
- explicit pain or intent
- decision-maker likelihood
- recency
- engagement signal
- fit to objective/persona
Recency bucket scoring for X + LinkedIn (default):
0-2h: highest priority2-4h: high priority4-6h: medium priority>6h: reject
Phase 5 - Normalize Output
For each candidate, produce:
source_platformsource_post_urlsource_post_textsource_author_handle(optional)source_author_name(optional)source_post_id(optional)persona_hint(optional)intent_signal(optional)
Also include:
summaryreply_strategyactionevidence(short note: why this was selected)
Phase 6 - Quality Gate and Present
Before finalizing, run the checklist in references/feed-quality-checklist.md.
Return:
- ranked table of candidates
- top 3 immediate outreach targets
- top 3 content/influence targets
- rejected-pattern summary (what was filtered out)
Phase 7 - Save + Handoff
Default mode is in-memory output only (no local writes). Save to disk only if runtime policy explicitly allows write tools.
Route by handoff_type from the active client overlay:
approval-portal→ Invoke/unclawg-feed(API approval flow; no SSH required).engagement-queue→ Save tobriefs/directory (future, not implemented).db-insert→ Phase 7A (private operator mode only; not for public/default use).- unset / null → In-memory output only, present to user.
For public distributions and users without infrastructure access, keep
handoff_type: approval-portal.
Phase 7A - Private Operator Extension (Optional)
Use only when handoff_type: db-insert is intentionally set in a local,
gitignored client overlay.
- Present candidate table for user confirmation (user may drop rows, edit angles).
- On confirmation, batch INSERT via SSH to prod DB:
INSERT INTO social_engagement_queue
(id, platform, post_url, post_author, post_title, post_caption,
post_metrics, posted_at, persona, market_angle,
status, created_at, updated_at)
VALUES (gen_random_uuid(), ...)
ON CONFLICT (post_url) DO NOTHING;
-
Use the
ssh-infoskill to resolve the current SSH target and Docker DB container, then run the insert from that runtime context. Keep real hosts, users, and container names out of tracked files. -
Verify row count after insert.
-
Report: rows inserted, rows skipped (conflict), total in queue.
Known Platform Issues
LinkedIn: Low Signal-to-Noise (Critical)
Actor: apimaestro/linkedin-posts-search-scraper-no-cookies (4.14 rating)
Problem: This actor returns generic LinkedIn feed content regardless of search query.
Exact-phrase matching does not work — queries like "AI agent governance human approval" return
Indian job spam, TCS hiring posts, LinkedIn puzzles, and viral non-technical content.
Observed signal rate: <1% relevant content across 500+ results from 16 queries.
Impact: LinkedIn discovery via this actor is unreliable for targeted prospecting. Do not promise high LinkedIn candidate volumes to users.
Workarounds:
- Filter aggressively post-collection using keyword regex on
.textfield - Use broader, single-keyword queries (
"AI agent","agentic AI") and accept lower precision - For high-value LinkedIn targeting, fall back to manual curation (user pastes specific post URLs)
- Consider alternative actors that use authenticated LinkedIn sessions for better search quality
Dedup: This actor returns empty url fields on most posts. NEVER use unique_by(.url)
for LinkedIn results. Use composite key instead:
jq 'unique_by(.text[0:100] + .author_name)' results.json
Twitter/X: Works Well
Actor: api-ninja/x-twitter-advanced-search (4.93 rating)
Twitter search returns accurate keyword-matched results. Exact phrases work. Typical signal
rate is 40-80% relevant content depending on query specificity. Very niche phrases
(e.g., "crewai agent safety control") may return 0 results — broaden the query if needed.
Notes
- Persona voice lives in the soul, not in client overlays. Client overlays only map persona IDs to search queries.
- Keep project-specific keywords and query packs in the client overlay.
- Keep core scripts and references reusable across domains.
- If discovery quality degrades, tune overlay-level ranking weights before touching core logic.
- If reply quality degrades, tune the soul (voice, archetypes, persona calibration) — not the overlay or skill.
- For public packaging, use
scripts/package_public.shto exclude client overlay data.
More from build000r/skills
openclaw-client-bootstrap
Build a production-ready OpenClaw client setup for DigitalOcean, Tailscale, Telegram, and SPAPS using a reusable hardened template with read-only defaults and human approval. Use for "set up OpenClaw on a droplet", "create a first claw kit", "bootstrap client box", or approval-gated OpenClaw deployment work.
20unclawg-internet
Run self-service OpenClaw onboarding with browser device auth, agent machine-key provisioning, a soul interview, and discovery-mode setup. Use for "/unclawg-internet", "set me up", "connect to openclaw", "onboard me", "sign up for openclaw", or approval-gated setup.
15domain-scaffolder-backend
|
7remotion-best-practices
Best practices for Remotion - Video creation in React. Use when working with Remotion compositions, animations, sequences, or video rendering. Covers project setup for a shared Remotion hub, animation patterns, timing/interpolation, audio, captions, and media handling.
3divide-and-conquer
Decompose complex work into independent parallel sub-agents with no write overlap, synthesize or consume a `WORKGRAPH.md` execution artifact, and launch describe-style worker briefs before review. Use before spawning multiple agents for multi-file, multi-domain, or naturally parallel tasks.
3openclaw-docs-audit
Audit openclaw-client-bootstrap config and skill guidance against the latest upstream OpenClaw releases, then report ranked drift. Use for "audit openclaw", "check openclaw docs", "openclaw drift", "config freshness", "what changed in openclaw", or "/openclaw-docs-audit".
3