GEO audit report (Bright Data → dashboard)

You help the user:

define prompts that matter for their business,
collect results from major LLM experiences (ChatGPT / Perplexity / optionally Gemini),
generate a dashboard-ready audit package with explicit JSON artifacts,
review the results and turn them into actionable insights.

The Bright Data API is mandatory for automated collection.

Shared context first

Before asking repeated discovery questions, check whether .agents/seo-context.md exists.

If it does:

read it first
reuse the saved site, market, competitor, and content-priority context
ask only for GEO-specific gaps that are still missing

Inputs to collect

Brand/site:
- target domains (ex: example.com)
- brand terms (ex: product name, founder name)
- competitor domains (optional)
Prompt list (2-10 to start)
Country / language

Execution phases

Treat the workflow as four distinct phases. Do not blur them together.

Phase 1: prompt set

find or create the prompt list
validate prompt coverage before running anything
save or confirm the prompt file path

Phase 2: audit run

run the Bright Data collector
produce the dated audit folder
do not start editing the dashboard template yet

Phase 3: audit package

verify that the JSON artifacts are complete
prepare the dashboard inputs
only then connect the run to the React template

Phase 4: analysis

analyze the run
summarize visibility, citations, search-trigger behavior, and fan-out
add optional manual recommendations

If a run failed or the JSON is incomplete, stop in Phase 2 or 3 and say so clearly instead of pretending the dashboard is ready.

How to gather prompts with the user

Do not assume the user already has a finished prompt list. In most cases, help them find one.

Prompt preflight

Before generating new prompts:

Check whether a prompt file already exists in the workspace.
Prefer likely files such as:
- a user-provided prompts file path
- prompts.txt in the current project
- geo-audit-report/test-prompts.txt
- prompt/tracking files from earlier GEO runs if the project has them
If you find an existing prompts file, read it first and summarize it briefly:
- number of prompts
- a few representative examples
- whether it looks broad, branded, competitor-led, or JTBD-focused
Ask the user whether they want to:
- run the audit on those prompts as-is
- refine that list
- replace it with a new list

Do not jump straight into generating a fresh prompt set if a plausible existing file is already present.

Prompt discovery workflow

If the user does not already have a strong prompt list, build one with them from business context.

Collect or infer:

the main product or service
the core jobs-to-be-done
the target audience or persona
key alternatives / competitors
the site sections that matter most
any geographic context that changes recommendations

Then gather direction from available materials. Prefer concrete inputs over guessing:

visit the main website
inspect the homepage and core product/service pages
inspect the blog, resources, docs, integrations, use-case pages, and comparison pages if they exist
look for sitemap files, article lists, navigation labels, collections, and category pages
use headings, CTAs, repeated claims, and topic clusters to identify high-intent themes

Useful sources of prompt ideas:

homepage positioning and category language
product or feature pages
use-case and industry pages
competitor comparison pages
blog posts and article titles
FAQ sections
templates, tools, calculators, and lead magnets
integration pages

What good prompts should look like

Prompts should sound like real user intent, not internal marketing copy.

Aim for a mix of:

category discovery prompts
best-tool / ranking prompts
alternatives / comparison prompts
workflow or problem-solving prompts
no-code / privacy / compliance / localization prompts when relevant
prompts tied to the user's actual content themes

The list should usually contain 2-10 prompts to start, but the important part is coverage, not round numbers.

How to turn site content into prompts

Translate content themes into natural-language prompts a buyer or user might ask:

category page -> "best [category] tool for [persona]"
workflow page -> "how to [job to be done] with [constraints]"
comparison page -> "alternatives to [competitor] for [use case]"
blog cluster -> "how to solve [pain point]" / "tools for [problem]"
feature page -> "tool that lets me [specific outcome]"

When useful, generate prompt clusters around:

best / top / rank / compare
alternatives / versus
how to / set up / automate
privacy-safe / no-code / local-first / global-friendly
timezone-aware / multilingual / regional requirements

Prompt quality bar

Avoid:

branded prompts only
vague one-word prompts
prompts that merely restate the company tagline
prompts disconnected from pages the site already has or wants to win on

Prefer prompts that:

map to an actual business outcome
map to existing or desired pages/content
reveal whether the brand gets cited, mentioned, or ignored
help uncover fan-out opportunities and missing content

Prompt coverage guide

When helping the user shape the prompt set, try to guide it toward a healthy mix across:

category prompts
competitor / alternative prompts
workflow / JTBD prompts
buyer-evaluation or ranking prompts
brand-adjacent prompts tied to actual product strengths

This is not a rigid checklist. The goal is to help the user find a set that reflects how buyers actually search, while avoiding a list that is too concentrated in only one angle.

Required tools

BRIGHTDATA_API_KEY env var
Bright Data dataset IDs for each chatbot you plan to run

Tooling & credentials

Auth mode: env
Requires: BRIGHTDATA_API_KEY
Fallback: none for automated collection
If missing: stop, ask the user to set the env var locally, and continue only after they confirm it is ready

Follow the shared setup and missing-access rules in docs/credentials-and-tooling.md.

Dataset IDs (always the same)

CHATGPT_DATASET_ID = "gd_m7aof0k82r803d5bjm"
PERPLEXITY_DATASET_ID = "gd_m7dhdot1vw9a7gc1n"
GEMINI_DATASET_ID = "gd_mbz66arm2mf9cu856y"

IMPORTANT: Before running the script, ask the user which chatbots they want to run:

ChatGPT
Perplexity
Gemini

They can select one, two, or all three. Only pass the dataset IDs for the selected chatbots to the script.

API key handling

Before running the script, verify only whether BRIGHTDATA_API_KEY is set:

if [ -z "$BRIGHTDATA_API_KEY" ]; then echo "Not set"; else echo "Set"; fi

If it is missing, ask the user to set it locally in their own terminal:

export BRIGHTDATA_API_KEY="your-key-here"

Do not ask the user to paste the key in chat, and never print the key value.

Collection script (Python)

Use:

geo-audit-report/scripts/brightdata-geo.py

It:

triggers selected datasets up front (part 1),
polls pending snapshots together (part 2),
downloads snapshots,
saves results to results.json with both response-level details and summary metrics.

Important implementation notes:

Do not force ChatGPT web search on or off in the input payload. Let the model decide naturally so the audit stays as close as possible to a real user experience.
Query fan-out coverage depends on what each provider exposes through Bright Data. ChatGPT currently returns the richest query-level search data.
Always preserve and surface whether search was actually triggered per response. Missing citations on a no-search response should be interpreted differently from missing citations on a searched response.
The collector should be resumable in a lightweight way: reuse existing snapshot metadata when available, skip raw redownloads unless forced, and write results.partial.json during progress.

Important timing note: BrightData processing can take 1 minute to several minutes depending on the number of prompts. With many prompts (e.g., 10+), expect 3-5+ minutes of wait time. Inform the user about this before running the script so they know what to expect.

Output structure: All files are saved in a dated folder (YYYY-MM-DD) within the specified --out-dir:

{out-dir}/{YYYY-MM-DD}/results.json - Complete results data
{out-dir}/{YYYY-MM-DD}/snapshots/{chatbot}.json - Snapshot metadata per chatbot
{out-dir}/{YYYY-MM-DD}/raw/{chatbot}-{snapshot_id}.json - Raw snapshot data
{out-dir}/{YYYY-MM-DD}/report.html - optional HTML report (generated by AI if requested)

Example run

# Ensure BRIGHTDATA_API_KEY is set (user should export it themselves)
python3 geo-audit-report/scripts/brightdata-geo.py \
  --check-url "https://example.com" \
  --prompts-file prompts.txt \
  --chatgpt-dataset-id "gd_m7aof0k82r803d5bjm" \
  --perplexity-dataset-id "gd_m7dhdot1vw9a7gc1n" \
  --gemini-dataset-id "gd_mbz66arm2mf9cu856y" \
  --target-domains "example.com" \
  --brand-terms "Example,Example Product" \
  --out-dir ./geo-run

# Files will be saved in: ./geo-run/2025-01-15/ (or current date)

Note: Only include the dataset ID flags for chatbots the user selected (ChatGPT, Perplexity, and/or Gemini).

Required audit artifacts

The audit is not complete unless the dated run folder contains the expected artifacts.

Required

{out-dir}/{YYYY-MM-DD}/results.json
{out-dir}/{YYYY-MM-DD}/snapshots/{chatbot}.json
{out-dir}/{YYYY-MM-DD}/raw/{chatbot}-{snapshot_id}.json

Optional but expected in normal usage

{out-dir}/{YYYY-MM-DD}/tracked-prompts.json
{out-dir}/{YYYY-MM-DD}/report.html

Artifact roles

results.json Main run output for the dashboard and summary analysis.
tracked-prompts.json Long-lived prompt tracking companion file used to compare prompt visibility over time.
report.html Optional static export if the user wants a standalone deliverable.

Post-execution analysis

After the script completes successfully, you MUST:

Read the results.json file from the dated output folder
Analyze the data and provide initial conclusions, including:
- Overall visibility summary (cited vs mentioned vs not visible)
- Which chatbots perform best/worst for the brand
- Key patterns across prompts (e.g., which prompts get cited, which don't)
- Which prompts triggered search and which did not
- Fan-out query insights (what related queries are being suggested)
- Aggregated fan-out query summary (query, count, whether the brand appeared, whether it was cited, whether the domain appeared in captured search results)
- Source breakdown insights (UGC vs YouTube vs web dominance)
- Competitor mentions if any
- Top priority actions based on the data
Prepare the dashboard files:
- Use results.json as the main audit input for the React template in geo-audit-report/template
- Prepare or update tracked-prompts.json if prompt history is available or should start from this run
- Optionally add a couple of manual recommendations in manual_recommendations
- If the user wants an HTML export too, generate report.html in the same dated output folder
Present these conclusions clearly and concisely to the user, and let them know the dashboard JSON is ready.

Output requirements

React dashboard template

IMPORTANT: The Python script produces the audit JSON, and the React template in geo-audit-report/template reads that JSON directly.

Treat the React template as the expected companion artifact for the skill, not an optional extra. If the user wants a usable deliverable, the default path is:

run the collector
verify the JSON artifacts
connect the run to the template
review the dashboard

Do not describe the audit as “done” if the collector succeeded but the JSON does not match what the dashboard expects.

The expected workflow is:

Run the collector to produce results.json
Copy that file into the template app's public/data/... folder or point the app to it with env vars
Add a few manual recommendations directly in the JSON after reviewing the data
Launch the app and review the three tabs:
- Overview
- Prompts
- Sources

Dashboard input contract

At minimum, the dashboard run should expose:

run_at
check_url
target_domains
brand_terms
snapshots
responses

Recommended fields:

manual_recommendations
fan_out_summary

Each response should ideally expose:

prompt
chatbot
model
mentions
cited
first_citation_rank
used_web_search
fan_out_queries
fan_out_details
sources

Dashboard requirements

The dashboard should support:

Overview metrics and prompt visibility
Prompt-level drilldown per chatbot
Source grouping by domain
Search-trigger status per response and per prompt rollup
Fan-out queries per response, including whether we appeared in the response and whether our domain appeared in captured search results
A global fan-out query summary with counts and appearance status
Manual recommendations added by AI in the JSON
A separate JSON file tracking prompts over time
Footer including holly-and-stick.com

Tracked prompts companion file

tracked-prompts.json should be treated as a first-class companion artifact for repeated audits.

Use this minimum structure:

{
  "tracked_prompts": [
    {
      "prompt": "best treasury software for multi-entity groups",
      "first_tracked_at": "2026-04-24",
      "last_tracked_at": "2026-04-24",
      "status": "active",
      "notes": "Enterprise category prompt",
      "history": [
        {
          "date": "2026-04-24",
          "visibility": 0
        }
      ]
    }
  ]
}

When to create or update it:

create it on the first run if the user wants repeated GEO tracking
update it on later runs when the same prompt reappears
keep it separate from results.json so each run stays immutable

If the user only wants a one-off audit, say explicitly that tracked-prompts.json is optional and may be skipped.

Fan-out analysis requirements

Treat fan-out analysis as a required output, not a side observation.

Per response

For each response with search fan-out data, capture:

the exact query or queries
whether search triggered
whether the brand appeared in the response
whether the brand was cited
whether the target domain appeared in captured search results

Aggregated

Across the run, provide a grouped query summary with:

query
count
prompts it appeared on
chatbots it appeared on
how many times the brand appeared in the response
how many times the brand was cited
how many times the target domain appeared in captured search results

Use guidance from obsidian/GEO Playbook.md:

list prompts → track → wait → analyze fan-outs
create missing content for fan-outs
target UGC-dominant sources strategically

geo-audit-report