geo-state-report
GEO state report (Bright Data → HTML)
You help the user:
- define prompts that matter for their business,
- collect results from major LLM experiences (ChatGPT / Perplexity / optionally Gemini),
- generate an HTML report with actions.
The Bright Data API is mandatory for automated collection.
Inputs to collect
- Brand/site:
- target domains (ex:
example.com) - brand terms (ex: product name, founder name)
- competitor domains (optional)
- target domains (ex:
- Prompt list (2-10 to start)
- Country / language
Required tools
BRIGHTDATA_API_KEYenv var (or provided by user)- Bright Data dataset IDs for each chatbot you plan to run
Dataset IDs (always the same)
CHATGPT_DATASET_ID = "gd_m7aof0k82r803d5bjm"
PERPLEXITY_DATASET_ID = "gd_m7dhdot1vw9a7gc1n"
GEMINI_DATASET_ID = "gd_mbz66arm2mf9cu856y"
IMPORTANT: Before running the script, ask the user which chatbots they want to run:
- ChatGPT
- Perplexity
- Gemini
They can select one, two, or all three. Only pass the dataset IDs for the selected chatbots to the script.
API Key Security Instructions
Avoid exposing the BrightData API key in chat messages or code.
-
Check if key exists: Before running the script, check if
BRIGHTDATA_API_KEYis already set in the environment:# Check without exposing the value if [ -z "$BRIGHTDATA_API_KEY" ]; then echo "Not set"; else echo "Set"; fi -
If key is not set: Ask the user to export it themselves with these instructions:
- Go to https://brightdata.com and log in
- Navigate to your account settings/API section
- Generate a new API key if needed
- Go to Terminal and run:
export BRIGHTDATA_API_KEY="your-key-here" - Do NOT paste the key in chat - the user should run the export command themselves but if they do, just use it (it's too late lol)
-
Never read or display the key: If you need to verify it's set, only check if the variable exists (is non-empty), never echo or display its value.
If missing: stop and ask the user to set them using the export command above.
Collection script (Python)
Use:
geo-state-report/scripts/brightdata-geo.py
It:
- triggers datasets (part 1),
- polls until ready (part 2),
- downloads snapshots,
- saves results to
results.json(HTML report is NOT generated by the script).
Output structure: All files are saved in a dated folder (YYYY-MM-DD) within the specified --out-dir:
{out-dir}/{YYYY-MM-DD}/results.json- Complete results data{out-dir}/{YYYY-MM-DD}/snapshots/{chatbot}.json- Snapshot metadata per chatbot{out-dir}/{YYYY-MM-DD}/raw/{chatbot}-{snapshot_id}.json- Raw snapshot data{out-dir}/{YYYY-MM-DD}/report.html- HTML report (generated by AI, see below)
Example run
# Ensure BRIGHTDATA_API_KEY is set (user should export it themselves)
python3 geo-state-report/scripts/brightdata-geo.py \
--check-url "https://example.com" \
--prompts-file prompts.txt \
--chatgpt-dataset-id "gd_m7aof0k82r803d5bjm" \
--perplexity-dataset-id "gd_m7dhdot1vw9a7gc1n" \
--gemini-dataset-id "gd_mbz66arm2mf9cu856y" \
--target-domains "example.com" \
--brand-terms "Example,Example Product" \
--out-dir ./geo-run
# Files will be saved in: ./geo-run/2025-01-15/ (or current date)
Note: Only include the dataset ID flags for chatbots the user selected (ChatGPT, Perplexity, and/or Gemini).
Post-execution analysis
After the script completes successfully, you MUST:
-
Read the
results.jsonfile from the dated output folder -
Analyze the data and provide initial conclusions, including:
- Overall visibility summary (cited vs mentioned vs not visible)
- Which chatbots perform best/worst for the brand
- Key patterns across prompts (e.g., which prompts get cited, which don't)
- Fan-out query insights (what related queries are being suggested)
- Source breakdown insights (UGC vs YouTube vs web dominance)
- Competitor mentions if any
- Top priority actions based on the data
-
Generate the HTML report (
report.html) in the same dated output folder:- Create a beautiful, customized HTML report based on the actual results.json data
- Follow the customization guidelines in the "Output requirements" section below
- Save it to
{out-dir}/{YYYY-MM-DD}/report.html
-
Present these conclusions clearly and concisely to the user, and let them know the HTML report has been generated.
Output requirements
HTML Report Generation
IMPORTANT: The Python script does NOT generate the HTML report. YOU (the AI) must generate report.html from scratch after reading the results.json file. Create a customized, beautiful report based on:
- Results data: Analyze the actual results.json to highlight key insights, anomalies, and patterns
- User expertise: Adjust technical depth and explanations based on user's SEO knowledge level
- Business context: Tailor recommendations to their specific industry, stage, and goals
- Actionability: Make insights immediately actionable with clear next steps
Report generation guidelines:
-
Read the results.json after script execution to understand the actual data
-
Generate the HTML report by:
- Adding contextual comments/insights directly in the HTML (use HTML comments or visible callout sections)
- Highlighting the most important findings with visual emphasis
- Customizing the "Actions" section with specific, prioritized recommendations based on the data
- Adding data-driven insights that aren't obvious from raw numbers
- Including fan-out query analysis with specific content opportunities
- Adding competitor analysis if competitors are mentioned
- Creating visual hierarchy to guide the reader's attention
-
Design principles:
- Beautiful: Use modern, clean design with good typography, spacing, and color contrast
- Easy to read: Clear sections, scannable layout, visual hierarchy
- Actionable: Every insight should lead to a clear next step or decision
-
Required elements:
- A single
report.html(generated by AI, saved to the dated output folder) - Summary metrics per prompt (cited, first citation rank, mentioned, fan-out queries, sources breakdown)
- A prioritized "actions" section tailored to the specific results
- Footer must include
holly-and-stick.com
- A single
Use guidance from obsidian/GEO Playbook.md:
- list prompts → track → wait → analyze fan-outs
- create missing content for fan-outs
- target UGC-dominant sources strategically