topic-gap-analyzer
Topic Gap Analyzer
What This Skill Does
Given:
- A topic research PDF (from the user's Google AI Studio app, covering semantic analysis, related subtopics, entity salience, related keywords, and search volume context)
- A URL inventory of the user's website — either an XML sitemap URL or a Screaming Frog CSV/XLSX export
…this skill analyzes the semantic gap between what the site currently covers and what the topic report recommends, then outputs a structured content action plan.
Step-by-Step Instructions
Step 1 — Parse the Topic Report PDF
Extract the following from the PDF:
- Core topic and its semantic definition (what the topic actually means in the current search landscape)
- Primary audience segments (e.g., new users vs. scaling brands)
- Related subtopics with their volume signals (HIGH / MEDIUM) and trend direction (Rising / Stable)
- Related keywords / search volume context — the keyword clusters and variants listed
- Top topic entities with their salience scores and entity type (Organization, Concept, etc.)
- Key concepts that modern content must address (e.g., "Lump Sum Trap", "payout-centric automation", "AI transaction categorization")
Entity handling rule: Entities that are Organizations (brands/competitors) must be flagged. Only include competitor/brand names in content recommendations if the page type explicitly calls for it — comparison pages, alternative pages, review roundups, "best X" lists. For all other page types, recommend covering the concepts those entities represent without naming the brands.
Step 2 — Ingest the URL Inventory
If the user provides an XML sitemap URL:
- Fetch the sitemap using the
web_fetchtool - If it's a sitemap index, fetch the child sitemaps and collect all
<loc>URLs - Extract all URLs into a flat list
If the user provides a Screaming Frog export (CSV or uploaded file):
- Read the file from
/mnt/user-data/uploads/ - Extract the
Addresscolumn (or equivalent URL column) - Filter to
200status codes only (ignore redirects, errors) if a Status column is present
Parse each URL to extract:
- The URL slug / path
- Any inferred page title or topic from the slug (split hyphens/underscores into readable words)
Do not fetch individual page content unless the user explicitly asks for deeper analysis — slug-level semantics are sufficient for gap analysis.
Step 3 — Semantic Matching
For each subtopic and keyword cluster from the report, scan the URL inventory for semantic matches using slug analysis:
- Strong match: The URL slug contains 2+ key terms from the subtopic/keyword cluster
- Weak match: The URL slug contains 1 key term that overlaps
- No match: No URL covers this subtopic
Also flag URLs that are adjacent — they cover a related theme and could be expanded rather than replaced.
Consider page type signals in URL slugs (e.g., /blog/, /vs/, /alternatives/, /reviews/, /best-) when making recommendations, as these affect whether competitor names are appropriate.
Step 4 — Generate the Content Action Plan
Structure the output as follows:
📊 Topic Overview
Brief 2–3 sentence summary of the topic, its current search intent landscape, and the two primary audience segments from the report.
✅ UPDATE Recommendations
Pages that already exist and should be enriched to cover the topic or subtopics better.
For each recommendation:
- URL:
[existing URL] - Why update: What the page currently covers and the gap
- What to add: Specific subtopics, keyword angles, or concepts to incorporate
- Entity requirements: Which non-brand entities/concepts from the report must appear
- Priority: High / Medium (based on subtopic volume + trend)
🆕 CREATE Recommendations
Net-new pages to build. Group by content type where relevant (e.g., educational guide, comparison page, feature landing page).
For each recommendation:
- Suggested URL: Proposed slug
- Page type: Educational / Landing / Comparison / Alternative / Hub etc.
- Target topic/subtopic: Which part of the report this addresses
- Target audience segment: Which persona this serves
- Core angle: The specific hook or framing for the page
- Must-cover entities/concepts: Non-negotiable topic entities from the report with high salience
- Competitor mentions: Only include if page type is Comparison / Alternative / Review / Best-list — flag clearly with ⚠️ Competitor mentions appropriate
- Priority: High / Medium / Low
🔑 Entity Coverage Checklist
A table of the top entities from the report and which recommended pages (UPDATE or CREATE) will cover them. This ensures no high-salience entity is left unaddressed.
| Entity | Salience | Type | Covered By |
|---|---|---|---|
| ... | ... | ... | ... |
📋 Quick Summary
- Total URLs analyzed: X
- Strong matches found: X
- Weak/adjacent matches: X
- UPDATE recommendations: X
- CREATE recommendations: X
Important Rules
-
Never suggest adding competitor brand names to non-comparative content. Instead recommend covering the function or concept they represent (e.g., "AI-powered transaction categorization" rather than naming a specific tool).
-
Prioritize Rising-trend subtopics over Stable ones when there are more opportunities than the site can realistically tackle at once.
-
Respect page type context: A
/blog/URL is a different recommendation context than a product landing page slug. -
Don't hallucinate page content — all analysis must be based on what the slug implies, not assumed body content.
-
Be specific about what to add — vague recommendations like "cover this topic better" are not useful. Name the specific concepts, angles, and entity terms that should appear.
-
Hub vs. spoke thinking — if multiple subtopics cluster together with no existing hub page, recommend a pillar/hub page as a CREATE alongside the spoke pages.