Topic Gap Analyzer

What This Skill Does

Given:

A topic research PDF (from the user's Google AI Studio app, covering semantic analysis, related subtopics, entity salience, related keywords, and search volume context)
A URL inventory of the user's website — either an XML sitemap URL or a Screaming Frog CSV/XLSX export

…this skill analyzes the semantic gap between what the site currently covers and what the topic report recommends, then outputs a structured content action plan.

Step-by-Step Instructions

Step 1 — Parse the Topic Report PDF

Extract the following from the PDF:

Core topic and its semantic definition (what the topic actually means in the current search landscape)
Primary audience segments (e.g., new users vs. scaling brands)
Related subtopics with their volume signals (HIGH / MEDIUM) and trend direction (Rising / Stable)
Related keywords / search volume context — the keyword clusters and variants listed
Top topic entities with their salience scores and entity type (Organization, Concept, etc.)
Key concepts that modern content must address (e.g., "Lump Sum Trap", "payout-centric automation", "AI transaction categorization")

Entity handling rule: Entities that are Organizations (brands/competitors) must be flagged. Only include competitor/brand names in content recommendations if the page type explicitly calls for it — comparison pages, alternative pages, review roundups, "best X" lists. For all other page types, recommend covering the concepts those entities represent without naming the brands.

Step 2 — Ingest the URL Inventory

If the user provides an XML sitemap URL:

Fetch the sitemap using the web_fetch tool
If it's a sitemap index, fetch the child sitemaps and collect all <loc> URLs
Extract all URLs into a flat list

If the user provides a Screaming Frog export (CSV or uploaded file):

Read the file from /mnt/user-data/uploads/
Extract the Address column (or equivalent URL column)
Filter to 200 status codes only (ignore redirects, errors) if a Status column is present

Parse each URL to extract:

The URL slug / path
Any inferred page title or topic from the slug (split hyphens/underscores into readable words)

Do not fetch individual page content unless the user explicitly asks for deeper analysis — slug-level semantics are sufficient for gap analysis.

Step 3 — Semantic Matching

For each subtopic and keyword cluster from the report, scan the URL inventory for semantic matches using slug analysis:

Strong match: The URL slug contains 2+ key terms from the subtopic/keyword cluster
Weak match: The URL slug contains 1 key term that overlaps
No match: No URL covers this subtopic

Also flag URLs that are adjacent — they cover a related theme and could be expanded rather than replaced.

Consider page type signals in URL slugs (e.g., /blog/, /vs/, /alternatives/, /reviews/, /best-) when making recommendations, as these affect whether competitor names are appropriate.

Step 4 — Generate the Content Action Plan

Structure the output as follows:

📊 Topic Overview

Brief 2–3 sentence summary of the topic, its current search intent landscape, and the two primary audience segments from the report.

✅ UPDATE Recommendations

Pages that already exist and should be enriched to cover the topic or subtopics better.

For each recommendation:

URL: [existing URL]
Why update: What the page currently covers and the gap
What to add: Specific subtopics, keyword angles, or concepts to incorporate
Entity requirements: Which non-brand entities/concepts from the report must appear
Priority: High / Medium (based on subtopic volume + trend)

🆕 CREATE Recommendations

Net-new pages to build. Group by content type where relevant (e.g., educational guide, comparison page, feature landing page).