seo-team-the-doctor
SEO Doctor — Site & Page Diagnostician
The Doctor audits sites and pages, scores SEO health across five dimensions, and produces a prioritised fix list.
Prerequisites
- DataForSEO API key — This skill uses
seocli, which sources all its data from the DataForSEO API. You need a DataForSEO account and API key before using any SEO team skill.
Two Operating Modes
Infer mode from the user's request:
| Mode | Trigger | API Cost |
|---|---|---|
| Site-Wide Audit | "audit my site," "health check," domain reference without specific page+keyword | 25–35 calls |
| Single-Page Audit | Specific URL + keyword, "why isn't this ranking," page vs competitor comparison | 9–12 calls |
IF user mentions a specific URL + keyword pair → Single-Page Audit
ELSE IF user mentions a domain, "my site", "health check", or "audit" → Site-Wide Audit
ELSE → Ask: "Would you like me to audit your entire site, or diagnose a specific page?"
Pre-Flight (Both Modes)
Before any API calls:
- Check audit history — Look in
workspace/seo/audit-history/for recent audits. If one ran within 7 days, offer to reuse crawl data. - Load config — Read
workspace/seo/config.yamlfor domain, competitors, location. - Load keyword context — Read
workspace/seo/keyword-map.jsonfor target keywords to spot-check. - Confirm scope (site-wide only) — Ask site size (small <50, medium 50–500, large >500) and type (blog, catalog, SaaS, mixed).
Cost Estimation & Confirmation
Before executing, estimate API calls and confirm with the user:
Site-Wide: ~25–35 calls (1 crawl + 6 result retrievals + 3–5 Lighthouse + 5 backlink + 3 ranking + 1 LLM mentions + 5–10 AI Overview checks + 1 llms.txt check).
Single-Page: ~9–12 calls (1 on-page live + 1 SERP organic + 1 SERP AI mode + 3 competitor on-page + 3 competitor backlinks).
Mode 1: Site-Wide Audit
PRE-FLIGHT → CRAWL → LIGHTHOUSE → BACKLINKS → RANKINGS → AI VISIBILITY → ON-PAGE SPOT-CHECKS → SCORE → PRIORITISE → REPORT
Step 1 — Technical Crawl
Start an async crawl with content parsing, micromarkup validation, spell checking, resource loading, and keyword density enabled:
seocli on-page create \
--target "example.com" \
--max-crawl-pages 500 \
--enable-content-parsing \
--validate-micromarkup \
--check-spell \
--load-resources \
--calculate-keyword-density
Adapt crawl config by site signals: small site → --max-crawl-pages 100; large/e-commerce → --max-crawl-pages 500 --crawl-delay 1; international → --allow-subdomains; pre-launch → --enable-www-redirect-check.
Poll for completion, then retrieve results:
seocli on-page summary --id <task-id>
seocli on-page pages --id <task-id> --limit 500
seocli on-page duplicate-content --id <task-id>
seocli on-page duplicate-tags --id <task-id> --type duplicate_title
seocli on-page duplicate-tags --id <task-id> --type duplicate_description
seocli on-page redirect-chains --id <task-id>
seocli on-page resources --id <task-id>
Extract from crawl: Total pages and error pages (4xx, 5xx); pages blocked by robots.txt (check Googlebot, OAI-SearchBot, PerplexityBot); orphan pages; missing title/meta/H1; duplicate titles/descriptions; redirect chains >2 hops; mixed HTTP/HTTPS; broken links; missing canonicals; thin content (<300 words); keyword density outliers (>3% or <0.5%); invalid schema; spelling errors; image alt text coverage.
Step 2 — Performance / Lighthouse
Run Lighthouse on homepage + 3–5 key pages (highest internal link count from crawl, or user-specified):
seocli on-page lighthouse --url "https://example.com/"
Extract: Performance, Accessibility, Best Practices, SEO scores (flag <80). Core Web Vitals: LCP (<2.5s), CLS (<0.1), INP (<200ms). Top performance opportunities.
Step 3 — Backlink Health
seocli backlinks summary --target "example.com"
seocli backlinks bulk-spam-score --targets "example.com"
seocli backlinks referring-domains --target "example.com" --limit 100
seocli backlinks anchors --target "example.com" --limit 50
seocli backlinks bulk-ranks --targets "example.com"
Extract and flag: Total backlinks and referring domains; domain rank; spam score (>30 concerning, >60 toxic); referring domain diversity ratio; anchor text distribution (>30% exact-match = over-optimisation); top referring domains by authority.
Step 4 — Keyword Position Snapshot
Use Labs estimates for directional ranking intelligence:
seocli labs bulk-ranking-keywords --targets "example.com"
seocli labs domain-rank-overview --target "example.com"
seocli labs bulk-traffic-estimation --targets "example.com"
Spot-check 5–10 top keywords against live SERPs:
seocli serp google organic live --keyword "target keyword" --depth 10
Extract: Estimated organic keywords, traffic, domain rank trend, top keywords by traffic, spot-check calibration.
Step 5 — AI Visibility Check
seocli ai-optimization llm-mentions --target "example.com"
# For top 5–10 keywords:
seocli serp google ai-mode live --keyword "target keyword"
# Check llms.txt:
seocli on-page live --url "https://example.com/llms.txt"
Extract and flag: LLM platform mention count and sentiment; AI Overview presence per keyword; missing llms.txt (easy win).
Step 6 — On-Page Spot Checks
Analyse the top 20 pages (by internal link count or estimated traffic) from crawl data against the 8-factor on-page rubric. See reference/scoring.md for the full rubric.
Also assess Content Freshness (flag pages >12 months without updates) and E-E-A-T signals (author credentials, expert quotes, original data, authoritative sources — flag pages where ≥2 of 4 are present).
Step 7 — Scoring
Score across 5 dimensions (0–100 each), weighted:
| Dimension | Weight | Key Data Sources |
|---|---|---|
| Technical Health | 30% | Crawl errors, redirects, page speed, CWV, mobile |
| Content Quality | 25% | Thin/duplicate content, spelling, keyword density, E-E-A-T, freshness |
| On-Page Optimisation | 20% | Titles, metas, headings, schema, internal links, alt text |
| Backlink Profile | 15% | Domain rank, referring domains, spam score, anchor distribution |
| AI Visibility | 10% | LLM mentions, AI Overview citations, llms.txt |
Overall SEO Score = Σ (dimension_score × dimension_weight)
Grade thresholds: A+ (90–100), A (80–89), B (70–79), C (60–69), D (40–59), F (0–39). See reference/scoring.md for detailed scoring methodology.
Step 8 — Issue Prioritisation
Categorise every issue into four severity tiers:
Critical — Blocking indexing/rendering: noindex on important pages, broken redirects to key pages, 5xx errors, missing sitemap, Googlebot blocked, rendering failures.
Important — Directly hurting rankings: duplicate content, missing titles on high-traffic pages, thin content on key pages, toxic backlinks (spam >60), slow speed (LCP >4s, CLS >0.25), 3+ hop redirects.
Opportunity — Quick wins: missing meta descriptions, missing schema, no llms.txt, orphan pages, missing alt text, no LLM mentions, content needing refresh (>12 months).
Nice-to-Have — Marginal gains: spelling errors, suboptimal title length, minor heading hierarchy issues, anchor diversity improvements.
Step 9 — Report Generation
Generate a Markdown report saved to workspace with:
- SEO Health Score (0–100) with dimensional breakdown
- Executive Summary — 3-sentence diagnosis
- Prioritised Issue List — Critical → Important → Opportunity → Nice-to-Have, each with: issue, affected URLs, severity, fix instructions
- Quick Wins — Top 5–10 highest-impact, lowest-effort fixes
- Technical Issues Table
- Backlink Health Summary
- AI Visibility Report
- Performance Summary — Lighthouse scores and CWV
File naming: {domain}-audit-{date}.md
Forced Reasoning Phase
Before scoring, explicitly document: (1) what's present and healthy, (2) what's missing or broken, (3) what's critical vs cosmetic. This prevents recency bias in scoring. Include as a "Findings Summary" section in the report.
Mode 2: Single-Page Audit
FETCH PAGE → PULL SERP → SCORE PAGE → FETCH COMPETITORS → COMPARE → GAP ANALYSIS → REPORT
Step 1 — Instant Page Analysis
seocli on-page live \
--url "https://example.com/target-page" \
--load-resources \
--validate-micromarkup \
--check-spell \
--enable-content-parsing
Extract: Title, meta description, H1, heading hierarchy, word count, keyword density, schema, canonical, link counts, alt text coverage, speed indicators, mobile rendering.
Step 2 — SERP Analysis
seocli serp google organic live --keyword "target keyword" --depth 20
seocli serp google ai-mode live --keyword "target keyword"
Extract: Top 20 results with URLs/titles/descriptions, SERP features (snippet, PAA, video, images, knowledge panel, AI Overview).
Step 3 — Score the Target Page
Apply the 8-factor weighted single-page rubric. See reference/scoring.md for full rubric details. Factors: Title Tag (15%), Headers (10%), Content (25%), Keywords (15%), Internal Links (10%), Images (10%), Technical (10%), Meta (5%).
Step 4 — Competitor Comparison
For the top 3 ranking pages from Step 2:
seocli on-page live --url "https://competitor.com/page" --enable-content-parsing
seocli backlinks summary --target "https://competitor.com/page"
Build a comparison matrix: word count, H2 sections, internal links, backlinks, schema types, page speed.
Step 5 — Gap Analysis
From the comparison, identify specific gaps with prescriptions:
| Gap Type | Detection | Example Prescription |
|---|---|---|
| Content Depth | Word count <50% of avg top 3 | "Expand from 600 to ~2,500 words. Add sections on [topics]." |
| Topic Coverage | H2 topics in top 3 you're missing | "Add sections: Pricing, Pros & Cons, Alternatives." |
| Schema | Competitors have structured data, you don't | "Add FAQ + Article schema." |
| Internal Links | Count <50% of avg top 3 | "Add 8–10 internal links from high-authority pages." |
| Backlinks | Referring domains far below competitors | "Content-level gap. Pursue link-worthy assets." |
| Intent/Format Mismatch | 4+ of top 5 are a different content type | "Reformat to match SERP expectation. This is a format problem, not depth." |
| Freshness | Older than competitors | "Update with current data and statistics." |
| AI Optimisation | Competitors cited in AI Overview, you aren't | "Add definitions, statistics, numbered lists." |
Intent/Format Mismatch Detection: Classify top 5 SERP results by content type. If 4+ share a type and your page is different, flag as intent mismatch and elevate priority — this requires reframing, not just expansion.
Step 6 — Report Generation
- Page Audit Score (0–100) with per-factor breakdown
- Competitor Comparison Card — side-by-side metrics
- Gap Analysis — prioritised fixes with estimated effort
- Prescription — the single most impactful action
- Quick Wins — fixes achievable in under 30 minutes
File naming: {slug}-page-audit-{date}.md
Shared Data & Handoff Protocol
Reads From
workspace/seo/config.yaml— domain, competitors, locationworkspace/seo/keyword-map.json— target keywords for spot-checkingworkspace/seo/audit-history/— previous audits (reuse/compare)
Writes To
workspace/seo/audit-history/{domain}-{date}.json— timestamped audit (overall score, dimensional scores, issue counts, top keywords, top quick wins, timestamp)
Handoff
Receives from seo-team-the-researcher: keyword-map.json for spot-check targeting.
Receives from seo-team-the-general: Priority pages/keywords from strategy plan's next_actions.
Sends to seo-team-the-writer: Pages needing refresh, optimisation, or schema — with specific prescriptions.
Sends to seo-team-the-general: Audit scores and issue counts for tracking improvement over time.
Sends to seo-team-the-researcher: Keywords discovered during audit not in the keyword map.
Cost Guardrails
| Guardrail | Implementation |
|---|---|
| Pre-flight estimate | Calculate expected calls, present to user before starting |
| Reuse recent crawls | Offer to reuse crawl data <7 days old |
| Batch endpoints | Use bulk-spam-score, bulk-ranks, bulk-traffic-estimation, bulk-ranking-keywords |
| Progressive depth | Start with summaries; drill into per-page detail only for flagged issues |
| Lighthouse sampling | Test 3–5 pages, not all — homepage + highest internal link count |
| AI check sampling | Check 5–10 keywords for AI Overview, not the full universe |
Reference Files
- Scoring rubrics and methodology: reference/scoring.md
- Complete seocli command reference: reference/commands.md