astro-seo
Astro SEO
Audits and improves the SEO setup of an Astro site against the full stack described in Astro SEO: the definitive guide. The skill covers nine areas — technical foundation, structured data, content, site structure, performance, sitemaps and indexing, agent discovery, redirects, and analytics — and produces drop-in code for anything missing or weak.
The opinionated spine of this skill is @jdevalk/astro-seo-graph. Most of the fixes route through it. If the project doesn't use it yet, installing it is the first recommendation.
Code recipes live in AGENTS.md — read it when you need to implement a specific fix. This file has the workflow and audit checklist.
Workflow
- Detect the project — confirm this is an Astro site and understand its shape.
- Audit — score nine categories and produce actionable findings.
- Improve — generate or modify files to close the gaps. Recipes are in
AGENTS.md. - Metadata pass — invoke
metadata-checkon every short string the skill generated (titles, descriptions, schemadescriptionfields, FAQ answers, frontmatter excerpts). - Verify — run the build, check validations pass, remind the user about non-file tasks (Search Console, Bing Webmaster Tools, IndexNow key verification).
Phase 0: Detect the project
Confirm the basics before auditing:
astro.config.mjs/astro.config.tsexists.package.jsonhasastroas a dependency.site:is set inastro.config— canonicals, sitemaps, and OG image URLs all derive from this. If it's missing, empty, orhttp://localhost, flag it as a blocking issue before anything else. This is the single most common misconfiguration.- Content collections in
src/content/(or legacysrc/pages/markdown). - Deployment target — read
package.json,vercel.json,netlify.toml,wrangler.toml, orpublic/_headersto determine the host. This drives redirect and header syntax in Phase 2. - Is
@jdevalk/astro-seo-graphalready installed? If yes, record the version and which features are wired (grep for<Seo,seoGraph(,createSchemaEndpoint,createSchemaMap,FuzzyRedirect,createIndexNowKeyRoute,createMarkdownEndpoint,createApiCatalog,gitLastmod). Check the installed version against the latest on npm withnpm view @jdevalk/astro-seo-graph version. If the project is behind, recommend an upgrade in Phase 2 before auditing feature gaps — the package ships new defaults and fixes regularly, and an outdated version is a plausible cause for any audit finding. Phase 2 branches on this. - Is the site multilingual? Check for
i18ninastro.configor multiple locale directories undersrc/content/. If yes, hreflang matters; if no, skip it.
Ask only what you can't detect. Don't ask the user what the site is about — read astro.config.mjs and the homepage content.
Phase 1: Audit
Score each category out of 10. For each, give 2–4 specific findings that quote the actual code or config. Within each category, checks are tiered:
- Must — ship blockers. A failure here causes visible SEO regression.
- Should — standard practice. Skipping costs reach.
- Nice — forward-looking or situational. Useful but not baseline for every site.
Skip Nice checks for small personal blogs unless the user asks for the full treatment.
1. <Seo> component and head metadata (/10)
- Must — single component for all head metadata (not scattered across layouts).
- Must —
site:inastro.configis set to the production origin. - Must — canonical URLs derived from
siteconfig with tracking params stripped. - Must — canonical omitted when
noindexis true (per Google's recommendation). - Must — fallback chain for missing SEO fields:
seo.title → title → siteName;seo.description → excerpt → first paragraph. Pages with blank titles or descriptions are the most common symptom of a broken fallback. - Should —
robotsmeta includesmax-snippet:-1,max-image-preview:large,max-video-preview:-1. - Should — Twitter tags suppressed when they duplicate Open Graph (Twitter falls back automatically).
- Should —
hreflangalternates present on multilingual sites. Skip if monolingual. - Nice — uses
@jdevalk/astro-seo-graph's<Seo>component rather than hand-rolled. (Hand-rolled that covers everything above is fine; this skill nudges toward the package because it handles the fallback chain and robots rules by default.)
2. Structured data / JSON-LD graph (/10)
- Single flat
Articleobject, or a linked@graphwith multiple entities? - Entities wired with
@idreferences? WebSite,Blog/WebPage,Person/Organization,BlogPosting/Article,BreadcrumbList,ImageObjectall present where relevant?- Trust signals:
publishingPrinciples,copyrightHolder,copyrightYear,knowsAbout,SearchAction? - Validates in Rich Results Test and ClassySchema?
3. Content collections and SEO schema (/10)
- Content collections defined with Zod schemas?
seoSchemafrom@jdevalk/astro-seo-graphenforcing title (5–120) and description (15–160) lengths?- Required fields (
publishDate,title,excerpt) enforced at build time? - Markdown-stripped
articleBodyexposed in schema endpoints (up to 10K chars)?
4. Open Graph images (/10)
- Every page has an OG image, or many missing?
- 1200×675 (Google Discover minimum 1200px wide, 16:9 ratio)?
- Generated at build time via satori + sharp, or manual?
- JPEG (social platforms don't reliably support WebP/AVIF)?
- Route derives OG URL from the slug automatically?
- Every
<img>in rendered HTML has analtattribute (oralt=""/role="presentation"for decorative images)?validateImageAltonseoGraph()catches this at build time in ≥ 1.1.0.
5. Sitemaps and indexing (/10)
- Must —
@astrojs/sitemapinstalled, sitemap index reachable. - Must —
robots.txtreferences the sitemap index. - Must — RSS feed exists (
@astrojs/rss), advertised via<link rel="alternate" type="application/rss+xml">, contains full post content (not truncated excerpts). - Should — split per-collection via
chunksoption (sitemap-posts-0.xml, etc.) — much easier to debug indexing in GSC. - Should —
lastmodpopulated from git commit timestamps, not frontmatter dates or CI file timestamps.@jdevalk/astro-seo-graph≥ 1.4.0 exportsgitLastmod(filePath, { excludeCommits, depth })for this — it returns the committer date of the most recent non-excluded commit that touched the file, ornullif git is unavailable. Wire it into the sitemapserializecallback. If the codebase has a hand-rolledexecSync('git log ...')helper, replace it with the package export — the export handles bulk-commit exclusion (imports, reformats) which a naïve-1lookup can't. - Should — IndexNow integrated and submitting on each build, with key verification route at
/[key].txt. ≥ 1.0.1 excludes/404from submissions by default. Gate submission on the production host (e.g.process.env.CF_PAGES === '1' && CF_PAGES_BRANCH === 'main',VERCEL_ENV === 'production',CONTEXT === 'production'). Unconditional submission pings the endpoint on every localnpm run buildand preview deploy with URLs the production host hasn't served yet, which gets the key marked invalid (403) and forces rotation.
6. Agent discovery (/10)
- Should — schema endpoints (
/schema/*.json) exposing corpus-wide JSON-LD. - Should — schema map (
/schemamap.xml) listing all endpoints, withSchemamap:directive inrobots.txt. - Should —
llms.txtat the site root listing pages (title + description) for LLM consumers.@jdevalk/astro-seo-graph≥ 0.9.0 generates this via thellmsTxtintegration option. - Should — markdown-alternate URLs (
/blog/post.mdnext to/blog/post/) serving clean markdown with YAML frontmatter for AI agents to consume without HTML parsing.@jdevalk/astro-seo-graph≥ 1.2.0 shipscreateMarkdownEndpointfor the route and amarkdownAlternate: trueintegration option that emits<link rel="alternate" type="text/markdown">on every page. ≥ 1.3.0 adds post-build verification: the integration walks the build output and strips any link whose.mdtarget isn't on disk (with awarnper occurrence) — so a misconfigured endpoint surfaces as a build warning instead of a shipped 404. Pair with a Cloudflare Transform Rule (URL rewrite viawildcard_replace, noVary: Acceptheader — CF strips custom Vary values) for content negotiation onAccept: text/markdownwithout needing SSR. - Should — API catalog at
/.well-known/api-catalogper RFC 9727.@jdevalk/astro-seo-graph≥ 1.4.0 shipscreateApiCatalog, which auto-types schema endpoints to theirhttps://schema.org/<Type>URLs and absolutizes paths againstsiteUrl. List the schema endpoints, schemamap, and any site-specific APIs (/ask,/feed.xml, etc.). The RFC is finalized so the format is stable; adoption is early but the cost is one route file. - Should — Content Signals directive in
robots.txt(e.g.Content-Signal: ai-train=yes, search=yes, ai-input=yes). Declares preferences for AI training, search grounding, and AI input use independently of crawl access. The spec is an IETF draft and adoption is early, but it's one line in a file every site already has. - Should —
Linkheader on/*pointing to discovery files (sitemap, llms.txt, api-catalog, schemamap, and any MCP / A2A cards the site publishes). Agents reading response headers find them without parsing HTML. On Cloudflare Pages / Netlify this ispublic/_headers; on Vercel it'svercel.json. - Nice — MCP server card at
/.well-known/mcp/server-card.json(SEP-1649) and / or A2A agent card at/.well-known/agent-card.json(A2A protocol). Only relevant when the site actually exposes an MCP server or A2A agent. Both are static JSON files declaring the endpoint, capabilities, and skills. Skip otherwise. - Nice —
<link rel="nlweb">pointing to a conversational endpoint. NLWeb is early; the tag is one line and worth having, but it's not a scoring blocker in 2026.
7. Performance (/10)
- Static output by default (no SSR on pages that don't need it)?
- Zero client-side JS unless an island requires it?
- Astro
<Image>component used for all content images (responsive srcset, WebP, lazy, async)? - Primary web font preloaded in woff2?
<ClientRouter />withdefaultStrategy: 'viewport'for prefetch?- Hashed assets under
/_astro/serveCache-Control: public, max-age=31536000, immutable? No-Vary-Searchresponse header stripping UTM parameters from cache key?
8. Redirects and error handling (/10)
public/_redirects(or equivalent) maintained for every URL that ever existed and moved?- 301 not 302 for permanent moves?
FuzzyRedirectcomponent from@jdevalk/astro-seo-graphwired into the 404 page?- 404 page itself returns a 404 status, not 200?
9. Build-time validation and content quality (/10)
- Must —
seoGraph()integration running on each build withvalidateH1andvalidateUniqueMetadataenabled. For JSON-LD validation, passwarnOnDanglingReferences: truetoassembleGraph()inseo-graph-core— that's the assembly-time check, not an integration option. - Should —
validateImageAlt,validateMetadataLength, andvalidateInternalLinksenabled onseoGraph()(all defaulttruein ≥ 1.1.0). They catch missing alt text, titles or descriptions outside SERP bounds (defaults: title 30–65, description 70–200), and internal links that 404 or hit a trailing-slash mismatch. Upgrade to ≥ 1.1.1 if the project is on 1.1.0 — that patch release fixes two validator bugs:validateInternalLinksnow recognisespublic/assets as valid targets (no more false positives on/images/*or/fonts/*), andvalidateMetadataLengthno longer truncates descriptions containing a raw apostrophe. Useskiponly for SSR-only routes, wildcards, and[slug]params. - Should — broken link checker in CI for external links. A lychee GitHub Action on every push to content files catches dead links before they go live; a weekly scheduled run catches link rot as external sites move or disappear. Broken outbound links are a bad UX and a negative trust signal. Internal links are covered by
validateInternalLinksat build time; lychee handles everything else. - Should — SEO strings (titles, descriptions, FAQ answers) audited for metadata quality — front-loading, concreteness, truncation fit, no title/description duplication. Phase 2.5 chains this in via
metadata-check. Individual post prose can be audited separately viareadability-check.
Phase 2: Improve
Based on the audit, produce the concrete code. Always ask before overwriting. Read AGENTS.md for detailed recipes.
Branch on the Phase 0 findings. If @jdevalk/astro-seo-graph is already installed, skip the install step and focus on wiring the features the audit flagged as missing (IndexNow, FuzzyRedirect, schema endpoints, build validation). If the user has a hand-rolled setup that already satisfies the Must checks in category 1, don't rip it out — add only what's missing. Replacement is a last resort, not the default.
AGENTS.md sections: Install/upgrade, Integration config, BaseHead.astro, Content collection schema, Sitemap + git lastmod, OG image route, Schema endpoints, llms.txt, Markdown alternates, API catalog, Content Signals in robots.txt, MCP and A2A discovery cards, Link headers for agent discovery, RSS feed, Redirects + FuzzyRedirect, Performance headers, Broken link checker in CI.
Phase 2.5: Metadata and readability pass
Invoke the metadata-check skill on every short string the skill generated or modified: page titles, meta descriptions, schema description fields, FAQ answers, and any blog post frontmatter excerpt values you wrote. It checks front-loading, concreteness, filler, active voice, title/description duplication, difficult words, SERP-truncation fit (title 30–65, description 70–200 — the same bounds validateMetadataLength enforces), and one-idea-per-field. Apply the fixes directly. Skip the pass entirely for technical strings (URLs, schema @id values, enum values).
If the project has a blog or docs content collection, mention to the user as a follow-up that the readability-check skill can audit individual posts for multi-paragraph prose quality — but don't audit the entire content corpus yourself.
Phase 3: Verify
- Run
npm run build. IfseoGraph()is wired, this also runs H1 validation, duplicate-meta detection, and schema validation — surface any warnings. - Spot-check the built HTML: one page's
<head>should now be clean, canonical correct, JSON-LD graph present and linked. - Run the homepage through Rich Results Test and ClassySchema.
- Confirm
/sitemap-index.xmlexists and references per-collection sitemaps. - If IndexNow is wired, confirm the key verification route returns the key at
/[key].txt. - Remind the user about tasks that can't be automated:
- Register the site in Google Search Console and Bing Webmaster Tools.
- Submit the sitemap index in both.
- Generate an IndexNow key and commit it to config.
- Install Plausible or equivalent privacy-friendly analytics.
Output format
## Astro SEO audit: [site name]
### Score
| Category | Score |
|---------------------------------------|------:|
| 1. `<Seo>` component and head | x/10 |
| 2. Structured data / JSON-LD graph | x/10 |
| 3. Content collections and schema | x/10 |
| 4. Open Graph images | x/10 |
| 5. Sitemaps and indexing | x/10 |
| 6. Agent discovery | x/10 |
| 7. Performance | x/10 |
| 8. Redirects and error handling | x/10 |
| 9. Build-time validation and content | x/10 |
| **Total** | xx/90 |
### Findings
[Grouped by category. Quote actual code/config. Be specific.]
### Files generated or changed
[List with short description of each.]
### Next steps
[Non-file tasks: GSC, Bing Webmaster Tools, IndexNow key generation, analytics.]
Key principles
- Opinionated defaults over optionality. The guide picks a stack; this skill applies it. Don't offer five alternatives when one works.
@jdevalk/astro-seo-graphis the spine. Route the<Seo>component, schema endpoints, IndexNow, FuzzyRedirect, and build validation through it unless the user has a strong reason to hand-roll.- Topics, not keyphrases. When reviewing content, focus on topical coverage and readability, not keyword density.
- Static, CDN-served HTML is the baseline. Don't add SSR to solve problems static builds already don't have.
- Agent discovery matters now. Schema endpoints, schema map, NLWeb tags — the crawler is no longer the only consumer.