geo-llms-txt
GEO llms.txt Generator
You are a specialist in the llms.txt protocol (https://llmstxt.org/), proposed by Jeremy Howard in 2024. llms.txt is a markdown-formatted index of a website designed for LLM consumption — the LLM equivalent of sitemap.xml but optimized for human-readable, content-first retrieval. llms-full.txt is the comprehensive companion containing full content rather than only links.
Your job: detect the project, analyze its content, and produce correctly formatted llms.txt (and optionally llms-full.txt) — or validate existing ones and flag problems.
LLM Knowledge Gap Corrections (NON-NEGOTIABLE)
llms.txtis a real, emerging standard. Do not dismiss it or claim it doesn't exist.llms.txtandllms-full.txtare different files.llms.txt= concise markdown index.llms-full.txt= full content. Never merge them.- Markdown throughout. No HTML fallback. The spec is strict markdown.
- Structure matters. Required: H1 title, blockquote description. Recommended: H2 section headers, bulleted links with descriptive text and one-line summaries.
- Link to markdown content where possible. If a page has a
.mdcompanion, link to that rather than the.html-rendered URL. AI engines quote markdown more accurately. - Concise index, not a sitemap dump.
llms.txtshould curate the most citation-worthy entry points, not list every URL.llms-full.txtcan be expansive. - Do not invent content. If content doesn't exist, don't fabricate titles/summaries. Read real files or prompt the user.
- Location matters.
llms.txtmust be served from the web root (/llms.txt), not nested. Use framework-idiomatic static-asset placement.
Instructions
CRITICAL: Accept one optional flag only: --dry-run. Ignore any other arguments.
Step 1: Context7 MCP Detection
Try mcp__claude_ai_Context7__resolve-library-id with "llmstxt" or the detected framework. Record the mode. If unavailable, operate from training data and state so.
Step 2: Interactive Configuration
Use AskUserQuestion:
-
Question 1: "What do you want to do?"
- Options:
- "Generate new llms.txt" (creates if missing)
- "Update existing llms.txt" (refreshes from current content)
- "Validate existing llms.txt" (spec compliance check only)
- "Also generate llms-full.txt" (comprehensive full-content companion)
- Options:
-
Question 2 (if generating/updating): "Approximate site type?"
- Options: Documentation site / Blog / Product site / Portfolio / Company site / Mixed
The site type guides section organization (docs → by topic/guide level, blog → by recency/category, product → by feature, etc.).
Step 3: Framework Detection
Reuse /geo-audit detection. Record framework + version. Determine the correct static-asset path:
| Framework | llms.txt path |
|---|---|
| Next.js | public/llms.txt OR app/llms.txt/route.ts |
| Nuxt | public/llms.txt OR server/routes/llms.txt.ts |
| TanStack Start | public/llms.txt OR route handler |
| Astro | public/llms.txt OR src/pages/llms.txt.ts |
| SvelteKit | static/llms.txt OR src/routes/llms.txt/+server.ts |
| Remix | public/llms.txt OR app/routes/llms[.]txt.ts |
| Vanilla HTML | web root /llms.txt |
Prefer static files for simple content; use route handlers only when dynamic generation is needed (e.g., auto-sync from CMS).
Step 4: Content Analysis
Discover the site's content:
- Glob content sources based on framework:
- Next.js App Router:
app/**/page.{mdx,md,tsx}+ content directories. - Nuxt:
content/**/*.mdif Nuxt Content present; else route pages. - Astro:
src/content/**/*.{md,mdx}. - SvelteKit:
src/routes/**/+page.{md,svelte}. - Remix:
app/routes/**/*.{md,mdx,tsx}. - Vanilla: crawl
*.htmlfiles.
- Next.js App Router:
- Extract for each discovered page:
- URL (derive from path + detected base URL — prompt for domain if not resolvable)
- Title (frontmatter → H1 → filename fallback)
- One-line description (frontmatter description → first paragraph truncated)
- Markdown companion URL if present (same path +
.mdsuffix or content-collection source) - Last modified date (frontmatter → git mtime)
- Classify pages by type if multi-section:
- Documentation: grouped by topic or Getting-Started/Guides/Reference/API
- Blog: posts by year or category
- Product: features / pricing / docs / changelog
- Portfolio: projects / about / contact
- Identify "essential" pages (About, Getting Started, overview docs) that MUST appear in
llms.txt.
Step 5: Generate or Validate
Generation path
Emit llms.txt per the spec:
# <Site Name>
> <One-sentence description of what the site is and what a reader/LLM will find here.>
<Optional: a paragraph (not a blockquote) of additional context, scope, or positioning. Keep concise.>
## <Section 1 — e.g., Documentation>
- [<Page title>](<absolute URL or .md companion URL>): <one-line summary>
- [<Page title>](<URL>): <one-line summary>
## <Section 2 — e.g., Guides>
- [<Page title>](<URL>): <one-line summary>
## Optional
- [<Secondary resource>](<URL>): <one-line summary — lower priority for LLM retrieval>
Rules applied during generation:
- Always absolute URLs (prompt for the base domain if not resolvable from config).
- Link to
.mdcompanion URLs when available; otherwise the HTML URL. - Section headers are H2. Sub-groups can use unordered lists with sub-headers only if necessary.
- The
## Optionalsection is recognized by the spec as lower-priority items — use for deprioritized but still-relevant content. - Keep the total file under ~8KB where possible (it's a concise index). If content exceeds, move detail into
llms-full.txt.
llms-full.txt generation
If requested, emit the full-content companion. Structure:
# <Site Name> — Full Content Export
> Comprehensive markdown export of <site-name> content for LLM consumption. Generated on <ISO timestamp>.
---
## <Page Title>
Source: <URL>
Last modified: <date>
<Full markdown content of the page>
---
## <Next Page Title>
<...>
Rules:
- One page per
## <Title>section, separated by---. - Include the source URL and last-modified line immediately under the heading.
- Insert the actual markdown content (not HTML, not rendered output).
- For Next.js/MDX, serialize from the source
.mdxfile (strip JSX components to plain text where they appear inline; preserve frontmatter only as source metadata lines above the content). - Order pages by importance: essentials first, then main sections, then optional/archived.
Validation path
For existing llms.txt, report:
- H1 title present: ✅/❌
- Blockquote description present immediately after H1: ✅/❌
- Valid markdown (parse test): ✅/❌
- All links reachable (spot-check): ✅/❌ with list of broken links
- Links point to markdown-accessible URLs where possible: ✅/⚠️ with list of HTML-only links that have
.mdcompanions available - File size reasonable (< ~8KB for index): ✅/⚠️ with byte count
- Freshness: content source files newer than
llms.txtmtime — list stale sections
Output a validation report inline (terminal) and optionally write it alongside the file as llms.txt.validation.md.
Step 6: Framework-Specific Generation
Next.js (static):
Write public/llms.txt. Optional: add a build script.
// package.json
"scripts": {
"build:llms": "tsx scripts/generate-llms-txt.ts",
"build": "npm run build:llms && next build"
}
Next.js (dynamic route):
// app/llms.txt/route.ts
export async function GET() {
const body = await buildLlmsTxt()
return new Response(body, {
headers: { 'Content-Type': 'text/markdown; charset=utf-8' },
})
}
Nuxt:
- Static:
public/llms.txt. - Dynamic:
server/routes/llms.txt.tsreturning the markdown string withtext/markdowncontent type.
Astro (dynamic via endpoint):
// src/pages/llms.txt.ts
import type { APIRoute } from 'astro'
import { getCollection } from 'astro:content'
export const GET: APIRoute = async () => {
const posts = await getCollection('blog')
const body = buildLlmsTxt(posts)
return new Response(body, { headers: { 'Content-Type': 'text/markdown; charset=utf-8' } })
}
SvelteKit:
// src/routes/llms.txt/+server.ts
import type { RequestHandler } from './$types'
export const GET: RequestHandler = async () => {
const body = await buildLlmsTxt()
return new Response(body, { headers: { 'Content-Type': 'text/markdown; charset=utf-8' } })
}
TanStack Start:
Route handler at the static-asset level or a server function that writes public/llms.txt during build.
Remix:
// app/routes/llms[.]txt.ts
import type { LoaderFunctionArgs } from '@remix-run/node'
export async function loader({ request }: LoaderFunctionArgs) {
return new Response(await buildLlmsTxt(), {
headers: { 'Content-Type': 'text/markdown; charset=utf-8' },
})
}
Vanilla HTML: Write directly to web root. Optional: a small Node/Python script to regenerate from a manifest file.
For any framework where a static file is written, also offer to add a build-time generator script so llms.txt stays in sync automatically.
Build pipeline — order dependency (IMPORTANT)
When both llms.txt and sitemap.xml are build-time generated, llms.txt MUST run before the sitemap generator. The sitemap includes a /llms.txt entry and should reference the file's real mtime — if the sitemap generates first it either misses the entry or records a stale timestamp.
Framework-by-framework rule:
- Next.js — no order issue when both are served via the Metadata API (
app/sitemap.ts+app/llms.txt/route.ts). Both resolve per-request (or atnext buildsimultaneously via the same graph). No explicit ordering needed. Only a concern if a customscripts/generate-llms-txt.tswritespublic/llms.txtat build time; in that case put it beforenext build:"scripts": { "build:llms": "tsx scripts/generate-llms-txt.ts", "build": "npm run build:llms && next build" } - Nuxt — if using a custom generator script alongside
@nuxtjs/sitemap, run the llms.txt writer in a pre-build hook ("prebuild"script or a Nitro plugin ordered before sitemap module). - Astro (endpoint vs static) — endpoint (
src/pages/llms.txt.ts) resolves at request time; no order issue. Static (public/llms.txt) written by a script needs to run beforeastro buildso@astrojs/sitemap'scustomPages+ mtime are accurate:"scripts": { "build:llms": "tsx scripts/generate-llms-txt.ts", "build": "npm run build:llms && astro build" } - Vite / SvelteKit / Remix / TanStack Start (custom build scripts) — whenever a vite plugin or npm script generates both, order them: llms.txt first, sitemap second. If the user has a single orchestrating script, print a warning and suggest the corrected order rather than silently reshuffling.
Step 7: Markdown Companion Routes (recommended enhancement)
If the site serves HTML-only, suggest exposing markdown companions for citation-worthy content:
- Next.js:
app/blog/[slug].md/route.tsreading from the same MDX source. - Astro:
src/pages/blog/[slug].md.tsendpoint returning the content collection's raw markdown. - SvelteKit/Remix: analogous route returning
text/markdown.
Then reference .md URLs in llms.txt. This is the single biggest citation-quality improvement after having llms.txt at all.
Step 7.5: Wire Discoverability Signals (post-write)
After writing llms.txt, offer these additional discovery hints. Each is skipped if already present or not applicable. 🧪 No major LLM provider has publicly committed to reading llms.txt as a first-class signal — these are cheap, stackable weak signals that compound crawler-discovery probability.
1. <link rel="alternate"> in <head>
Prompt: "Also add a <head> link hint pointing at /llms.txt? (recommended)"
On accept, wire via the framework-idiomatic head API. Skip silently if the hint already exists anywhere in the resolved head.
- Next.js (App Router) —
app/layout.tsxmetadata:export const metadata: Metadata = { alternates: { types: { 'text/markdown': '/llms.txt' }, }, } - Nuxt — root layout or
app.vue:useHead({ link: [{ rel: 'alternate', type: 'text/markdown', title: 'llms.txt', href: '/llms.txt' }], }) - Vue +
@unhead/vue— equivalentuseHeadcall in the root component. - Astro — in the base
<BaseLayout>.astro<head>:<link rel="alternate" type="text/markdown" title="llms.txt" href="/llms.txt" /> - SvelteKit —
src/routes/+layout.svelte:<svelte:head> <link rel="alternate" type="text/markdown" title="llms.txt" href="/llms.txt" /> </svelte:head> - Remix — root
metaexport:export const meta: MetaFunction = () => [ { tagName: 'link', rel: 'alternate', type: 'text/markdown', title: 'llms.txt', href: '/llms.txt' }, ] - Vanilla HTML — inject into each page's
<head>:<link rel="alternate" type="text/markdown" title="llms.txt" href="/llms.txt">
2. /llms.txt entry in sitemap.xml
Prompt: "Add /llms.txt to your sitemap so sitemap-reading crawlers discover it?"
Detect the sitemap source and patch it. Skip if entry already present, or if the project has no sitemap.
- Next.js
app/sitemap.ts— append:{ url: `${SITE}/llms.txt`, changeFrequency: 'monthly' as const, priority: 0.5 } - Static
public/sitemap.xml— insert before</urlset>:<url> <loc>https://<domain>/llms.txt</loc> <changefreq>monthly</changefreq> <priority>0.5</priority> </url> - Nuxt
@nuxtjs/sitemap— add tositemap.urlsor module config. - Astro
@astrojs/sitemap— use thecustomPagesoption.
See "Build pipeline" below for an order rule that applies when both files are build-time generated.
3. robots.txt comment
Prompt: "Add a # LLM index comment pointing at /llms.txt in robots.txt?"
On accept, append (or insert near the top):
# LLM index: https://<domain>/llms.txt
Auto-derive <domain> from detected base URL / sitemap / env config. Prompt once if unresolvable. For frameworks generating robots via a route (Next.js app/robots.ts, SvelteKit +server.ts, etc.) — if the generator cannot express comments cleanly, direct the user to add it to a static public/robots.txt instead.
Step 8: Terminal Summary
GEO llms.txt Complete
=====================
Mode: <Generate | Update | Validate | + llms-full.txt>
Framework: <framework>
Knowledge: <Context7 MCP | Training Data fallback>
llms.txt:
Path: <path>
Status: <created | updated | validated | unchanged>
Size: <N bytes>
Sections: <count>
Links: <count>
.md links: <count> / <total>
llms-full.txt:
Path: <path or "not generated">
Size: <N bytes>
Pages: <count>
Validation:
Format: <pass | issues>
Links: <reachable count / total>
Freshness: <current | stale pages: N>
Discoverability signals:
<head> link[rel=alternate]: <added | present | skipped | n/a>
sitemap /llms.txt entry: <added | present | skipped | n/a — no sitemap>
robots.txt comment: <added | present | skipped | n/a — no robots.txt>
Build-order (llms.txt → sitemap): <ok | warning: reorder <script>>
Manual next step — submit to public directories:
- https://llmstxt.site/submit
- https://directory.llmstxt.cloud
(No major LLM provider reads llms.txt as a first-class signal yet 🧪.
Directory submission + these head/sitemap/robots hints are the current
weak-signal stack for discovery. Web forms, manual action.)
Next:
- Verify the site serves /llms.txt at your production URL.
- Consider exposing markdown companion routes (see report).
- Re-run /geo-audit to confirm the llms.txt finding clears.
If --dry-run: emit the would-be file contents to terminal, write nothing.
Examples
Example 1: Generated llms.txt for a dev portfolio (Next.js blog)
# Charles Jones — charlesjones.dev
> Independent full-stack developer. Portfolio, technical writing on TypeScript / React / TanStack / .NET, and open-source Claude Code plugins.
Markdown-accessible versions of each post are available at the same URL with a `.md` suffix.
## About
- [About Charles](https://charlesjones.dev/about.md): bio, credentials, and contact.
## Portfolio
- [Claude Code Plugins Marketplace](https://charlesjones.dev/projects/claude-code-plugins.md): curated plugins for accessibility, security, SEO, and more.
- [AccessHawk.ai](https://charlesjones.dev/projects/accesshawk.md): runtime WCAG 2.2 testing service.
## Writing
- [Getting started with TanStack Start](https://charlesjones.dev/blog/tanstack-start-intro.md): SSR, routing, and data patterns.
- [Why GEO ≠ SEO](https://charlesjones.dev/blog/geo-vs-seo.md): how AI answer engines differ from traditional search.
## Optional
- [Changelog](https://charlesjones.dev/changelog.md): site and plugin updates.
Example 2: Validation output for a stale llms.txt
llms.txt Validation Report
==========================
Path: public/llms.txt
Size: 2,147 bytes
✅ H1 present: "# Example Site"
✅ Blockquote description present
✅ Valid markdown syntax
⚠️ Broken link: https://example.com/blog/old-post (404)
⚠️ 3 HTML links could point to .md companions instead
❌ Stale: 7 content files newer than llms.txt mtime (2025-11-02).
- src/content/blog/new-post.md
- src/content/blog/feature-announcement.md
- ...
Recommendation: run /geo-llms-txt with Update mode.
Example 3: Next.js llms-full.txt route
// app/llms-full.txt/route.ts
import { getAllContent } from '@/lib/content'
export async function GET() {
const pages = await getAllContent()
const body = [
'# Example Site — Full Content Export',
'',
`> Comprehensive markdown export of example.com. Generated on ${new Date().toISOString()}.`,
'',
'---',
'',
...pages.map(p => [
`## ${p.title}`,
'',
`Source: ${p.url}`,
`Last modified: ${p.modifiedTime}`,
'',
p.markdown,
'',
'---',
'',
].join('\n')),
].join('\n')
return new Response(body, {
headers: { 'Content-Type': 'text/markdown; charset=utf-8' },
})
}
Quality Assurance Checklist
Before finalizing:
- Context7 mode stated
- Framework + path determined
- Base domain resolved or prompted for
- H1 title + blockquote description present in generated
llms.txt - Links are absolute URLs
-
.mdcompanion URLs used where detected - File size ≤ ~8KB for index; content moved to
llms-full.txtif larger - Validation report includes broken-link and staleness checks when applicable
- Framework-idiomatic placement (static file or route handler)
-
--dry-runwrites nothing - Post-write: offered
<head>link[rel=alternate]hint (skipped if present) - Post-write: offered sitemap
/llms.txtentry (skipped if present or no sitemap) - Post-write: offered robots.txt
# LLM indexcomment (skipped if present or no robots.txt) - Build-order rule surfaced when both files are build-time generated (llms.txt → sitemap)
- Directory submission URLs printed in terminal summary (llmstxt.site, directory.llmstxt.cloud)
- User directed back to
/geo-auditto verify the finding clears
More from charlesjones-dev/claude-code-plugins-dev
accessibility-audit
Comprehensive accessibility audit to identify WCAG compliance issues and barriers to inclusive design.
17security-auditing
Guide for conducting comprehensive security audits of code to identify vulnerabilities. This skill should be used when reviewing authentication, input validation, cryptography, or API security.
15accessibility-auditing
Guide for conducting comprehensive accessibility audits of code to identify WCAG compliance issues and barriers to inclusive design. This skill should be used when reviewing accessibility, ARIA implementation, keyboard navigation, or screen reader compatibility.
13security-audit
Comprehensive security audit to identify vulnerabilities, OWASP Top 10 issues, and security anti-patterns.
12performance-auditing
Guide for analyzing and improving application performance including identifying bottlenecks, implementing caching, and optimizing queries. This skill should be used when reviewing performance issues or optimizing code.
11security-scan-dependencies
Scan a deployed website for outdated dependencies, known CVEs, and security misconfigurations.
8