audit-site
audit-site
Perform a comprehensive site audit: crawl, capture enrichment, analyze observations, and produce a findings report.
When to Use
Use this skill when the user wants a thorough analysis of a website — not just downloading it, but understanding its technology stack, identifying issues, and producing actionable findings.
Full Audit Workflow
Phase 1: Configure for the Target
First, understand the site and configure appropriately:
update_settings({
settings: {
maxConcurrent: 4,
crawlDelay: 0.5,
stripTrackingParams: true
},
policy: {
scopeMode: "sameDomain",
maxDepth: 5,
respectRobotsTxt: true,
includeSupportingFiles: true,
downloadCrossDomainAssets: true,
autoUpgradeHTTP: true
}
})
Adjust based on site size:
- Small site (<100 pages):
maxDepth: 10,maxConcurrent: 8 - Medium site (100-1000):
maxDepth: 5,maxConcurrent: 4 - Large site (1000+):
maxDepth: 3,maxPagesPerCrawl: 500
Phase 2: Crawl
start_crawl({ url: "https://example.com" })
Monitor until complete:
get_crawl_status()
// Repeat with `since` parameter for efficient polling
Phase 3: Capture Enrichment (if crawlio-agent available)
If the Chrome extension is running, enrichment data (framework detection, network requests, console logs) is captured automatically during the crawl and appended to the observation log.
Phase 4: Analyze Observations
Query the observation timeline for patterns:
// Get all observations for the target
get_observations({ host: "example.com", limit: 200 })
// Check framework detection
get_observations({ host: "example.com", source: "extension", limit: 50 })
// Review crawl results
get_observations({ op: "crawl_done" })
// Check for errors
get_errors()
get_failed_urls()
Phase 5: Create Findings
For each issue or insight identified, create a finding with evidence:
create_finding({
title: "...",
url: "https://example.com",
evidence: ["obs_xxx", "obs_yyy"],
synthesis: "..."
})
Phase 6: Generate Report
Compile findings into a structured report:
get_findings({ host: "example.com" })
Present as a summary with:
- Technology Stack: Framework, rendering mode, CDN
- Issues Found: Grouped by category (security, performance, SEO, errors)
- Site Structure: Tree overview, orphaned pages, broken links
- Recommendations: Prioritized action items
Audit Checklist
Use this checklist to ensure thorough coverage:
Technology
- Framework detected (React, Vue, Next.js, etc.)
- Rendering mode identified (SSR, SSG, CSR, ISR)
- CDN/hosting identified
- Third-party services cataloged
Performance
- Page count and crawl duration noted
- Large files identified (images, videos, scripts)
- External resource dependencies mapped
Security
- HTTPS enforcement checked
- Mixed content identified
- Security headers reviewed (if captured in network data)
Content
- Failed URLs reviewed and categorized
- Redirect chains identified
- Missing resources noted
Structure
- Site tree reviewed for organization
- Depth distribution analyzed
- Cross-domain assets cataloged
Example Report Structure
# Site Audit: example.com
## Summary
- **Pages crawled**: 142
- **Failed**: 4 (3 auth-required API routes, 1 timeout)
- **Framework**: Next.js (App Router, ISR)
- **Total size**: 23.4 MB
## Findings (5)
### Security
1. **Mixed content on 3 pages** — HTTP images loaded on HTTPS pages
Evidence: obs_a3f7b2c1, obs_b4e8c3d2
### Performance
2. **Unoptimized images** — 8 images over 500KB without modern formats
Evidence: obs_c5f9d4e3
### SEO
3. **Missing meta descriptions** — 12 pages have no meta description
4. **Duplicate titles** — 3 pairs of pages share identical titles
### Errors
5. **Console errors on product pages** — TypeError in cart widget
Evidence: obs_d6g0e5f4
## Recommendations
1. Convert images to WebP/AVIF format
2. Fix mixed content by updating CDN URLs to HTTPS
3. Add unique meta descriptions to all pages
4. Fix TypeError in cart widget JavaScript
Tips
- Save the project after crawling so you can revisit:
save_project({ name: "example.com audit" }) - For recurring audits, load the project and recrawl to compare:
load_project({ id: "..." }) - Use
export_site({ format: "folder" })to archive the downloaded content - Combine with crawlio-agent browser interaction tools for authenticated areas
More from crawlio-app/crawlio-plugin
crawl-site
Use this skill when the user asks to "crawl a site", "download a website", "mirror a site", "scrape a site", or wants to download web pages for offline access or analysis. Configures Crawlio settings based on site type, starts the crawl, monitors progress, and reports results.
18observe
Use this skill when the user asks to "check observations", "what did Crawlio see", "show crawl timeline", "query the observation log", or wants to review what happened during a crawl session. Queries the append-only observation log with filtering by host, source, operation, and time range.
9finding
Use this skill when the user asks to "create a finding", "record an insight", "what findings exist", "show findings", or wants to create or review evidence-backed analysis insights from crawl observations. Creates and queries curated findings with evidence chains.
8crawlio-mcp
Complete reference for the Crawlio MCP server — 37 tools, 6 code-mode tools, 4 resources, 4 prompts. Use this skill when orchestrating website crawling, export, enrichment, or analysis via Crawlio MCP.
1web-research
Use this skill when the user asks to "research a site", "compare sites", "analyze technology", or wants structured evidence-based web research. Teaches the acquire-normalize-analyze protocol using CrawlioMCP's composite analysis tools.
1extract-and-export
Use this skill when the user asks to "download and export a site", "crawl and extract content", "archive a website", "export as WARC/ZIP/PDF", or wants a complete crawl-extract-export pipeline. Crawls the site, extracts structured content, and exports in the requested format.
1