www.facebook.com
Metrics
Final screenshot
Diagnostics
97% of content requires JS · 3% of rendered content recovered (rest is placeholder/wrong)
Fix: Server-render or statically generate the main content so a non-JS agent still receives it; make client rendering a progressive enhancement, not the source of truth.
signal 0.81 · JSON-LD 0/1 · missing: structured-data
Fix: Wrap the real content in <main>/<article>, cut repeated nav/boilerplate, and keep the primary content dense and early in the DOM.
Rendered profile: headless
Metrics
Final screenshot
Diagnostics
86% of content requires JS · 14% of rendered content recovered (rest is placeholder/wrong)
Fix: Server-render or statically generate the main content so a non-JS agent still receives it; make client rendering a progressive enhancement, not the source of truth.
signal 1.00 · JSON-LD 0/1 · missing: structured-data
Fix: Wrap the real content in <main>/<article>, cut repeated nav/boilerplate, and keep the primary content dense and early in the DOM.
Rendered profile: headless
Access & discovery checks — separate from the gated CAV metrics above. Click an issue for business impact, what we measured, and how to fix. · Take the Agent Readiness course →
Agent files & endpoints
Issues (6)
✗ robots.txt allows AI bots high impact Blocks: * (all)
Business impact If robots.txt blocks AI crawlers you are invisible to ChatGPT, Claude and Perplexity — they skip you and recommend a competitor instead.
What we measured We read /robots.txt and test it against 16 AI user-agents (GPTBot, ClaudeBot, PerplexityBot, …) for a Disallow that blocks them.
How to fix Allow major AI bots to public content; restrict only private paths (/admin, /api).
User-agent: GPTBot
Allow: /
Disallow: /admin/
✗ No CAPTCHA wall high impact Detected: recaptcha
Business impact CAPTCHAs stop bots — including the AI agents your customers send to shop or book. Content behind a challenge is unreachable.
What we measured We fingerprint reCAPTCHA, hCaptcha and Cloudflare Turnstile in the page.
How to fix Reserve CAPTCHA for login/checkout flows, never public content pages.
~ No content-blocking cookie wall medium impact Consent platform: cookieconsent (verify it doesn't block content)
Business impact A consent wall that hides content until a click is invisible to agents — they cannot click 'Accept'.
What we measured We fingerprint OneTrust, Cookiebot, Usercentrics and similar managers and flag content-blocking ones.
How to fix Use an overlay banner that leaves content in the DOM, not a blocking interstitial.
Spec: https://gdpr.eu/cookies/
~ llms.txt present + valid high impact Found at /llms.txt but missing H1/blockquote
Business impact llms.txt is the robots.txt for AI: it tells agents what your site is, what matters, and where to find it. Without it AI guesses — and guessing means inaccurate recommendations and lost visibility.
What we measured We fetch /llms.txt and /.well-known/llms.txt and validate the spec (H1 title + a one-line blockquote summary). We also note /llms-full.txt (your full content as Markdown).
How to fix Create /llms.txt with a short summary + key pages; optionally /llms-full.txt with full content in Markdown.
# Your Site
> One-line description for AI agents.
## Key pages
- /products — catalog
- /pricing — plans
- /docs — documentationSpec: https://llmstxt.org
✗ Structured data (JSON-LD) medium impact No JSON-LD found
Business impact Schema.org JSON-LD tells agents what a page IS (product, article, business) with typed fields (price, rating, hours). Without it agents extract less reliably.
What we measured We parse <script type=application/ld+json>, validate it, and check for populated @type fields.
How to fix Add JSON-LD: Organization/LocalBusiness on the homepage, Product on product pages, Article on posts.
<script type="application/ld+json">{"@context":"https://schema.org","@type":"Organization","name":"Your Co","url":"https://example.com"}</script>Spec: https://schema.org/
~ XML sitemap present medium impact Sitemap found but no <url>/<loc> entries
Business impact A sitemap is your table of contents for AI crawlers. Without it agents follow homepage links and miss deep pages (products, docs, pricing) — shrinking what they can recommend.
What we measured We fetch /sitemap.xml (and /sitemap_index.xml), confirm valid XML with <loc> entries, and check <lastmod> freshness.
How to fix Generate an XML sitemap of all public pages with current lastmod dates and reference it in robots.txt.
# robots.txt
Sitemap: https://example.com/sitemap.xml
Passed audits (6)
A deeper scan (a second render, ~30–60s): network waterfall, unused JavaScript, long tasks, and prioritized fixes. Runs only when you ask; the result is cached so it never re-runs.