Methodology

How we measure agent readability

The CovenAI v2.0.1 score quantifies how well a website is structured, citable, discoverable, accessible to AI agents, and transactable by them. We measure these signals; we don't transact for you. Open rubric, paid tooling. This document describes what we measure, how scores are constructed, and what the corpus data reveals.

Updated 5 May 2026 5 dimensions 938-site corpus

What the v2.0.1 score measures

As AI systems — large language models, autonomous agents, shopping assistants, and search overviews — become primary intermediaries between people and the web, a site's ability to be read and cited by these systems matters as much as its ability to rank in a traditional search index.

A site can rank well in conventional search results and still be effectively unreadable to AI-driven discovery. The reasons are structural: AI systems evaluate content differently from keyword-matching algorithms. They assess whether content is machine-readable, structurally coherent, semantically clear, demonstrably authoritative, and accessible to the crawlers that feed their training and retrieval pipelines.

The CovenAI v2.0.1 score was built to make these signals measurable. A high score indicates that a site is well-positioned to appear in AI-generated responses, be cited in LLM outputs, and be transacted with by purchasing agents. A low score reveals specific, fixable gaps in how the site presents itself to non-human systems. The score is available free via the scan tool and updated continuously in Agent Analytics for monitored sites.


The five scored dimensions

Each site is evaluated across five independent scored dimensions. Scores for each dimension range from 0 to 100. The composite score is the weighted sum of all five and ranges from 0 to 100. Weights reflect the relative influence each dimension has on observed AI citation and transaction behaviour.

A sixth signal, Agent Correlation, is also measured and reported below for transparency, but contributes 0% to the public score. We treat it as a research signal because it can be inflated with synthetic agent traffic; using it in the headline number would reward gaming over genuine agent-readiness.

Dimension 01 — 25%
Structure
Evaluates how well the page's information architecture serves automated comprehension. Content that is logically ordered, properly headed, and semantically marked up is far easier for AI systems to parse, summarise, and excerpt accurately. Structure is consistently the most surprising low-score driver: pages with genuinely good content often score poorly because of invisible markup issues that are invisible to human readers but critical to AI agents.
  • Real heading hierarchy — <h2> tags, not styled <p> elements or <details> accordions
  • Structured lists — <ul>/<ol> elements, not comma-separated prose
  • Semantic wrappers — <section> and <article> to define content boundaries
  • Answer-first pattern — direct answer in the opening paragraph
  • Content depth — minimum word count for citation eligibility
Dimension 02 — 25%
Citability
Measures the signals that AI citation systems use to determine whether a source is trustworthy and worth referencing. Combines structured data richness, author attribution, freshness signals, and E-E-A-T markers into a single citability score. A page can be well-structured and still score poorly on citability if it lacks the provenance signals that LLMs use to assess reliability.
  • JSON-LD presence, type specificity, and entity completeness
  • datePublished and dateModified in structured data
  • Author attribution and credentials
  • About and contact information
  • Transparency signals — methodology, data sourcing, external references
Dimension 03 — 15%
Discoverability
Covers the foundational technical requirements that determine whether AI crawlers can reliably access and process a site. Strong discoverability scores are necessary but not sufficient — they enable the other dimensions to be evaluated at all. The most common discoverability issue is unintentionally blocking AI crawlers via robots.txt.
  • robots.txt — all nine major AI crawlers permitted
  • llms.txt presence and validity
  • Sitemap availability and freshness
  • HTTPS and canonical tag hygiene
  • Crawl accessibility — no login walls or JS-only rendering on key pages
Dimension 04 — 20%
Agent Interface
Measures whether the site exposes machine-readable protocol surfaces designed for AI agent interaction. As agentic systems move from passive reading to active querying, sites that publish predictable, discoverable interfaces have a structural advantage. This is the dimension with no analogue in traditional SEO — and where most sites have nothing at all.
  • MCP server card (Model Context Protocol discovery)
  • Agent Skills index
  • API Catalog (RFC 9727)
  • OAuth Authorisation Server metadata (RFC 8414)
  • Markdown content negotiation, Web Bot Auth, RFC 8288 link headers
Dimension 05 — 15%
Transactability
Measures whether agents can buy from your site. Whether the site exposes the machine-readable commerce surfaces that purchasing agents look for — agent policy, structured offers, x402 payment shape, MCP payment-capable tools, DNS records announcing payment endpoints. We measure these signals; we don't transact for you. Open rubric, paid tooling — the rubric is published here, and the tooling that helps you build these surfaces lives in Agent Analytics.
  • Agent policy file declaring transaction posture
  • Structured offers endpoint (catalogue discoverable by agents)
  • HTTP 402 response shape on price-gated resources
  • MCP manifest declaring payment-capable tools
  • DNS TXT record announcing the payment endpoint

Agent Correlation · research signal · 0% to public score

We also observe live agent traffic against scanned sites — visit recency, agent-type diversity, dwell signals across the nine AI agent systems we track (GPTBot, OAI-SearchBot, ChatGPT-User, ClaudeBot, Claude-SearchBot, Claude-User, PerplexityBot, Google-Extended, Gemini-Deep-Research). This data is reported alongside the score for transparency, but contributes 0% to the public number. Including it would reward sites with synthetic agent traffic over sites with genuine agent-readiness; we treat it as a confidence signal we use to validate the rubric, not as a public scoring input.


Score bands and what they mean

The composite score runs from 0 to 100. Scores are normalised against observed performance across the CovenAI corpus, which is updated as new sites are added. The bands below reflect meaningful thresholds in real-world agent readability outcomes.

Band Score range What it means
Bad 0 – 39 Significant structural or technical barriers to AI discovery. A site in this band is poorly read by most AI systems and will not surface reliably in AI-generated responses. High-priority remediation recommended.
Needs Work 40 – 59 Partial readability. Some AI systems may encounter the site, but inconsistent signals reduce citation likelihood. Targeted improvements to the lowest-scoring dimensions will have the highest impact.
Good with caveats 60 – 79 Solid foundation. The site is legible to most AI systems and reasonably likely to appear in relevant AI-generated responses. One or more dimensions still have meaningful gaps worth addressing.
Good 80 – 100 Strong agent readability. The site is well-structured, citable, discoverable, and accessible to AI systems. Well-positioned to appear in AI-generated responses and agentic discovery pipelines.

Global corpus: April 2026

In April 2026, CovenAI ran the v2.0.1 scoring methodology across 938 sites spanning a range of industries, regions, and site types. Of those, 488 returned a complete score across all five scored dimensions. The remainder had one or more dimensions that could not be evaluated — typically due to crawl access restrictions or insufficient content depth. The corpus continues to grow as sites connect to Agent Analytics monitoring; the figures below reflect the April 2026 snapshot.

The results show that most sites are partially readable by AI systems but fall short on the signals that drive citation: structured data, clear authorship, fresh date markup, and permissive AI crawler access.

938
Sites in corpus
47.4
Average score (0–100)
50
Median score
76
Highest score recorded

The average score of 47.4 places most sites in the Needs Work band. The median of 50 indicates the distribution is relatively even around the midpoint, with no heavy skew towards the extremes. The highest score recorded was 76, placing it at the top of the Good with caveats band — a reminder that even well-optimised sites tend to have meaningful gaps in at least one dimension.

The most consistent low-score drivers across the corpus are absent citability signals (missing structured data, no author attribution, no date metadata) and restrictive robots.txt configurations that block one or more major AI crawlers.


What we analyse

Scores are derived from live page analysis conducted by CovenAI’s scanning infrastructure — including Coven-Citability-Bot, our own web agent — at the time of assessment. Scores reflect the state of a site at scan time and will change as the site evolves. Sites connected to Agent Analytics receive updated scores on a regular cadence.

The analysis draws on signals from the publicly accessible version of each page as seen by a standard web client, structured data validators, the Agent Diagnostic Layer (ADL) for agent identity and traffic classification, crawl behaviour logs (for the Agent Correlation research signal), Transactability surface probes, and heuristic evaluation of content quality signals. No proprietary or authenticated data is used; scores reflect only what AI systems themselves can observe.

Industry benchmark data is aggregated and anonymised. Individual site scores are not disclosed in public reports.


How to improve your score

Because scores decompose into five independent dimensions, improvement is systematic rather than speculative. The highest-impact actions are almost always in the lowest-scoring dimensions. Across the corpus, the most consistent low-score dimensions are Citability and Discoverability — both addressable with focused, low-effort changes.

Structure is consistently the most surprising high-impact area. Pages with genuinely good content often score poorly on structure because of invisible markup issues: FAQ sections built with <details> accordions register zero headings to an AI agent; section titles styled as <p> tags are indistinguishable from body copy; comma-separated item lists score nothing where a <ul> would score full points. These are low-effort fixes with outsized score impact. Read the full breakdown →

Run a free scan to see your site's current scores across all five dimensions. Agent Analytics provides continuous monitoring and updated scores as your site evolves.

See your score now

Find out how your site scores across all five dimensions — free, no account required.