Why Your Site's Structure Is Costing You AI Citations (And It's Not What You Think)
Tonight we ran a citability audit on a real site and moved its score from 50 to 73 in a single session. We changed almost no content. The writing was already good. The information was accurate and specific. What we changed was the markup — the invisible layer that determines whether an AI agent can read the page at all. Every structural problem we found and fixed is documented below, because the same issues are almost certainly present on your site too.
The Gap Between What Humans See and What AI Agents Read
Visual hierarchy and semantic hierarchy are not the same thing. A browser renders CSS. An AI parser reads the DOM. You can have a beautifully designed page with clear section headers, a logical information flow, and a complete FAQ — and an AI agent will see a structurally empty document if those elements are built with the wrong tags.
This is the same transition that happened with mobile responsiveness in 2010. Websites weren't wrong before responsive design — they were built for one consumption layer. The content was fine; the structure wasn't built for the new way it was being read. We're in an identical moment now, except the new consumption layer is automated agents rather than small screens.
The score breakdown from tonight's session shows what changed and by how much:
- Overall citability score: 50 → 73
- Structure subscore: 52 → 70
- Schema subscore: 33 → 73
- E-E-A-T subscore: 28 → 88
- Every page rated Good by end of session
None of those gains came from rewriting copy. They came from fixing structural and semantic issues that were hiding otherwise solid content from AI parsers.
Finding 1: The Accordion Problem
The site had a detailed FAQ section. Ten questions, each with a thorough answer. All of it was built with <details> and <summary> elements — a common, accessible pattern for collapsible content.
To an AI structure parser, every one of those answers was invisible.
The <details> element hides its child content by default. When an agent reads the page looking for headings and text content, collapsed answers register as zero readable content and zero headings — the same score as a blank page. Ten detailed, well-written answers contributed exactly nothing to the structure score.
What we changed
- Converted the primary FAQ section from
<details>/<summary>to always-visible<h3>+<p>elements - Kept a compact toggle-based FAQ lower on the page for secondary questions where visual brevity mattered more than citability
- Added the same Q&A content to a
FAQPageJSON-LD schema block
The result: the structure parser now sees ten headings and ten blocks of readable content where it previously saw nothing. The FAQ content began contributing to the site's citability score immediately.
Finding 2: The Styled-Paragraph Trap
Several section titles on the site were built as paragraphs with a CSS class: <p class="section-title">. Visually, they looked identical to headings — large, bold, properly spaced. To every human visitor, the page had clear, structured sections.
To an AI agent, those titles were body copy. The agent counts headings by tag, not by appearance. A <p> element styled to look like an <h2> still registers as a paragraph. The page's heading count was a fraction of what its visual design implied.
The same problem appears in other forms
<div class="title">or<div class="heading">— any div used as a section label regardless of class name<span>elements styled as inline headings<strong>tags used as stand-alone section markers at the start of a paragraph- Custom web components that render heading-like text without using native heading elements
The fix is straightforward: use real <h2>, <h3>, and <h4> tags for anything that functions as a heading. Keep the CSS styling if the visual design requires it. Only the tag matters to the parser.
Finding 3: The Prose List Problem
AI agents extract list items as discrete structured facts. A sentence like "The platform includes monitoring, reporting, and alerting" is processed as a single unit of text — one fact, loosely parseable. The same three items presented as a <ul> with three <li> elements score full points for list structure, with each item registered as an individually extractable claim.
This matters because the primary use case for AI citations is answering specific questions. An agent looking for "what does this platform monitor?" will preferentially cite a page that presents a structured list of monitoring capabilities over a page that mentions them in flowing prose. Structured data is citation-ready data.
Where we found prose lists on the site
- Feature descriptions written as run-on sentences with commas or em-dashes separating items
- Benefit sections written as paragraphs rather than bullet points
- Process descriptions that walked through steps in paragraph form instead of an
<ol> - Pricing inclusions listed inline within a paragraph
Converting the most fact-dense prose lists to <ul> or <ol> elements was one of the highest-impact changes of the session — the structure score responded immediately, and the content became machine-readable in a way it simply wasn't before.
Finding 4: Nested Schema Is Invisible
The site had schema markup. An Article block with an author property containing a Person object. An AboutPage block with an about property containing an Organization object. On paper, the entities were described. In practice, they were invisible to most schema extractors.
The issue is that many schema extractors only walk top-level @type declarations. When a Person entity appears nested inside an Article block rather than as its own standalone <script type="application/ld+json">, those extractors don't surface it as a known entity. The person and the organisation existed only as properties of other types — never as first-class declared entities the AI could reference independently.
What top-level schema blocks the site was missing
Person— author identity with name, job title, and employerOrganization— company identity with name, URL, description, and contactWebPage— page-level metadata separate fromArticleBreadcrumbList— navigation context for the page's position in the siteFAQPage— questions and answers as extractable structured data
Each of these became its own <script type="application/ld+json"> block in the <head>, with its own top-level @type declaration. The schema subscore jumped from 33 to 73 as a result — the largest single gain of the session.
The practical rule: every entity you want an AI to recognise needs its own top-level block. Nesting is fine as a cross-reference (author inside Article can still point to the person), but the standalone declaration must exist independently.
Finding 5: Content Depth Threshold
Several pages on the site were thin — well-written, accurate, but short. Under approximately 500 words. These pages were automatically penalised in citability scoring regardless of the quality of what they contained.
The analogy is Google News eligibility: there's a minimum content threshold before a page qualifies for inclusion, independent of the topic's importance or the writing's accuracy. AI citation systems apply a similar filter. A short page doesn't qualify as a citation source in the same way a short document doesn't qualify as a reference in an academic bibliography — not because it's wrong, but because it lacks the depth that signals authority.
The right fix is not padding
- Add genuine supporting context: the reasoning behind a claim, not just the claim
- Include concrete examples where the page currently only states principles
- Expand definitions — don't assume the reader knows the terminology
- Add related questions and answers that a reader might naturally have after reading the primary content
On the pages we worked on, the content expansion served dual purpose: it moved the pages above the citability threshold and made them genuinely more useful to human readers. There is no tension between the two goals when you expand with substance.
Finding 6: Semantic Wrappers Add Points
This one is small relative to the others, but it compounds. <section> and <article> elements help agents identify content boundaries — where a topic starts, where it ends, and how one block of content relates to another. A <div> provides no such signal.
A page built entirely with <div> containers for layout sections gives an AI agent no structural scaffolding beyond the heading hierarchy. A page that wraps major content blocks in <section> tags, puts the primary content in an <article>, uses a <nav> for navigation, and identifies the page's main content area with <main> is broadcasting its structure explicitly. Agents don't have to infer it.
Semantic elements that contribute to structure scores
<article>— identifies self-contained, independently distributable content<section>— marks a thematically grouped block within a larger document<main>— identifies the dominant content of the page, distinct from navigation and footers<nav>— explicitly marks navigation blocks so agents skip them when parsing content<aside>— marks supplementary content, preventing it from being weighted as primary<header>and<footer>— frame the content so agents know what's metadata and what's substance
None of these require visual design changes. They're structural labels that cost nothing to add and consistently improve how agents parse the page.
What the Score Progression Actually Means
Going from 50 to 73 in a single evening sounds modest until you understand what the scores represent. A score of 50 means AI agents are actively struggling to parse the site — the content is there but the structure is opaque. A score of 73 means the site is legible to agents and competitive for citation. The jump from invisible to citable is more significant than the twenty-three-point numeric difference implies.
More telling than the overall score was the E-E-A-T subscore moving from 28 to 88. That jump came almost entirely from the schema work — declaring the author as a named entity with a job title and employer, declaring the organisation with a description and contact point, and making those declarations top-level rather than nested. The content demonstrating expertise was already present. The markup that told agents who was behind the expertise was missing.
AI agents don't just parse what your page says. They parse what your page is — who wrote it, what organisation published it, where it fits in the site's hierarchy, what questions it answers, and whether its content has sufficient depth to qualify as a reference. Structure is the mechanism through which all of that context is communicated. Improve the structure and you don't just score better — you become legible to a consumption layer that was previously skipping past you entirely.
The checklist for where to start:
- Replace
<details>/<summary>FAQ accordions with always-visible<h3>+<p>for primary Q&As - Audit every section title — replace
<p class="...">and styled<div>labels with real<h2>or<h3>elements - Convert inline prose lists to
<ul>or<ol>wherever items are enumerable facts - Add top-level JSON-LD blocks for
Person,Organization,WebPage,BreadcrumbList, andFAQPage - Expand any page under 500 words with genuine supporting context
- Wrap major content blocks in
<section>and<article>elements
This post was produced by COVEN AI Research. COVEN AI builds the infrastructure layer for the agentic web — including the Agent Analytics platform used to run the testing session described here. If you want to see how your site scores and what's holding it back, run a free scan below.
See your site's AI citability score — free, in under a minute.
Run a Free Scan →Or explore the full Agent Analytics platform at aa.covenai.io