Why don't FAQ sections built with details and summary tags count for AI citability?

AI structure parsers read the DOM as text content. The details and summary elements hide their child content by default — collapsed answers register as zero readable content and zero headings to an AI agent parsing the page. A page can contain ten detailed answers inside a details/summary accordion and score identically to a blank page on structure metrics. Converting those answers to always-visible h3 headings and paragraph tags makes the content fully legible to AI parsers.

Does CSS styling affect how AI agents read HTML structure?

Yes, significantly. A paragraph element styled with CSS to look like a heading — for example, p class='section-title' with large bold text — is still a paragraph element to an AI agent. Agents parse semantic HTML tags, not visual presentation. Only real h2, h3, and h4 elements register as headings in structure analysis. The same applies to div elements styled as section titles: they contribute zero heading count regardless of how they appear on screen.

Why does nested schema markup get missed by AI schema extractors?

Many schema extractors only walk top-level @type declarations in a page's structured data. When a schema type — for example, a Person entity inside Article.author, or an Organization inside AboutPage.about — is nested inside another schema block rather than declared as its own top-level script element, it is frequently invisible to those extractors. The fix is to give every important entity type its own standalone script type='application/ld+json' block with its own @type declaration at the top level.

What is the minimum content length for a page to qualify as an AI citation source?

Our testing showed that pages under approximately 500 words are automatically penalised in AI citability scoring regardless of content quality. Thin pages do not qualify as citation sources in the same way that a Google News index requires a minimum content threshold before a page is eligible for inclusion. If a page contains genuinely valuable information but sits below that threshold, expanding the content — adding context, examples, or supporting detail — is the fastest route to citability improvement.

← All posts

COVEN Writer · 17 April 2026 · 9 min read

Why Your Site's Structure Is Costing You AI Citations (And It's Not What You Think)

Tonight we ran a citability audit on a real site and moved its score from 50 to 73 in a single session. We changed almost no content. The writing was already good. The information was accurate and specific. What we changed was the markup — the invisible layer that determines whether an AI agent can read the page at all. Every structural problem we found and fixed is documented below, because the same issues are almost certainly present on your site too.

The Gap Between What Humans See and What AI Agents Read

Visual hierarchy and semantic hierarchy are not the same thing. A browser renders CSS. An AI parser reads the DOM. You can have a beautifully designed page with clear section headers, a logical information flow, and a complete FAQ — and an AI agent will see a structurally empty document if those elements are built with the wrong tags.

This is the same transition that happened with mobile responsiveness in 2010. Websites weren't wrong before responsive design — they were built for one consumption layer. The content was fine; the structure wasn't built for the new way it was being read. We're in an identical moment now, except the new consumption layer is automated agents rather than small screens.

The score breakdown from tonight's session shows what changed and by how much:

Overall citability score: 50 → 73
Structure subscore: 52 → 70
Schema subscore: 33 → 73
E-E-A-T subscore: 28 → 88
Every page rated Good by end of session

None of those gains came from rewriting copy. They came from fixing structural and semantic issues that were hiding otherwise solid content from AI parsers.

Finding 1: The Accordion Problem

The site had a detailed FAQ section. Ten questions, each with a thorough answer. All of it was built with <details> and <summary> elements — a common, accessible pattern for collapsible content.

To an AI structure parser, every one of those answers was invisible.

The <details> element hides its child content by default. When an agent reads the page looking for headings and text content, collapsed answers register as zero readable content and zero headings — the same score as a blank page. Ten detailed, well-written answers contributed exactly nothing to the structure score.

What we changed

Converted the primary FAQ section from <details>/<summary> to always-visible <h3> + <p> elements
Kept a compact toggle-based FAQ lower on the page for secondary questions where visual brevity mattered more than citability
Added the same Q&A content to a FAQPage JSON-LD schema block

The result: the structure parser now sees ten headings and ten blocks of readable content where it previously saw nothing. The FAQ content began contributing to the site's citability score immediately.

Finding 2: The Styled-Paragraph Trap

Several section titles on the site were built as paragraphs with a CSS class: <p class="section-title">. Visually, they looked identical to headings — large, bold, properly spaced. To every human visitor, the page had clear, structured sections.

To an AI agent, those titles were body copy. The agent counts headings by tag, not by appearance. A <p> element styled to look like an <h2> still registers as a paragraph. The page's heading count was a fraction of what its visual design implied.

The same problem appears in other forms

<div class="title"> or <div class="heading"> — any div used as a section label regardless of class name
<span> elements styled as inline headings
<strong> tags used as stand-alone section markers at the start of a paragraph
Custom web components that render heading-like text without using native heading elements

The fix is straightforward: use real <h2>, <h3>, and <h4> tags for anything that functions as a heading. Keep the CSS styling if the visual design requires it. Only the tag matters to the parser.

Finding 3: The Prose List Problem

AI agents extract list items as discrete structured facts. A sentence like "The platform includes monitoring, reporting, and alerting" is processed as a single unit of text — one fact, loosely parseable. The same three items presented as a <ul> with three <li> elements score full points for list structure, with each item registered as an individually extractable claim.

This matters because the primary use case for AI citations is answering specific questions. An agent looking for "what does this platform monitor?" will preferentially cite a page that presents a structured list of monitoring capabilities over a page that mentions them in flowing prose. Structured data is citation-ready data.

Where we found prose lists on the site

Feature descriptions written as run-on sentences with commas or em-dashes separating items
Benefit sections written as paragraphs rather than bullet points
Process descriptions that walked through steps in paragraph form instead of an <ol>
Pricing inclusions listed inline within a paragraph

Converting the most fact-dense prose lists to <ul> or <ol> elements was one of the highest-impact changes of the session — the structure score responded immediately, and the content became machine-readable in a way it simply wasn't before.

Finding 4: Nested Schema Is Invisible

The site had schema markup. An Article block with an author property containing a Person object. An AboutPage block with an about property containing an Organization object. On paper, the entities were described. In practice, they were invisible to most schema extractors.

The issue is that many schema extractors only walk top-level @type declarations. When a Person entity appears nested inside an Article block rather than as its own standalone <script type="application/ld+json">, those extractors don't surface it as a known entity. The person and the organisation existed only as properties of other types — never as first-class declared entities the AI could reference independently.

What top-level schema blocks the site was missing

Person — author identity with name, job title, and employer
Organization — company identity with name, URL, description, and contact
WebPage — page-level metadata separate from Article
BreadcrumbList — navigation context for the page's position in the site
FAQPage — questions and answers as extractable structured data

Each of these became its own <script type="application/ld+json"> block in the <head>, with its own top-level @type declaration. The schema subscore jumped from 33 to 73 as a result — the largest single gain of the session.

The practical rule: every entity you want an AI to recognise needs its own top-level block. Nesting is fine as a cross-reference (author inside Article can still point to the person), but the standalone declaration must exist independently.

Finding 5: Content Depth Threshold

Several pages on the site were thin — well-written, accurate, but short. Under approximately 500 words. These pages were automatically penalised in citability scoring regardless of the quality of what they contained.

The analogy is Google News eligibility: there's a minimum content threshold before a page qualifies for inclusion, independent of the topic's importance or the writing's accuracy. AI citation systems apply a similar filter. A short page doesn't qualify as a citation source in the same way a short document doesn't qualify as a reference in an academic bibliography — not because it's wrong, but because it lacks the depth that signals authority.

The right fix is not padding

Add genuine supporting context: the reasoning behind a claim, not just the claim
Include concrete examples where the page currently only states principles
Expand definitions — don't assume the reader knows the terminology
Add related questions and answers that a reader might naturally have after reading the primary content

On the pages we worked on, the content expansion served dual purpose: it moved the pages above the citability threshold and made them genuinely more useful to human readers. There is no tension between the two goals when you expand with substance.

Finding 6: Semantic Wrappers Add Points

This one is small relative to the others, but it compounds. <section> and <article> elements help agents identify content boundaries — where a topic starts, where it ends, and how one block of content relates to another. A <div> provides no such signal.

A page built entirely with <div> containers for layout sections gives an AI agent no structural scaffolding beyond the heading hierarchy. A page that wraps major content blocks in <section> tags, puts the primary content in an <article>, uses a <nav> for navigation, and identifies the page's main content area with <main> is broadcasting its structure explicitly. Agents don't have to infer it.

Semantic elements that contribute to structure scores

<article> — identifies self-contained, independently distributable content
<section> — marks a thematically grouped block within a larger document
<main> — identifies the dominant content of the page, distinct from navigation and footers
<nav> — explicitly marks navigation blocks so agents skip them when parsing content
<aside> — marks supplementary content, preventing it from being weighted as primary
<header> and <footer> — frame the content so agents know what's metadata and what's substance

None of these require visual design changes. They're structural labels that cost nothing to add and consistently improve how agents parse the page.

What the Score Progression Actually Means

Going from 50 to 73 in a single evening sounds modest until you understand what the scores represent. A score of 50 means AI agents are actively struggling to parse the site — the content is there but the structure is opaque. A score of 73 means the site is legible to agents and competitive for citation. The jump from invisible to citable is more significant than the twenty-three-point numeric difference implies.

More telling than the overall score was the E-E-A-T subscore moving from 28 to 88. That jump came almost entirely from the schema work — declaring the author as a named entity with a job title and employer, declaring the organisation with a description and contact point, and making those declarations top-level rather than nested. The content demonstrating expertise was already present. The markup that told agents who was behind the expertise was missing.

AI agents don't just parse what your page says. They parse what your page is — who wrote it, what organisation published it, where it fits in the site's hierarchy, what questions it answers, and whether its content has sufficient depth to qualify as a reference. Structure is the mechanism through which all of that context is communicated. Improve the structure and you don't just score better — you become legible to a consumption layer that was previously skipping past you entirely.

The checklist for where to start:

Replace <details>/<summary> FAQ accordions with always-visible <h3> + <p> for primary Q&As
Audit every section title — replace <p class="..."> and styled <div> labels with real <h2> or <h3> elements
Convert inline prose lists to <ul> or <ol> wherever items are enumerable facts
Add top-level JSON-LD blocks for Person, Organization, WebPage, BreadcrumbList, and FAQPage
Expand any page under 500 words with genuine supporting context
Wrap major content blocks in <section> and <article> elements

This post was produced by COVEN AI Research. COVEN AI builds the infrastructure layer for the agentic web — including the Agent Analytics platform used to run the testing session described here. If you want to see how your site scores and what's holding it back, run a free scan below.

See your site's AI citability score — free, in under a minute.

Run a Free Scan →

Or explore the full Agent Analytics platform at aa.covenai.io