AI SEARCH OPTIMIZATION

Feature Story

How AI Agents Read Your Website. (And How To Optimize For Them)

Okay, fair warning: this one's a bit more technical than usual. I considered softening it, maybe burying the HTML talk behind three more paragraphs of analogies about restaurants and confused tourists (don't worry, those are still coming). But honestly? This is the kind of detail that's going to separate the sites AI agents can actually use from the ones they silently skip — and right now, while AI-driven traffic is still ramping up, is exactly the time to lay the foundation properly. Not in six months when everyone's panicking. Now, while it's still a competitive advantage rather than desperate catch-up.

So with that caveat firmly in place: I need to talk about something that's been quietly bothering me for weeks. And no, it's not the fact that I still haven't figured out why my kitchen tap drips at exactly 3 AM (though that mystery remains unsolved). It's about who — or more accurately, what — is visiting your website right now.

According to Imperva's 2025 Bad Bot Report, automated traffic surpassed human traffic last year. 51% of all web interactions. Let that marinate for a second. The majority of your website's audience is no longer human. Your carefully chosen brand colours, your hero image that took three rounds of revisions, your lifestyle photography where everyone looks suspiciously happy holding your product? More than half your visitors literally cannot see any of it.

And here's where it gets properly uncomfortable: they don't care.

The Audience That Doesn't Have Eyes

Every major AI platform — ChatGPT, Perplexity, Gemini, Claude — can now browse websites autonomously. They scroll, click, fill forms, and in some cases complete transactions. But they don't experience your site the way a person does. They experience something called the accessibility tree — a stripped-down structural map of your page that exposes only what matters: buttons, links, form fields, headings, and their relationships to each other.

Think of it like this: you've spent months perfecting a beautifully designed restaurant. Gorgeous lighting, curated playlist, Instagram-worthy plating. And your fastest-growing customer segment is ordering through a text-only phone line where all they get is someone reading the menu in monotone. If your menu doesn't make sense read aloud by a bored robot, you've got a problem.

Three methods are emerging for how AI agents interpret web pages. The first is vision-based — taking screenshots and analysing pixels (Anthropic's Computer Use does this). It works, but it's computationally expensive and fragile. Like trying to read a book by squinting at it from across the room.

The second — and increasingly dominant — method is the accessibility tree. This is what OpenAI chose for ChatGPT Atlas, and what Microsoft's Playwright MCP uses. Rather than processing rendered pixels, these systems query a simplified structural representation of your page.

The third is a hybrid. But the trend is clear: even platforms that started vision-first are incorporating accessibility data. The industry is voting with its architecture, and the accessibility tree is winning.

Why Your Beautifully Designed Site Might Be Invisible

Here's where I started losing sleep (additional to the tap situation). The accessibility tree only exposes what's properly marked up in your HTML. If your page is structurally sound — semantic headings, labelled form fields, native HTML buttons — the tree presents everything cleanly and agents navigate with confidence.

If it's not? If your "Get Started" button is actually a <div> with some JavaScript taped to it? If your pricing is hidden behind a tab that requires a click to reveal? If your contact form fields don't have labels? The agent sees something between incomplete and incomprehensible. It might fail silently. It might misinterpret your page entirely. Either way, you've just become invisible to the fastest-growing segment of web traffic.

Research from UC Berkeley and the University of Michigan quantifies the damage. They tested Claude Sonnet 4.5 on 60 real-world web tasks under different accessibility conditions. Under standard conditions, the agent succeeded nearly 80% of the time. Restrict it to keyboard-only interaction — simulating how screen readers and many agents navigate — and success dropped to 42%. Restrict the viewport further: 28%.

Those aren't abstract percentages. That's the difference between an AI assistant successfully finding your law firm's consultation booking page versus giving up and recommending your competitor. Between an AI research agent citing your SaaS product's pricing page in a comparison versus skipping you entirely. Between existing in AI-mediated discovery and... not.

The Stuff Most Sites Get Embarrassingly Wrong

The encouraging news (and yes, I am trying to end on encouraging notes more often, personal growth is a journey) is that building a strong accessibility tree doesn't require exotic technology. It requires using HTML the way it was designed to be used — something the web standards community has been politely screaming about for two decades.

Use native HTML elements. A <button> automatically appears in the accessibility tree with the role "button." A <div onclick="doSomething()"> does not. The agent doesn't know it's clickable. This sounds trivial. It is absolutely not. Across major websites, styled divs masquerading as buttons remain embarrassingly common. Every one of them is invisible to an accessibility-tree-first agent.

Label every form input. Contact forms, search bars, newsletter signups, booking widgets — every input needs an associated label. Agents read labels to understand what data a field expects. The autocomplete attribute is particularly important: it tells agents exactly what type of data a field expects using standardised values. When an AI agent fills out a form on someone's behalf, autocomplete attributes make the difference between confident field mapping and wild guessing. (And nobody wants an AI agent wildly guessing at your checkout page. That's how you end up with a shipping address of "undefined, null, NaN.")

Establish heading hierarchy. Use h1 through h6 in logical order. Agents use headings to understand page structure. A service page with a clear h1 (your offering), h2 sections (Features, Pricing, FAQ), and properly nested subheadings gives an agent a reliable map. A page where everything is a styled <div> with font-size: 24px gives an agent an existential crisis.

Use landmark regions. HTML5 elements like <nav>, <main>, and <footer> tell agents where they are on the page. For sites with complex layouts — mega menus, sidebar filters, multi-section landing pages — landmarks are what prevent an agent from getting lost in your navigation and wandering around your footer like a confused tourist.

Don't hide critical information behind interactions. Pricing, specifications, availability, contact details — if the information matters to a decision, it should be in the initial HTML. Not behind an accordion. Not in a tab that requires a click. Not loaded dynamically after a scroll event. Microsoft's own guidance is blunt: AI systems may not render hidden content, so key details get skipped. (Imagine spending months on your pricing strategy only for AI agents to literally not know your prices exist. The comedy writes itself, except it's not funny when it's your pipeline.)

Server-Side Rendering: No Longer a Nice-to-Have

There's a distinction worth understanding between browser-based agents and AI crawlers. Browser-based agents like ChatGPT Atlas execute JavaScript — they can render a single-page application.

AI crawlers — PerplexityBot, OAI-SearchBot, ClaudeBot — often can't. If your website is a blank <div id="root"> until React hydrates, these crawlers see an empty page. Your content, your structured data — all invisible to the AI search ecosystem.

The logic chain is unforgiving: if your content isn't in the initial HTML, it doesn't get indexed by AI systems. If it doesn't get indexed, it doesn't get cited. If it doesn't get cited, you don't exist in the AI discovery layer. It's like printing a beautiful brochure and leaving it locked in a filing cabinet. In a basement. With the lights off.

Frameworks like Next.js, Nuxt, and Astro make server-side rendering straightforward. If you're evaluating platforms or rebuilding a site, SSR capability should be a non-negotiable requirement. Not a "nice-to-have-for-later." Now.

ARIA: Helpful When Used Right, Chaos When Used Wrong

OpenAI recommends ARIA — the W3C standard for making dynamic web content accessible. But here's the thing: ARIA is a supplement, not a substitute. The W3C's own first rule of ARIA is, effectively, "don't use ARIA if a native HTML element will do the job."

Why? Because according to WebAIM's annual survey of the top million websites, sites that use ARIA are generally less accessible. Not because ARIA is bad, but because it's frequently applied incorrectly — slapped over poor HTML structure like wallpaper over a crack in the wall. The crack is still there. Now it's just harder to diagnose.

And here's where my SEO brain starts twitching: the same instinct that led to meta keyword stuffing in early SEO will absolutely lead to aria-label abuse if left unchecked. I can already picture the agency pitch deck: "We'll optimise your ARIA labels for AI visibility!" No. Stop. Use ARIA for dynamic components that don't have native HTML equivalents. Keep labels descriptive and honest. Don't stuff them.

Your Action Points

  1. Run a screen reader through your key user flows. If VoiceOver, NVDA, or TalkBack can navigate your site successfully — identifying buttons, reading form labels, following the content structure — agents can likely do the same. Both audiences rely on the same accessibility tree. Budget 30 minutes. You'll be horrified and grateful in equal measure.

  2. Generate accessibility snapshots with Playwright MCP. It strips away visual presentation and shows you exactly what an agent works with: roles, names, states, hierarchy. If your primary CTA doesn't appear in the snapshot, or appears without a useful name, you have a problem that no amount of A/B testing your button colour will fix.

  3. View your page source and confirm critical information is in the HTML. Pricing, service descriptions, contact details, product specifications — check they're present before JavaScript loads. If they're not, AI crawlers can't see them.

  4. Audit your buttons, links, and form fields. Replace any <div> or <span> elements pretending to be interactive with native HTML equivalents. Yes, your developer might sigh. They'll get over it.

  5. Implement server-side rendering if you haven't already. If your site relies on client-side JavaScript to render content, you're invisible to the majority of AI crawlers. This isn't a future problem. This is a now problem.

  6. Review your ARIA usage. If you're using ARIA, check it's supplementing good HTML structure, not papering over bad structure. If you're not using ARIA at all, start with semantic HTML first — you'll solve 80% of the problem before you write a single aria-label.

The Bottom Line

The non-human majority of your web traffic is already here. The work required to make your site legible to AI agents is the same work that improves human accessibility, strengthens traditional SEO, and increases the likelihood of AI citation. Semantic HTML, structured data, server-side rendering, proper form labelling — none of this is new. What's new is that the audience demanding it has expanded from a subset of human users to every AI system that touches your website.

You spent years optimising for humans who judge your site in three seconds. Now optimise for agents who judge it in three milliseconds — and can't even see your logo.

Behind The Writing

ABOUT THE WRITER

Jo Lambadjieva is an entrepreneur and AI expert in the e-commerce industry. She is the founder and CEO of Amazing Wave, an agency specializing in AI-driven solutions for e-commerce businesses. With over 13 years of experience in digital marketing, agency work, and e-commerce, Joanna has established herself as a thought leader in integrating AI technologies for business growth.

What did you think of today’s email?

Keep Reading