Why AI Crawlers Can't See Your React App

AI fetchers like GPTBot, ClaudeBot, and PerplexityBot don't execute JavaScript when they fetch a page. For these crawlers, what arrives over HTTP is what gets read -- no rendering, no second pass. When you build a React app, every page starts as a near-empty HTML shell. The actual content only appears after JavaScript boots up, fetches data from APIs, and renders the UI in the browser. To any tool reading raw HTML, that process never happens. The page is blank.

Google Search is a partial exception worth understanding. Google does render JavaScript, but rendering happens in a separate wave that can lag initial crawling by days or weeks. Until a page is rendered, Google indexes the HTML-only version. For new sites, low-traffic pages, or pages that change frequently, that rendered version may never catch up. Google's own documentation acknowledges that JavaScript-based content "might not be indexed or might be indexed with a delay." The risk is real and avoidable. For AI fetchers, there is no rendering pipeline at all.

Google: JavaScript SEO basics ↗
Google's documentation on how Googlebot handles JavaScript rendering and indexing delays.

What a bot actually sees

Fetch any React SPA directly without running JavaScript and you'll get something like this: a <div id="root"></div> and a bundle of script tags. No headings. No body copy. No FAQ answers. No pricing. Just the skeleton the browser needs to mount the app. That's what ChatGPT, Googlebot on its first pass, Perplexity, and any other tool reading raw HTML sees when it visits your site.

To check what a bot sees: curl -s https://yoursite.com | grep 'text you expect to find'

The standard fix: static prerendering

Static prerendering solves this by capturing what a real browser sees at build time, before any visitor arrives. A headless browser visits every page, waits for content to fully render, and saves the resulting HTML to disk. The production server then serves that pre-captured HTML to bots. Real users get the same HTML first, then React hydrates on top of it to restore interactivity. The result: bots see exactly what users see.

Playwright (headless browser) ↗
Open-source browser automation library used to capture fully-rendered HTML at build time.

What Stackra had in place

Stackra already used this approach. At deploy time, a Playwright-powered build script visits every marketing page, including the homepage, blog, benchmarks, about, and each blog article, then waits for network activity to settle, captures the HTML, and writes it to the dist/prerendered/ directory. The Express server checks for a prerendered file first on every incoming request. If one exists, it gets served. If not, the SPA shell is served as a fallback for app routes.

Stackra Benchmarks →
See real performance data across small business websites by industry.
Stackra About & FAQs →
Learn how Stackra works and get answers to common questions.
Express.js ↗
Node.js web framework used to serve both the API and the prerender static file server.

The system worked. Two pages didn't.

When we fetched the site's pages directly, the same way an AI tool or crawler would, most came back with full content. But two came back thin:

The Benchmarks page: 18KB of shell HTML with no rendered data. Grade bars, pillar averages, and industry comparisons were all missing.
The About/FAQ page: 9 question headings present, zero answers anywhere in the document.

Both problems had different root causes and needed different fixes. Parts 2 and 3 of this series cover each one in detail, including two approaches we tried on the benchmarks problem before finding what actually worked.

What a bot actually sees

The standard fix: static prerendering

What Stackra had in place

The system worked. Two pages didn't.

More Inside Stackra articles