Computer screen showing a partially loaded webpage with a miniature Googlebot robot figurine

On March 31, 2026, Google published a detailed article on the Search Central Blog revealing Googlebot's inner workings — and a technical limit most websites are unaware of: Googlebot only downloads the first 2 megabytes of each HTML page. Everything beyond that cutoff is simply ignored.

What Google just confirmed

Gary Illyes, a Google analyst, details the precise crawling mechanisms for the first time in a post accompanied by episode 105 of the Search Off the Record podcast.

2MB HTML limit per URL (headers included)
64MB Limit for PDF files
15MB Default limit for other Google crawlers

Key takeaways:

  • Googlebot isn't a single program. It's a centralized infrastructure used by Google Search, Shopping, AdSense, and dozens of other products — each with its own settings.
  • Partial fetch. If your HTML exceeds 2MB, Googlebot cuts off at exactly 2MB. It doesn't reject the page: it indexes what it retrieved as if it were the complete file.
  • Unseen bytes don't exist. Any content, structured data, or meta tag located beyond the cutoff is ignored. Not crawled, not rendered, not indexed.
  • External resources have their own counter. Externally loaded CSS and JavaScript don't count toward the parent page's 2MB — but each has its own separate 2MB limit.

Why this matters for your website

For most sites, 2MB of HTML is a comfortable threshold. But some common architectures exceed it without teams even realizing:

  • Inline base64 images — a single encoded image can weigh 500KB to 1MB
  • Massive inline CSS and JavaScript — frameworks that inject everything into the HTML (some SSR configs) bloat the document
  • Mega-menus and duplicated footers — a navigation menu with 200+ links before the main content pushes your critical elements down
  • E-commerce catalog pages — hundreds of products with their schema markup injected into the HTML

The concrete risk: your E-E-A-T signals, structured data, main content, or internal links end up past the cutoff — and Google never sees them.

3 actions to take now

  1. Measure your HTML page weight. Not the total weight (with images), but the raw HTML. In Chrome DevTools: Network tab → filter by Doc → check the "Size" column. If any page exceeds 1.5MB, you're in the danger zone.
  2. Externalize CSS and JS. Move inline styles and scripts to external files. Each external file gets its own 2MB counter — you're freeing up space in the main document for actual content.
  3. Order your tags. Google says it explicitly: place critical elements — meta tags, canonical, structured data, hreflang tags — higher up in the HTML document. If you have a 300-line mega-menu before your <main>, move it after the content or load it via JS.

Key takeaway: Google notes this limit "is not set in stone and may change over time." But until then, every byte counts. Optimize your HTML the way you optimize your load time.

Our take

This announcement formalizes what technical SEO specialists have suspected for years. The real news is the transparency: Google is publishing its limits in black and white. For businesses using CMS platforms with generated content or heavy product pages, the signal is clear — slim down your HTML or accept that Google only sees part of your site. A thorough SEO audit can identify at-risk pages in hours.

Sources

Is your HTML too heavy for Google?

We audit your site for free: page weight, structured data, indexation issues.

Alexis Dollé, founder of Cicéro
Alexis Dollé
CEO & Founder

Growth and SEO content strategist, I founded Cicéro to help businesses build lasting organic visibility — on Google and in AI-generated answers alike. Every piece of content we produce is designed to convert, not just to exist.

LinkedIn