Representation of an entity graph linking a brand to its products, audiences and locations, generated by a large language model

TL;DR

On June 22, 2026, Search Engine Land surfaced a Google patent, "Data extraction using LLMs" (WO2025063948A1) : describing how its large language models build a holistic characterization of a business from scattered sources (website, listings, press, job postings). The GEO lesson: your homepage no longer defines your brand in the eyes of AI, the consistency of everything that exists about you does. Take control of that signal, or let AI decide for you.

Google has documented, in black and white, how an AI can decide who you are, without ever asking you. On June 22, 2026, analyst Rich Sanger published a detailed read of a Google LLC patent on Search Engine Land: "Data extraction using LLMs" (WO2025063948A1), published March 27, 2025, authored by researchers Aarthi Ramachandran and Nidhi Gupta.

The document describes a system in which a large language model crawls multiple pages and sources within a domain, then generates "an interpretation of the extracted content rather than a verbatim duplication." Translation: AI doesn't copy your site, it forms an opinion of it, and that opinion becomes the official version of your brand inside generative answers.

What the patent actually says

The patented system follows four steps, per Search Engine Land's analysis:

  1. Collection : the AI aggregates information from your website, maps data, directories, business listings, job postings and third-party sources.
  2. Interpretation : it reads content not structured for machines, going beyond simple keyword extraction.
  3. Attribute extraction : services, reputation, principles, social media sentiment, relationships between entities.
  4. Graph organization : attributes are arranged in a hierarchical structure linking products, services, audiences and locations.

In other words, Google describes a machine that manufactures an "identity file" of your business. That file isn't fed by what you say you are, but by what the whole web suggests you are. It's exactly the entity logic we already see in how AI Overviews cite sources and sometimes recommend your competitors.

Keep in mind: a patent protects a method, it doesn't prove deployment. There's no evidence this exact system powers AI Mode or AI Overviews today. But it confirms the direction: AI reasons in entities, not isolated pages.

Why it matters for your visibility

For twenty years, SEO was played page by page: one keyword, one URL, one ranking. Entity logic changes the rules. If an AI builds your profile from dozens of sources, then a single inconsistency : a different job title on LinkedIn, an old address in a directory, a marketing promise missing from your site, becomes noise that dilutes the "characterization" the model retains.

It's the direct continuation of something we've been documenting for months: as Search Console rolls out generative AI performance reports, the question is no longer just "do I rank?" but "does AI describe me correctly when asked about my industry?".

Want to know how ChatGPT and Google describe your business today? We'll show you in a free GEO audit.

What to do now

You don't need to wait for the system to ship to act. The recommendations that follow from the patent are GEO fundamentals you can apply immediately:

LeverConcrete action
Multi-source consistencyAlign your description (who you are, what you do, for whom) across your website, Google Business Profile, social media, press and job postings.
Brand attributesDefine the 3 to 5 attributes you want associated with your entity (reliability, local expertise, innovation) and surface them everywhere.
EvidenceBack every claim: customer reviews, case studies, awards, author expertise signals.
Entity relationshipsClarify the links between your products, locations, audiences and use cases, that's what the hierarchical graph tries to reconstruct.
Footprint auditAssess how an AI would describe your business combining every available source, then fix the gaps.

The habit to drop: believing that optimizing your homepage is enough. The habit to build: treating your brand as a distributed entity where every mention counts. It's the same shift that makes certain "chunking" tactics obsolete against AI semantic reading.

What this article doesn't cover

This patent doesn't say Google already uses it in production, nor how heavily it weights each source. We don't have access to the model's internal signals, and no one, Google included, publishes a verifiable "entity score." This article describes a direction confirmed by an official document, not a guaranteed ranking recipe. Traffic figures specific to each industry depend on your competition and queries, that's the job of a dedicated audit, not a generality.

Our take

At Cicero, we've been saying the same thing since generative engines arrived: AI visibility is won upstream, on the consistency of your identity, not by stuffing keywords. This patent simply puts Google's words on a reality we already observe with our clients. Take back control of what AI remembers about you, otherwise it will invent a version of your brand you never chose.

FAQ

What does Google patent WO2025063948A1 describe?
The patent "Data extraction using LLMs", filed by Google LLC, describes a system in which a large language model collects information across multiple pages and sources within a domain, then generates a synthesized characterization of an entity. The LLM interprets content rather than copying it verbatim, and organizes attributes into a hierarchical graph linking products, services, audiences and locations.
Does the patent mean Google already uses this system?
No. A patent describes a protected method, not a deployed product. There's no evidence this exact system currently powers AI Mode or AI Overviews. It does, however, reveal Google's research direction and the entity logic that AI engines already apply visibly in their citations.
How do I optimize my business for this kind of AI characterization?
Align your business description across every source (website, Google Business Profile, social media, press, job postings), define the attributes you want associated with your brand, back every claim with evidence (reviews, case studies, awards), and clarify the relationships between your products, audiences and locations.

Related reading

Sources

What if we looked at how AI talks about you?

Free GEO audit: we'll show you how Google, ChatGPT and Perplexity describe your business, and what to fix to take back control.

Alexis Dollé, founder of Cicéro
Alexis Dollé
CEO & Founder

Growth and SEO content strategist, I founded Cicéro to help businesses build lasting organic visibility, on Google and in AI-generated answers alike. Every piece of content we produce is designed to convert, not just to exist.

LinkedIn