TL;DR
On June 22, 2026, Search Engine Land surfaced a Google patent, "Data extraction using LLMs" (WO2025063948A1) : describing how its large language models build a holistic characterization of a business from scattered sources (website, listings, press, job postings). The GEO lesson: your homepage no longer defines your brand in the eyes of AI, the consistency of everything that exists about you does. Take control of that signal, or let AI decide for you.
Google has documented, in black and white, how an AI can decide who you are, without ever asking you. On June 22, 2026, analyst Rich Sanger published a detailed read of a Google LLC patent on Search Engine Land: "Data extraction using LLMs" (WO2025063948A1), published March 27, 2025, authored by researchers Aarthi Ramachandran and Nidhi Gupta.
The document describes a system in which a large language model crawls multiple pages and sources within a domain, then generates "an interpretation of the extracted content rather than a verbatim duplication." Translation: AI doesn't copy your site, it forms an opinion of it, and that opinion becomes the official version of your brand inside generative answers.
What the patent actually says
The patented system follows four steps, per Search Engine Land's analysis:
- Collection : the AI aggregates information from your website, maps data, directories, business listings, job postings and third-party sources.
- Interpretation : it reads content not structured for machines, going beyond simple keyword extraction.
- Attribute extraction : services, reputation, principles, social media sentiment, relationships between entities.
- Graph organization : attributes are arranged in a hierarchical structure linking products, services, audiences and locations.
In other words, Google describes a machine that manufactures an "identity file" of your business. That file isn't fed by what you say you are, but by what the whole web suggests you are. It's exactly the entity logic we already see in how AI Overviews cite sources and sometimes recommend your competitors.
Keep in mind: a patent protects a method, it doesn't prove deployment. There's no evidence this exact system powers AI Mode or AI Overviews today. But it confirms the direction: AI reasons in entities, not isolated pages.
Why it matters for your visibility
For twenty years, SEO was played page by page: one keyword, one URL, one ranking. Entity logic changes the rules. If an AI builds your profile from dozens of sources, then a single inconsistency : a different job title on LinkedIn, an old address in a directory, a marketing promise missing from your site, becomes noise that dilutes the "characterization" the model retains.
It's the direct continuation of something we've been documenting for months: as Search Console rolls out generative AI performance reports, the question is no longer just "do I rank?" but "does AI describe me correctly when asked about my industry?".
Want to know how ChatGPT and Google describe your business today? We'll show you in a free GEO audit.
What to do now
You don't need to wait for the system to ship to act. The recommendations that follow from the patent are GEO fundamentals you can apply immediately:
| Lever | Concrete action |
|---|---|
| Multi-source consistency | Align your description (who you are, what you do, for whom) across your website, Google Business Profile, social media, press and job postings. |
| Brand attributes | Define the 3 to 5 attributes you want associated with your entity (reliability, local expertise, innovation) and surface them everywhere. |
| Evidence | Back every claim: customer reviews, case studies, awards, author expertise signals. |
| Entity relationships | Clarify the links between your products, locations, audiences and use cases, that's what the hierarchical graph tries to reconstruct. |
| Footprint audit | Assess how an AI would describe your business combining every available source, then fix the gaps. |
The habit to drop: believing that optimizing your homepage is enough. The habit to build: treating your brand as a distributed entity where every mention counts. It's the same shift that makes certain "chunking" tactics obsolete against AI semantic reading.
What this article doesn't cover
This patent doesn't say Google already uses it in production, nor how heavily it weights each source. We don't have access to the model's internal signals, and no one, Google included, publishes a verifiable "entity score." This article describes a direction confirmed by an official document, not a guaranteed ranking recipe. Traffic figures specific to each industry depend on your competition and queries, that's the job of a dedicated audit, not a generality.
Our take
At Cicero, we've been saying the same thing since generative engines arrived: AI visibility is won upstream, on the consistency of your identity, not by stuffing keywords. This patent simply puts Google's words on a reality we already observe with our clients. Take back control of what AI remembers about you, otherwise it will invent a version of your brand you never chose.
FAQ
What does Google patent WO2025063948A1 describe?
Does the patent mean Google already uses this system?
How do I optimize my business for this kind of AI characterization?
Related reading
- Search Console rolls out generative AI performance reports
- AI Overviews cite listicles and recommend your competitors
- Why "chunking" is becoming useless against AI
Sources
- → Google Patents, "Data extraction using LLMs" (WO2025063948A1), Google LLC patent, published March 27, 2025.
- → Search Engine Land, analysis by Rich Sanger, June 22, 2026.
Growth and SEO content strategist, I founded Cicéro to help businesses build lasting organic visibility, on Google and in AI-generated answers alike. Every piece of content we produce is designed to convert, not just to exist.
LinkedIn