The fast growth of AI search has led to new tools that promise to track brand visibility in ChatGPT, Perplexity, Gemini, and other large language model (LLM) platforms. These tools are being marketed as the next step in search engine optimization, often under the term “AI Engine Optimization” or AEO.

A recent discussion on LinkedIn led by Ahrefs CMO Tim Soulo, referencing Nathan Gotch’s YouTube video “AI Search Rank Trackers Are Lying to You,” sheds light on what is fact versus hype in this AI chatbot space. Many of these products are creating misconceptions about how much visibility data they can really provide.

But what are AI search rank trackers not saying? Suolo provides answers to this.

No company has access to real user prompts

Soulo explains that one of the most important facts is that no tool has direct access to the actual prompts people type into ChatGPT, Perplexity, Gemini, or other LLMs. These companies do not share their user prompt data.

According to him, marketers can only use “clickstream data” to get some insight into what people are searching for, but even the largest clickstream panels only capture a small slice of user behavior.

This means the visibility reports that advertisers see are not based on true user prompts. Instead, they are stitched together from assumptions. Any platform claiming to show “real” AI search visibility is likely overstating its data access.

Most tools depend on synthetic prompts

Another issue is that most AI visibility trackers use “synthetic” prompts. This means they rely on AI to generate lists of possible questions customers might ask. For example, a platform might scan a website and then create potential prompts such as “What is the best running shoe for flat feet?” But these are generated predictions, not verified customer inputs.

Why reliable AI visibility data is hard to get

 Even when tools try to simulate AI search visibility, the process is filled with challenges that make consistency difficult. According to Soulo, several factors create major inconsistencies:

  • People rarely use short keyword-style prompts. Instead, they type longer, more conversational sentences that differ from one another.
  • Personalization is turned on by default in tools like ChatGPT, meaning two users can see different answers for the same query.
  • Different AI models, such as GPT-3.5, GPT-4, or Gemini, return different results for the same input.
  • Even repeating the same prompt can produce slightly different answers, shifting brand positions in the response.
  • LLMs run internal “query fan-outs” to fetch supporting information, and these internal calls vary constantly.

What advertisers and marketers should take away

The push for AI search visibility tracking is still in its early stages. The industry does not yet have the kind of reliable data sources or standardized measurement that traditional SEO is built on. If advertisers treat these tools as definitive rankings, they risk making decisions based on flawed inputs.

The most important step right now is for advertisers and brands to look closely at how each tool collects its data and understand what its reports are really showing. Knowing where the data comes from, how prompts are generated, and what limitations exist is key before using AI visibility reports to guide ad spend or content strategies.

Industry's View
Stories like this, in your inbox every Wednesday
Our 1x weekly, bite-sized newsletter will give you everything you need to know
in the world of marketing:
HOME PAGE