Why Reporting On LLM Traffic Is So Hard

This article explains the difficulties in accurately reporting on traffic generated by Large Language Models (LLMs). It details how data from AI tools is often an educated guess due to privacy laws and the conversational nature of AI search, contrasting it with keyword-based traditional search. The piece also offers strategies employed by Geeky Tech to derive insights from the limited data available, such as using regex for tracking and making inferences based on SEO experience.

Q&A

Q: Why is reporting on LLM traffic difficult?

Reporting on LLM traffic is difficult due to several factors: the proprietary nature of LLM data which makes direct measurement impossible, leading to reliance on indirect methods like referral traffic and scraped data; data privacy laws (like GDPR and CCPA) that limit the visibility of referral traffic by up to 50% due to cookie opt-outs; and the fundamental difference between keyword-based search (Google) and conversational search (LLMs), where user journeys are complex and varied, making it hard to attribute actions to specific prompts or starting points.

Q: How does Geeky Tech attempt to report on LLM traffic?

Geeky Tech employs several strategies to navigate the challenges of LLM traffic reporting. They set up precise filtering in Google Analytics 4 using regex to capture AI referrals from major LLMs. They also leverage their SEO experience to make educated inferences from the limited session data, such as determining if a user query was branded or informational based on the landing page. Finally, they adopt a 'wait and see' attitude, continuously monitoring metrics and adapting strategies as the field evolves.

Questions not yet answered

{'question': 'What are the specific technical limitations of accessing LLM data?', 'hypothetical_answer': 'A thorough answer would detail the API restrictions, data formatting issues, and the lack of standardized protocols that prevent direct access to LLM interaction logs. It would also cover the computational resources required to process such vast amounts of data, and how current infrastructure may not be equipped for real-time, comprehensive analysis.'}
{'question': 'How do different LLMs compare in terms of the data they make available or the difficulty of tracking their traffic?', 'hypothetical_answer': "This would involve a comparative analysis of major LLMs like ChatGPT, Gemini, Claude, and Perplexity, detailing any unique features or limitations each presents for traffic analysis. It would explore whether certain models are more 'opaque' or 'transparent' regarding user interactions and referral data, and how this impacts reporting accuracy for each."}
{'question': 'What are the ethical implications of trying to track or infer user behavior from LLM interactions?', 'hypothetical_answer': "An ideal answer would discuss the privacy concerns related to user data when interacting with LLMs, even indirectly. It would explore the balance between marketers' need for data and users' right to privacy, touching upon consent, anonymization, and the potential for misinterpretation of user intent, as well as the ethical considerations of companies attempting to 'reverse-engineer' user journeys."}

Follow-up questions

{'question': 'How can businesses proactively optimize their content for conversational AI search?', 'hypothetical_answer': "This would involve strategies like focusing on long-tail keywords, creating content that directly answers complex questions, structuring content in a way that's easily digestible by AI (e.g., using clear headings, summaries, and FAQs), and understanding the different types of queries users might pose to LLMs when searching for information related to a business's products or services."}
{'question': 'What emerging tools or technologies are being developed to improve LLM traffic reporting?', 'hypothetical_answer': 'This would explore potential future solutions such as advancements in AI analytics platforms, new browser features or standards for referral tracking that accommodate AI interactions, or direct partnerships between LLM providers and analytics companies. It might also touch upon innovations in synthetic data generation or privacy-preserving analysis techniques.'}
{'question': 'Beyond referral traffic, what other metrics can indicate the effectiveness of LLM visibility for a brand?', 'hypothetical_answer': 'An answer would explore alternative KPIs such as brand mentions within LLM-generated content, the sentiment of AI-driven responses related to the brand, the conversion rates of users who interact with the brand after an LLM-guided search, or even qualitative feedback from customers about how they discovered the brand through AI.'}

Why Reporting On LLM Traffic Is So Hard

Traffic

Keywords

Q&A

Questions not yet answered

Follow-up questions

Entities on this page