Testing suggests Google's AI Overviews tell millions of lies per hour

Ars Technica

Ryan Whitwam

Apr 8, 2026 at 12:53 AM7.0/10

A New York Times analysis using OpenAI's SimpleQA benchmark found Google's AI Overviews provides incorrect answers 10% of the time, translating to millions of daily errors. The feature showed improvement from 85% to 91% accuracy between Gemini 2.5 and 3.0 updates, but remains problematic at scale. Examples include confidently citing contradictory or irrelevant sources for factual queries.

Background

Google's AI Overviews is a Gemini-powered feature that appears atop search results, designed to summarize information but criticized for accuracy issues since its 2024 launch. The SimpleQA benchmark is a standardized test with 4,000+ verifiable questions used to evaluate AI factuality.

Source: Ars Technica
Published: Apr 8, 2026 at 12:53 AM
Score: 7.0 / 10

Read Original →