Supposed expert reviews of Google Gemini outputs are coming from non-experts

External testers are now required to address prompts outside of their "domain knowledge."
 By 
Cecily Mauran
 on 
google gemini app. on a smartphone
Google Gemini might have an accuracy problem, and not because of hallucination. Credit: Jaque Silva / NurPhoto / Getty Images

Like any genAI model, Google Gemini responses can sometimes be inaccurate, but in this case it might be because testers don't have the expertise to fact-check them.

According to TechCrunch, the firm hired to improve accuracy for Gemini is now making its testers evaluate responses even if they don't have the "domain knowledge."

The report raises questions about the rigor and standards Google says it applies to testing Gemini for accuracy. In the "Building responsibly" section of the Gemini 2.0 announcement, Google said it is "working with trusted testers and external experts and performing extensive risk assessments and safety and assurance evaluations." There's a reasonable focus on evaluating responses for sensitive and harmful content, but less attention is paid to responses that aren't necessarily dangerous but just inaccurate.


You May Also Like

Google seems to disregard the hallucination and error problem by simply adding a disclaimer that "Gemini can make mistakes, so double-check it," which effectively absolves it from any responsibility. But that doesn't account for the humans doing the work behind the scenes.

Previously GlobalLogic, a subsidiary of Hitachi, instructed its prompt engineers and analysts to skip a Gemini response they didn't fully understand. "If you do not have critical expertise (e.g. coding, math) to rate this prompt, please skip this task," said the guidelines viewed by the outlet.

But last week, GlobalLogic changed its instructions, saying, "You should not skip prompts that require specialized domain knowledge," and to instead "rate the parts of the prompt you understand," and note that they don't have the required expertise in their analysis. Expertise, in other words, is not being treated as a prerequisite for this work.

Contractors can now only skip prompts that are "completely missing information," according to TechCrunch, or those that contain sensitive content that requires a consent form.

Mashable Image
Cecily Mauran
Tech Reporter

Cecily is a tech reporter at Mashable who covers AI, Apple, and emerging tech trends. Before getting her master's degree at Columbia Journalism School, she spent several years working with startups and social impact businesses for Unreasonable Group and B Lab. Before that, she co-founded a startup consulting business for emerging entrepreneurial hubs in South America, Europe, and Asia. You can find her on X at @cecily_mauran.

Mashable Potato

Recommended For You

Google Chrome unveils Gemini-powered auto-browsing feature
Chrome auto browse

Google hit with shocking wrongful death lawsuit over Gemini AI chatbot
Google Gemini logo

Google Maps adds Gemini to a major feature in new test
Google Maps logo on smartphone

Google increases Gemini usage limit. How it will work.
Google Gemini

Trending on Mashable
NYT Connections hints today: Clues, answers for April 3, 2026
Connections game on a smartphone

Wordle today: Answer, hints for April 3, 2026
Wordle game on a smartphone


What's new to streaming this week? (April 3, 2026)
A composite of images from film and TV streaming this week.

Google launches Gemma 4, a new open-source model: How to try it
Google Gemma
The biggest stories of the day delivered to your inbox.
These newsletters may contain advertising, deals, or affiliate links. By clicking Subscribe, you confirm you are 16+ and agree to our Terms of Use and Privacy Policy.
Thanks for signing up. See you at your inbox!