Researcher at Google DeepMind
Co-author of the SimpleQA Verified benchmark paper
How media typically covers Giovanni D'Antonio
Giovanni D'Antonio as author
SimpleQA Verified, an improved factuality benchmark for LLMs, reveals that Gemini 2.5 Pro achieves state-of-the-art F1-score of 55.6, outperforming GPT-5 on measuring parametric knowledge and hallucination mitigation.
“Co-author of the SimpleQA Verified benchmark paper”