
Author of "AI Models Are Already as Good as Experts at Half of Tasks, New OpenAI Benchmark GDPval Suggests" in Fortune
How this journalist typically writes
Jeremy Kahn as author
An MIT report finding that 95% of AI pilot projects fail to deliver financial returns reveals deeper enterprise adoption challenges beyond what triggered market sell-offs.
“Author of "An MIT report that 95% of AI pilots fail spooked investors. But it's the reason why those pilots fai"”
OpenAI's GDPval benchmark shows Claude Opus 4.1 achieves expert-level performance on 47.6% of real-world professional tasks across 44 different professions, while performance varies significantly across models with GPT-4o performing worst at only 10%.
“Author of "AI Models Are Already as Good as Experts at Half of Tasks, New OpenAI Benchmark GDPval Suggests" in Fortune”