Researcher

Jeremy Berman

Achieved the highest score on ARC-AGI v1 (79.6%) and v2 (29.4%) using multi-agent collaboration with evolutionary test-time compute.

Mentions

Articles

Outlets

Associated Topics

Companies Covered

Coverage Patterns

How media typically covers Jeremy Berman

Article Types

research1

Media Angles

technical1

Narrative Framing

breakthrough1

Associated AI Models

Grok-41Deepseek R11

Articles

Most recent first

Mentioned In

Referenced in coverage

How I Got the Highest Score on ARC-AGI Again Swapping Python for English

Jeremy Berman SubstackresearchpositiveSep 17, 2025

Jeremy Berman achieved a new state-of-the-art score of 79.6% on ARC-AGI v1 and 29.4% on v2 using an evolutionary test-time compute architecture with English instructions instead of Python, achieving 25× greater efficiency than OpenAI's o3.

“Achieved the highest score on ARC-AGI v1 (79.6%) and v2 (29.4%) using multi-agent collaboration with evolutionary test-time compute.”

AI ResearchFoundation ModelsLLMsReinforcement Learning