
Wrote about LLMs as parts of systems and how systems built with LLMs and tools can do things that LLMs alone cannot do.
How media typically covers Marc Brooker
Marc Brooker as author
Pass@k is an exponentially forgiving metric that misrepresents agent performance and should be rarely used in AI evaluation, as human users require all steps to succeed rather than accepting occasional successes.
“Author of "Pass@k Is Mostly Bunk"”
Referenced in coverage
LLMs are most powerful when integrated into broader systems with code interpreters, databases, browsers, and formal reasoning tools rather than used in isolation, enabling capabilities and cost efficiencies unattainable by LLMs alone.
“Wrote about LLMs as parts of systems and how systems built with LLMs and tools can do things that LLMs alone cannot do.”