
Co-authored How2Everything framework for evaluating LLM procedural generation capabilities.
How media typically covers Kyle Lo
Referenced in coverage
How2Everything, a new framework for mining 351K how-to procedures from the web, enables scalable evaluation and improvement of LLM procedural generation through an LLM judge that achieves 80.5% human agreement and yields measurable performance gains via reinforcement learning.
“Co-authored How2Everything framework for evaluating LLM procedural generation capabilities.”