Author of "DualPath for High-Throughput Agentic LLM Inference" on arXiv
How media typically covers Yongtong Wu
Yongtong Wu as author
DualPath, a novel inference system, breaks KV-Cache storage bandwidth bottlenecks in agentic LLM inference by introducing dual-path loading that improves throughput by up to 1.87× offline and 1.96× online.
“Author of "DualPath for High-Throughput Agentic LLM Inference" on arXiv”