Author of "DFlash Speeds Up Speculative Decoding" in Z-Lab
How this journalist typically writes
Yesheng Liang as author
DFlash achieves up to 6× lossless speedup for LLM inference using block diffusion for speculative decoding, nearly 2.5× faster than EAGLE-3 state-of-the-art.
“Author of "DFlash Speeds Up Speculative Decoding" in Z-Lab”