Equal contribution author on Depth Anything 3 for monocular and multi-view depth estimation.
How media typically covers Sili Chen
Referenced in coverage
ByteDance released Depth Anything 3, a unified transformer-based model that predicts spatially consistent geometry from arbitrary visual inputs for monocular depth estimation, multi-view depth estimation, pose estimation, and 3D Gaussian generation using a single depth-ray representation.
“Equal contribution author on Depth Anything 3 for monocular and multi-view depth estimation.”