Title: Perceiving the 4D World from Any Video

Abstract: Perceiving the dynamic 3D world from visual inputs is essential for human interaction with the physical environment. While recent advancements in data-driven methods have significantly improved models' ability to interpret 3D scenes, much of this progress has focused on static scenes or specific categories of dynamic objects. How can we effectively model general dynamic scenes in the wild? How can we achieve online perception with human-like capabilities? In this talk, I will first discuss holistic representations for 4D scenes and then present a framework for online dense perception that continuously refines scene understanding with new observations. Finally, I will conclude with a discussion about the future opportunities and challenges in developing robust, scalable systems for perceiving and understanding dynamic 3D environments in real-world settings.

Bio: Qianqian Wang is a postdoc at UC Berkeley working with Prof. Angjoo Kanazawa and Prof. Alyosha Efros. She received her PhD in Computer Science from Cornell University in 2023, advised by Prof. Noah Snavely and Prof. Bharath Hariharan. She is a recipient of ICCV Best Student Paper Award, Google PhD fellowship and EECS rising stars.