Wenhao Chai

Wenhao Chai is a first-year Ph.D. student in Computer Science at Princeton University, advised by Professor Karthik Narasimhan, and student researcher at Google DeepMind. He received his master's degree from University of Washington and bachelor's degree from Zhejiang University.

His research spans a wide range of topics in computer vision and machine learning, with a focus on long-context multimodal modeling and reasoning. He has interned at Pika Labs working with Professor Christopher D. Manning, and Microsoft Research Asia.

He leads MovieChat, one of the first large multimodal models and benchmarks for hour-long video understanding with memory mechanism. He co-leads LiveCodeBench Pro, which has been listed as evaluation benchmarks by frontier models like Google Gemini and Meta Muse Spark. He has organized workshops and competitions at CVPR 2024, CVPR 2025, and CVPR 2026. His work has been featured by MIT Technology Review.

News & Highlights

View all
  • Jun 2026PaperOne paper accepted by IJCV.
  • Jun 2026PaperTwo papers accepted by ECCV 2026.
  • Jun 2026EventWe organize the 2nd Workshop on Knowledge-Intensive Multimodal Reasoning at CVPR 2026.
  • May 2026RoleI join Google DeepMind as a student researcher.
  • May 2026PaperTwo papers accepted by ICML 2026.
  • Feb 2026PaperTwo papers accepted by CVPR 2026.
  • Jan 2026PaperFive papers accepted by ICLR 2026.
  • Dec 2025PaperOne paper accepted by IEEE TIP.
  • Dec 2025TalkI give an oral presentation at NeurIPS 2025 about Benchmarking Reasoning-Informed Visual Editing. Slides.
  • Oct 2025PaperVideo-MMLU received the Outstanding Paper Award at ICCV 2025 Workshop @ Knowledge-Intensive Multimodal Reasoning with Travel Grant.
  • Sep 2025PaperOne paper accepted by NeurIPS 2025, two papers accepted by NeurIPS 2025 Datasets and Benchmarks Track with one Oral.
  • Sep 2025TalkInvited talk at Abaka AI and 2077AI titled Better and Longer Video Understanding. Slides.
  • Sep 2025MilestoneI join Princeton University as a CS Ph.D. student.
  • Aug 2025PaperOne paper accepted by IEEE TPAMI.
  • Aug 2025TalkLiveCodeBench Pro presented in Open AGI Symposium at University of California, Berkeley. Slides.
  • Jul 2025PressInterviewed by DeepTech and MIT Technology Review China. Post.
  • Jun 2025PaperOne paper accepted by ICCV 2025.
  • Jun 2025PressFeatured in MIT Technology Review as one of the lead authors of LiveCodeBench Pro.
  • Jun 2025PaperOne paper accepted by IROS 2025.
  • May 2025PaperOne paper accepted by ACL 2025.
  • Apr 2025EventWe host CVPR 2025 Video Understanding Challenge @ LOVEU sponsored by Lambda.
  • Mar 2025MilestoneGraduated from the University of Washington with a Master's thesis on Large Multimodal Models for Video Captioning, nominated for the Distinguished Thesis Award by the ECE Department.
  • Feb 2025PaperThree papers accepted by CVPR 2025.
  • Jan 2025PaperTwo papers accepted by ICLR 2025.
  • Jul 2024PaperTwo papers accepted by ECCV 2024.
  • Jun 2024RoleI work with Pika Labs as intern to develop next-generation video understanding and generation models.
  • Apr 2024EventWe host CVPR 2024 Long-form Video Understanding Challenge @ LOVEU.
  • Apr 2024TalkInvited talk at AgentX seminar about our STEVE series works.
  • Feb 2024PaperTwo papers accepted by CVPR 2024 with one highlight (2.81%).
  • Feb 2024TalkInvited talk at AAAI 2024 workshop @ IMAGEOMICS.
  • Dec 2023PaperOne paper accepted by AAAI 2024.
  • Jul 2023PaperTwo papers accepted by ICCV 2023.
  • Feb 2023RoleI become a research intern at Microsoft Research Asia (MSRA), advised by principal researcher Xun Guo.

Office Hours

FAQ for juniors

To junior master/undergraduate students: if you would like to chat about life, career plan, or research ideas related to AI/ML. I will dedicate at least 30 mins every week for such meetings. I encourage students from underrepresented groups to reach out. Also check my calendar.

AuroraCap. Efficient, performant video detailed captioning, and a new benchmark.

Introduction to AuroraCap and the VDC benchmark. ICLR 2025, in collaboration with Pika Labs, Stanford, MIT, Harvard, and NYU.

Watch on YouTube
2:14 · AuroraCap / ICLR '25