Blogs | Slides
Featured
Blogs and Slides

Flow Matching Variant for Denoising Diffusion Codebook Models
Mar 16, 2025
In this blog, we introduce Denoising Diffusion Codebook Models (DDCM) and extend it to the flow matching framework.
Blogs
Introducing muP
Mar 13, 2025
In this blog, we introduce muP (Maximal Update Parametrization), which aims at studying the transfer patterns of hyperparameters across model scales.
Blogs
What is the Intrinsic Dimension of Your Data?
Jan 15, 2025
In this blog, we introduce the concept of intrinsic dimension and provide a method to estimate it. It is amazing that ImageNet has only 50 of the intrinsic dimension.
SlidesFeatured
Posters
† Attend in person.
AuroraCap: Efficient, Performant Video Detailed Captioning and a New Benchmark
ICLR 2025, Singapore
Ego3DT: Tracking Every 3D Object in Ego-centric Videos
ACM MM 2024, Melbourne, Australia
† MovieChat: From Dense Token to Sparse Memory for Long Video Understanding
CVPR 2024, Seattle, WA
† STEVE Series: Step-by-Step Construction of Agent Systems in Minecraft
CVPR 2024 workshop, Seattle, WA
† UniAP: Towards Universal Animal Perception in Vision via Few-shot Learning
AAAI 2024, Vancouver, Canada
† StableVideo: Text-driven Consistency-aware Diffusion Video Editing
ICCV 2023, Paris, France
See and Think: Embodied Agent in Virtual Environment
ECCV 2024, Milano, Italy
† Learning Diffusion Texture Priors for Image Restoration
CVPR 2024, Seattle, WA
Hierarchical Auto-Organizing System for Open-Ended Multi-Agent Navigation
ICLR 2024 workshop, Vienna, Austria
† Back to Optimization: Diffusion-based Zero-Shot 3D Human Pose Estimation
WACV 2024, Waikoloa, Hawaii
† Global Adaptation meets Local Generalization: Unsupervised Domain Adaptation for 3D Human Pose Estimation
ICCV 2023, Paris, France