Given an input monocular video, we generate multi-view videos at novel viewpoints using our                         multi-vie

CAT4D: Create Anything in 4D with Multi-View Video Diffusion Models

submited by
Style Pass
2024-11-29 07:00:03

Given an input monocular video, we generate multi-view videos at novel viewpoints using our multi-view video diffusion model. These generated videos are then used to reconstruct the dynamic 3D scene as deforming 3D Gaussians.

Click on the images below to render 4D scenes in real-time in your browser, powered by Brush! Note that this is experimental and quality may be reduced.

Your browser does not appear to support the interactive viewer. Currently, only Chrome 130+ is supported.

At the core of CAT4D is a multi-view video diffusion model that disentangles the controls of camera and scene motions. We demonstrate this by generating three types of output sequences given 3 input images (with camera poses): 1) fixed viewpoint and varying time, 2) varying viewpoint and fixed time, and 3) varying viewpoint and varying time.

Leave a Comment