This is the official codebase for the paper

Search code, repositories, users, issues, pull requests...

submited by
Style Pass
2024-11-18 01:00:07

This is the official codebase for the paper "WAVELET LATENT DIFFUSION (WALA): BILLION- PARAMETER 3D GENERATIVE MODEL WITH COM-PACT WAVELET ENCODINGS"

This model uses a voxelized representation of the object with a resolution of 16³. The voxel file is a JSON containing the following keys: resolution, occupancy, and color

For multi-view input, the model utilizes multiple images of the same object captured from different camera angles. These images should be named according to the index of the camera view parameters as described in Data Formats

For depth-maps input, the model utilizes 4 depth-map images of the same object captured from different camera angles to create 3D object.

For depth-maps input, the model utilizes 6 depth-map images of the same object captured from different camera angles to create 3D object.

Multi-View Input: A set of image files taken from different camera angles. The filenames correspond to specific camera parameters. Below is a table that maps the index of each image to its corresponding camera rotation and elevation:

Leave a Comment