Top
Background
Main idea
Fix
Results
Citing

Score Distillation via Reparametrized DDIM

NeurIPS 2024

1Massachusetts Institute of Technology, 2University of Oxford, 3MIT‑IBM Watson AI Lab, 4Toyota Research Institute, 5Meta Reality Labs Research

TL;DR

Skip to Background Main idea Fix Results Citing

Score Distillation Sampling (SDS) is a promising technique that allows to use pre-trained 2D diffusion models for 3D generation. However, the quality of the generated 3D assets is limited.

In this work we:

  • 🔎 Theoretically show that SDS ≈ 2D Diffusion
  • 🚨 Reveal that the noise term in SDS is the reason for over-smoothing
  • 🛠️ Suggest a fix
  • ✅ Improve the quality of 3D generation

A photograph of a ninja

“A photograph of a ninja”

Main principle of Score Distillation

How does Score Distillation work?

Proposed in DreamFusion and Score Jacobian Chaining, Score Distillation is a method for generating 3D shapes using a pre-trained and frozen 2D diffusion model.

  1. Initialize a differentiable 3D representation
  2. Sample a random camera pose 📷
  3. Render a view of the object 🖼️
  4. Add noise to the rendering 🌫️
  5. Denoise the image with the 2D diffusion model
  6. Optimize the parameters of the 3D representation to match the denoised image 🎯
  7. Repeat steps 2-6 until convergence 🔁

Often the generated shapes are over-smoothed and over-saturated.

What do we propose?

In this work we start with the first principles of image generation with diffusion models. At first, we consider the steps of DDIM that gradually remove noise from the image. Each update can be seen as denoising the image all the way with a single-step update and then adding portion of this noise back to the image. By reshuffling the order of the updates, we can define a dual process defined on the space of noise-free images.

For a formal derivation, see the full paper .

Pumpkin head zombie, skinny, highly detailed

Good noise is all you need

Depending on the choice of the noise term the dual process becomes either identical to Score Distillation or to DDIM.

The dual process is equivalend to SDS when noise is sampled randomly:

Kappa in SDS

Thus each SDS udpate step corresponds to a different DDIM trajectory. This inconsistency averages the final result across multiple trajectories and leads to blurriness.

SDS mixing trajectories

From the reparametrization intuition we know that noise term should follow specific structure:

Kappa in DDIM

In practise it is hard to solve as it involves an inverse of the trained diffusion model. We suggest to use DDIM inversion to find an approximate solution.

SDS mixing trajectories

We obtain the noise term with DDIM inversion conditioned on the current rendering:

Kappa in ours

This anchors score distillation process to DDIM trajectories that are consistent with each other both in time and across the different rendering angles.

SDS mixing trajectories

Results

A DSLR photograph of a freshly baked round loaf of sourdough bread

“A DSLR photograph of a freshly baked round loaf of sourdough bread”

Robotic bee high detail

“Robotic bee high detail”

Pumpkin head zombie, skinny, highly detailed

“Pumpkin head zombie, skinny, highly detailed”

A ripe strawberry

“A ripe strawberry”

An ice cream sundae

“An ice cream sundae”

A photograph of a knight

“A photograph of a knight”

See the paper for detailed comparisons with ProlificDreamer, Noise-Free SD, HiFA, Lucid Dreamer, and other amazing works in Score Distillation.

BibTeX

If you find this work useful, please consider citing:


                @misc{lukoianov2024score,
                    title={Score Distillation via Reparametrized DDIM}, 
                    author={Artem Lukoianov and Haitz Sáez de Ocáriz Borde and
                            Kristjan Greenewald and Vitor Campagnolo Guizilini and Timur Bagautdinov and
                            Vincent Sitzmann and Justin Solomon},
                    year={2024},
                    eprint={2405.15891},
                    archivePrefix={arXiv},
                    primaryClass={cs.CV}
                }
            
For any questions, please contact arteml@mit.edu.