Imagine Flash: Accelerating Emu Diffusion Models with Backward Distillation

submited by

Style Pass

2024-04-18 22:00:02

Diffusion models are a powerful generative framework, but come with expensive inference. Existing acceleration methods often compromise image quality or fail under complex conditioning when operating in an extremely low-step regime. In this work, we propose a novel distillation framework tailored to enable high-fidelity, diverse sample generation using just one to three steps. Our approach comprises three key components: (i) Backward Distillation, which mitigates training-inference discrepancies by calibrating the student on its own backward trajectory; (ii) Shifted Reconstruction Loss that dynamically adapts knowledge transfer based on the current time step; and (iii) Noise Correction, an inference time technique that enhances sample quality by addressing singularities in noise prediction. Through extensive experiments, we demonstrate that our method outperforms existing competitors in quantitative metrics and human evaluations. Remarkably, it achieves performance comparable to the teacher model using only three denoising steps, enabling efficient high-quality generation.

Armen Avetisyan , Chris Xie , Henry Howard-Jenkins , Tsun-Yi Yang , Samir Aroudj , Suvam Patra , Fuyang Zhang , Duncan Frost , Luke Holland , Campbell Orme , Jakob Julian Engel , Edward Miller , Richard Newcombe , Vasileios Balntas