This could potentially provide legal evidence in cases where an artist's style has been copied, or where copyrighted images have been used to train ge

Extracting Training Data From Fine-Tuned Stable Diffusion Models

submited by
Style Pass
2024-10-07 12:30:02

This could potentially provide legal evidence in cases where an artist's style has been copied, or where copyrighted images have been used to train generative models of public figures, IP-protected characters, or other content.

From the new paper: original training images are seen in the row above, and the extracted images are depicted in the row below. Source: https://arxiv.org/pdf/2410.03039

Such models are widely and freely available on the internet, primarily through the enormous user-contributed archives of civit.ai, and, to a lesser extent, on the Hugging Face repository platform.

The new model developed by the researchers is called FineXtract, and the authors contend that it achieves state-of-the-art results in this task.

‘[Our framework] effectively addresses the challenge of extracting fine-tuning data from publicly available DM fine-tuned checkpoints. By leveraging the transition from pretrained DM distributions to fine-tuning data distributions, FineXtract accurately guides the generation process toward high-probability regions of the fine-tuned data distribution, enabling successful data extraction.'

Far right, the original image used in training. Second from right, the image extracted via FineXtract. The other columns represent alternative, prior methods. Please refer to the source paper for better resolution.

Leave a Comment