Let's get deep into the world of randomness, shall we? Into the randomness that governs the diffusion process and the wonders of controlled randomness. This journey will comprehensively explore how the tiny seeds of noise grow into the lush, vibrant gardens of generated images. https://arxiv.org/abs/2405.14828
Introduction: The Seed of the Matter
The Diffusion process or I should say the reverse diffusion process, behind every generative AI model, capable of transforming textual descriptions into detailed visual representations like Stable Diffusion 2.0 or SDXL Turbo, essentially just refines random noise into coherent images and at the heart of this process lies the seed a seemingly insignificant number that determines the initial state of the latents and subsequently influences the entire generative trajectory
The Diffusion Process: A Sparse Overview
! There are very good guides on how diffusion works online this is just a sparse introduction to the diffusion process to build up the notion !
The diffusion process is an iterative mechanism that converts random noise into an image. This process can be broadly divided into two stages: forward diffusion and reverse diffusion. Let's delve into these stages to understand their technical intricacies.
Forward Diffusion: Adding Noise
The forward diffusion process gradually corrupts an image by adding Gaussian noise over several steps. Mathematically, it is described by the following equation:
Over time, the image is transformed into pure noise, which serves as the starting point for the reverse diffusion process.
Reverse Diffusion: Removing Noise
The reverse diffusion process, guided by a pre-trained model, aims to denoise the noisy latent variables step-by-step, eventually reconstructing a high-quality image. This is expressed as:
Seeds: The Architects of Randomness
Moving on to the core focus of the this blog: Seeds play a crucial role in this process. Each seed initializes the random number generator that produces the initial noise and the noise added at each reverse diffusion step. Consequently, the choice of seed can lead to vastly different generated images even for the same textual description
Exploring the Impact of Seeds
To illustrate the profound impact of seeds, let's consider two examples:
These seeds were identified through extensive experimentation and analysis from the authors. The FID score, where lower values indicate higher quality, highlights how the choice of seed can significantly affect the output quality.
Seed and Intermediate Noise
An interesting facet of the seed's influence is its effect on intermediate noise levels during the reverse diffusion process. Our experiments and the experiments from the paper revealed that while the initial noise predominantly determines the final image, the noise added at each step, controlled by the seed, does not really affect the quality or has any substantial change in the final output.
Visual Fingerprints: The Seeds' Distinguishing Marks
Each seed imprints a unique "visual fingerprint" on the generated images. The authors from the paper trained a 1,024-way classifier to predict the seed from an image with over 99.9% accuracy, demonstrating the distinct influence of seeds. This classifier revealed that even subtle variations in noise, dictated by the seed, result in distinguishable visual features. These findings suggest that seeds may encode unique visual features, prompting us to explore their impact across several interpretable dimensions
The authors reduced the dimensions in order to better classify the seeds according to the features which resulted in some seeds giving certain features in the output i.e. having a frame around the image even though not prompted to, having a white sky instead of a blue sky, giving a grayscaled output etc.
By leveraging these mathematical frameworks, we can fine-tune the generation process to yield visually stunning and high-quality images with the right seeds. Happy seeding!