Balancing Act: Distribution-Guided Debiasing in Diffusion Models

1Indian Institute of Science, 2Meta Reality Labs
Interpolation end reference image.

Abstract

Diffusion Models (DMs) have emerged as powerful generative models with unprecedented image generation capability. These models are widely used for data augmentation and creative applications. However, DMs reflect the biases present in the training datasets. This is especially concerning in the context of faces, where the DM prefers one demographic subgroup vs others (eg. female vs male).

In this work, we present a method for debiasing DMs without relying on additional data or model retraining. Specifically, we propose Distribution Guidance, which enforces the generated images to follow the prescribed attribute distribution. To realize this, we build on the key insight that the latent features of denoising UNet hold rich demographic semantics, and the same can be leveraged to guide debiased generation. We train Attribute Distribution Predictor (ADP) - a small mlp that maps the latent features to the distribution of attributes. ADP is trained with pseudo labels generated from existing attribute classifiers.

The proposed Distribution Guidance with ADP enables us to do fair generation. Our method reduces bias across single/multiple attributes and outperforms the baseline by a significant margin. Further, we present a downstream task of training a fair attribute classifier by rebalancing the training set with our generated data

Method

1. Training H-space classifier (ADP)

Images of each class are embedded into the H-space to obtain their corresponding H-vectors. A linear classifier is trained on these H-vectors, conditioned by the timestep of each h-vector.

Interpolate start reference image.

2. Distribution Guidance in the H-space

The H-vectors of a batch of images are passed through the ADP to obtain a prediction of the distribution of the batch. A loss between the predicted distribution and the reference distribution is used to guide the H-vectors to follow the reference distribution. A linear classifier is trained on these H-vectors, conditioned by the timestep of each h-vector.

Interpolate start reference image.

Stable Diffusion Results

Attribute Debiasing

Gender Balancing Results: We specify different professions as prompts and balance the generations across gender.

Interpolate start reference image.

Race and Age Balancing Results:

Interpolate start reference image.

WaterBird Dataset

Interpolate start reference image.