Diffusion Models as Tools for Data Mining

Our supplementary material contains:

  1. Full clusters (Fig. 3-6 in the main paper) of:
  2. Ten images from the parallel dataset (see Fig. 9 in the main paper), randomly selected images for each country or selected in order of descending typicality parallel dataset.
  3. Mining a dataset of generated images using the finetuned model and ddpm across multiple countries and across multiple guidance values.
  4. Formal connection of our typicality measure to the previous literature.
  5. Sensitivity according to range and seed.
  6. Comparison with our implementation of "What makes Paris Look Like Paris" on G^3.
  7. Comparison with a mining algorithm based solely on CLIP, that is very close to ours but operates on clip token space.
To expose images in all links, please select the configuration you want to visualize through the radio buttons at the top of each page.