Generative models

synthesizing-original.jpg GenerativeModels.png class-synthesis-deepgen.png

In the context of neural networks, generative models refers to those networks which output images. We’ve seen Deepdream and style transfer already, which can also be regarded as generative, but in contrast, those are produced by an optimization process in which convolutional neural networks are merely used as a sort of analytical tool. In generative models like autoencoders and generative adversarial networks, the convnets output the images themselves. This chapter will look at those two specifically.


So far, we’ve mostly interpreted neural networks as being predictive, i.e. given some inputs, what is the output of – where it’s going, etc. But this is just a special case of a much more general capacity they have.


  • neural nets really more interesting in general capacity – learning how to map desired x to y
  • encoder -> decoder
    • why??? world’s most expensive identity function
  • “compression” of latent variable
  • denoising vs variational
  • variational = assume a probabilistic model interpretation (KL divergence)
    • useful because it makes the latent space move around
  • face examples (VRAE)

mnist_VAE.png autoencoder_net.png autoencoder.png autoencoders_mnist_reconstruction.png


GenerativeModels.png GANs.png GANpublications.png

interesting property of DCGANs

  • generate tons of labeled images to fill out the image class manifold
  • then a nearest-neighbors classifier on that outperforms a RBF-SVM

hardmaru - GAN + VRAE making

img: generative models: whats wrong with auto encoders’s-wrong-with-autoencoders.html

unreasonable confusion of VAEs

text to image

GANS explained

deep image completion seeing beyond edges of image

transfiguring portraits:

soumith + yann

end to end neural style with gans

eyescream eyescream

generating faces with torch



Fast Scene Understanding with Generative Models (nice video)

hardmaru images from latent vectors (GAN + VAE)

autoencoders book chapter

gen models - describe probability distributions and data manifolds 2d line manifold, or plane in 3d, then go to eigenfaces neural net modeling prob distribution very sparse

we sometimes use the word astronomical to describe very large quantities. but no nimberassociated with astronmy, like the number of atoms in the universe, even begins to approach ___

GAN papers


lsun_bedrooms_five_epoch_samples.png lsun_bedrooms_five_epochs_interps.png lsun_bedrooms_one_epoch_samples.png faces_128_filter_samples.png faces_arithmetic_collage1.png faces_arithmetic_collage2.png dcgan_soumith_bedrooms.png albums_128px.png

Image-to-image translation

Style transfer is a special case of the more general task of image-to-image tanslation.

Colorization, deblurring/superresolution

superresolution.png example_results.png

Conditional GANs (pix2pix)

pix2pix paper jasper, brannon, mario, invisible cities link to guide


cyclegan.jpg horse2zebra.mp4 putin_zebra.jpeg

GAN zoo


*InfoGAN InfoGAN1.png InfoGAN2.png

*DiscoGAN DiscoGAN-gender1.png DiscoGAN-gender2.png DiscoGAN2.png

*Art GANS (CreativeGAN + ArtGAN) ArtGAN-art.png ArtGAN-caption.png GANGogh.png Creative-GAN.png

*StackGAN + TAC-GAN + PPGN StackGAN-bird_interp.png StackGAN-bird1.jpg StackGAN-bird2.jpg StackGAN-flower1.jpg StackGAN-flower2.jpg ppgn_image_captioning.jpg

GANs in other domains

Wavenets + SampleRNN

skip-thought vectors skip-thought-vectors.png

PointClouds pointcloud-GAN.png

*Sketch-RNN Kanjis sketchrnn.png

Progressive growing gans