Generative models

In the context of neural networks, generative models refers to those networks which output images. We’ve seen Deepdream and style transfer already, which can also be regarded as generative, but in contrast, those are produced by an optimization process in which convolutional neural networks are merely used as a sort of analytical tool. In generative models like autoencoders and generative adversarial networks, the convnets output the images themselves. This chapter will look at those two specifically.


So far, we’ve mostly interpreted neural networks as being predictive, i.e. given some inputs, what is the output of – where it’s going, etc. But this is just a special case of a much more general capacity they have.


  • neural nets really more interesting in general capacity – learning how to map desired x to y
  • encoder -> decoder
    • why??? world’s most expensive identity function
  • “compression” of latent variable
  • denoising vs variational
  • variational = assume a probabilistic model interpretation (KL divergence)
    • useful because it makes the latent space move around
  • face examples (VRAE)

interesting property of DCGANs

  • generate tons of labeled images to fill out the image class manifold
  • then a nearest-neighbors classifier on that outperforms a RBF-SVM

hardmaru - GAN + VRAE making

img: generative models: whats wrong with auto encoders’s-wrong-with-autoencoders.html

unreasonable confusion of VAEs

text to image

GANS explained

deep image completion seeing beyond edges of image

transfiguring portraits:

soumith + yann

end to end neural style with gans

eyescream eyescream

generating faces with torch



Fast Scene Understanding with Generative Models (nice video)

hardmaru images from latent vectors (GAN + VAE)

autoencoders book chapter

gen models - describe probability distributions and data manifolds 2d line manifold, or plane in 3d, then go to eigenfaces neural net modeling prob distribution very sparse

GAN papers


Image-to-image translation

Style transfer is a special case of the more general task of image-to-image tanslation.

Colorization, deblurring/superresolution

superresolution.png example_results.png

Conditional GANs (pix2pix)

pix2pix paper jasper, brannon, mario, invisible cities link to guide


cyclegan.jpg horse2zebra.mp4 putin_zebra.jpeg

GAN zoo


*InfoGAN InfoGAN1.png InfoGAN2.png

*DiscoGAN DiscoGAN-gender1.png DiscoGAN-gender2.png DiscoGAN2.png

*Art GANS (CreativeGAN + ArtGAN) ArtGAN-art.png ArtGAN-caption.png GANGogh.png Creative-GAN.png

*StackGAN + TAC-GAN + PPGN StackGAN-bird_interp.png StackGAN-bird1.jpg StackGAN-bird2.jpg StackGAN-flower1.jpg StackGAN-flower2.jpg ppgn_image_captioning.jpg

GANs in other domains

Wavenets + SampleRNN

skip-thought vectors skip-thought-vectors.png

PointClouds pointcloud-GAN.png

*Sketch-RNN Kanjis sketchrnn.png

Progressive growing gans