Variational Autoencoder

A generative model that learns compressed representations for creating new data.

Why It Matters

Compression as Intelligence

A VAE learns to compress data into a small "latent space" and then reconstruct it. The magic: you can sample new points from latent space to generate entirely new data that looks like the training data but never existed.

VAEs are the backbone of video generation models like OpenAI's Sora, where they compress video frames into a manageable latent space for the diffusion model to work with.

Encoder

Compresses input into a latent vector (mean + standard deviation). Maps data to a compact representation.

Latent Space

The compressed representation. Nearby points produce similar outputs. You can interpolate between them.

Decoder

Reconstructs data from the latent vector. Can generate new data by decoding random latent points.

Interactive

Explore the Latent Space

Drag the slider to move through latent space. Watch how the encoded representation and decoded output change smoothly:

z = [0.00, 1.00]

→

Decoded

Latent position: 50%

In a real VAE, this slider would smoothly morph between generated faces, digits, or other data types.

Deep Dive

VAE Architecture

In Practice

VAEs in the Wild

OpenAI Sora

Uses a VAE to compress video frames into latent space before applying diffusion for generation.

Stable Diffusion

The VAE encoder compresses images to latent space; the VAE decoder converts back to pixel space.

Drug Discovery

VAEs generate novel molecular structures by sampling from a learned chemical latent space.

Knowledge Check

Test Your Understanding

Q1.What are the two main components of a VAE?

Q2.What does the latent space represent?

Q3.How is a VAE used in OpenAI's Sora?