Skip to content

Create a first VAE (Variational Autoencoder) template #47

@blkdmr

Description

@blkdmr

We want to provide a beginner-friendly, end-to-end Variational Autoencoder (VAE) example for fenn users, but without adding new core utilities yet.
Therefore, all VAE code should live inside the template (model, loss, training, sampling, plotting). No changes to the fenn core repository are required for this issue.

Goal

Add a new template to the templates repository that trains a VAE on a canonical dataset (MNIST preferred) and demonstrates:

  • correct VAE implementation (encoder/decoder + reparameterization)
  • ELBO loss (reconstruction + KL divergence)
  • a reproducible training run
  • basic qualitative outputs (reconstructions and samples)

Repository / Location

Deliverables

1) Minimal VAE implementation (PyTorch)

Inside the template, implement:

  • class VAE(nn.Module)
    • encode(x) -> (mu, logvar)
    • reparameterize(mu, logvar) -> z
    • decode(z) -> x_hat
    • forward(x) -> (x_hat, mu, logvar, z)

Architecture (keep it simple):

  • MNIST: either MLP or small Conv; prefer MLP for readability.
  • Document expected input shape, e.g. (B, 1, 28, 28).

2) Loss (ELBO)

Implement ELBO as explicit, readable functions:

  • reconstruction loss: BCE (typical for MNIST) or MSE (configurable)
  • KL divergence for diagonal Gaussian
  • total: loss = recon + beta * kl where beta defaults to 1.0

The template should log:

  • total loss
  • recon term
  • KL term

3) Training + evaluation script

Provide a runnable entrypoint (e.g., train.py) that:

  • downloads/loads MNIST (torchvision is fine)
  • trains the VAE with Adam
  • evaluates on validation/test split
  • saves model checkpoints (optional but recommended)

4) Qualitative outputs

At minimum, produce one of the following:

  • a grid image of reconstructions (input vs output)
  • a grid image of samples from the prior (z ~ N(0, I))

Saving artifacts under something like outputs/ is recommended.

5) README (must-have)

The README should include:

  • how to run the template
  • minimal explanation of VAE objective (ELBO = recon + KL)
  • expected shapes and key hyperparameters:
    • latent dim
    • beta
    • lr, batch size, epochs
  • what artifacts to expect (where recon/sample grids are saved)

Acceptance criteria

  • Template runs end-to-end on a clean environment.
  • Training loss decreases over epochs.
  • Reconstructions become visually meaningful after a short run.
  • At least one qualitative artifact (recon or samples) is produced.
  • No modifications to fenn core repo (this issue is template-only).

Out of scope (for this issue)

  • β-VAE schedules, KL annealing (optional follow-up)
  • CIFAR-10 / complex conv architectures
  • IWAE / flows / VQ-VAE
  • extensive experiment tracking/benchmarking

References (external)

How to contribute

Comment here to claim the issue, then open a PR against the templates repository with the new template folder.
Please refer to the existing templates (excluding dev-only) as a reference.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions