Skip to content

Dataset

Kiwhan Song edited this page Feb 11, 2025 · 1 revision

List of Datasets

Name External Conditioning Latent Diffusion
RealEstate10k ✔︎ X
Kinetics-600 X ✔︎
Minecraft ✔︎ ✔︎
Robot Swapping ✔︎ X

Video Dataset Implementation

SimpleVideoDataset vs. AdvancedVideoDataset

Video datasets fall into two categories: SimpleVideoDataset and AdvancedVideoDataset.

  • SimpleVideoDataset loads entire videos and is used for preprocessing them into latents.
  • AdvancedVideoDataset loads fixed-length clips (with frame_skip) from videos and is used for training VAEs and Diffusion models.

Using Latents

Depending on dataset.latent.type, we either precompute latents for the entire dataset or compute them on-the-fly during diffusion training. If the former, the latents are stored in a separate folder named: {dataset_dir}_latent_{latent_resolution}_{latent_suffix} (e.g., minecraft_latent_32_1cd9pgpb).

Adding a New Video Dataset

To add a new video dataset, structure the dataset directory as follows:

data/
├── {dataset_name}/
│   ├── training/
│   │   ├── video_xxx.mp4
│   │   ├── ...
│   ├── validation/
│   │   ├── video_xxx.mp4
│   │   ├── ...
│   ├── test/
│   │   ├── video_xxx.mp4
│   │   ├── ...
│   ├── metadata/
│   │   ├── training.pt
│   │   ├── validation.pt
│   │   ├── test.pt

Next, implement the dataset class in datasets/video/{dataset_name}.py, inheriting from BaseVideoDataset, BaseSimpleVideoDataset, and BaseAdvancedVideoDataset, then override the necessary methods. Additionally, define the dataset configuration in configurations/dataset/{dataset_name}.yaml. While having all three splits (training, validation, test) is not required, each split folder can be structured arbitrarily as long as the metadata files correctly reference the video files.

Clone this wiki locally