-
Notifications
You must be signed in to change notification settings - Fork 31
Dataset
| Name | External Conditioning | Latent Diffusion |
|---|---|---|
| RealEstate10k | ✔︎ | X |
| Kinetics-600 | X | ✔︎ |
| Minecraft | ✔︎ | ✔︎ |
| Robot Swapping | ✔︎ | X |
Video datasets fall into two categories: SimpleVideoDataset and AdvancedVideoDataset.
-
SimpleVideoDatasetloads entire videos and is used for preprocessing them into latents. -
AdvancedVideoDatasetloads fixed-length clips (withframe_skip) from videos and is used for training VAEs and Diffusion models.
Depending on dataset.latent.type, we either precompute latents for the entire dataset or compute them on-the-fly during diffusion training. If the former, the latents are stored in a separate folder named: {dataset_dir}_latent_{latent_resolution}_{latent_suffix} (e.g., minecraft_latent_32_1cd9pgpb).
To add a new video dataset, structure the dataset directory as follows:
data/
├── {dataset_name}/
│ ├── training/
│ │ ├── video_xxx.mp4
│ │ ├── ...
│ ├── validation/
│ │ ├── video_xxx.mp4
│ │ ├── ...
│ ├── test/
│ │ ├── video_xxx.mp4
│ │ ├── ...
│ ├── metadata/
│ │ ├── training.pt
│ │ ├── validation.pt
│ │ ├── test.pt
Next, implement the dataset class in datasets/video/{dataset_name}.py, inheriting from BaseVideoDataset, BaseSimpleVideoDataset, and BaseAdvancedVideoDataset, then override the necessary methods. Additionally, define the dataset configuration in configurations/dataset/{dataset_name}.yaml. While having all three splits (training, validation, test) is not required, each split folder can be structured arbitrarily as long as the metadata files correctly reference the video files.