Custom implementation of "Training Agents Inside of Scalable World Models" originally published by Google DeepMind. Trained on 8x 16GB V100 GPUs using data from zhwang4ai/OpenAI-Minecraft-Contractor.
- Uses MSE loss only when training tokenizer (no dynamic LPIPS integration as of yet)
- Tokenizer/World Model/Imagination Training etc... all overfit on one video to prove the pipeline works on limited compute resources.
- Due to limited computing resources, tokenizer and world model outputs are grainy than ideal reconstructions.
Example #1: Reconstruction from Tokenizer
reconstructed_output.mp4
Example #2: Reconstruction from Tokenizer
v2_reconstructed_output.mp4
Single-step inference (shows that overfitting worked)
one_step_inference.mp4
Multi-step inference (simulates actual world model performance)