-
Notifications
You must be signed in to change notification settings - Fork 7
Description
Feature: Render a Preview Video of the Trained Gaussian Splat
Context
splatter-server is a GPU-accelerated compute node running on the Real World Web. It has the /splatter/colmap/v1 capability and receives training jobs from DMS (Domain Management Service). When a job arrives, the Rust server downloads input assets from DDS (Domain Data Service), runs a Python-based training pipeline, and uploads the resulting .splat file back to the domain.
Current Pipeline
The training pipeline (run.py) executes the following steps in order:
| Step | Tool / Script | Description |
|---|---|---|
| 1 | extract_mp4.py |
Extracts individual JPEG frames from input Frames.mp4 video recordings |
| 2 | ns-process-data |
Converts COLMAP binary data (cameras.bin, images.bin, points3D.bin) into nerfstudio format |
| 3 | ns-train splatfacto |
Trains a 3D Gaussian Splatting model using nerfstudio's splatfacto method |
| 4 | ns-export gaussian-splat |
Exports the trained model as a .ply point cloud |
| 5 | rotate_ply.py |
Applies a coordinate-system transformation (rotation) to the exported PLY |
| 6 | convert_ply2splat.py |
Converts the rotated .ply to the final .splat binary format |
Inputs
The job receives a CID pointing to a refined manifest JSON containing dataIDs that reference:
- Video recordings —
dmt_recording_*.mp4files (one or more MP4 videos of the scene) - COLMAP binaries — pre-computed SfM data (
cameras.bin,images.bin,points3D.bin,frames.bin,rigs.bin)
{job_root_path}/
├── datasets/
│ └── {dataset}/
│ └── Frames.mp4
└── refined/
└── global/
└── refined_sfm_combined/
├── cameras.bin
├── images.bin
└── points3D.bin
Current Outputs
{job_root_path}/
└── refined/
├── nerfstudio-data/ # converted nerfstudio dataset
└── splatter/
├── splat.ply # raw exported Gaussian Splat
├── splat_rot.ply # coordinate-transformed splat
├── splat_rot.splat # final .splat file (uploaded to domain as "splat_data")
└── splatfacto/
└── config.yml # nerfstudio model config + checkpoint
Only splat_rot.splat is uploaded back to the domain as a splat_data artifact. There is currently no visual preview — neither image nor video — of the trained result.
Feature Request
Goal
After training completes, render a short preview video of the trained Gaussian Splat and include it in the pipeline output. The video should be uploaded to the domain alongside the .splat file so that users and downstream services can visually assess training quality and spatial coverage without loading the full splat in an interactive viewer.
Proposed Approach
Insert a new step after training and export (after Step 4 or after Step 6) that renders a preview video:
-
Use nerfstudio's built-in rendering —
ns-rendersupports rendering along camera paths from a trained model checkpoint. Since thesplatfactoconfig and checkpoint are already on disk at{job_root_path}/refined/splatter/splatfacto/config.yml, the renderer can be invoked directly. -
Generate an orbital camera path around the splat — The preview video should give a complete overview of the trained scene:
- Compute the bounding box (or centroid + extent) of the splat's point cloud from the exported
splat.ply. - Generate a 360° orbit (or 270° arc) camera path that circles around the centroid at an elevated angle (~30–45° above the horizontal plane), with a radius large enough to frame the full scene.
- The orbit should be smooth, with the camera always looking at the centroid.
- Target duration: 5–10 seconds at 30 fps (150–300 frames). Keep it short to limit GPU render time and file size.
- Use nerfstudio's camera-path JSON format or programmatically construct the path as a sequence of
Camerasposes.
- Compute the bounding box (or centroid + extent) of the splat's point cloud from the exported
-
Encode the video —
ns-renderoutputs individual frames to a directory. Encode them into an MP4 usingffmpeg(already available in the nerfstudio Docker image):ffmpeg -y -framerate 30 -i {frames_dir}/%05d.png \ -c:v libx264 -pix_fmt yuv420p -crf 23 \ -movflags +faststart \ {job_root_path}/refined/splatter/preview.mp4Use
-crf 23for a good quality/size balance and-movflags +faststartfor web-friendly streaming. -
Save the preview video — Write the encoded video to the output directory:
{job_root_path}/refined/splatter/preview.mp4 -
Upload the preview — In the Rust runner (
runner/src/lib.rs), after uploadingsplat_rot.splat, upload the preview video as an additional domain artifact (e.g., withdata_type: "splat_preview_video"and namerefined_splat_preview_video{suffix}). -
Clean up intermediate frames — Delete the temporary rendered frame images after encoding to avoid bloating the task workspace.
Expected Changes
run.py— Add a new pipeline step that:- Computes the splat bounding box / centroid from
splat.ply. - Generates an orbital camera-path JSON.
- Invokes
ns-renderto render frames along that path. - Encodes frames into
preview.mp4viaffmpeg. - Cleans up temporary frame images.
- Computes the splat bounding box / centroid from
runner/src/lib.rs— After the existingput_domain_artifactcall for the.splatfile, add another upload for the preview video. Include the preview video path in the finalprogressJSON payload.README.md— Update the "Output Files" section to document the newpreview.mp4artifact.
Updated Output Structure
{job_root_path}/
└── refined/
└── splatter/
├── splat.ply
├── splat_rot.ply
├── splat_rot.splat # uploaded as "splat_data"
├── preview.mp4 # NEW — uploaded as "splat_preview_video"
└── splatfacto/
└── config.yml
Performance Considerations
- Rendering 150–300 frames of a Gaussian Splat is fast on GPU (typically a few minutes), but should be profiled to ensure it doesn't significantly extend total job time.
- The resulting MP4 at 1080p with CRF 23 should be in the range of 2–10 MB — acceptable for upload.
- If render time is a concern, reduce resolution to 720p or lower the frame count (e.g., 90 frames = 3 seconds).
Acceptance Criteria
- After a successful training run, a preview video (MP4) is rendered from the trained Gaussian Splat model.
- The camera follows a smooth orbital path around the splat at an elevated angle (~30–45°), looking at the scene centroid, covering at least 270° of rotation.
- The video is 5–10 seconds long at 30 fps, encoded as H.264 MP4 with web-friendly flags.
- The preview video is saved to
{job_root_path}/refined/splatter/preview.mp4. - The preview video is uploaded to the domain as an artifact with a distinguishable name and
data_type(e.g.,splat_preview_video). - The final
progressJSON emitted by the runner includes the preview video artifact path/name. - Temporary rendered frames are cleaned up after encoding.
- The pipeline does not fail the entire job if preview video rendering or encoding fails — this should be best-effort with a warning log.
- README and output documentation are updated to reflect the new artifact.