Skip to content

[Bounty | 50 USD] Generate preview images as additional output artifacts for the splatter node #5

@arshak93

Description

@arshak93

Feature: Render a Preview Image of the Trained Gaussian Splat

Context

splatter-server is a GPU-accelerated compute node running on the Real World Web. It has the /splatter/colmap/v1 capability and receives training jobs from DMS (Domain Management Service). When a job arrives, the Rust server downloads input assets from DDS (Domain Data Service), runs a Python-based training pipeline, and uploads the resulting .splat file back to the domain.

Current Pipeline

The training pipeline (run.py) executes the following steps in order:

Step Tool / Script Description
1 extract_mp4.py Extracts individual JPEG frames from input Frames.mp4 video recordings
2 ns-process-data Converts COLMAP binary data (cameras.bin, images.bin, points3D.bin) into nerfstudio format
3 ns-train splatfacto Trains a 3D Gaussian Splatting model using nerfstudio's splatfacto method
4 ns-export gaussian-splat Exports the trained model as a .ply point cloud
5 rotate_ply.py Applies a coordinate-system transformation (rotation) to the exported PLY
6 convert_ply2splat.py Converts the rotated .ply to the final .splat binary format

Inputs

The job receives a CID pointing to a refined manifest JSON containing dataIDs that reference:

  • Video recordingsdmt_recording_*.mp4 files (one or more MP4 videos of the scene)
  • COLMAP binaries — pre-computed SfM data (cameras.bin, images.bin, points3D.bin, frames.bin, rigs.bin)
{job_root_path}/
├── datasets/
│   └── {dataset}/
│       └── Frames.mp4
└── refined/
    └── global/
        └── refined_sfm_combined/
            ├── cameras.bin
            ├── images.bin
            └── points3D.bin

Current Outputs

{job_root_path}/
└── refined/
    ├── nerfstudio-data/          # converted nerfstudio dataset
    └── splatter/
        ├── splat.ply             # raw exported Gaussian Splat
        ├── splat_rot.ply         # coordinate-transformed splat
        ├── splat_rot.splat       # final .splat file (uploaded to domain as "splat_data")
        └── splatfacto/
            └── config.yml        # nerfstudio model config + checkpoint

Only splat_rot.splat is uploaded back to the domain as a splat_data artifact. There is currently no visual preview of the trained result.


Feature Request

Goal

After training completes, render two preview images of the trained Gaussian Splat and include them in the pipeline output. Both preview images should be uploaded to the domain alongside the .splat file so that users and downstream services can quickly assess training quality without loading the full splat.

The two views are:

Preview Camera Placement Purpose
Top-down Directly above the centroid, looking straight down Shows the spatial footprint / floor-plan layout of the scene
Angled (¾-view) Elevated corner (~45° above horizontal), looking at the centroid Shows depth, vertical structure, and overall 3D coverage

Proposed Approach

Insert a new step between the current Step 4 (export) and Step 5 (rotate), or after Step 6 (convert), that renders previews:

  1. Use nerfstudio's built-in renderingns-render can render camera paths or individual viewpoints from a trained model checkpoint. Since the splatfacto config and checkpoint are already on disk at {job_root_path}/refined/splatter/splatfacto/config.yml, the renderer can be invoked directly.

  2. Compute the splat bounding box — Read the exported splat.ply point cloud positions to determine the centroid and extent (axis-aligned bounding box). These are used to place both cameras.

  3. Top-down camera — Place a virtual camera directly above the centroid, at a height sufficient to frame the full XZ extent of the splat with some margin. The camera looks straight down (−Y or −Z, depending on the coordinate convention). This produces a plan / floor-plan style overview.

  4. Angled corner camera — Place a virtual camera at one of the upper corners of the bounding box, offset diagonally above the scene, looking inward at the centroid. A ~45° elevation angle gives a ¾-view / isometric-style shot that reveals depth and vertical structure. Set the field of view wide enough to frame the full extent of the splat.

  5. Render both views — Use nerfstudio's camera JSON format (or programmatically construct Cameras objects) to define two single-frame render poses and invoke ns-render for each.

  6. Save the preview images — Write the rendered images as JPEG or PNG to the output directory:

    {job_root_path}/refined/splatter/preview_top.jpg
    {job_root_path}/refined/splatter/preview_angle.jpg
    
  7. Upload both previews — In the Rust runner (runner/src/lib.rs), after uploading splat_rot.splat, upload both preview images as additional domain artifacts:

    • Top-down: data_type: "splat_preview_top", name refined_splat_preview_top{suffix}
    • Angled: data_type: "splat_preview_angle", name refined_splat_preview_angle{suffix}

Expected Changes

  • run.py — Add a new pipeline step that:
    1. Reads the splat bounding box / centroid from splat.ply.
    2. Computes two camera poses (top-down and angled corner).
    3. Invokes ns-render (or equivalent) twice to produce both preview images.
  • runner/src/lib.rs — After the existing put_domain_artifact call for the .splat file, add two more uploads for the preview images. Include both preview paths in the final progress JSON payload.
  • README.md — Update the "Output Files" section to document the new preview image artifacts.

Updated Output Structure

{job_root_path}/
└── refined/
    └── splatter/
        ├── splat.ply
        ├── splat_rot.ply
        ├── splat_rot.splat           # uploaded as "splat_data"
        ├── preview_top.jpg           # NEW — top-down view, uploaded as "splat_preview_top"
        ├── preview_angle.jpg         # NEW — angled ¾-view, uploaded as "splat_preview_angle"
        └── splatfacto/
            └── config.yml

Acceptance Criteria

  • After a successful training run, two preview images (JPEG or PNG) are rendered from the trained Gaussian Splat model.
  • Top-down preview: camera is placed directly above the centroid looking straight down, framing the full spatial footprint.
  • Angled preview: camera is placed at an elevated corner (~45° above horizontal) looking at the centroid, framing the entire scene with depth visible.
  • The preview images are saved to {job_root_path}/refined/splatter/preview_top.jpg and preview_angle.jpg.
  • Both preview images are uploaded to the domain as artifacts with distinguishable names and data_type values (e.g., splat_preview_top and splat_preview_angle).
  • The final progress JSON emitted by the runner includes both preview artifact paths/names.
  • The pipeline does not fail the entire job if preview rendering fails — this should be best-effort with a warning log.
  • README and output documentation are updated to reflect the new artifacts.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions