Skip to content

kuai-lab/cvpr26_Dynamic-eDiTor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Dynamic‑eDiTor: Training‑Free Text‑Driven 4D Scene Editing with Multimodal Diffusion Transformer [CVPR 2026]

Dong In Lee1,2*, Hyungjun Doh1*, Seunggeun Chi1, Runlin Duan1,
Sangpil Kim2†, Karthik Ramani1†

1Purdue University,   2Korea University

⚙️ Installation

Tested on Python 3.10 + CUDA 12.1.

conda create -n editor python=3.10
conda activate editor

# CUDA 12.1
pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 xformers --index-url https://download.pytorch.org/whl/cu121
pip install diffusers==0.35.1 transformers==4.55.4 accelerate==1.10.1
pip install "huggingface-hub>=0.34.0,<1.0"
pip install bitsandbytes peft

cd dynamic_editor
# Method-specific dependencies
pip install -r requirements_multiview.txt   # for multi-view scenes
pip install -r requirements_mono.txt       # for monocular scenes

If you encounter:

ImportError: libGL.so.1: cannot open shared object file: No such file or directory

Install system dependency:

sudo apt-get install -y libgl1

📂 Datasets and Pre‑trained Scenes

Recommended layout:

data/
 ├─ DyCheck/
 └─ DyNeRF/
dynamic_editor/

🎨 Editing

  1. Update all local paths in script/run_editor.sh.
  2. Run editing + optimization + rendering:
cd script
bash run_editor.sh

For ~48GB GPUs (e.g., RTX A6000), enable local caching:

cd script
bash run_editor_local.sh

Monocular Scenes

Update paths in src/Deformable-3D-Gaussians/script/run.sh:

  • DATA_DIR: path to the data for scene reconstruction
  • BASE_OUTPUT_NAME: pre‑trained scene name (e.g., "mochi-high-five")
  • BASE_OUTPUT_ROOT: path to the pre‑trained scene

Run editing + optimization + rendering:

cd src/Deformable-3D-Gaussians/script
bash run.sh

🎬 Rendering

All commands above include a visualization/rendering step. After completion, inspect the generated results under the corresponding output directories created by each script. You can render additional views using the rendering utilities provided by 4DGS or the scripts supplied in this repository.

🧰 Tips

  • Ensure your dataset is correctly pre‑processed with 4DGS (multi‑view) or the provided monocular setup.
  • Use the local‑caching script on 48GB GPUs to avoid OOM.
  • Allocate sufficient local storage if caching is enabled.

📜 Citation

If you find this work useful, please cite:

@article{lee2025dynamiceditor,
  title   = {Dynamic-eDiTor: Training-Free Text-Driven 4D Scene Editing with Multimodal Diffusion Transformer},
  author  = {Lee, Dong In and Doh, Hyungjun and Chi, Seunggeun and Duan, Runlin and Kim, Sangpil and Ramani, Karthik},
  journal = {arXiv preprint arXiv:2512.00677},
  year    = {2025}
}

🙏 Acknowledgements

We thank the authors and contributors of 4DGS, DyNeRF, Diffusers, and related open‑source projects that made this work possible.

About

Official code of "Dynamic-eDiTor: Training-Free Text-Driven 4D Scene Editing with Multimodal Diffusion Transformer"

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages