Skip to content

AIR-Embodied: An Efficient Active 3DGS-based Interaction and Reconstruction Framework with Embodied Large Language Model

Notifications You must be signed in to change notification settings

QZH-00/AIR-Embodied

Repository files navigation

AIR-Embodied

The replication code for experiments has been open sourced, and our system will be fully open source once the article is accepted.

1726156748245

Recent advancements in 3D reconstruction and neural rendering have enhanced the creation of high-quality digital assets, yet existing methods struggle to generalize across varying object shapes, textures, and occlusions. While Next Best View (NBV) planning and learning-based approaches offer solutions, they are often limited by predefined criteria and fail to manage occlusions effectively. We present AIR-Embodied, a novel framework that integrates embodied AI agents with large-scale pretrained multi-modal language models (MLLM) to improve active reconstruction. AIR-Embodied utilizes a three-stage process: understanding the current reconstruction state via multi-modal prompts, planning tasks with viewpoint selection and interactive actions, and employing closed-loop reasoning to ensure accurate execution. The agent dynamically refines its actions based on discrepancies between the planned and actual outcomes. Experimental evaluations across virtual and real-world environments demonstrate that AIR-Embodied significantly enhances reconstruction efficiency and quality, providing a robust solution to challenges in active 3D reconstruction.

1726156657768

About

AIR-Embodied: An Efficient Active 3DGS-based Interaction and Reconstruction Framework with Embodied Large Language Model

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published