You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: source/en/user_guide/internnav/quick_start/installation.md
+35-75Lines changed: 35 additions & 75 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -256,35 +256,32 @@ To get started, we need to prepare the data and checkpoints.
256
256
Please download our latest pretrained [checkpoint](https://huggingface.co/InternRobotics/InternVLA-N1) of InternVLA-N1 and run the following script to inference with visualization results. Move the checkpoint to the `checkpoints` directory.
257
257
2.**DepthAnything v2 Checkpoints**
258
258
Please download the depthanything v2 pretrained [checkpoint](https://huggingface.co/Ashoka74/Placement/resolve/main/depth_anything_v2_vits.pth). Move the checkpoint to the `checkpoints` directory.
259
-
3.**Matterport3D Scenes**
260
-
Download the MP3D scenes from [official project pages](https://niessner.github.io/Matterport/) and place them under `data/scene_datasets/mp3d`.
Download the [InternData-N1](https://huggingface.co/datasets/InternRobotics/InternData-N1) for `vln-ce`. Extract them into the `data/vln_ce/` directory.
261
+
4.**Scene-N1**
262
+
Download the [SceneData-N1](https://huggingface.co/datasets/InternRobotics/Scene-N1) for `mp3d_ce`. Extract them into the `data/scene_data/` directory.
Find the IP address of the node allocated by Slurm. Then change the BACKEND_URL in the gradio client (navigation_ui.py) to the server's IP address. Start the gradio.
310
307
```bash
311
-
python navigation_ui.py
308
+
python scripts/eval/navigation_ui.py
312
309
```
313
-
Note that it's better to run the Gradio client on a machine with a graphical user interface (GUI) but ensure there is proper network connectivity between the client and the server. Then open a browser and enter the Gradio address (such as http://0.0.0.0:5700). We can see the interface as shown below.
310
+
Note that it's better to run the Gradio client on a machine with a graphical user interface (GUI) but ensure there is proper network connectivity between the client and the server. Download the gradio scene assets from [huggingface](https://huggingface.co/datasets/InternRobotics/Scene-N1) and extract them into the `scene_assets` directory of the client. Then open a browser and enter the Gradio address (such as http://0.0.0.0:5700). We can see the interface as shown below.
Click the 'Start Navigation Simulation' button to send a VLN request to the backend. The backend will submit a task to ray server and simulate the VLN task with InternVLA-N1 models. Wait about 3 minutes, the VLN task will be finished and return a result video. We can see the result video in the gradio like this.
313
+
Click the 'Start Navigation Simulation' button to send a VLN request to the backend. The backend will submit a task to ray server and simulate the VLN task with InternVLA-N1 models. Wait about 1 minutes, the VLN task will be finished and return a result video. We can see the result video in the gradio like this.
@@ -343,10 +340,14 @@ After downloading, organize the datasets into the following structure:
343
340
data/
344
341
├── scene_data/
345
342
│ ├── mp3d_pe/
346
-
│ │ ├──17DRP5sb8fy/
343
+
│ │ ├──17DRP5sb8fy/
347
344
│ │ ├── 1LXtFkjw3qL/
348
345
│ │ └── ...
349
346
│ ├── mp3d_ce/
347
+
│ │ ├── mp3d/
348
+
│ │ │ ├── 17DRP5sb8fy/
349
+
│ │ │ ├── 1LXtFkjw3qL/
350
+
│ │ │ └── ...
350
351
│ └── mp3d_n1/
351
352
├── vln_pe/
352
353
│ ├── raw_data/
@@ -362,55 +363,14 @@ data/
362
363
│ └── ...
363
364
├── vln_ce/
364
365
│ ├── raw_data/
366
+
│ │ ├── r2r
367
+
│ │ │ ├── train
368
+
│ │ │ ├── val_seen
369
+
│ │ │ │ └── val_seen.json.gz
370
+
│ │ │ └── val_unseen
371
+
│ │ │ └── val_unseen.json.gz
365
372
│ └── traj_data/
366
373
└── vln_n1/
367
374
└── traj_data/
368
375
```
369
376
370
-
If you want to evaluate on habitat environment and finish the data preparation mentioned [above](#DataCheckpoints-Preparation), the final data structure should look like this:
The evaluation results will be saved in the `eval_results.log` file in the output_dir of the config file. The whole evaluation process takes about 3 hours at RTX4090 platform.
31
+
The evaluation results will be saved in the `eval_results.log` file in the output_dir of the config file. The whole evaluation process takes about 10 hours at RTX-4090 graphics platform.
32
32
33
33
34
34
#### Evaluation on habitat
@@ -49,7 +49,7 @@ For multi-gpu inference, currently we only support inference on SLURM.
49
49
50
50
### Training
51
51
52
-
Download the training data from [Hugging Face](https://huggingface.co/datasets/InternRobotics/InternData-N1/), and extract them into the `data/datasets/` directory.
52
+
Download the training data from [Hugging Face](https://huggingface.co/datasets/InternRobotics/InternData-N1/), and organize them in the form mentioned in [installation](./installation.md).
Copy file name to clipboardExpand all lines: source/en/user_guide/internnav/tutorials/model.md
+6-6Lines changed: 6 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,20 +1,20 @@
1
1
# Model
2
2
3
-
This tutorial introduces the structure and implementation of both System 1 (navdp) and System 2 (rdp) policy models in the internNav framework.
3
+
This tutorial introduces the structure and implementation of both System 1 (NavDP) and whole-system (InternVLA-N1) policy models in the internNav framework.
4
4
5
5
---
6
6
7
-
## System 1: Navdp
7
+
## System 1: NavDP
8
8
9
-
<!--navdp content start -->
9
+
<!--NavDP content start -->
10
10
11
-
This tutorial introduces the structure and implementation of the navdp policy model in the internNav framework, helping you understand and customize each module.
11
+
This tutorial introduces the structure and implementation of the NavDP policy model in the internNav framework, helping you understand and customize each module.
12
12
13
13
---
14
14
15
15
### Model Structure Overview
16
16
17
-
The navdp policy model in internNav mainly consists of the following parts:
17
+
The NavDP policy model in internNav mainly consists of the following parts:
Copy file name to clipboardExpand all lines: source/en/user_guide/internnav/tutorials/training.md
+9-9Lines changed: 9 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,22 +1,22 @@
1
1
# Training
2
2
3
-
This tutorial provides a detailed guide for training both System 1 (navdp) and System 2 (rdp) policy models within the InterNav framework.
3
+
This tutorial provides a detailed guide for training both System 1 (NavDP) and whole system (InternVLA-N1-S2) policy models within the InterNav framework.
4
4
5
5
---
6
6
7
-
## System 1: Navdp
7
+
## System 1: NavDP
8
8
9
-
<!--navdp content start -->
9
+
<!--NavDP content start -->
10
10
11
-
This tutorial provides a detailed guide for training the navdp policy model within the InterNav framework. It covers the **training workflow**, **configuration and parameters**, **command-line usage**, and **troubleshooting**.
11
+
This tutorial provides a detailed guide for training the NavDP policy model within the InterNav framework. It covers the **training workflow**, **configuration and parameters**, **command-line usage**, and **troubleshooting**.
12
12
13
13
---
14
14
15
15
### Overview of the Training Process
16
16
17
-
The navdp training process in InterNav includes the following steps:
17
+
The NavDP training process in InterNav includes the following steps:
18
18
19
-
1.**Model Initialization**: Load navdp configuration and initialize model structure and parameters.
19
+
1.**Model Initialization**: Load NavDP configuration and initialize model structure and parameters.
20
20
2.**Dataset Loading**: Configure dataset paths and preprocessing, build the DataLoader.
21
21
3.**Training Parameter Setup**: Set batch size, learning rate, optimizer, and other hyperparameters.
22
22
4.**Distributed Training Environment Initialization**: Multi-GPU training is supported out of the box.
@@ -32,7 +32,7 @@ Ensure you have installed InterNav and its dependencies, and have access to a mu
32
32
33
33
#### 2. Configuration Check
34
34
35
-
The navdp training configuration file is located at:
35
+
The NavDP training configuration file is located at:
36
36
37
37
```bash
38
38
InternNav/scripts/train/configs/navdp.py
@@ -72,7 +72,7 @@ torchrun \
72
72
73
73
### Training Parameters and Configuration
74
74
75
-
The main training parameters for navdp are set in `scripts/train/configs/navdp.py`. Common parameters include:
75
+
The main training parameters for NavDP are set in `scripts/train/configs/navdp.py`. Common parameters include:
0 commit comments