Skip to content

Conversation

@kellyiss
Copy link
Collaborator

@kellyiss kellyiss commented Dec 4, 2025

✨ InternVLA-N1 Dual-System Variants

  1. Added training code for internvla-n1-w-navdp
  2. Added training code for internvla-n1-dualvln
  3. Both variants support full training/inference pipeline

📊 Extended Evaluation Support

  1. Support RXR evaluation
  2. Decouple System 2 and Dual System evaluation functions
  3. Released new models checkpoints
  4. Updated README download links, performance metrics and some content reorganizations.

import copy
import itertools
import json
import os
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kew6688 Check the conflicts between this file and the refactored evaluator.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acknowledged. Conflicts are marked. Will wait for @kew6688 's adaptation before merging.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kellyiss#1

Adaption is ready to merge in this PR

  • Fixed vlnpe internvla_n1 agent with the updated n1 model
  • Merged refactored Habitat Evaluator with decoupled System 2 and Dual System evaluation functions
  • Downgrade diffusers version 0.33.1 -> 0.32.2

@Tai-Wang Tai-Wang changed the title [feat] Add training code for InternVLA-N1 [Feature] Add training code for InternVLA-N1 Dec 9, 2025
@kew6688 kew6688 changed the base branch from main to dev December 10, 2025 06:11
@yuqiang-yang yuqiang-yang self-assigned this Dec 10, 2025
self.dual_forward_step += 1

# print('Output action:', output, self.dual_forward_step)
print('Output action:', output, self.dual_forward_step)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to keep the print or not?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's only for debug purpose, and it's noisy. I will remove it.

server_port=8023,
model_name='internvla_n1',
ckpt_path='',
model_settings={
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kew6688 Is this the default the configuration file in documents? If so, check the result or tutorial steps in the document.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is the default configuration. The new checkpoints downloading instructions need to be updated in the document.

BTW, this is used both for flash or not flash. Should this two modes be two seperate cfg files?

),
task=TaskCfg(
task_name='rdp_eval',
task_settings={
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we change the configuration file of RDP in this PR?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed this cfg to use 1 GPU as default instead of vec env with 4 GPUs.

# shift_labels = shift_labels.view(-1)
# # Enable model parallelism
# shift_labels = shift_labels.to(shift_logits.device)
# loss = loss_fct(shift_logits, shift_labels)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can remove the unused codes

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

env_settings={
# habitat sim specifications - agent, sensors, tasks, measures etc. are defined in the habitat config file
'config_path': 'scripts/eval/configs/vln_r2r.yaml',
'config_path': 'scripts/eval/configs/vln_rxr.yaml',
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be r2r? Please also check the file, where the env is r2r instead of rxr

# hidden_states, mask, img_size, image_rotary_emb = self.patch_embedder(hidden_states, image_rotary_emb)
# torch.Size([16, 256, 1792]) torch.Size([16, 256])
# image_rotary_emb = image_rotary_emb.to(hidden_states.device)
# breakpoint()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove unused codes?

# hidden_states = hidden_states[:, :sequence_length].view(
# batch_size, height // height_tokens, width // width_tokens, height_tokens, width_tokens, self.out_channels
# )
# output = hidden_states.permute(0, 5, 1, 3, 2, 4).flatten(4, 5).flatten(2, 3)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can remove?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants