-
Notifications
You must be signed in to change notification settings - Fork 51
[Feature] Add training code for InternVLA-N1 #184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
| import copy | ||
| import itertools | ||
| import json | ||
| import os |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kew6688 Check the conflicts between this file and the refactored evaluator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Acknowledged. Conflicts are marked. Will wait for @kew6688 's adaptation before merging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adaption is ready to merge in this PR
- Fixed vlnpe internvla_n1 agent with the updated n1 model
- Merged refactored Habitat Evaluator with decoupled System 2 and Dual System evaluation functions
- Downgrade diffusers version 0.33.1 -> 0.32.2
… in generate_traj
| self.dual_forward_step += 1 | ||
|
|
||
| # print('Output action:', output, self.dual_forward_step) | ||
| print('Output action:', output, self.dual_forward_step) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to keep the print or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's only for debug purpose, and it's noisy. I will remove it.
| server_port=8023, | ||
| model_name='internvla_n1', | ||
| ckpt_path='', | ||
| model_settings={ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kew6688 Is this the default the configuration file in documents? If so, check the result or tutorial steps in the document.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is the default configuration. The new checkpoints downloading instructions need to be updated in the document.
BTW, this is used both for flash or not flash. Should this two modes be two seperate cfg files?
| ), | ||
| task=TaskCfg( | ||
| task_name='rdp_eval', | ||
| task_settings={ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we change the configuration file of RDP in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed this cfg to use 1 GPU as default instead of vec env with 4 GPUs.
| # shift_labels = shift_labels.view(-1) | ||
| # # Enable model parallelism | ||
| # shift_labels = shift_labels.to(shift_logits.device) | ||
| # loss = loss_fct(shift_logits, shift_labels) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can remove the unused codes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed
| env_settings={ | ||
| # habitat sim specifications - agent, sensors, tasks, measures etc. are defined in the habitat config file | ||
| 'config_path': 'scripts/eval/configs/vln_r2r.yaml', | ||
| 'config_path': 'scripts/eval/configs/vln_rxr.yaml', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be r2r? Please also check the file, where the env is r2r instead of rxr
| # hidden_states, mask, img_size, image_rotary_emb = self.patch_embedder(hidden_states, image_rotary_emb) | ||
| # torch.Size([16, 256, 1792]) torch.Size([16, 256]) | ||
| # image_rotary_emb = image_rotary_emb.to(hidden_states.device) | ||
| # breakpoint() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove unused codes?
| # hidden_states = hidden_states[:, :sequence_length].view( | ||
| # batch_size, height // height_tokens, width // width_tokens, height_tokens, width_tokens, self.out_channels | ||
| # ) | ||
| # output = hidden_states.permute(0, 5, 1, 3, 2, 4).flatten(4, 5).flatten(2, 3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can remove?
✨ InternVLA-N1 Dual-System Variants
📊 Extended Evaluation Support