-
Notifications
You must be signed in to change notification settings - Fork 107
Closed
Description
While running BenchMARL with Meltingpot I've been experiencing unexpectedly slow performance. It takes around a minute to train only 1000 frames of the Common Harvest Open environment. It spends 34 seconds in the training step and 19 seconds in the collection step. The training device is a Tesla V100 GPU and the sampling is run on the cpu. I've attached my config, which is from #78, below. Not exactly sure what's happening, I would appreciate some pointers!
config.yaml:
defaults:
- experiment: base_experiment
- algorithm: mappo
- task: meltingpot/commons_harvest__open
- model: layers/cnn
- model@critic_model: layers/cnn
- _self_
hydra:
searchpath:
# Tells hydra to add the default benchmarl configuration to its path
- pkg://benchmarl/conf
seed: 0
task:
max_steps: 100
model:
mlp_num_cells: [ 256, 256 ]
cnn_num_cells: [ 16, 32, 256 ]
cnn_kernel_sizes: [ 8, 4, 11 ]
cnn_strides: [4, 2, 1]
cnn_paddings: [2, 1, 5]
cnn_activation_class: torch.nn.ReLU
critic_model:
mlp_num_cells: [ 256, 256 ]
cnn_num_cells: [ 16, 32, 256 ]
cnn_kernel_sizes: [ 8, 4, 11 ]
cnn_strides: [ 4, 2, 1 ]
cnn_paddings: [ 2, 1, 5 ]
cnn_activation_class: torch.nn.ReLU
algorithm:
entropy_coef: 0.001
use_tanh_normal: True
experiment:
sampling_device: "cpu"
train_device: "cuda"
share_policy_params: True
gamma: 0.99
adam_eps: 0.000001
lr: 0.00025
clip_grad_norm: True
clip_grad_val: 5
max_n_iters: null
max_n_frames: 1_000 # 10_000_000
on_policy_collected_frames_per_batch: 500
on_policy_n_envs_per_worker: 1
on_policy_n_minibatch_iters: 45
on_policy_minibatch_size: 500
evaluation: True
render: True
evaluation_interval: 1_000
evaluation_episodes: 1
evaluation_deterministic_actions: False
loggers: ["wandb"]
create_json: False
save_folder: null
restore_file: null
checkpoint_interval: 0
Metadata
Metadata
Assignees
Labels
No labels