Skip to content

Unexpected Slowdown with Meltingpot #217

@ImNotRog

Description

@ImNotRog

While running BenchMARL with Meltingpot I've been experiencing unexpectedly slow performance. It takes around a minute to train only 1000 frames of the Common Harvest Open environment. It spends 34 seconds in the training step and 19 seconds in the collection step. The training device is a Tesla V100 GPU and the sampling is run on the cpu. I've attached my config, which is from #78, below. Not exactly sure what's happening, I would appreciate some pointers!

config.yaml:

defaults:
  - experiment: base_experiment
  - algorithm: mappo
  - task: meltingpot/commons_harvest__open
  - model: layers/cnn
  - model@critic_model: layers/cnn
  - _self_

hydra:
  searchpath:
   # Tells hydra to add the default benchmarl configuration to its path
    - pkg://benchmarl/conf

seed: 0

task:
  max_steps: 100

model:
  mlp_num_cells: [ 256, 256 ]

  cnn_num_cells: [ 16, 32, 256 ]
  cnn_kernel_sizes: [ 8, 4, 11 ]
  cnn_strides: [4, 2, 1]
  cnn_paddings: [2, 1, 5]
  cnn_activation_class: torch.nn.ReLU

critic_model:
  mlp_num_cells: [ 256, 256 ]

  cnn_num_cells: [ 16, 32, 256 ]
  cnn_kernel_sizes: [ 8, 4, 11 ]
  cnn_strides: [ 4, 2, 1 ]
  cnn_paddings: [ 2, 1, 5 ]
  cnn_activation_class: torch.nn.ReLU

algorithm:
  entropy_coef: 0.001
  use_tanh_normal: True

experiment:
  sampling_device: "cpu"
  train_device: "cuda"

  share_policy_params: True
  gamma: 0.99

  adam_eps: 0.000001
  lr: 0.00025
  clip_grad_norm: True
  clip_grad_val: 5

  max_n_iters: null
  max_n_frames: 1_000 # 10_000_000

  on_policy_collected_frames_per_batch: 500
  on_policy_n_envs_per_worker: 1
  on_policy_n_minibatch_iters: 45
  on_policy_minibatch_size: 500

  evaluation: True
  render: True
  evaluation_interval: 1_000
  evaluation_episodes: 1
  evaluation_deterministic_actions: False

  loggers: ["wandb"]
  create_json: False

  save_folder: null
  restore_file: null
  checkpoint_interval: 0

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions