Skip to content

SB2 vs SB3 - Performance difference #1124

@MatPoliquin

Description

@MatPoliquin

❓ Question

EDIT: After doing some more digging I updated the post title and added more details with a newer version of SB3 (1.6.2)

I am using OpenAI gym-retro env to train on games and migrated from SB2 to SB3 1.6.2. I noticed the training FPS reduced by a lot from 1300fps to 900fps.

  • gym-retro env: Pong-Atari2600
  • num_env==24
  • PPO
  • CnnPolicy

Using Nvidia Nsight I profiled both versions (you can find the reports in the link to google drive below, you need Nsight to view it):
https://drive.google.com/drive/folders/1Lqxf-qKXTj__Hp8WUXgNHejZaJGy8oct?usp=sharing

Here are the parameters I use for PPO with SB3 (with SB1 I just use the default parameters provided by SB):
PPO(policy=args.nn, env=env, verbose=1, n_steps = 128, n_epochs = 4, batch_size = 256, learning_rate = 2.5e-4, clip_range = 0.2, vf_coef = 0.5, ent_coef = 0.01, max_grad_norm=0.5, clip_range_vf=None)

My specs:

  • Dual Xeon 2666v3
  • RTX 2060 Super 8g
  • Ubuntu 20.04
  • stable-baselines3 1.6.2
  • gym 0.26.2

Code I use to wrap the retro env (same for both SB2 and SB3 cases):

 def make_retro(*, game, state=None, num_players, max_episode_steps=4500, **kwargs):
      import retro
      if state is None:
         state = retro.State.DEFAULT
     env = retro.make(game, state, **kwargs, players=num_players)
     return env
def init_env(output_path, num_env, state, num_players, args, use_frameskip=True, use_display=False):
    seed = 0
    start_index = 0
    start_method=None
    allow_early_resets=True
   
    def make_env(rank):
       def _thunk():
            env = make_retro(game=args.env, use_restricted_actions=retro.Actions.FILTERED, state=state, num_players=num_players)

            env.seed(seed + rank)
            env = Monitor(env, output_path and os.path.join(output_path, str(rank)), allow_early_resets=allow_early_resets)
            if use_frameskip:
                env = StochasticFrameSkip(env, n=4, stickprob=0.25)

            env = WarpFrame(env)
            env = ClipRewardEnv(env)

            return env
        return _thunk
    #set_global_seeds(seed)


    env = SubprocVecEnv([make_env(i + start_index) for i in range(num_env)], start_method=start_method)
    
    env = VecFrameStack(env, n_stack=4)

    env = VecTransposeImage(env)

    return env

Checklist

  • I have checked that there is no similar issue in the repo
  • I have read the documentation
  • If code there is, it is minimal and working
  • If code there is, it is formatted using the markdown code blocks for both code and stack traces.

Metadata

Metadata

Assignees

No one assigned

    Labels

    more information neededPlease fill the issue template completelyquestionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions