Skip to content

Conversation

@m2kulkarni
Copy link

No description provided.

@greptile-apps
Copy link

greptile-apps bot commented Jan 31, 2026

Greptile Overview

Greptile Summary

This PR fixes a bug in ego agent indexing when using batch_size > 1 in population play mode. The old code incorrectly used global ego_ids to index into the reshaped observation tensor, which would cause IndexError or select wrong agents. The new code converts global indices to local per-environment indices before indexing.

Key Changes:

  • Adds logic to count how many ego agents belong to the first environment
  • Converts global ego IDs to local indices using modulo operation
  • Uses these local indices to correctly select ego agents from all environments in the batch

Critical Assumption:
The fix assumes all environments in a batch have (1) the same number of ego agents and (2) ego agents at identical local indices. For example, if batch has 2 environments with 3 agents each, and env 0 has ego at local index [1] and env 1 has ego at local index [2], the code will incorrectly select only index [1] from both environments.

The .gitignore change simply adds TODO.md for local development tracking.

Confidence Score: 3/5

  • Safe to merge if ego agents are uniformly structured across environments, but may fail in heterogeneous scenarios
  • The fix correctly addresses the original bug where global indices were used directly on reshaped tensors. However, it makes a strong assumption about uniform ego agent structure across environments that may not always hold. The code should work correctly for the common case where all environments have identical agent layouts, but could silently produce incorrect results if environments have different ego agent configurations.
  • pufferlib/pufferl.py requires attention - verify the uniform ego structure assumption holds for your use case

Important Files Changed

Filename Overview
.gitignore Added TODO.md to gitignore for local task tracking
pufferlib/pufferl.py Fixed ego_ids indexing for batch_size > 1 by converting global indices to local per-environment indices, but assumes uniform ego structure across environments

Sequence Diagram

sequenceDiagram
    participant VecEnv as Vector Environment
    participant PuffeRL as PuffeRL.evaluate()
    participant Drive as Drive Environment

    Drive->>Drive: Initialize ego_ids (global, flattened)
    Drive->>VecEnv: Return observations with info[-1] = ego_ids
    VecEnv->>PuffeRL: recv() returns (o, r, d, t, info, env_id, mask)
    
    alt batch_size > 1
        PuffeRL->>PuffeRL: Get ego_ids from info[-1]
        PuffeRL->>PuffeRL: Calculate num_agents_per_env = total_agents // batch_size
        PuffeRL->>PuffeRL: Count num_ego_per_env (ego IDs in first environment)
        PuffeRL->>PuffeRL: Convert to local_ego_ids using modulo
        PuffeRL->>PuffeRL: Reshape to (batch_size, num_agents_per_env, ...)
        PuffeRL->>PuffeRL: Index with local_ego_ids across all batches
        PuffeRL->>PuffeRL: Reshape back to (batch_size * num_ego_per_env, ...)
    else batch_size == 1
        PuffeRL->>PuffeRL: Use ego_ids directly to filter observations
    end
    
    PuffeRL->>PuffeRL: Continue with ego-only observations
Loading

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Comment on lines +308 to +309
num_ego_per_env = sum(1 for eid in ego_ids if eid < num_agents_per_env)
local_ego_ids = [eid % num_agents_per_env for eid in ego_ids[:num_ego_per_env]]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assumes all environments have the same number of ego agents at identical local indices. If env 0 has ego at local index [0, 1] and env 1 has ego at local index [2, 3], this will fail and only select indices [0, 1] from both environments. verify this assumption holds in your use case, or consider using the pre-computed local_ego_ids from drive.py instead

Prompt To Fix With AI
This is a comment left during a code review.
Path: pufferlib/pufferl.py
Line: 308:309

Comment:
assumes all environments have the same number of ego agents at identical local indices. If env 0 has ego at local index [0, 1] and env 1 has ego at local index [2, 3], this will fail and only select indices [0, 1] from both environments. verify this assumption holds in your use case, or consider using the pre-computed `local_ego_ids` from `drive.py` instead

How can I resolve this? If you propose a fix, please make it concise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants