local_ego_id changes for bsz > 1 #18

m2kulkarni · 2026-01-31T11:16:40Z

No description provided.

greptile-apps · 2026-01-31T11:21:03Z

Greptile Overview

Greptile Summary

This PR fixes a bug in ego agent indexing when using batch_size > 1 in population play mode. The old code incorrectly used global ego_ids to index into the reshaped observation tensor, which would cause IndexError or select wrong agents. The new code converts global indices to local per-environment indices before indexing.

Key Changes:

Adds logic to count how many ego agents belong to the first environment
Converts global ego IDs to local indices using modulo operation
Uses these local indices to correctly select ego agents from all environments in the batch

Critical Assumption:
The fix assumes all environments in a batch have (1) the same number of ego agents and (2) ego agents at identical local indices. For example, if batch has 2 environments with 3 agents each, and env 0 has ego at local index [1] and env 1 has ego at local index [2], the code will incorrectly select only index [1] from both environments.

The .gitignore change simply adds TODO.md for local development tracking.

Confidence Score: 3/5

Safe to merge if ego agents are uniformly structured across environments, but may fail in heterogeneous scenarios
The fix correctly addresses the original bug where global indices were used directly on reshaped tensors. However, it makes a strong assumption about uniform ego agent structure across environments that may not always hold. The code should work correctly for the common case where all environments have identical agent layouts, but could silently produce incorrect results if environments have different ego agent configurations.
pufferlib/pufferl.py requires attention - verify the uniform ego structure assumption holds for your use case

Important Files Changed

Filename	Overview
.gitignore	Added `TODO.md` to gitignore for local task tracking
pufferlib/pufferl.py	Fixed ego_ids indexing for batch_size > 1 by converting global indices to local per-environment indices, but assumes uniform ego structure across environments

Sequence Diagram

sequenceDiagram
    participant VecEnv as Vector Environment
    participant PuffeRL as PuffeRL.evaluate()
    participant Drive as Drive Environment

    Drive->>Drive: Initialize ego_ids (global, flattened)
    Drive->>VecEnv: Return observations with info[-1] = ego_ids
    VecEnv->>PuffeRL: recv() returns (o, r, d, t, info, env_id, mask)
    
    alt batch_size > 1
        PuffeRL->>PuffeRL: Get ego_ids from info[-1]
        PuffeRL->>PuffeRL: Calculate num_agents_per_env = total_agents // batch_size
        PuffeRL->>PuffeRL: Count num_ego_per_env (ego IDs in first environment)
        PuffeRL->>PuffeRL: Convert to local_ego_ids using modulo
        PuffeRL->>PuffeRL: Reshape to (batch_size, num_agents_per_env, ...)
        PuffeRL->>PuffeRL: Index with local_ego_ids across all batches
        PuffeRL->>PuffeRL: Reshape back to (batch_size * num_ego_per_env, ...)
    else batch_size == 1
        PuffeRL->>PuffeRL: Use ego_ids directly to filter observations
    end
    
    PuffeRL->>PuffeRL: Continue with ego-only observations

greptile-apps

_{2 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-01-31T11:21:06Z

pufferlib/pufferl.py

+                    num_ego_per_env = sum(1 for eid in ego_ids if eid < num_agents_per_env)
+                    local_ego_ids = [eid % num_agents_per_env for eid in ego_ids[:num_ego_per_env]]


assumes all environments have the same number of ego agents at identical local indices. If env 0 has ego at local index [0, 1] and env 1 has ego at local index [2, 3], this will fail and only select indices [0, 1] from both environments. verify this assumption holds in your use case, or consider using the pre-computed local_ego_ids from drive.py instead

Prompt To Fix With AI

This is a comment left during a code review. Path: pufferlib/pufferl.py Line: 308:309 Comment: assumes all environments have the same number of ego agents at identical local indices. If env 0 has ego at local index [0, 1] and env 1 has ego at local index [2, 3], this will fail and only select indices [0, 1] from both environments. verify this assumption holds in your use case, or consider using the pre-computed `local_ego_ids` from `drive.py` instead How can I resolve this? If you propose a fix, please make it concise.

local_ego_id changes for bsz > 1

95569d9

greptile-apps bot reviewed Jan 31, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

local_ego_id changes for bsz > 1 #18

local_ego_id changes for bsz > 1 #18

Uh oh!

m2kulkarni commented Jan 31, 2026

Uh oh!

greptile-apps bot commented Jan 31, 2026

Uh oh!

greptile-apps bot left a comment

Uh oh!

greptile-apps bot Jan 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		num_ego_per_env = sum(1 for eid in ego_ids if eid < num_agents_per_env)
		local_ego_ids = [eid % num_agents_per_env for eid in ego_ids[:num_ego_per_env]]

local_ego_id changes for bsz > 1 #18

Are you sure you want to change the base?

local_ego_id changes for bsz > 1 #18

Uh oh!

Conversation

m2kulkarni commented Jan 31, 2026

Uh oh!

greptile-apps bot commented Jan 31, 2026

Greptile Overview

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Jan 31, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants