Mismatched number of values to unpack when running TarGF.

When evaluating (`--mode test_policy`) the Ball Rearrangement (`Circle_Score`) task of TarGF, there is an error in function `collect_trajectories_ball()` when updating the state of the environment.

It occurs in file TarGF/runners/eval_policy.py, line 132:
```python3
...
    130 |        while not done:
    131 |            action = eval_policy.select_action(np.array(state), sample=False)
--> 132 |            new_state, _, done, infos = eval_env.step(action)
...
```

The Python error output be like:
```
Traceback (most recent call last):
  File "/data/josslei/github/TarGF/main.py", line 47, in <module>
    app.run(main)
  File "/home/josslei/Miniconda3/envs/targf/lib/python3.9/site-packages/absl/app.py", line 308, in run
    _run_main(main, args)
  File "/home/josslei/Miniconda3/envs/targf/lib/python3.9/site-packages/absl/app.py", line 254, in _run_main
    sys.exit(main(argv))
  File "/data/josslei/github/TarGF/main.py", line 41, in main
    test_policy(FLAGS.config, FLAGS.workdir)
  File "/data/josslei/github/TarGF/runners/eval_policy.py", line 62, in test_policy
    trajs_result = collect_trajectories(eval_env, policy, configs.test_num)
  File "/data/josslei/github/TarGF/runners/eval_policy.py", line 132, in collect_trajectories_ball
    new_state, _, done, infos = eval_env.step(action)
  File "/home/josslei/Miniconda3/envs/targf/lib/python3.9/site-packages/gym/wrappers/time_limit.py", line 50, in step
    observation, reward, terminated, truncated, info = self.env.step(action)
ValueError: not enough values to unpack (expected 5, got 4)
```

I found this problem might comes from this file, in EbOR, EbOR/ebor/Envs/rearrangement.py, line 191:
```python3
...
    187 |        # update prev_state
    188 |        self.prev_state = cur_state
    189 |        if self.auto_flatten:
    190 |            cur_state = self.flatten_states([cur_state])[0]
--> 191 |        return cur_state, r, is_done, infos
...
```

If I change line 191 to this, it'll work:
```python3
...
191 |         return cur_state, r, is_done, None, infos
...
```
Also changed the corresponding line in TarGF/runners/eval_policy.py:
```python3
...
132 |            new_state, _, done, _, infos = eval_env.step(action)
...
```

I don't know if there's other uses for this function (`step()`), so I'm submitting this issue to see if there's necessarity to change.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mismatched number of values to unpack when running TarGF. #1

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Mismatched number of values to unpack when running TarGF. #1

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions