Skip to content

Mismatched number of values to unpack when running TarGF. #1

@josslei

Description

@josslei

When evaluating (--mode test_policy) the Ball Rearrangement (Circle_Score) task of TarGF, there is an error in function collect_trajectories_ball() when updating the state of the environment.

It occurs in file TarGF/runners/eval_policy.py, line 132:

...
    130 |        while not done:
    131 |            action = eval_policy.select_action(np.array(state), sample=False)
--> 132 |            new_state, _, done, infos = eval_env.step(action)
...

The Python error output be like:

Traceback (most recent call last):
  File "/data/josslei/github/TarGF/main.py", line 47, in <module>
    app.run(main)
  File "/home/josslei/Miniconda3/envs/targf/lib/python3.9/site-packages/absl/app.py", line 308, in run
    _run_main(main, args)
  File "/home/josslei/Miniconda3/envs/targf/lib/python3.9/site-packages/absl/app.py", line 254, in _run_main
    sys.exit(main(argv))
  File "/data/josslei/github/TarGF/main.py", line 41, in main
    test_policy(FLAGS.config, FLAGS.workdir)
  File "/data/josslei/github/TarGF/runners/eval_policy.py", line 62, in test_policy
    trajs_result = collect_trajectories(eval_env, policy, configs.test_num)
  File "/data/josslei/github/TarGF/runners/eval_policy.py", line 132, in collect_trajectories_ball
    new_state, _, done, infos = eval_env.step(action)
  File "/home/josslei/Miniconda3/envs/targf/lib/python3.9/site-packages/gym/wrappers/time_limit.py", line 50, in step
    observation, reward, terminated, truncated, info = self.env.step(action)
ValueError: not enough values to unpack (expected 5, got 4)

I found this problem might comes from this file, in EbOR, EbOR/ebor/Envs/rearrangement.py, line 191:

...
    187 |        # update prev_state
    188 |        self.prev_state = cur_state
    189 |        if self.auto_flatten:
    190 |            cur_state = self.flatten_states([cur_state])[0]
--> 191 |        return cur_state, r, is_done, infos
...

If I change line 191 to this, it'll work:

...
191 |         return cur_state, r, is_done, None, infos
...

Also changed the corresponding line in TarGF/runners/eval_policy.py:

...
132 |            new_state, _, done, _, infos = eval_env.step(action)
...

I don't know if there's other uses for this function (step()), so I'm submitting this issue to see if there's necessarity to change.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions