-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
When evaluating (--mode test_policy) the Ball Rearrangement (Circle_Score) task of TarGF, there is an error in function collect_trajectories_ball() when updating the state of the environment.
It occurs in file TarGF/runners/eval_policy.py, line 132:
...
130 | while not done:
131 | action = eval_policy.select_action(np.array(state), sample=False)
--> 132 | new_state, _, done, infos = eval_env.step(action)
...The Python error output be like:
Traceback (most recent call last):
File "/data/josslei/github/TarGF/main.py", line 47, in <module>
app.run(main)
File "/home/josslei/Miniconda3/envs/targf/lib/python3.9/site-packages/absl/app.py", line 308, in run
_run_main(main, args)
File "/home/josslei/Miniconda3/envs/targf/lib/python3.9/site-packages/absl/app.py", line 254, in _run_main
sys.exit(main(argv))
File "/data/josslei/github/TarGF/main.py", line 41, in main
test_policy(FLAGS.config, FLAGS.workdir)
File "/data/josslei/github/TarGF/runners/eval_policy.py", line 62, in test_policy
trajs_result = collect_trajectories(eval_env, policy, configs.test_num)
File "/data/josslei/github/TarGF/runners/eval_policy.py", line 132, in collect_trajectories_ball
new_state, _, done, infos = eval_env.step(action)
File "/home/josslei/Miniconda3/envs/targf/lib/python3.9/site-packages/gym/wrappers/time_limit.py", line 50, in step
observation, reward, terminated, truncated, info = self.env.step(action)
ValueError: not enough values to unpack (expected 5, got 4)
I found this problem might comes from this file, in EbOR, EbOR/ebor/Envs/rearrangement.py, line 191:
...
187 | # update prev_state
188 | self.prev_state = cur_state
189 | if self.auto_flatten:
190 | cur_state = self.flatten_states([cur_state])[0]
--> 191 | return cur_state, r, is_done, infos
...If I change line 191 to this, it'll work:
...
191 | return cur_state, r, is_done, None, infos
...Also changed the corresponding line in TarGF/runners/eval_policy.py:
...
132 | new_state, _, done, _, infos = eval_env.step(action)
...I don't know if there's other uses for this function (step()), so I'm submitting this issue to see if there's necessarity to change.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels