Hi, thanks for such an all-around repo for working with 3DSG planning!
I would like to reproduce the benchmarking results in your repo under the benchmark folder to make sure everything runs properly before testing my own planners. However, during my testing, the behaviors of the planners are quite different than what are reported.
As of 07/20/2023, I ran all available planners in pddlgym_planners/__init__.py with pddl_domain taskographyv2tiny1 with the command python scripts/benchmark/plan.py --domain-name $DOMAIN_NAME --planner $PLANNER. The results are the following:
FF: error while running
gcc -o ff main.o memory.o output.o parse.o inst_pre.o inst_easy.o inst_hard.o inst_final.o orderings.o relax.o search.o scan-fct_pddl.tab.o scan-ops_pddl.tab.o -Wall -g -std=gnu99 -O6 -lm
/usr/bin/ld: search.o:/home/fjd/miniconda3/envs/taskographypy37/lib/python3.7/site-packages/pddlgym_planners/FF-v2.3/search.c:110: multiple definition of lcurrent_goals'; relax.o:/home/fjd/miniconda3/envs/taskographypy37/lib/python3.7/site-packages/pddlgym_planners/FF-v2.3/relax.c:111: first defined here /usr/bin/ld: scan-fct_pddl.tab.o:/home/fjd/miniconda3/envs/taskographypy37/lib/python3.7/site-packages/pddlgym_planners/FF-v2.3/lex-fct_pddl.l:9: multiple definition of gbracket_count'; main.o:/home/fjd/miniconda3/envs/taskographypy37/lib/python3.7/site-packages/pddlgym_planners/FF-v2.3/main.c:147: first defined here
collect2: error: ld returned 1 exit status
make: *** [makefile:74: ff] Error 1
FF-X: the same error as FF
FD-lama-first: plan failure
{'failure_rate': 1.0,
'num_node_expansions': nan,
'num_node_expansions_std': nan,
'plan_length': nan,
'plan_length_std': nan,
'search_time': nan,
'search_time_std': nan,
'success_rate': 0.0,
'timeout_rate': 0.0,
'total_time': nan,
'total_time_std': nan}
Cerberus-seq-sat: plan falure
{'failure_rate': 1.0,
'num_node_expansions': nan,
'num_node_expansions_std': nan,
'plan_length': nan,
'plan_length_std': nan,
'search_time': nan,
'search_time_std': nan,
'success_rate': 0.0,
'timeout_rate': 0.0,
'total_time': nan,
'total_time_std': nan}
Cerberus-seq-agl: plan failure
{'failure_rate': 1.0,
'num_node_expansions': nan,
'num_node_expansions_std': nan,
'plan_length': nan,
'plan_length_std': nan,
'search_time': nan,
'search_time_std': nan,
'success_rate': 0.0,
'timeout_rate': 0.0,
'total_time': nan,
'total_time_std': nan}
DecStar-agl-decoupled: plan failure
{'failure_rate': 1.0,
'num_node_expansions': nan,
'num_node_expansions_std': nan,
'plan_length': nan,
'plan_length_std': nan,
'search_time': nan,
'search_time_std': nan,
'success_rate': 0.0,
'timeout_rate': 0.0,
'total_time': nan,
'total_time_std': nan}
lapkt-bfws: slightly different behavior than benchmark/taskographyv2tiny1_bfws. My result:
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 40/40 [03:21<00:00, 5.04s/it]
{'failure_rate': 0.0,
'num_node_expansions': 468.48387096774195,
'num_node_expansions_std': 192.6469059835003,
'plan_length': 14.709677419354838,
'plan_length_std': 3.828530825661262,
'search_time': 0.4536315483870968,
'search_time_std': 0.3696494008728636,
'success_rate': 0.775,
'timeout_rate': 0.225,
'total_time': 0.4536315483870968,
'total_time_std': 0.3696494008728636}
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 55/55 [05:57<00:00, 6.51s/it]
{'failure_rate': 0.0,
'num_node_expansions': 573.3225806451613,
'num_node_expansions_std': 338.3147405651472,
'plan_length': 15.32258064516129,
'plan_length_std': 4.394917128465223,
'search_time': 0.5754497419354839,
'search_time_std': 0.8765903350261305,
'success_rate': 0.5636363636363636,
'timeout_rate': 0.43636363636363634,
'total_time': 0.5754497419354839,
'total_time_std': 0.8765903350261305}
reported in benchmark/taskographyv2tiny1_bfws/taskographyv2tiny1_bfws_test.json:
{
"failure_rate": 0.0,
"num_node_expansions": 609.6279069767442,
"num_node_expansions_std": 339.64208406455214,
"plan_length": 15.55813953488372,
"plan_length_std": 4.15570398469826,
"search_time": 0.8969197023255813,
"search_time_std": 1.3382104019851668,
"success_rate": 0.7818181818181819,
"timeout_rate": 0.21818181818181817,
"total_time": 0.8969197023255813,
"total_time_std": 1.3382104019851668
}
FD-seq-opt-lmcut: plan failure
{'failure_rate': 1.0,
'num_node_expansions': nan,
'num_node_expansions_std': nan,
'plan_length': nan,
'plan_length_std': nan,
'search_time': nan,
'search_time_std': nan,
'success_rate': 0.0,
'timeout_rate': 0.0,
'total_time': nan,
'total_time_std': nan}
Delfi: plan failure:
{'failure_rate': 1.0,
'num_node_expansions': nan,
'num_node_expansions_std': nan,
'plan_length': nan,
'plan_length_std': nan,
'search_time': nan,
'search_time_std': nan,
'success_rate': 0.0,
'timeout_rate': 0.0,
'total_time': nan,
'total_time_std': nan}
DecStar-opt-decoupled: plan failure
{'failure_rate': 1.0,
'num_node_expansions': nan,
'num_node_expansions_std': nan,
'plan_length': nan,
'plan_length_std': nan,
'search_time': nan,
'search_time_std': nan,
'success_rate': 0.0,
'timeout_rate': 0.0,
'total_time': nan,
'total_time_std': nan}
I followed the installation stated in the https://github.com/taskography/taskography-api#installation with only a few changes to fix some errors:
0. Ubuntu 22.04.
- Conda create an empty env with python=3.7.
- Add a comma
, at the end of line
to separate the two lines.
- Run
pip install -e . and pip install -r requirements.txt.
- Downgrade
importlib-metadata from 6.7.0 to 4.12.0 to avoid error 'EntryPoints' object has no attribute 'get'. Source: https://stackoverflow.com/questions/73929564/entrypoints-object-has-no-attribute-get-digital-ocean
- Move
from __future__ import annotations to the first line to avoid error from __future__ imports must occur at the beginning of the file. Source: https://stackoverflow.com/questions/38688504/from-future-imports-must-occur-at-the-beginning-of-the-file-what-defines
- Run
scripts/validate/loader.py and scripts/validate/taskography_env.py, pass both.
I'm willing to offer more details if needed. Highly appreciate it if you could offer some help as a solid benchmark is the pre-requisite to any possible future researches. Thanks in advance!
Hi, thanks for such an all-around repo for working with 3DSG planning!
I would like to reproduce the benchmarking results in your repo under the benchmark folder to make sure everything runs properly before testing my own planners. However, during my testing, the behaviors of the planners are quite different than what are reported.
As of 07/20/2023, I ran all available planners in
pddlgym_planners/__init__.pywith pddl_domaintaskographyv2tiny1with the commandpython scripts/benchmark/plan.py --domain-name $DOMAIN_NAME --planner $PLANNER. The results are the following:FF: error while runningFF-X: the same error as FFFD-lama-first: plan failureCerberus-seq-sat: plan falureCerberus-seq-agl: plan failureDecStar-agl-decoupled: plan failurelapkt-bfws: slightly different behavior thanbenchmark/taskographyv2tiny1_bfws. My result:reported in
benchmark/taskographyv2tiny1_bfws/taskographyv2tiny1_bfws_test.json:FD-seq-opt-lmcut: plan failureDelfi: plan failure:DecStar-opt-decoupled: plan failureI followed the installation stated in the https://github.com/taskography/taskography-api#installation with only a few changes to fix some errors:
0. Ubuntu 22.04.
,at the end of linetaskography-api/setup.py
Line 26 in bcb47fc
pip install -e .andpip install -r requirements.txt.importlib-metadatafrom 6.7.0 to 4.12.0 to avoid error'EntryPoints' object has no attribute 'get'. Source: https://stackoverflow.com/questions/73929564/entrypoints-object-has-no-attribute-get-digital-oceanfrom __future__ import annotationsto the first line to avoid errorfrom __future__ imports must occur at the beginning of the file. Source: https://stackoverflow.com/questions/38688504/from-future-imports-must-occur-at-the-beginning-of-the-file-what-definesscripts/validate/loader.pyandscripts/validate/taskography_env.py, pass both.I'm willing to offer more details if needed. Highly appreciate it if you could offer some help as a solid benchmark is the pre-requisite to any possible future researches. Thanks in advance!