-
Notifications
You must be signed in to change notification settings - Fork 34
Open
Description
Thank you for your excellent work. I have encountered some difficulties reproducing the results of ASearcher-Web-QwQ on GAIA. I am only able to achieve 41.7% accuracy, which is significantly lower than the 52.8% reported in the paper.
Here is my eval script:
MODEL_PATH=ASearcher-Web-QwQ_inclusionAI
DATA_NAMES=GAIA
AGENT_TYPE=asearcher-reasoning
PROMPT_TYPE=asearcher-reasoning
SEARCH_CLIENT_TYPE=async-web-search-access
python3 search_eval_async.py \
--data_names ${DATA_NAMES} \
--model_name_or_path ${MODEL_PATH} \
--output_dir results \
--data_dir ${DATA_DIR} \
--prompt_type $PROMPT_TYPE \
--agent-type ${AGENT_TYPE} \
--search-client-type ${SEARCH_CLIENT_TYPE} \
--tensor_parallel_size 4 \
--temperature 0.6 \
--parallel-mode seed \
--seed 1 \
--aggregate-only \
--use-jina \
--llm_as_judge \
--use-openai True \
--pass-at-k 1
evaluation results:
JINA and SERPER APIs are correctly set. Thank you for your help!
Metadata
Metadata
Assignees
Labels
No labels