Unable to reproduce results of ASearcher-Web-QwQ on GAIA.

Thank you for your excellent work. I have encountered some difficulties reproducing the results of ASearcher-Web-QwQ on GAIA. I am only able to achieve 41.7% accuracy, which is significantly lower than the 52.8% reported in the paper. 
Here is my eval script:
```
MODEL_PATH=ASearcher-Web-QwQ_inclusionAI
DATA_NAMES=GAIA
AGENT_TYPE=asearcher-reasoning
PROMPT_TYPE=asearcher-reasoning
SEARCH_CLIENT_TYPE=async-web-search-access

python3 search_eval_async.py \
    --data_names ${DATA_NAMES} \
    --model_name_or_path ${MODEL_PATH}  \
    --output_dir results \
    --data_dir ${DATA_DIR} \
    --prompt_type $PROMPT_TYPE \
    --agent-type ${AGENT_TYPE} \
    --search-client-type ${SEARCH_CLIENT_TYPE} \
    --tensor_parallel_size 4 \
    --temperature 0.6 \
    --parallel-mode seed \
    --seed 1 \
    --aggregate-only \
    --use-jina \
    --llm_as_judge \
    --use-openai True \
    --pass-at-k 1 
```

evaluation results:

<img width="314" height="180" alt="Image" src="https://github.com/user-attachments/assets/837b2b64-7b81-452b-8212-c6506431c8db" />

JINA and SERPER APIs  are correctly set. Thank you for your help!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to reproduce results of ASearcher-Web-QwQ on GAIA. #22

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Unable to reproduce results of ASearcher-Web-QwQ on GAIA. #22

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions