Skip to content

[Bug] mmlu_pro数据集评测精度结果都是0 #125

@zhangguinan

Description

@zhangguinan

操作系统及版本

Ubuntu 22.04.5 LTS

安装工具的python环境

在anaconda/miniconda创建的python虚拟环境

python版本

3.11

AISBench工具版本

3.11.10

AISBench执行命令

ais_bench --models vllm_api_general_chat --datasets mmlu_pro_gen_5_shot_str --debug

模型配置文件或自定义配置文件内容

root@glmTester:/home# cat /home/benchmark/ais_bench/benchmark/configs/models/vllm_api/vllm_api_general_chat.py
from ais_bench.benchmark.models import VLLMCustomAPIChat
from ais_bench.benchmark.utils.model_postprocessors import extract_non_reasoning_content

models = [
dict(
attr="service",
type=VLLMCustomAPIChat,
abbr='vllm-api-general-chat',
path="/data/Qwen3-32B",
model="qwen3",
request_rate = 0,
retry = 2,
host_ip = "100.100.*.**",
host_port = 8011,
max_out_len = 8000,
batch_size=16,
trust_remote_code=False,
generation_kwargs = dict(
temperature = 0,
ignore_eos = False
)
)
]

预期行为

测试结果正常

实际行为

结果都是0

前置检查

  • 我已读懂主页文档的快速入门,无法解决问题
  • 我已检索过FAQ,无重复问题
  • 我已搜索过现有Issue,无重复问题
  • 我已更新到最新版本,问题仍存在

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingcontent_check_passedissue content check passed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions