-
Notifications
You must be signed in to change notification settings - Fork 14
Description
疑问描述
ais_bench --models vllm_api_general_stream --datasets gsm8k_gen_0_shot_cot_str_perf --debug --summarizer default_perf --num-prompts 1
from ais_bench.benchmark.models import VLLMCustomAPIStream
models = [
dict(
attr="service",
type=VLLMCustomAPIStream,
abbr='vllm-api-general-stream',
path="/data/Qwen25-72B",
model="qwen3",
request_rate = 0,
retry = 2,
host_ip = "100.100.135.161",
host_port = 8000,
max_out_len = 1024,
batch_size=1,
trust_remote_code=False,
generation_kwargs = dict(
temperature = 0,
ignore_eos = True
#top_k = 10,
#top_p = 0.95,
#seed = None,
#repetition_penalty = 1.03,
)
)
]
02/11 17:53:04 - AISBench - ERROR - /data/benchmark/ais_bench/benchmark/clients/base_client.py - raise_error - 35 - [AisBenchClientException] Request failed: HTTP status 400. Server response: {"error":{"message":"async scheduling with spec decoding doesn't yet support penalties, bad words or structured outputs in sampling parameters.","type":"BadRequestError","param":null,"code":400}}
前置检查
- 我已读懂主页文档的快速入门,无法解答我的疑惑