Current vidur backend support

We’re attempting to reproduce the simulation results and observed that when comparing against **vLLM 0.9.1** benchmarks, the P50 latency differs by **700%**.  May I ask if vLLM v1 is supported by Vidur? If not, which framework and version does Vidur use to reproduce the published results?

Specifically, when running the example command in the **README** (shown below), which LLM engine should we use to validate the simulation output? Is Vidur’s simulation based on vLLM or Sarathi-Serve?


When using vLLM 0.9.1, the `mooncake_conversation_trace.csv` trace fails because the total token length exceeds the `max_model_len = 8192` limit for Meta-Llama-3-8B. Even after scaling down the token length and rerunning, the simulated latency still does not match vLLM’s measurements. Which framework does Vidur currently support, and what trace/configuration settings would you recommend for reproducing the results accurately?

![Image](https://github.com/user-attachments/assets/9385c8db-b54a-4594-9306-28b460471293)

```
python -m vidur.main \
--time_limit 10800 \
--replica_config_model_name meta-llama/Meta-Llama-3-8B \
--replica_config_device h100 \
--replica_config_network_device h100_dgx \
--cluster_config_num_replicas 8 \
--replica_config_tensor_parallel_size 1 \
--replica_config_num_pipeline_stages 1 \
--request_generator_config_type synthetic \
--synthetic_request_generator_config_num_requests 128 \
--length_generator_config_type trace \
--trace_request_length_generator_config_trace_file ./data/processed_traces/mooncake_conversation_trace.csv \
--interval_generator_config_type poisson \
--poisson_request_interval_generator_config_qps 8.0 \
--global_scheduler_config_type round_robin \
--replica_scheduler_config_type vllm_v1 \
--vllm_v1_scheduler_config_chunk_size 512 \
--vllm_v1_scheduler_config_batch_size_cap 512 \
--cache_config_enable_prefix_caching
```


 



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Current vidur backend support #60

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Current vidur backend support #60

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions