sanic server script #380

tukwila · 2025-09-30T03:06:50Z

Summary

#341

Details

[ ]

Test Plan

start sanic server with or without parameters:

python tests/unit/sanic_server.py

python tests/unit/sanic_server.py --host=0.0.0.0 --port=8000 --workers=4 --debug

execute guidellm benchmark test

(myenv) guidellm % guidellm benchmark \
 --target "http://localhost:8000/" \
--model "mock-qwen-2.5" \
 --rate-type "synchronous" \
 --processor "${local_path}/Qwen2.5-1.5B-Instruct" \
 --data "prompt_tokens=512,output_tokens=256, samples=10"
Creating backend...
Backend openai_http connected to http://localhost:8000/ for model mock-qwen-2.5.
Creating request loader...
Created loader with 10 unique requests from prompt_tokens=512,output_tokens=256,
samples=10.


╭─ Benchmarks ─────────────────────────────────────────────────────────────────╮
│ [0… syn… (c… Req:    0.0 req/s,    0.76s Lat,     0.0 Conc,      10 Comp,  … │
│              Tok:    9.6 gen/s,   28.9 tot/s,   8.5ms TTFT,    2.8ms ITL,  … │
╰──────────────────────────────────────────────────────────────────────────────╯
Generating... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ (1/1) [ 0:15:02 < 0:00:00 ]


Benchmarks Metadata:
    Run id:01c96eea-9e56-467b-8e39-35c67087f6ea
    Duration:903.3 seconds
    Profile:type=synchronous, strategies=['synchronous']
    Args:max_number=10, max_duration=None, warmup_number=None,
    warmup_duration=None, cooldown_number=None, cooldown_duration=None
    Worker:type_='generative_requests_worker' backend_type='openai_http'
    backend_target='http://localhost:8000' backend_model='mock-qwen-2.5'
    backend_info={'max_output_tokens': 16384, 'timeout': 300, 'http2': True,
    'follow_redirects': True, 'headers': {}, 'text_completions_path':
    '/v1/completions', 'chat_completions_path': '/v1/chat/completions'}
    Request Loader:type_='generative_request_loader'
    data='prompt_tokens=512,output_tokens=256, samples=10' data_args=None
    processor='${local_path}/Qwen2.5-1.5B-Instruct'
    processor_args=None
    Extras:None


Benchmarks Info:
================================================================================
======================================================================
Metadata                                    |||| Requests Made  ||| Prompt
Tok/Req  ||| Output Tok/Req  ||| Prompt Tok Total  ||| Output Tok Total  ||
  Benchmark| Start Time| End Time| Duration (s)|  Comp|  Inc|  Err|   Comp|
Inc|  Err|   Comp|  Inc|  Err|   Comp|   Inc|   Err|   Comp|   Inc|   Err
-----------|-----------|---------|-------------|------|-----|-----|-------|-----
|-----|-------|-----|-----|-------|------|------|-------|------|------
synchronous|   04:37:43| 04:42:09|        265.8|    10|    0|    0|  512.0|
0.0|  0.0|  256.0|  0.0|  0.0|   5120|     0|     0|   2560|     0|     0
================================================================================
======================================================================


Benchmarks Stats:
================================================================================
===============================================================
Metadata   | Request Stats         || Out Tok/sec| Tot Tok/sec| Req Latency
(sec)  ||| TTFT (ms)       ||| ITL (ms)       ||| TPOT (ms)      ||
  Benchmark| Per Second| Concurrency|        mean|        mean|  mean|  median|
p99| mean| median|  p99| mean| median| p99| mean| median| p99
-----------|-----------|------------|------------|------------|------|--------|-
-----|-----|-------|-----|-----|-------|----|-----|-------|----
synchronous|       0.04|        0.03|         9.6|        28.9|  0.76|    0.76|
0.88|  8.5|    4.7| 41.5|  2.8|    2.8| 3.3|  2.8|    2.8| 3.3
================================================================================
===============================================================

Saving benchmarks report...
Benchmarks report saved to ${local_path}/guidellm/benchmarks.json

Benchmarking complete.

Related Issues

Resolves # [Refactor] implement the guidellm.mock_server package #341

"I certify that all code in this PR is my own, except as noted below."

Use of AI

Includes AI-assisted code completion
Includes code generated by an AI application
Includes AI-generated tests (NOTE: AI written tests should have a docstring that includes ## WRITTEN BY AI ##)

Signed-off-by: guangli.bao <guangli.bao@daocloud.io>

markurtz · 2025-10-01T12:13:27Z

Thanks for the contribution @tukwila! There's a very large refactor ongoing currently that introduces some of this. Could you take a look at adapting this PR on top of the refactor branch and fixing anything that's missing there? #351

tukwila marked this pull request as draft September 30, 2025 03:07

tukwila mentioned this pull request Sep 30, 2025

[Refactor] implement the guidellm.mock_server package #341

Open

sanic server script

48792f9

Signed-off-by: guangli.bao <guangli.bao@daocloud.io>

tukwila force-pushed the sanic_server branch from 25ad1ac to 48792f9 Compare September 30, 2025 04:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

sanic server script #380

sanic server script #380

tukwila commented Sep 30, 2025

Uh oh!

markurtz commented Oct 1, 2025

Uh oh!

Uh oh!

sanic server script #380

Are you sure you want to change the base?

sanic server script #380

Conversation

tukwila commented Sep 30, 2025

Summary

Details

Test Plan

Related Issues

Use of AI

Uh oh!

markurtz commented Oct 1, 2025

Uh oh!

Uh oh!