From 830558a0e7d33aafcfb2c0393ded9b482701c276 Mon Sep 17 00:00:00 2001
From: Sam Pagon <sp3692@drexel.edu>
Date: Sat, 1 Mar 2025 15:18:50 -0500
Subject: [PATCH 1/2] update vllm config to use serve

---
 README.md | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/README.md b/README.md
index 0e73f96..4569c92 100644
--- a/README.md
+++ b/README.md
@@ -235,11 +235,10 @@ We provide three model sizes on Hugging Face: **2B**, **7B**, and **72B**. To ac
 
 
 #### Start an OpenAI API Service
-Run the command below to start an OpenAI-compatible API service. It is recommended to set the tensor parallel size `-tp=1` for 7B models and `-tp=4` for 72B models.
+Run the command below to start an OpenAI-compatible API service. It is recommended to set the tensor parallel size `--tensor-parallel-size 1` for 7B models and `--tensor-parallel-size 4` for 72B models.
 
 ```bash
-python -m vllm.entrypoints.openai.api_server --served-model-name ui-tars \
-    --model <path to your model> --limit-mm-per-prompt image=5 -tp <tp>
+vllm serve "<path_to_model>" --served-model-name ui-tars --limit-mm-per-prompt image=5 --tensor-parallel-size <tp>
 ```
 
 Then you can use the chat API as below with the gui prompt (choose from mobile or computer) and base64-encoded local images (see [OpenAI API protocol document](https://platform.openai.com/docs/guides/vision/uploading-base-64-encoded-images) for more details), you can also use it in [UI-TARS-desktop](https://github.com/bytedance/UI-TARS-desktop):

From 8537e8d08fb509d4e9a378cd538b21ed1bfb56bc Mon Sep 17 00:00:00 2001
From: Sam Pagon <sp3692@drexel.edu>
Date: Sat, 1 Mar 2025 15:20:33 -0500
Subject: [PATCH 2/2] update vllm config to use serve

---
 README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/README.md b/README.md
index 4569c92..3f3271d 100644
--- a/README.md
+++ b/README.md
@@ -235,7 +235,7 @@ We provide three model sizes on Hugging Face: **2B**, **7B**, and **72B**. To ac
 
 
 #### Start an OpenAI API Service
-Run the command below to start an OpenAI-compatible API service. It is recommended to set the tensor parallel size `--tensor-parallel-size 1` for 7B models and `--tensor-parallel-size 4` for 72B models.
+Run the command below to start an OpenAI-compatible API service. It is recommended to set `--tensor-parallel-size 1` for 7B models and `--tensor-parallel-size 4` for 72B models.
 
 ```bash
 vllm serve "<path_to_model>" --served-model-name ui-tars --limit-mm-per-prompt image=5 --tensor-parallel-size <tp>