Skip to content

Commit 690807f

Browse files
authored
Update llm-scaler-omni b2 (#122)
* init * update
1 parent 0015d67 commit 690807f

20 files changed

+6613
-93
lines changed

omni/README.md

Lines changed: 21 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,12 @@
1313

1414
## Getting Started with Omni Docker Image
1515

16-
Build docker image:
16+
Pull docker image from dockerhub:
17+
```bash
18+
docker pull intel/llm-scaler-omni:0.1-b2
19+
```
20+
21+
Or build docker image:
1722

1823
```bash
1924
bash build.sh
@@ -22,7 +27,7 @@ bash build.sh
2227
Run docker image:
2328

2429
```bash
25-
export DOCKER_IMAGE=intel/llm-scaler-omni:0.1-b1
30+
export DOCKER_IMAGE=intel/llm-scaler-omni:0.1-b2
2631
export CONTAINER_NAME=comfyui
2732
export MODEL_DIR=<your_model_dir>
2833
export COMFYUI_MODEL_DIR=<your_comfyui_model_dir>
@@ -45,24 +50,30 @@ docker exec -it comfyui bash
4550
```bash
4651
cd /llm/ComfyUI
4752

48-
MODEL_PATH=<your_comfyui_models_path>
49-
rm -rf /llm/ComfyUI/models
50-
ln -s $MODEL_PATH /llm/ComfyUI/models
51-
echo "Symbolic link created from $MODEL_PATH to /llm/ComfyUI/models"
52-
5353
export http_proxy=<your_proxy>
5454
export https_proxy=<your_proxy>
5555
export no_proxy=localhost,127.0.0.1
5656

5757
python3 main.py
5858
```
5959

60-
Then you can access the webUI at `http://<your_local_ip>:8188/`. On the left side,
60+
Then you can access the webUI at `http://<your_local_ip>:8188/`.
61+
62+
### (Optional) Preview settings for ComfyUI
63+
64+
Click the button on the top-right corner to launch ComfyUI Manager.
65+
![comfyui_manager_logo](./assets/comfyui_manager_logo.png)
66+
67+
Modify the `Preview method` to show the preview image during sampling iterations.
68+
69+
![comfyui_manager_preview](./assets/comfyui_manager_preview.png)
6170

62-
![workflow image](./assets/confyui_workflow.png)
6371

6472
### ComfyUI workflows
6573

74+
On the left side of the web UI, you can find the workflows logo.
75+
![workflow image](./assets/confyui_workflow.png)
76+
6677
Currently, the following workflows are supported on B60:
6778
- Qwen-Image (refer to https://raw.githubusercontent.com/Comfy-Org/example_workflows/main/image/qwen/image_qwen_image_distill.json)
6879
- Qwen-Image-Edit (refer to https://raw.githubusercontent.com/Comfy-Org/workflow_templates/refs/heads/main/templates/image_qwen_image_edit.json)
@@ -109,7 +120,6 @@ Set the `GPU` and `ulysses_degree` in `Ray Init Actor` node to GPU nums you want
109120
## XInference
110121

111122
```bash
112-
export ZE_AFFINITY_MASK=0 # In multi XPU environment, clearly select GPU index to avoid issues.
113123
xinference-local --host 0.0.0.0 --port 9997
114124
```
115125
Supported models:
@@ -139,10 +149,9 @@ Supported models:
139149
You can select model and launch service via WebUI (refer to [here](#1-access-xinference-web-ui)) or by command:
140150

141151
```bash
142-
export ZE_AFFINITY_MASK=0 # In multi XPU environment, clearly select GPU index to avoid issues.
143152
xinference-local --host 0.0.0.0 --port 9997
144153

145-
xinference launch --model-name sd3.5-medium --model-type image --model-path /llm/models/stable-diffusion-3.5-medium/
154+
xinference launch --model-name sd3.5-medium --model-type image --model-path /llm/models/stable-diffusion-3.5-medium/ --gpu-idx 0
146155
```
147156

148157
#### 2. Post request in OpenAI API format
4.08 KB
Loading
107 KB
Loading

omni/build.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,4 +3,4 @@ set -x
33
export HTTP_PROXY=<your_http_proxy>
44
export HTTPS_PROXY=<your_https_proxy>
55

6-
docker build -f ./docker/Dockerfile . -t intel/llm-scaler-omni:0.1-b1 --build-arg https_proxy=$HTTPS_PROXY --build-arg http_proxy=$HTTP_PROXY
6+
docker build -f ./docker/Dockerfile . -t intel/llm-scaler-omni:0.1-b2 --build-arg https_proxy=$HTTPS_PROXY --build-arg http_proxy=$HTTP_PROXY

omni/docker/Dockerfile

Lines changed: 20 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,10 @@ ENV LD_LIBRARY_PATH="/usr/local/lib:/usr/local/lib/python3.10/dist-packages/torc
1111
COPY ./patches/yunchang_for_multi_arc.patch /tmp/
1212
COPY ./patches/xdit_for_multi_arc.patch /tmp/
1313
COPY ./patches/raylight_for_multi_arc.patch /tmp/
14+
COPY ./patches/xinference_device_utils.patch /tmp/
15+
COPY ./patches/comfyui_for_multi_arc.patch /tmp/
16+
COPY ./patches/comfyui_voxcpm_for_xpu.patch /tmp/
17+
1418

1519
# Add Intel oneAPI repo and PPA for GPU support
1620
RUN wget -O- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB | gpg --dearmor | tee /usr/share/keyrings/oneapi-archive-keyring.gpg > /dev/null && \
@@ -50,7 +54,8 @@ RUN wget -O- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRO
5054
cd /llm && \
5155
git clone https://github.com/comfyanonymous/ComfyUI.git && \
5256
cd ComfyUI && \
53-
git checkout 72212fef660bcd7d9702fa52011d089c027a64d8 && \
57+
git checkout 51696e3fdcdfad657cb15854345fbcbbe70eef8d && \
58+
git apply /tmp/comfyui_for_multi_arc.patch && \
5459
pip install -r requirements.txt && \
5560
cd custom_nodes && \
5661
git clone https://github.com/ltdrdata/ComfyUI-Manager.git comfyui-manager && \
@@ -60,19 +65,32 @@ RUN wget -O- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRO
6065
cd .. && \
6166
git clone https://github.com/komikndr/raylight.git && \
6267
cd raylight && \
63-
git checkout 290c934cdd498b003fbf083e74e91ffc8edb961a && \
68+
git checkout ff8e90ba1f2c2d23e3ac23746910ddfb523fc8f1 && \
6469
git apply /tmp/raylight_for_multi_arc.patch && \
6570
pip install -r requirements.txt && \
6671
cd .. && \
6772
git clone https://github.com/yolain/ComfyUI-Easy-Use.git comfyui-easy-use && \
6873
cd comfyui-easy-use && \
6974
pip install -r requirements.txt && \
75+
cd .. && \
76+
git clone https://github.com/Fannovel16/comfyui_controlnet_aux.git && \
77+
cd comfyui_controlnet_aux && \
78+
apt install libcairo2-dev pkg-config python3-dev -y && \
79+
pip install -r requirements.txt && \
80+
cd .. && \
81+
git clone https://github.com/wildminder/ComfyUI-VoxCPM.git comfyui-voxcpm && \
82+
cd comfyui-voxcpm && \
83+
git checkout 044dd93c0effc9090fb279117de5db4cd90242a0 && \
84+
git apply /tmp/comfyui_voxcpm_for_xpu.patch && \
85+
pip install -r requirements.txt && \
7086
# Install Xinference
7187
pip install "xinference[transformers]" && \
88+
patch /usr/local/lib/python3.10/dist-packages/xinference/device_utils.py < /tmp/xinference_device_utils.patch && \
7289
pip install kokoro Jinja2==3.1.6 jieba ordered-set pypinyin cn2an pypinyin-dict && \
7390
# Clean
7491
rm -rf /tmp/*
7592

7693
COPY ./workflows/* /llm/ComfyUI/user/default/workflows/
94+
COPY ./example_inputs/* /llm/ComfyUI/input/
7795

7896
WORKDIR /llm/ComfyUI
1.88 MB
Loading
1.37 MB
Binary file not shown.
118 KB
Loading
580 KB
Binary file not shown.
Lines changed: 112 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,112 @@
1+
diff --git a/comfy/model_management.py b/comfy/model_management.py
2+
index 709ebc40..c43e8eab 100644
3+
--- a/comfy/model_management.py
4+
+++ b/comfy/model_management.py
5+
@@ -148,6 +148,90 @@ def is_intel_xpu():
6+
return True
7+
return False
8+
9+
+import os
10+
+if is_intel_xpu() and os.environ.get("_LLM_SCALER_DISABLE_INTERPOLATE_FIX") != "1":
11+
+ import torch
12+
+ import torch.nn.functional as F
13+
+ import functools # Used to preserve function metadata like docstrings
14+
+
15+
+ # Global variables to store the original function and patch status
16+
+ _original_interpolate_func = None
17+
+ _is_interpolate_patched = False
18+
+
19+
+
20+
+ def patch_xpu_interpolate_to_cpu():
21+
+ """
22+
+ patches torch.nn.functional.interpolate. If an input tensor is on an XPU device,
23+
+ it will be moved to CPU for interpolation, and the result will be moved back
24+
+ to the original XPU device.
25+
+ """
26+
+ global _original_interpolate_func, _is_interpolate_patched
27+
+
28+
+ if _is_interpolate_patched:
29+
+ print("torch.nn.functional.interpolate is already patched for XPU. Skipping.")
30+
+ return
31+
+
32+
+ # Store the original function
33+
+ _original_interpolate_func = F.interpolate
34+
+
35+
+ @functools.wraps(_original_interpolate_func)
36+
+ def _custom_interpolate(input_tensor, *args, **kwargs):
37+
+ """
38+
+ Custom wrapper for interpolate. Moves XPU tensors to CPU for computation.
39+
+ """
40+
+
41+
+ if input_tensor.device.type == "xpu":
42+
+ # print(
43+
+ # f"Intercepted interpolate call for XPU tensor at device {input_tensor.device}. Moving to CPU for computation."
44+
+ # )
45+
+ original_device = input_tensor.device
46+
+
47+
+ # Move input to CPU
48+
+ input_on_cpu = input_tensor.to("cpu")
49+
+
50+
+ # Call the original interpolate function on CPU
51+
+ result_on_cpu = _original_interpolate_func(input_on_cpu, *args, **kwargs)
52+
+
53+
+ # Move the result back to the original XPU device
54+
+ result_on_xpu = result_on_cpu.to(original_device)
55+
+ # print(
56+
+ # f"Interpolation completed on CPU, result moved back to {original_device}."
57+
+ # )
58+
+ return result_on_xpu
59+
+ else:
60+
+ # If not an XPU tensor, just call the original function directly
61+
+ return _original_interpolate_func(input_tensor, *args, **kwargs)
62+
+
63+
+ # Replace the original function with our custom one
64+
+ F.interpolate = _custom_interpolate
65+
+ _is_interpolate_patched = True
66+
+ print(
67+
+ "Successfully patched torch.nn.functional.interpolate to handle XPU tensors on CPU."
68+
+ )
69+
+
70+
+
71+
+ def unpatch_xpu_interpolate_to_cpu():
72+
+ """
73+
+ Restores the original torch.nn.functional.interpolate function if it was patched.
74+
+ """
75+
+ global _original_interpolate_func, _is_interpolate_patched
76+
+
77+
+ if not _is_interpolate_patched:
78+
+ print(
79+
+ "torch.nn.functional.interpolate is not currently patched. Skipping unpatch."
80+
+ )
81+
+ return
82+
+
83+
+ if _original_interpolate_func is not None:
84+
+ F.interpolate = _original_interpolate_func
85+
+ _original_interpolate_func = None
86+
+ _is_interpolate_patched = False
87+
+ print("Successfully unpatched torch.nn.functional.interpolate.")
88+
+ else:
89+
+ print("Error: Could not unpatch. Original function reference missing.")
90+
+
91+
+
92+
+ patch_xpu_interpolate_to_cpu()
93+
def is_ascend_npu():
94+
global npu_available
95+
if npu_available:
96+
@@ -720,7 +804,6 @@ def cleanup_models_gc():
97+
logging.warning("WARNING, memory leak with model {}. Please make sure it is not being referenced from somewhere.".format(cur.real_model().__class__.__name__))
98+
99+
100+
-
101+
def cleanup_models():
102+
to_delete = []
103+
for i in range(len(current_loaded_models)):
104+
@@ -1399,7 +1482,7 @@ def unload_all_models():
105+
free_memory(1e30, get_torch_device())
106+
107+
108+
-#TODO: might be cleaner to put this somewhere else
109+
+# TODO: might be cleaner to put this somewhere else
110+
import threading
111+
112+
class InterruptProcessingException(Exception):

0 commit comments

Comments
 (0)