You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: content/blog/2025-11-18-1763464399.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -12,7 +12,7 @@ tags:
12
12
13
13
Successfully compiled the VAE of Stable Diffusion 1.5 using [graph-compiler](https://github.com/cmdr2/graph-compiler).
14
14
15
-
The compiled model is terribly slow because I haven't written any performance optimizations, and it (conservatively) converts a lot of intermediate tensors to contiguous copies. But we don't need a lot of clever optimizations to get to decent performance.
15
+
The compiled model is terribly slow because I haven't written any performance optimizations, and it (conservatively) converts a lot of intermediate tensors to contiguous copies. But we don't need any clever optimizations to get to decent performance, just basic ones.
16
16
17
17
It's pretty exciting because I was able to bypass the need to port the model to C++ manually. Instead, I was able to just compile the exported ONNX model and get the same output values as the original PyTorch implementation (given the same input and weights). I could compile to any platform supported by ggml by just changing one flag (e.g. CPU, CUDA, ROCm, Vulkan, Metal etc).
0 commit comments