-
-
Notifications
You must be signed in to change notification settings - Fork 122
Inaccurate Coordinate Outputs for MLX-Quantized UI-TARS-1.5 (4bit/6bit) #330
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@francedot this should be similar to this #319 |
@francedot can you take a pull from my fork and validate if it works as expected? |
i pulled from your fork (commit b355cf1 from today), but i am still noticing a bug with the coordinate outputs: mlx-vlm ui-tars: Clipboard-20250505-145258-678.mp4torch ui-tars: Clipboard-20250505-145856-783.mp4misc environment details:
|
hey @prncvrm any fix for the above? keen on trying to get the mlx model to work |
yes, the new UITars1.5 uses qwen2.5VL, while the fix i've raised was for Qwen2VL |
Awesome, feel free to ping me when it ready! I also left a comment on your current PR. |
We've tested the quantized version of the UI-TARS-1.5 model (4-bit and 6-bit quantization) implemented with MLX. The work-in-progress implementation can be found here:
https://github.com/trycua/cua/tree/feature/agent/uitars-mlx
Problems Observed:
Testing Setup:
Artifacts:
cua_uitars_trajectories.zip
Environment Details:
cua uitars trajectories.zip
The text was updated successfully, but these errors were encountered: