Qwen3-text-embedding-8b loading enhancement require

<img width="2559" height="1543" alt="Image" src="https://github.com/user-attachments/assets/16043488-1f22-4aa3-924b-b59d1fb44aca" />

while testing the 'modelscope.cn/Qwen/Qwen3-Embedding-8B-GGUF: Q8_0.gguf' model, via LM Studio, I got an incredible RAM and VRAM usage, and LM Studio finally gave up trying to continue loading it, throwing an unknow error

But via Ollama, it won't happen,  

<img width="1945" height="868" alt="Image" src="https://github.com/user-attachments/assets/2054eea6-48ce-43cf-ab6a-a58af7061bf9" />

I'm not clear on what's happening in LM  studio to load an embedding model, but I guess a parameter called parallel is causing this, and I don't know how to set it. 

Does anyone know how to solve this?  Thanks so much

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Qwen3-text-embedding-8b loading enhancement require #504

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Qwen3-text-embedding-8b loading enhancement require #504

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions