while testing the 'modelscope.cn/Qwen/Qwen3-Embedding-8B-GGUF: Q8_0.gguf' model, via LM Studio, I got an incredible RAM and VRAM usage, and LM Studio finally gave up trying to continue loading it, throwing an unknow error
But via Ollama, it won't happen,
I'm not clear on what's happening in LM studio to load an embedding model, but I guess a parameter called parallel is causing this, and I don't know how to set it.
Does anyone know how to solve this? Thanks so much