Very slow on my nvidia 3090: 44 tok/s

I don't understand. I have a very slow Qwen3-8B model on my nvidia 3090 video card. Am I doing something wrong? I tried the Transformers solution.
DFlash
Metric Value
Speed ​​44.4 tok/s
Time 54.48 s
Generated 2419 tokens
Input 23 tokens
Block size 16

LM Studio's speculative decoding reaches a speed of 83 tokens/sec: Qwen3 8B+Qwen3 1.7B.