Container with a model and inference code
GPT-OSS:20b Tokens/sec for various M series and NVidia Ampere, Ada and Grace Blackwell GPUs
20260120 - ObrienlabsDev/blog#160 and https://github.com/ObrienlabsDev/gcp-infrastructure-as-code/tree/main/m10-3rdparty/m11-nvidia-gce-terraform-ubuntu