feat(container)!: Update image quay.io/go-skynet/local-ai ( v3.12.1 ➔ v4.1.3 ) by nerdz-bot[bot] · Pull Request #1722 · gavinmcfall/home-ops

nerdz-bot · 2026-03-14T19:10:39Z

ℹ️ Note

This PR body was truncated due to platform limits.

This PR contains the following updates:

Package	Update	Change
quay.io/go-skynet/local-ai	major	`v3.12.1-gpu-nvidia-cuda-12` → `v4.1.3-gpu-nvidia-cuda-12`
quay.io/go-skynet/local-ai	major	`v3.12.1-gpu-intel` → `v4.1.3-gpu-intel`

Warning

Some dependencies could not be looked up. Check the Dependency Dashboard for more information.

Release Notes

mudler/LocalAI (quay.io/go-skynet/local-ai)

`v4.1.3`

Compare Source

What's Changed

Bug fixes 🐛

fix(token): login via legacy api keys by @mudler in #9249
fix(anthropic): do not emit empty tokens and fix SSE tool calls by @mudler in #9258
fix(gpu): better detection for MacOS and Thor by @mudler in #9263

👒 Dependencies

chore(deps): bump google.golang.org/grpc from 1.79.3 to 1.80.0 by @dependabot[bot] in #9253
chore(deps): bump github.com/jaypipes/ghw from 0.23.0 to 0.24.0 by @dependabot[bot] in #9250
chore(deps): bump github.com/aws/aws-sdk-go-v2/config from 1.32.12 to 1.32.14 by @dependabot[bot] in #9256
chore(deps): bump go.opentelemetry.io/otel/exporters/prometheus from 0.64.0 to 0.65.0 by @dependabot[bot] in #9254

Other Changes

chore: ⬆️ Update ggml-org/llama.cpp to d0a6dfeb28a09831d904fc4d910ddb740da82834 by @localai-bot in #9259
docs: ⬆️ update docs version mudler/LocalAI by @localai-bot in #9260
chore: ⬆️ Update ace-step/acestep.cpp to e0c8d75a672fca5684c88c68dbf6d12f58754258 by @localai-bot in #9261
chore: ⬆️ Update leejet/stable-diffusion.cpp to 8afbeb6ba9702c15d41a38296f2ab1fe5c829fa0 by @localai-bot in #9262

Full Changelog: mudler/LocalAI@v4.1.2...v4.1.3

`v4.1.2`

Compare Source

What's Changed

Bug fixes 🐛

fix(autoparser): correctly pass by logprobs by @mudler in #9239
fix(chat): do not retry if we had chatdeltas or tooldeltas from backend by @mudler in #9244

Exciting New Features 🎉

feat(llama.cpp): wire speculative decoding settings by @mudler in #9238

Other Changes

Update index.yaml and add Qwen3.5 model files by @ER-EPR in #9237
chore: ⬆️ Update ggml-org/llama.cpp to 761797ffdf2ce3f118e82c663b1ad7d935fbd656 by @localai-bot in #9243
chore: ⬆️ Update leejet/stable-diffusion.cpp to 7397ddaa86f4e8837d5261724678cde0f36d4d89 by @localai-bot in #9242
docs: ⬆️ update docs version mudler/LocalAI by @localai-bot in #9241

Full Changelog: mudler/LocalAI@v4.1.1...v4.1.2

`v4.1.1`

Compare Source

This is a patch release to address few regressions from the last release and the upcoming Gemma4, most importantly:

Fixes Gemma 4 tokenization with llama.cpp
Show login in api key only mode
Small fixes to improve Anthropic API compatibility

What's Changed

Other Changes

docs: Update Home Assistant integrations list by @loryanstrant in #9206
chore: ⬆️ Update ggml-org/llama.cpp to a1cfb645307edc61a89e41557f290f441043d3c2 by @localai-bot in #9203
chore(model gallery): 🤖 add 1 new models via gallery agent by @localai-bot in #9210
chore: bump inference defaults from unsloth by @github-actions[bot] in #9219
docs: ⬆️ update docs version mudler/LocalAI by @localai-bot in #9214
chore: ⬆️ Update ggml-org/llama.cpp to d006858316d4650bb4da0c6923294ccd741caefd by @localai-bot in #9215
fix(ui): pass by staticApiKeyRequired to show login when only api key is configured by @mudler in #9220
feat(gemma4): add thinking support by @mudler in #9221
fix(nats): improve error handling by @mudler in #9222
feat(autoparser): prefer chat deltas from backends when emitted by @mudler in #9224
fix(anthropic): show null index when not present, default to 0 by @mudler in #9225
feat(api): Allow coding agents to interactively discover how to control and configure LocalAI by @richiejp in #9084
chore(refactor): use interface by @mudler in #9226
fix(reasoning): accumulate and strip reasoning tags from autoparser results by @mudler in #9227
chore(model-gallery): ⬆️ update checksum by @localai-bot in #9233
chore: ⬆️ Update ggml-org/llama.cpp to b8635075ffe27b135c49afb9a8b5c434bd42c502 by @localai-bot in #9231

New Contributors

@github-actions[bot] made their first contribution in #9219

Full Changelog: mudler/LocalAI@v4.1.0...v4.1.1

`v4.1.0`

Compare Source

🎉 LocalAI 4.1.0 Release! 🚀

LocalAI 4.1.0 is out! 🔥

Just weeks after the landmark 4.0, we're back with another massive drop. This release turns LocalAI into a production-grade AI platform: spin up a distributed cluster with smart routing and autoscaling, lock it down with built-in auth and per-user quotas, fine-tune models without leaving the UI, and much more. If 4.0 was the foundation, 4.1 is the control tower.

Feature	Summary
🌐 Distributed Mode	Run LocalAI as a cluster — smart routing, node groups, drain/resume, min/max autoscaling.
🔐 Users & Auth	Built-in user management with OIDC, invite mode, API keys, and admin impersonation.
📊 Quota System	Per-user usage quotas with predictive analytics and breakdown dashboards.
🧪 Fine-Tuning	(experimental) Fine-tune models with TRL, auto-export to GGUF, and import back — all from the UI.
⚗️ Quantization	(experimental) New backend for on-the-fly model quantization.
🔧 Pipeline Editor	Visual model pipeline editor in the React UI.
🤖 Standalone Agents	Run agents from the CLI with `local-ai agent run`.
🧠 Smart Inferencing	Auto inference defaults from Unsloth, tool parsing fallback, and `min_p` support.
🎬 Media History	Browse past generated images and media in Studio pages.

New (long version) Full setup walktrough: https://www.youtube.com/watch?v=cMVNnlqwfw4

🚀 Key Features

🌐 Distributed Mode: scaling LocalAI horizontally

Run LocalAI as a distributed cluster and let it figure out where to send your requests. No more single-node bottlenecks.

Smart Routing: Requests are routed to nodes ordered by available VRAM — the beefiest, free GPU gets the job.
Node Groups: Pin models to specific node groups for workload isolation (e.g., "gpu-heavy" vs "cpu-light").
Autoscaling: Built-in min/max autoscaler with a node reconciler that manages the lifecycle automatically.
Drain & Resume: Gracefully drain nodes for maintenance and bring them back with a single API call.
Cluster Dashboard: See your entire cluster status at a glance from the home page.
Smart Model transfer: Use S3 or transfer via peer to peer

distributed-mode.mp4

🔐 Users, Authentication & Quotas

LocalAI now ships with a complete multi-user platform — perfect for teams, classrooms, or any shared deployment.

User Management: Create, edit, and manage users from the React UI.
OIDC/OAuth: Plug in your identity provider for SSO — Google, Keycloak, Authentik, you name it.
Invite Mode: Restrict registration to invite-only with admin approval.
API Keys: Per-user API key management.
Admin Powers: Admins can impersonate users for debugging.
Quota System: Set per-user usage quotas and enforce limits.
Usage Analytics: Predictive usage dashboard with per-user breakdown statistics.

Users and quota:

usersquota-1775167475876.mp4

Usage metrics per user:

usage.mp4

🧪 Fine-Tuning & Quantization

No more juggling external tools. Fine-tune and quantize directly inside LocalAI.

Fine-Tuning with TRL (Experimental): Train LoRA adapters with Hugging Face TRL, auto-export to GGUF, and import the result straight back into LocalAI. Includes a built-in evals framework to validate your work.
Quantization Backend: Spin up the new quantization backend to create optimized model variants on-the-fly.

quantize-fine-tune.mp4

🎨 UI

The React UI keeps getting better. This release adds serious power-user features:

Model Pipeline Editor: Visually wire up model pipelines — no YAML editing required.
Per-Model Backend Logs: Drill into logs scoped to individual models for laser-focused debugging.
Media History: Studio pages now remember your past generations — images, audio, and more.
Searchable Model/Backend Selector: Quickly find models and backends with inline search and filtering.
Structured Error Toasts: Errors now link directly to traces — one click from "something broke" to "here's why."
Tracing Settings: Inline tracing config restored with a cleaner UI.

talk.mp4

🤖 Agents & Inference

Standalone Agent Mode: Run agents straight from the terminal with local-ai agent run. Supports single-turn --prompt mode and pool-based configurations from pool.json.
Streaming Tool Calls: Agent mode tool calls now stream in real-time, with interleaved thinking fixed.
Inferencing Defaults: Automatic inference parameters sourced from Unsloth and applied to all endpoints and gallery models, your models just work better out of the box.
Tool Parsing Fallback: When native tool call parsing fails, an iterative fallback parser kicks in automatically.

🛠️ Under the Hood

Repeated Log Merging: Noisy terminals? Repeated log lines are now collapsed automatically.
Jetson/Tegra GPU Detection: First-class NVIDIA Jetson/Tegra platform detection.
Intel SYCL Fix: Auto-disables mmap for SYCL backends to prevent crashes.
llama.cpp Portability: Bundled libdl, librt, libpthread for improved cross-platform support.
HF_ENDPOINT Mirror: Downloader now rewrites HuggingFace URIs with HF_ENDPOINT for corporate/mirror setups.
Transformers >5.0: Bumped to HuggingFace Transformers >5.0 with generic model loading.
API Improvements: Proper 404s for missing models, unescaped model names, unified inferencing paths with automatic retry on transient errors.

🐞 Fixes & Improvements

Embeddings: Implemented encoding_format=base64 for the embeddings endpoint.
Kokoro TTS: Fixed phonemization model not downloading during installation.
Realtime API: Fixed Opus codec backend selection alias in development mode.
Gallery Filtering: Fixed exact tag matching for model gallery filters.
Open Responses: Fixed required ORItemParam.Arguments field being omitted; ORItemParam.Summary now always populated.
Tracing: Fixed settings not loading from runtime_settings.json.
UI: Fixed watchdog field mapping, model list refresh on deletion, backend display in model config, MCP button ordering.
Downloads: Fixed directory removal during fallback attempts; improved retry logic.
Model Paths: Fixed baseDir assignment to use ModelPath correctly.

❤️ Thank You

LocalAI is a community-powered FOSS movement. Every star, every PR, every bug report matters.

If you believe in privacy-first, self-hosted AI:

⭐ Star the repo — it helps more than you think
🛠️ Contribute code, docs, or feedback
📣 Share with your team, your community, your world

Let's keep building the future of open AI — together. 💪

✅ Full Changelog

📋 Click to expand full changelog

What's Changed

Bug fixes 🐛

fix: Change baseDir assignment to use ModelPath by @mudlhttps://github.com/mudler/LocalAI/pull/9010l/9010
fix(ui): correctly map watchdog fields by @mudlhttps://github.com/mudler/LocalAI/pull/9022l/9022
fix(api): unescape model names by @mudlhttps://github.com/mudler/LocalAI/pull/9024l/9024
fix(ui): Add tracing inline settings back and create UI tests by @richiehttps://github.com/mudler/LocalAI/pull/9027l/9027
Always populate ORItemParam.Summary by @tvhttps://github.com/mudler/LocalAI/pull/9049l/9049
fix(ui): correctly display backend if specified in the model config, re-order MCP buttons by @mudlhttps://github.com/mudler/LocalAI/pull/9053l/9053
fix(ui): Refresh model list on deletion by @richiehttps://github.com/mudler/LocalAI/pull/9059l/9059
fix(openresponses): do not omit required field ORItemParam.Arguments by @tvhttps://github.com/mudler/LocalAI/pull/9074l/9074
fix: Add tracing settings loading from runtime_settings.json by @localai-bhttps://github.com/mudler/LocalAI/pull/9081l/9081
fix: use exact tag matching for model gallery tag filtering by @majiayu0https://github.com/mudler/LocalAI/pull/9041l/9041
fix(realtime): Set the alias for opus so the development backend can be selected by @richiehttps://github.com/mudler/LocalAI/pull/9083l/9083
fix(llama.cpp): bundle libdl, librt, libpthread in llama-cpp backend by @mudlhttps://github.com/mudler/LocalAI/pull/9099l/9099
fix(download): do not remove dst dir until we try all fallbacks by @mudlhttps://github.com/mudler/LocalAI/pull/9100l/9100
fix(auth): do not allow to register in invite mode by @mudlhttps://github.com/mudler/LocalAI/pull/9101l/9101
fix(downloader): Rewrite full https HF URI with HF_ENDPOINT by @richiehttps://github.com/mudler/LocalAI/pull/9107l/9107
fix: implement encoding_format=base64 for embeddings endpoint by @walcz-https://github.com/mudler/LocalAI/pull/9135l/9135
fix(coqui,nemo,voxcpm): Add dependencies to allow CI to progress by @richiehttps://github.com/mudler/LocalAI/pull/9142l/9142
fix(voxcpm): Force using a recent voxcpm version to kick the dependency solver by @richiehttps://github.com/mudler/LocalAI/pull/9150l/9150
fix: huggingface repo change the file name so Update index.yaml is needed by @ER-Ehttps://github.com/mudler/LocalAI/pull/9163l/9163
fix(kokoro): Download phonemization model during installation by @richiehttps://github.com/mudler/LocalAI/pull/9165l/9165
fix(oauth/invite): do not register user (prending approval) without correct invite by @mudlhttps://github.com/mudler/LocalAI/pull/9189l/9189
fix(inflight): count inflight from load model, but release afterwards by @mudlhttps://github.com/mudler/LocalAI/pull/9194l/9194

Exciting New Features 🎉

feat: support streaming mode for tool calls in agent mode, fix interleaved thinking stream by @mudlhttps://github.com/mudler/LocalAI/pull/9023l/9023
feat(ui): Per model backend logs and various fixes by @richiehttps://github.com/mudler/LocalAI/pull/9028l/9028
feat(ui, gallery): Show model backends and add searchable model/backend selector by @richiehttps://github.com/mudler/LocalAI/pull/9060l/9060
feat: add users and authentication support by @mudlhttps://github.com/mudler/LocalAI/pull/9061l/9061
feat(ui, openai): Structured errors and link to traces in error toast by @richiehttps://github.com/mudler/LocalAI/pull/9068l/9068
feat(ui): Add model pipeline editor by @richiehttps://github.com/mudler/LocalAI/pull/9070l/9070
feat: add (experimental) fine-tuning support with TRL by @mudlhttps://github.com/mudler/LocalAI/pull/9088l/9088
feat(ui): add predictor for usage, user-breakdown statistics by @mudlhttps://github.com/mudler/LocalAI/pull/9091l/9091
feat: add quota system by @mudlhttps://github.com/mudler/LocalAI/pull/9090l/9090
feat(quantization): add quantization backend by @mudlhttps://github.com/mudler/LocalAI/pull/9096l/9096
feat: inferencing default, automatic tool parsing fallback and wire min_p by @mudlhttps://github.com/mudler/LocalAI/pull/9092l/9092
feat: Merge repeated log lines in the terminal by @richiehttps://github.com/mudler/LocalAI/pull/9141l/9141
feat: add distributed mode by @mudlhttps://github.com/mudler/LocalAI/pull/9124l/9124
feat(ui): Add media history to studio pages (e.g. past images) by @richiehttps://github.com/mudler/LocalAI/pull/9151l/9151
feat: add node reconciler, allow to schedule to group of nodes, min/max autoscaler by @mudlhttps://github.com/mudler/LocalAI/pull/9186l/9186
feat(api): Return 404 when model is not found except for model names in HF format by @richiehttps://github.com/mudler/LocalAI/pull/9133l/9133
feat(distributed): Avoid resending models to backend nodes by @richiehttps://github.com/mudler/LocalAI/pull/9193l/9193
feat: add resume endpoint to undrain nodes by @mudlhttps://github.com/mudler/LocalAI/pull/9197l/9197

👒 Dependencies

chore(deps): bump github.com/anthropics/anthropic-sdk-go from 1.26.0 to 1.27.0 by @dependabot[bohttps://github.com/mudler/LocalAI/pull/9035l/9035
chore(deps): bump github.com/ebitengine/purego from 0.9.1 to 0.10.0 by @dependabot[bohttps://github.com/mudler/LocalAI/pull/9034l/9034
chore(deps): bump actions/upload-artifact from 4 to 7 by @dependabot[bohttps://github.com/mudler/LocalAI/pull/9030l/9030
chore(deps): bump github.com/google/go-containerregistry from 0.21.1 to 0.21.2 by @dependabot[bohttps://github.com/mudler/LocalAI/pull/9033l/9033
chore(deps): bump playwright from 1.52.0 to 1.58.2 in /core/http/react-ui in the npm_and_yarn group across 1 directory by @dependabot[bohttps://github.com/mudler/LocalAI/pull/9055l/9055
chore(deps): bump github.com/google/go-containerregistry from 0.21.2 to 0.21.3 by @dependabot[bohttps://github.com/mudler/LocalAI/pull/9121l/9121
chore(deps): bump github.com/modelcontextprotocol/go-sdk from 1.4.0 to 1.4.1 by @dependabot[bohttps://github.com/mudler/LocalAI/pull/9118l/9118
chore(deps): bump actions/checkout from 4 to 6 by @dependabot[bohttps://github.com/mudler/LocalAI/pull/9110l/9110
chore(deps): bump peter-evans/create-pull-request from 7 to 8 by @dependabot[bohttps://github.com/mudler/LocalAI/pull/9114l/9114
chore(deps): bump github.com/mudler/skillserver from 0.0.5 to 0.0.6 by @dependabot[bohttps://github.com/mudler/LocalAI/pull/9116l/9116
chore(deps): bump actions/checkout from 4 to 6 by @dependabot[bohttps://github.com/mudler/LocalAI/pull/9173l/9173
chore(deps): bump actions/deploy-pages from 4 to 5 by @dependabot[bohttps://github.com/mudler/LocalAI/pull/9172l/9172
chore(deps): bump actions/configure-pages from 5 to 6 by @dependabot[bohttps://github.com/mudler/LocalAI/pull/9174l/9174
chore(deps): bump google.golang.org/grpc from 1.79.1 to 1.79.3 by @dependabot[bohttps://github.com/mudler/LocalAI/pull/9175l/9175
chore(deps): bump github.com/nats-io/nats.go from 1.49.0 to 1.50.0 by @dependabot[bohttps://github.com/mudler/LocalAI/pull/9183l/9183
chore(deps): bump github.com/pion/webrtc/v4 from 4.2.9 to 4.2.11 by @dependabot[bohttps://github.com/mudler/LocalAI/pull/9185l/9185
chore(deps): bump go.opentelemetry.io/otel/exporters/prometheus from 0.62.0 to 0.64.0 by @dependabot[bohttps://github.com/mudler/LocalAI/pull/9178l/9178
chore(deps): bump actions/upload-pages-artifact from 3 to 4 by @dependabot[bohttps://github.com/mudler/LocalAI/pull/9179l/9179
chore(deps): bump grpcio from 1.78.1 to 1.80.0 in /backend/python/transformers by @dependabot[bohttps://github.com/mudler/LocalAI/pull/9180l/9180
chore(deps): bump grpcio from 1.78.1 to 1.80.0 in /backend/python/vllm by @dependabot[bohttps://github.com/mudler/LocalAI/pull/9177l/9177
chore(deps): bump grpcio from 1.78.1 to 1.80.0 in /backend/python/coqui by @dependabot[bohttps://github.com/mudler/LocalAI/pull/9182l/9182
chore(deps): bump grpcio from 1.78.1 to 1.80.0 in /backend/python/rerankers by @dependabot[bohttps://github.com/mudler/LocalAI/pull/9181l/9181
chore(deps): bump grpcio from 1.78.1 to 1.80.0 in /backend/python/common/template by @dependabot[bohttps://github.com/mudler/LocalAI/pull/9176l/9176

Other Changes

docs: ⬆️ update docs version mudler/LocalAI by @localai-bhttps://github.com/mudler/LocalAI/pull/9008l/9008
chore: ⬆️ Update ggml-org/llama.cpp to 3a6f059909ed5dab8587df5df4120315053d57a4 by @localai-bhttps://github.com/mudler/LocalAI/pull/9009l/9009
fix: Automatically disable mmap for Intel SYCL backends (#9012) by @lochttps://github.com/mudler/LocalAI/pull/9015AI/pull/9015
chore: ⬆️ Update leejet/stable-diffusion.cpp to 862a6586cb6fcec037c14f9ed902329ecec7d990 by @localai-bhttps://github.com/mudler/LocalAI/pull/9019l/9019
chore: ⬆️ Update ggml-org/llama.cpp to 88915cb55c14769738fcab7f1c6eaa6dcc9c2b0c by @localai-bhttps://github.com/mudler/LocalAI/pull/9020l/9020
chore: refactor endpoints to use same inferencing path, add automatic retrial mechanism in case of errors by @mudlhttps://github.com/mudler/LocalAI/pull/9029l/9029
chore: ⬆️ Update ggml-org/whisper.cpp to 79218f51d02ffe70575ef7fba3496dfc7adda027 by @localai-bhttps://github.com/mudler/LocalAI/pull/9037l/9037
chore: ⬆️ Update ggml-org/llama.cpp to 9b342d0a9f2f4892daec065491583ec2be129685 by @localai-bhttps://github.com/mudler/LocalAI/pull/9039l/9039
chore: ⬆️ Update ace-step/acestep.cpp to 15740f4301b3ec3020875f1fb975a6cfdb2f6767 by @localai-bhttps://github.com/mudler/LocalAI/pull/9038l/9038
chore: ⬆️ Update leejet/stable-diffusion.cpp to 545fac4f3fb0117a4e962b1a04cf933a7e635933 by @localai-bhttps://github.com/mudler/LocalAI/pull/9036l/9036
chore: ⬆️ Update ggml-org/llama.cpp to ee4801e5a6ee7ee4063144ab44ab4e127f76fba8 by @localai-bhttps://github.com/mudler/LocalAI/pull/9044l/9044
chore: ⬆️ Update ggml-org/whisper.cpp to dc9611662265870df22a7230b7586176a99c1955 by @localai-bhttps://github.com/mudler/LocalAI/pull/9045l/9045
chore: ⬆️ Update ace-step/acestep.cpp to ab020a9aefcd364423e0665da12babc6b0c7b507 by @localai-bhttps://github.com/mudler/LocalAI/pull/9046l/9046
feat: Add standalone agent run mode inspired by LocalAGI by @localai-bhttps://github.com/mudler/LocalAI/pull/9056l/9056
chore: ⬆️ Update ggml-org/whisper.cpp to ef3463bb29ef90d25dfabfd1e75993111c52412d by @localai-bhttps://github.com/mudler/LocalAI/pull/9062l/9062
chore: ⬆️ Update ggml-org/llama.cpp to 5744d7ec430e2f875a393770195fda530560773f by @localai-bhttps://github.com/mudler/LocalAI/pull/9063l/9063
docs: Add troubleshooting guide for embedding models (fixes #9064) by @lochttps://github.com/mudler/LocalAI/pull/9065AI/pull/9065
feat(swagger): update swagger by @localai-bhttps://github.com/mudler/LocalAI/pull/9075l/9075
chore: ⬆️ Update ggml-org/whisper.cpp to 9386f239401074690479731c1e41683fbbeac557 by @localai-bhttps://github.com/mudler/LocalAI/pull/9077l/9077
chore(deps): bump llama-cpp to 'a0bbcdd9b6b83eeeda6f1216088f42c33d464e38' by @mudlhttps://github.com/mudler/LocalAI/pull/9079l/9079
feat(swagger): update swagger by @localai-bhttps://github.com/mudler/LocalAI/pull/9085l/9085
chore: ⬆️ Update ggml-org/llama.cpp to 4cb7e0bd61e7e1101e8ab10db5dee70c5717a386 by @localai-bhttps://github.com/mudler/LocalAI/pull/9087l/9087
chore: ⬆️ Update ace-step/acestep.cpp to 7326a7bea0c2037982ec924f7364e998df70450c by @localai-bhttps://github.com/mudler/LocalAI/pull/9086l/9086
chore: ⬆️ Update ggml-org/whisper.cpp to 76684141a5d059be71cbe23dc2f0ed552213ba2d by @localai-bhttps://github.com/mudler/LocalAI/pull/9094l/9094
chore: ⬆️ Update ggml-org/llama.cpp to 990e4d96980d0b016a2b07049cc9031642fb9903 by @localai-bhttps://github.com/mudler/LocalAI/pull/9095l/9095
chore(transformers): bump to >5.0 and generically load models by @mudlhttps://github.com/mudler/LocalAI/pull/9097l/9097
feat(swagger): update swagger by @localai-bhttps://github.com/mudler/LocalAI/pull/9103l/9103
chore: ⬆️ Update ggml-org/llama.cpp to 49bfddeca18e62fa3d39114a23e9fcbdf8a22388 by @localai-bhttps://github.com/mudler/LocalAI/pull/9102l/9102
chore: ⬆️ Update ggml-org/llama.cpp to 1772701f99dd3fc13f5783b282c2361eda8ca47c by @localai-bhttps://github.com/mudler/LocalAI/pull/9123l/9123
chore: ⬆️ Update ggml-org/llama.cpp to 9f102a1407ed5d73b8c954f32edab50f8dfa3f58 by @localai-bhttps://github.com/mudler/LocalAI/pull/9127l/9127
chore: ⬆️ Update ace-step/acestep.cpp to 6f35c874ee11e86d511b860019b84976f5b52d3a by @localai-bhttps://github.com/mudler/LocalAI/pull/9128l/9128
fix(docs): Use notice instead of alert by @richiehttps://github.com/mudler/LocalAI/pull/9134l/9134
chore: ⬆️ Update ggml-org/llama.cpp to a970515bdb0b1d09519106847660b0d0c84d2472 by @localai-bhttps://github.com/mudler/LocalAI/pull/9137l/9137
feat(swagger): update swagger by @localai-bhttps://github.com/mudler/LocalAI/pull/9136l/9136
chore: ⬆️ Update ggml-org/llama.cpp to 59d840209a5195c2f6e2e81b5f8339a0637b59d9 by @localai-bhttps://github.com/mudler/LocalAI/pull/9144l/9144
chore: ⬆️ Update leejet/stable-diffusion.cpp to f16a110f8776398ef23a2a6b7b57522c2471637a by @localai-bhttps://github.com/mudler/LocalAI/pull/9167l/9167
chore: ⬆️ Update ggml-org/whisper.cpp to 95ea8f9bfb03a15db08a8989966fd1ae3361e20d by @localai-bhttps://github.com/mudler/LocalAI/pull/9168l/9168
chore: ⬆️ Update ggml-org/llama.cpp to 7c203670f8d746382247ed369fea7fbf10df8ae0 by @localai-bhttps://github.com/mudler/LocalAI/pull/9160l/9160
chore(workers): improve logging, set header timeouts by @mudlhttps://github.com/mudler/LocalAI/pull/9171l/9171
chore(ci): Scope tests extras backend tests by @richiehttps://github.com/mudler/LocalAI/pull/9170l/9170
feat(swagger): update swagger by @localai-bhttps://github.com/mudler/LocalAI/pull/9187l/9187
chore: ⬆️ Update ggml-org/llama.cpp to 08f21453aec846867b39878500d725a05bd32683 by @localai-bhttps://github.com/mudler/LocalAI/pull/9190l/9190
stablediffusion-ggml: replace hand-maintained enum string arrays with upstream API calls by @Copilhttps://github.com/mudler/LocalAI/pull/9192l/9192
chore: ⬆️ Update leejet/stable-diffusion.cpp to 09b12d5f6d51d862749e8e0ee8baac8f012089e2 by @localai-bhttps://github.com/mudler/LocalAI/pull/9195l/9195
chore: ⬆️ Update ggml-org/llama.cpp to 0fcb3760b2b9a3a496ef14621a7e4dad7a8df90f by @localai-bhttps://github.com/mudler/LocalAI/pull/9196l/9196

New Contributors

@tv42 made their first contributihttps://github.com/mudler/LocalAI/pull/9049l/9049
@walcz-de made their first contributihttps://github.com/mudler/LocalAI/pull/9135l/9135
@ER-EPR made their first contributihttps://github.com/mudler/LocalAI/pull/9163l/9163

Full Changelog: mudler/LocalAI@v4.0.0...v4.1.0

`v4.0.0`

Compare Source

🎉 LocalAI 4.0.0 Release! 🚀

LocalAI 4.0.0 is out!

This major release transforms LocalAI into a complete AI orchestration platform. We’ve embedded agentic and hybrid search capabilities directly into the core, completely overhauled the user interface with React for a modern experience, and are thrilled to introduce Agenthub ( link ) a brand new community hub to easily share and import agents. Alongside these massive updates, we've introduced powerful new features like Canvas mode for code artifacts, MCP apps and full MCP client-side support.

Feature	Summary
Agentic Orchestration & Agenthub	Native agent management with memory, skills, and the new Agenthub for community sharing.
Revamped React UI	Complete frontend rewrite for lightning-fast performance and modern UX.
Canvas Mode	Preview code blocks and artifacts side-by-side in the chat interface.
MCP Client-Side	Full Model Context Protocol support, MCP Apps, and tool streaming in chat.
WebRTC Realtime	WebRTC support for low-latency realtime audio conversations.
New Backends	Added experimental MLX Distributed, fish-speech, ace-step.cpp, and faster-qwen3-tts.
Infrastructure	Podman documentation, shell completion, and persistent data path separation.

🚀 Key Features

🤖 Native Agentic Orchestration & Agenthub

LocalAI now includes agentic capabilities embedded directly in the core. You can manage, import, start, and stop agents via the new UI.

🌐 Agenthub: We are launching Agenthub! This is a centralized community space to share common agents and import them effortlessly into your LocalAI instance.
Agent Management: Full lifecycle management via the React UI. Create Agents, connect them to Slack, configure MCP servers and skills.
Skills Management: Centralized skill database for AI agents.
Memory: Agents can utilize memory with Hybrid search (PostgreSQL) or embedded in-memory storage (Chromem).
Observability: New "Events" column in the Agents list to track observables and status.
📚 Documentation: Dive into the new capabilities in our official Agents documentation.

agents.mp4

🎨 Revamped UI & Canvas Mode

The Web interface has been completely migrated to React, bringing a smoother experience and powerful new capabilities:

Canvas Mode: Enable "canvas mode" in the chat to see code blocks and artifacts generated by the LLM in a dedicated preview bar on the right.
System View: Tabbed navigation separating Models and Backends for better organization.
Model Size Warnings: Visual warnings when model storage exceeds system RAM to prevent lockups.
Traces: Improved trace display using accordions for better readability.

model-fit-canvas-mode.mp4

🔌 MCP Apps & Client-Side Support

We’ve expanded support for the Model Context Protocol (MCP):

MCP Apps: Select which servers to enable for the chat directly from the UI.
Tool Streaming: Tools from MCP servers are automatically injected into the standard chat interface.
Client-Side Support: Full client-side integration for MCP tools and streaming.
Disable Option: Add LOCALAI_DISABLE_MCP to completely disable MCP support for security.

🎵 New Backends, Audio & Video Enhancements

MLX Distributed (Experimental): We've added an experimental backend for running distributed workloads using Apple's MLX framework! Check out the docs here.
New Audio Backends: Introduced fish-speech, ace-step.cpp, and faster-qwen3-tts (CUDA-only).
WeRTC Realtime: WebRTC support added to the Realtime API and Talk page for better low-latency audio handling.
TTS Improvements: Added sample_rate support via post-processing and multi-voice support for Qwen TTS.
Video Generation: Fixed model selection dropdown sync and added vllm-omni backend detection.

🛠️ Infrastructure & Developer Experience

Data Separation: New --data-path CLI flag and LOCALAI_DATA_PATH env var to separate persistent data (agents, skills) from configuration.
Shell Completion: Dynamic completion scripts for bash, zsh, and fish.
Podman Support: Dedicated documentation for Podman installation and rootless configuration.
Gallery & Models: Model storage size display with RAM warnings, and fallback URI resolution for backend installation failures.
Deprecations: HuggingFace backend support removed, and AIO images dropped to focus on main images.

🐞 Fixes & Improvements

Logging: Fixed watchdog spamming logs when no interval was configured; downgraded health check logs to debug.
CUDA Detection: Improved GPU vendor checks to prevent false CUDA detection on CPU-only hosts with runtime libs.
Compatibility: Renamed json_verbose to verbose_json for OpenAI spec compliance (fixes Nextcloud integration).
Embedding: Fixed embedding dimension truncation to return full native dimensions.
Permissions: Changed model install file permissions to 0644 to ensure server readability.
Windows Docker: Added named volumes to Docker Compose files for Windows compatibility.
Model Reload: Models now reload automatically after editing YAML config (e.g., context_size).
Chat: Fixed issue where thinking/reasoning blocks were sent to the LLM.
Audio: Fixed img2img pipeline in diffusers backend and Qwen TTS duplicate argument error.

Known issues

The diffusers backend fails to build currently (due to CI limit exhaustion) and it's not currently part of this release (the previous version is still available). We are looking into it but, if you want to help and know someone at Github that could help supporting us with better ARM runners, please reach out!

❤️ Thank You

LocalAI is a true FOSS movement — built by contributors, powered by community.

If you believe in privacy-first AI:

✅ Star the repo
💬 Contribute code, docs, or feedback
📣 Share with others

Your support keeps this stack alive.

✅ Full Changelog

📋 Click to expand full changelog

What's Changed

Breaking Changes 🛠

Remove HuggingFace backend support by @localai-bhttps://github.com/mudler/LocalAI/pull/8971l/8971
chore: drop AIO images by @mudlhttps://github.com/mudler/LocalAI/pull/9004l/9004

Bug fixes 🐛

fix(cli): Fix watchdog running constantly and spamming logs by @nanoandrehttps://github.com/mudler/LocalAI/pull/8624l/8624
fix(api): Downgrade health/readiness check to debug by @nanoandrehttps://github.com/mudler/LocalAI/pull/8625l/8625
fix: rename json_verbose to verbose_json by @lukasdotchttps://github.com/mudler/LocalAI/pull/8627l/8627
fix(chatterbox): add support for cuda13/aarch64 by @mudlhttps://github.com/mudler/LocalAI/pull/8653l/8653
fix: reload model after editing YAML config (issue #8647) by @lochttps://github.com/mudler/LocalAI/pull/8652AI/pull/8652
fix(chat): do not send thinking/reasoning messages to the LLM by @mudlhttps://github.com/mudler/LocalAI/pull/8656l/8656
fix: change file permissions from 0600 to 0644 in InstallModel by @localai-bhttps://github.com/mudler/LocalAI/pull/8657l/8657
fix: Add named volumes for Windows Docker compatibility by @localai-bhttps://github.com/mudler/LocalAI/pull/8661l/8661
fix(gallery): add fallback URI resolution for backend installation by @localai-bhttps://github.com/mudler/LocalAI/pull/8663l/8663
fix: whisper breaking on cuda-13 (use absolute path for CUDA directory detection) by @localai-bhttps://github.com/mudler/LocalAI/pull/8678l/8678
fix(gallery): clean up partially downloaded backend on installation failure by @localai-bhttps://github.com/mudler/LocalAI/pull/8679l/8679
fix: properly sync model selection dropdown in video generation UI by @localai-bhttps://github.com/mudler/LocalAI/pull/8680l/8680
fix: allow reranking models configured with known_usecases by @localai-bhttps://github.com/mudler/LocalAI/pull/8681l/8681
fix: return full embedding dimensions instead of truncating trailing zeros (#8721) by @lochttps://github.com/mudler/LocalAI/pull/8755AI/pull/8755
fix: Add vllm-omni backend to video generation model detection (#8659) by @lochttps://github.com/mudler/LocalAI/pull/8781AI/pull/8781
fix(qwen-tts): duplicate instruct argument in voice design mode by @Weathercohttps://github.com/mudler/LocalAI/pull/8842l/8842
Fix image upload processing and img2img pipeline in diffusers backend by @attilagyorfhttps://github.com/mudler/LocalAI/pull/8879l/8879
fix: gate CUDA directory checks on GPU vendor to prevent false CUDA detection by @sozerchttps://github.com/mudler/LocalAI/pull/8942l/8942
fix(llama-cpp): Set enable_thinking in the correct place by @richiehttps://github.com/mudler/LocalAI/pull/8973l/8973

Exciting New Features 🎉

feat(traces): Use accordian instead of pop-ups by @richiehttps://github.com/mudler/LocalAI/pull/8626l/8626
chore: remove install.sh script and documentation references by @localai-bhttps://github.com/mudler/LocalAI/pull/8643l/8643
docs: add Podman installation documentation by @localai-bhttps://github.com/mudler/LocalAI/pull/8646l/8646
Add sample_rate support to TTS API via post-processing resampling by @Copilhttps://github.com/mudler/LocalAI/pull/8650l/8650
feat(backends): add faster-qwen3-tts by @localai-bhttps://github.com/mudler/LocalAI/pull/8664l/8664
feat(models): add model storage size display and RAM warning by @localai-bhttps://github.com/mudler/LocalAI/pull/8675l/8675
feat(ui): add model size estimation by @mudlhttps://github.com/mudler/LocalAI/pull/8684l/8684
feat: Add Free RPC to backend.proto for VRAM cleanup by @localai-bhttps://github.com/mudler/LocalAI/pull/8751l/8751
feat(qwen-tts): Support using multiple voices by @nanoandrehttps://github.com/mudler/LocalAI/pull/8757l/8757
feat(ui): move to React for frontend by @mudlhttps://github.com/mudler/LocalAI/pull/8772l/8772
feat: add WebSocket mode support for the response api by @bittohttps://github.com/mudler/LocalAI/pull/8676l/8676
feat: Add LOCALAI_DISABLE_MCP environment variable to disable MCP support by @localai-bhttps://github.com/mudler/LocalAI/pull/8816l/8816
feat: add agentic management by @mudlhttps://github.com/mudler/LocalAI/pull/8820l/8820
feat: Add shell completion support for bash, zsh, and fish by @localai-bhttps://github.com/mudler/LocalAI/pull/8851l/8851
feat(downloader): add HF_MIRROR environment variable support by @localai-bhttps://github.com/mudler/LocalAI/pull/8847l/8847
feat: add Events column to Agents list page by @localai-bhttps://github.com/mudler/LocalAI/pull/8870l/8870
feat: Add tabs to System view for Models and Backends by @localai-bhttps://github.com/mudler/LocalAI/pull/8885l/8885
feat: Add --data-path CLI flag for persistent data separation by @localai-bhttps://github.com/mudler/LocalAI/pull/8888l/8888
feat(mlx-distributed): add new (experimental) MLX-distributed backend by @mudlhttps://github.com/mudler/LocalAI/pull/8801l/8801
chore(size): display size of HF models and allow to specify it from the gallery by @mudlhttps://github.com/mudler/LocalAI/pull/8907l/8907
feat(ui): add canvas mode, support history in agent chat by @mudlhttps://github.com/mudler/LocalAI/pull/8927l/8927
feat(ui): MCP Apps, mcp streaming and client-side support by @mudlhttps://github.com/mudler/LocalAI/pull/8947l/8947
feat: add fish-speech backend by @mudlhttps://github.com/mudler/LocalAI/pull/8962l/8962
feat(backends): add ace-step.cpp by @mudlhttps://github.com/mudler/LocalAI/pull/8965l/8965
feat(realtime): WebRTC support by @richiehttps://github.com/mudler/LocalAI/pull/8790l/8790

🧠 Models

chore(model gallery): 🤖 add 1 new models via gallery agent by @localai-bhttps://github.com/mudler/LocalAI/pull/8693l/8693
chore(model gallery): 🤖 add 1 new models via gallery agent by @localai-bhttps://github.com/mudler/LocalAI/pull/8694l/8694
chore(model gallery): 🤖 add 1 new models via gallery agent by @localai-bhttps://github.com/mudler/LocalAI/pull/8695l/8695
chore(model gallery): 🤖 add 1 new models via gallery agent by @localai-bhttps://github.com/mudler/LocalAI/pull/8696l/8696
chore(model gallery): 🤖 add 1 new models via gallery agent by @localai-bhttps://github.com/mudler/LocalAI/pull/8698l/8698

📖 Documentation and examples

fix(realtime): Add functions to conversation history by @richiehttps://github.com/mudler/LocalAI/pull/8616l/8616
docs: update diffusers multi-GPU documentation to mention tensor_parallel_size configuration by @localai-bhttps://github.com/mudler/LocalAI/pull/8621l/8621
docs: Update Home Assistant links in README.md by @loryanstrahttps://github.com/mudler/LocalAI/pull/8688l/8688

👒 Dependencies

chore(deps): bump fyne.io/fyne/v2 from 2.7.2 to 2.7.3 by @dependabot[bohttps://github.com/mudler/LocalAI/pull/8629l/8629
chore(deps): bump github.com/anthropics/anthropic-sdk-go from 1.22.0 to 1.26.0 by @dependabot[bohttps://github.com/mudler/LocalAI/pull/8630l/8630
chore(deps): bump

Configuration

📅 Schedule: (in timezone America/New_York)

Branch creation
- At any time (no schedule defined)
Automerge
- At any time (no schedule defined)

🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

♻ Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this PR and you won't be reminded about these updates again.

If you want to rebase/retry this PR, check this box

This PR has been generated by Renovate Bot.

github-actions · 2026-03-14T19:11:01Z

no HelmRelease objects found in cluster

… v4.1.3 )

nerdz-bot bot added renovate/container type/major labels Mar 14, 2026

github-actions bot added the area/kubernetes label Mar 14, 2026

nerdz-bot bot force-pushed the renovate/quay.io-go-skynet-local-ai-4.x branch 21 times, most recently from ec5ccf1 to 76d9c71 Compare March 22, 2026 03:06

nerdz-bot bot force-pushed the renovate/quay.io-go-skynet-local-ai-4.x branch 3 times, most recently from 39d32f9 to a1267b3 Compare March 23, 2026 07:34

nerdz-bot bot force-pushed the renovate/quay.io-go-skynet-local-ai-4.x branch 10 times, most recently from 0e3a1e0 to b5bf8c2 Compare April 2, 2026 07:36

nerdz-bot bot force-pushed the renovate/quay.io-go-skynet-local-ai-4.x branch 2 times, most recently from df7eb08 to 2d7e0f9 Compare April 2, 2026 23:16

nerdz-bot bot changed the title ~~feat(container)!: Update image quay.io/go-skynet/local-ai ( v3.12.1 ➔ v4.0.0 )~~ feat(container)!: Update image quay.io/go-skynet/local-ai ( v3.12.1 ➔ v4.1.0 ) Apr 2, 2026

nerdz-bot bot force-pushed the renovate/quay.io-go-skynet-local-ai-4.x branch 5 times, most recently from d2a4d3b to ea2dbfa Compare April 5, 2026 01:37

nerdz-bot bot changed the title ~~feat(container)!: Update image quay.io/go-skynet/local-ai ( v3.12.1 ➔ v4.1.0 )~~ feat(container)!: Update image quay.io/go-skynet/local-ai ( v3.12.1 ➔ v4.1.1 ) Apr 5, 2026

nerdz-bot bot force-pushed the renovate/quay.io-go-skynet-local-ai-4.x branch 2 times, most recently from c9efcb8 to 6553c5e Compare April 6, 2026 10:27

nerdz-bot bot changed the title ~~feat(container)!: Update image quay.io/go-skynet/local-ai ( v3.12.1 ➔ v4.1.1 )~~ feat(container)!: Update image quay.io/go-skynet/local-ai ( v3.12.1 ➔ v4.1.2 ) Apr 6, 2026

nerdz-bot bot force-pushed the renovate/quay.io-go-skynet-local-ai-4.x branch 2 times, most recently from 00d3ff3 to 2a4a55f Compare April 7, 2026 00:20

nerdz-bot bot changed the title ~~feat(container)!: Update image quay.io/go-skynet/local-ai ( v3.12.1 ➔ v4.1.2 )~~ feat(container)!: Update image quay.io/go-skynet/local-ai Apr 7, 2026

nerdz-bot bot force-pushed the renovate/quay.io-go-skynet-local-ai-4.x branch from 2a4a55f to f509bb1 Compare April 7, 2026 01:38

nerdz-bot bot changed the title ~~feat(container)!: Update image quay.io/go-skynet/local-ai~~ feat(container)!: Update image quay.io/go-skynet/local-ai ( v3.12.1 ➔ v4.1.3 ) Apr 7, 2026

nerdz-bot bot force-pushed the renovate/quay.io-go-skynet-local-ai-4.x branch from f509bb1 to 739dd5b Compare April 8, 2026 08:30

feat(container)!: Update image quay.io/go-skynet/local-ai ( v3.12.1 ➔…

1ac4dab

… v4.1.3 )

nerdz-bot bot force-pushed the renovate/quay.io-go-skynet-local-ai-4.x branch from 739dd5b to 1ac4dab Compare April 9, 2026 08:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(container)!: Update image quay.io/go-skynet/local-ai ( v3.12.1 ➔ v4.1.3 )#1722

feat(container)!: Update image quay.io/go-skynet/local-ai ( v3.12.1 ➔ v4.1.3 )#1722
nerdz-bot[bot] wants to merge 1 commit intomainfrom
renovate/quay.io-go-skynet-local-ai-4.x

nerdz-bot bot commented Mar 14, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

Conversation

nerdz-bot bot commented Mar 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Release Notes

v4.1.3

What's Changed

Bug fixes 🐛

👒 Dependencies

Other Changes

v4.1.2

What's Changed

Bug fixes 🐛

Exciting New Features 🎉

Other Changes

v4.1.1

What's Changed

Other Changes

New Contributors

v4.1.0

🎉 LocalAI 4.1.0 Release! 🚀

New (long version) Full setup walktrough: https://www.youtube.com/watch?v=cMVNnlqwfw4

🚀 Key Features

🌐 Distributed Mode: scaling LocalAI horizontally

🔐 Users, Authentication & Quotas

Users and quota:

Usage metrics per user:

🧪 Fine-Tuning & Quantization

🎨 UI

🤖 Agents & Inference

🛠️ Under the Hood

🐞 Fixes & Improvements

❤️ Thank You

✅ Full Changelog

What's Changed

Bug fixes 🐛

Exciting New Features 🎉

👒 Dependencies

Other Changes

New Contributors

v4.0.0

🎉 LocalAI 4.0.0 Release! 🚀

🚀 Key Features

🤖 Native Agentic Orchestration & Agenthub

🎨 Revamped UI & Canvas Mode

🔌 MCP Apps & Client-Side Support

🎵 New Backends, Audio & Video Enhancements

🛠️ Infrastructure & Developer Experience

🐞 Fixes & Improvements

Known issues

❤️ Thank You

✅ Full Changelog

What's Changed

Breaking Changes 🛠

Bug fixes 🐛

Exciting New Features 🎉

🧠 Models

📖 Documentation and examples

👒 Dependencies

Configuration

Uh oh!

github-actions bot commented Mar 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

nerdz-bot bot commented Mar 14, 2026 •

edited

Loading

`v4.1.3`

`v4.1.2`

`v4.1.1`

`v4.1.0`

`v4.0.0`