Skip to content

feat(container)!: Update image quay.io/go-skynet/local-ai ( v3.12.1 ➔ v4.1.3 )#1722

Open
nerdz-bot[bot] wants to merge 1 commit intomainfrom
renovate/quay.io-go-skynet-local-ai-4.x
Open

feat(container)!: Update image quay.io/go-skynet/local-ai ( v3.12.1 ➔ v4.1.3 )#1722
nerdz-bot[bot] wants to merge 1 commit intomainfrom
renovate/quay.io-go-skynet-local-ai-4.x

Conversation

@nerdz-bot
Copy link
Copy Markdown
Contributor

@nerdz-bot nerdz-bot bot commented Mar 14, 2026

ℹ️ Note

This PR body was truncated due to platform limits.

This PR contains the following updates:

Package Update Change
quay.io/go-skynet/local-ai major v3.12.1-gpu-nvidia-cuda-12v4.1.3-gpu-nvidia-cuda-12
quay.io/go-skynet/local-ai major v3.12.1-gpu-intelv4.1.3-gpu-intel

Warning

Some dependencies could not be looked up. Check the Dependency Dashboard for more information.


Release Notes

mudler/LocalAI (quay.io/go-skynet/local-ai)

v4.1.3

Compare Source

What's Changed
Bug fixes 🐛
👒 Dependencies
Other Changes

Full Changelog: mudler/LocalAI@v4.1.2...v4.1.3

v4.1.2

Compare Source

What's Changed
Bug fixes 🐛
Exciting New Features 🎉
Other Changes

Full Changelog: mudler/LocalAI@v4.1.1...v4.1.2

v4.1.1

Compare Source

This is a patch release to address few regressions from the last release and the upcoming Gemma4, most importantly:

  • Fixes Gemma 4 tokenization with llama.cpp
  • Show login in api key only mode
  • Small fixes to improve Anthropic API compatibility
What's Changed
Other Changes
New Contributors

Full Changelog: mudler/LocalAI@v4.1.0...v4.1.1

v4.1.0

Compare Source

🎉 LocalAI 4.1.0 Release! 🚀




LocalAI 4.1.0 is out! 🔥

Just weeks after the landmark 4.0, we're back with another massive drop. This release turns LocalAI into a production-grade AI platform: spin up a distributed cluster with smart routing and autoscaling, lock it down with built-in auth and per-user quotas, fine-tune models without leaving the UI, and much more. If 4.0 was the foundation, 4.1 is the control tower.

Feature Summary
🌐 Distributed Mode Run LocalAI as a cluster — smart routing, node groups, drain/resume, min/max autoscaling.
🔐 Users & Auth Built-in user management with OIDC, invite mode, API keys, and admin impersonation.
📊 Quota System Per-user usage quotas with predictive analytics and breakdown dashboards.
🧪 Fine-Tuning (experimental) Fine-tune models with TRL, auto-export to GGUF, and import back — all from the UI.
⚗️ Quantization (experimental) New backend for on-the-fly model quantization.
🔧 Pipeline Editor Visual model pipeline editor in the React UI.
🤖 Standalone Agents Run agents from the CLI with local-ai agent run.
🧠 Smart Inferencing Auto inference defaults from Unsloth, tool parsing fallback, and min_p support.
🎬 Media History Browse past generated images and media in Studio pages.
New (long version) Full setup walktrough: https://www.youtube.com/watch?v=cMVNnlqwfw4
🚀 Key Features
🌐 Distributed Mode: scaling LocalAI horizontally

Run LocalAI as a distributed cluster and let it figure out where to send your requests. No more single-node bottlenecks.

  • Smart Routing: Requests are routed to nodes ordered by available VRAM — the beefiest, free GPU gets the job.
  • Node Groups: Pin models to specific node groups for workload isolation (e.g., "gpu-heavy" vs "cpu-light").
  • Autoscaling: Built-in min/max autoscaler with a node reconciler that manages the lifecycle automatically.
  • Drain & Resume: Gracefully drain nodes for maintenance and bring them back with a single API call.
  • Cluster Dashboard: See your entire cluster status at a glance from the home page.
  • Smart Model transfer: Use S3 or transfer via peer to peer
distributed-mode.mp4

🔐 Users, Authentication & Quotas

LocalAI now ships with a complete multi-user platform — perfect for teams, classrooms, or any shared deployment.

  • User Management: Create, edit, and manage users from the React UI.
  • OIDC/OAuth: Plug in your identity provider for SSO — Google, Keycloak, Authentik, you name it.
  • Invite Mode: Restrict registration to invite-only with admin approval.
  • API Keys: Per-user API key management.
  • Admin Powers: Admins can impersonate users for debugging.
  • Quota System: Set per-user usage quotas and enforce limits.
  • Usage Analytics: Predictive usage dashboard with per-user breakdown statistics.
Users and quota:
usersquota-1775167475876.mp4
Usage metrics per user:
usage.mp4

🧪 Fine-Tuning & Quantization

No more juggling external tools. Fine-tune and quantize directly inside LocalAI.

  • Fine-Tuning with TRL (Experimental): Train LoRA adapters with Hugging Face TRL, auto-export to GGUF, and import the result straight back into LocalAI. Includes a built-in evals framework to validate your work.
  • Quantization Backend: Spin up the new quantization backend to create optimized model variants on-the-fly.
quantize-fine-tune.mp4

🎨 UI

The React UI keeps getting better. This release adds serious power-user features:

  • Model Pipeline Editor: Visually wire up model pipelines — no YAML editing required.
  • Per-Model Backend Logs: Drill into logs scoped to individual models for laser-focused debugging.
  • Media History: Studio pages now remember your past generations — images, audio, and more.
  • Searchable Model/Backend Selector: Quickly find models and backends with inline search and filtering.
  • Structured Error Toasts: Errors now link directly to traces — one click from "something broke" to "here's why."
  • Tracing Settings: Inline tracing config restored with a cleaner UI.
talk.mp4

🤖 Agents & Inference
  • Standalone Agent Mode: Run agents straight from the terminal with local-ai agent run. Supports single-turn --prompt mode and pool-based configurations from pool.json.
  • Streaming Tool Calls: Agent mode tool calls now stream in real-time, with interleaved thinking fixed.
  • Inferencing Defaults: Automatic inference parameters sourced from Unsloth and applied to all endpoints and gallery models, your models just work better out of the box.
  • Tool Parsing Fallback: When native tool call parsing fails, an iterative fallback parser kicks in automatically.

🛠️ Under the Hood
  • Repeated Log Merging: Noisy terminals? Repeated log lines are now collapsed automatically.
  • Jetson/Tegra GPU Detection: First-class NVIDIA Jetson/Tegra platform detection.
  • Intel SYCL Fix: Auto-disables mmap for SYCL backends to prevent crashes.
  • llama.cpp Portability: Bundled libdl, librt, libpthread for improved cross-platform support.
  • HF_ENDPOINT Mirror: Downloader now rewrites HuggingFace URIs with HF_ENDPOINT for corporate/mirror setups.
  • Transformers >5.0: Bumped to HuggingFace Transformers >5.0 with generic model loading.
  • API Improvements: Proper 404s for missing models, unescaped model names, unified inferencing paths with automatic retry on transient errors.

🐞 Fixes & Improvements
  • Embeddings: Implemented encoding_format=base64 for the embeddings endpoint.
  • Kokoro TTS: Fixed phonemization model not downloading during installation.
  • Realtime API: Fixed Opus codec backend selection alias in development mode.
  • Gallery Filtering: Fixed exact tag matching for model gallery filters.
  • Open Responses: Fixed required ORItemParam.Arguments field being omitted; ORItemParam.Summary now always populated.
  • Tracing: Fixed settings not loading from runtime_settings.json.
  • UI: Fixed watchdog field mapping, model list refresh on deletion, backend display in model config, MCP button ordering.
  • Downloads: Fixed directory removal during fallback attempts; improved retry logic.
  • Model Paths: Fixed baseDir assignment to use ModelPath correctly.

❤️ Thank You

LocalAI is a community-powered FOSS movement. Every star, every PR, every bug report matters.

If you believe in privacy-first, self-hosted AI:

  • Star the repo — it helps more than you think
  • 🛠️ Contribute code, docs, or feedback
  • 📣 Share with your team, your community, your world

Let's keep building the future of open AI — together. 💪


✅ Full Changelog
📋 Click to expand full changelog
What's Changed
Bug fixes 🐛
Exciting New Features 🎉
👒 Dependencies
Other Changes
New Contributors

Full Changelog: mudler/LocalAI@v4.0.0...v4.1.0

v4.0.0

Compare Source


🎉 LocalAI 4.0.0 Release! 🚀




LocalAI 4.0.0 is out!

This major release transforms LocalAI into a complete AI orchestration platform. We’ve embedded agentic and hybrid search capabilities directly into the core, completely overhauled the user interface with React for a modern experience, and are thrilled to introduce Agenthub ( link ) a brand new community hub to easily share and import agents. Alongside these massive updates, we've introduced powerful new features like Canvas mode for code artifacts, MCP apps and full MCP client-side support.

Feature Summary
Agentic Orchestration & Agenthub Native agent management with memory, skills, and the new Agenthub for community sharing.
Revamped React UI Complete frontend rewrite for lightning-fast performance and modern UX.
Canvas Mode Preview code blocks and artifacts side-by-side in the chat interface.
MCP Client-Side Full Model Context Protocol support, MCP Apps, and tool streaming in chat.
WebRTC Realtime WebRTC support for low-latency realtime audio conversations.
New Backends Added experimental MLX Distributed, fish-speech, ace-step.cpp, and faster-qwen3-tts.
Infrastructure Podman documentation, shell completion, and persistent data path separation.

🚀 Key Features
🤖 Native Agentic Orchestration & Agenthub

LocalAI now includes agentic capabilities embedded directly in the core. You can manage, import, start, and stop agents via the new UI.

  • 🌐 Agenthub: We are launching Agenthub! This is a centralized community space to share common agents and import them effortlessly into your LocalAI instance.
  • Agent Management: Full lifecycle management via the React UI. Create Agents, connect them to Slack, configure MCP servers and skills.
  • Skills Management: Centralized skill database for AI agents.
  • Memory: Agents can utilize memory with Hybrid search (PostgreSQL) or embedded in-memory storage (Chromem).
  • Observability: New "Events" column in the Agents list to track observables and status.
  • 📚 Documentation: Dive into the new capabilities in our official Agents documentation.
agents.mp4
🎨 Revamped UI & Canvas Mode

The Web interface has been completely migrated to React, bringing a smoother experience and powerful new capabilities:

  • Canvas Mode: Enable "canvas mode" in the chat to see code blocks and artifacts generated by the LLM in a dedicated preview bar on the right.
  • System View: Tabbed navigation separating Models and Backends for better organization.
  • Model Size Warnings: Visual warnings when model storage exceeds system RAM to prevent lockups.
  • Traces: Improved trace display using accordions for better readability.
model-fit-canvas-mode.mp4
🔌 MCP Apps & Client-Side Support

We’ve expanded support for the Model Context Protocol (MCP):

  • MCP Apps: Select which servers to enable for the chat directly from the UI.
  • Tool Streaming: Tools from MCP servers are automatically injected into the standard chat interface.
  • Client-Side Support: Full client-side integration for MCP tools and streaming.
  • Disable Option: Add LOCALAI_DISABLE_MCP to completely disable MCP support for security.
mcp apps
🎵 New Backends, Audio & Video Enhancements
  • MLX Distributed (Experimental): We've added an experimental backend for running distributed workloads using Apple's MLX framework! Check out the docs here.
  • New Audio Backends: Introduced fish-speech, ace-step.cpp, and faster-qwen3-tts (CUDA-only).
  • WeRTC Realtime: WebRTC support added to the Realtime API and Talk page for better low-latency audio handling.
  • TTS Improvements: Added sample_rate support via post-processing and multi-voice support for Qwen TTS.
  • Video Generation: Fixed model selection dropdown sync and added vllm-omni backend detection.
🛠️ Infrastructure & Developer Experience
  • Data Separation: New --data-path CLI flag and LOCALAI_DATA_PATH env var to separate persistent data (agents, skills) from configuration.
  • Shell Completion: Dynamic completion scripts for bash, zsh, and fish.
  • Podman Support: Dedicated documentation for Podman installation and rootless configuration.
  • Gallery & Models: Model storage size display with RAM warnings, and fallback URI resolution for backend installation failures.
  • Deprecations: HuggingFace backend support removed, and AIO images dropped to focus on main images.

🐞 Fixes & Improvements
  • Logging: Fixed watchdog spamming logs when no interval was configured; downgraded health check logs to debug.
  • CUDA Detection: Improved GPU vendor checks to prevent false CUDA detection on CPU-only hosts with runtime libs.
  • Compatibility: Renamed json_verbose to verbose_json for OpenAI spec compliance (fixes Nextcloud integration).
  • Embedding: Fixed embedding dimension truncation to return full native dimensions.
  • Permissions: Changed model install file permissions to 0644 to ensure server readability.
  • Windows Docker: Added named volumes to Docker Compose files for Windows compatibility.
  • Model Reload: Models now reload automatically after editing YAML config (e.g., context_size).
  • Chat: Fixed issue where thinking/reasoning blocks were sent to the LLM.
  • Audio: Fixed img2img pipeline in diffusers backend and Qwen TTS duplicate argument error.
Known issues
  • The diffusers backend fails to build currently (due to CI limit exhaustion) and it's not currently part of this release (the previous version is still available). We are looking into it but, if you want to help and know someone at Github that could help supporting us with better ARM runners, please reach out!

❤️ Thank You

LocalAI is a true FOSS movement — built by contributors, powered by community.

If you believe in privacy-first AI:

  • Star the repo
  • 💬 Contribute code, docs, or feedback
  • 📣 Share with others

Your support keeps this stack alive.


✅ Full Changelog
📋 Click to expand full changelog
What's Changed
Breaking Changes 🛠
Bug fixes 🐛
Exciting New Features 🎉
🧠 Models
📖 Documentation and examples
👒 Dependencies

Configuration

📅 Schedule: (in timezone America/New_York)

  • Branch creation
    • At any time (no schedule defined)
  • Automerge
    • At any time (no schedule defined)

🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this PR and you won't be reminded about these updates again.


  • If you want to rebase/retry this PR, check this box

This PR has been generated by Renovate Bot.

@github-actions
Copy link
Copy Markdown

no HelmRelease objects found in cluster

@nerdz-bot nerdz-bot bot force-pushed the renovate/quay.io-go-skynet-local-ai-4.x branch 21 times, most recently from ec5ccf1 to 76d9c71 Compare March 22, 2026 03:06
@nerdz-bot nerdz-bot bot force-pushed the renovate/quay.io-go-skynet-local-ai-4.x branch 3 times, most recently from 39d32f9 to a1267b3 Compare March 23, 2026 07:34
@nerdz-bot nerdz-bot bot force-pushed the renovate/quay.io-go-skynet-local-ai-4.x branch 10 times, most recently from 0e3a1e0 to b5bf8c2 Compare April 2, 2026 07:36
@nerdz-bot nerdz-bot bot force-pushed the renovate/quay.io-go-skynet-local-ai-4.x branch 2 times, most recently from df7eb08 to 2d7e0f9 Compare April 2, 2026 23:16
@nerdz-bot nerdz-bot bot changed the title feat(container)!: Update image quay.io/go-skynet/local-ai ( v3.12.1 ➔ v4.0.0 ) feat(container)!: Update image quay.io/go-skynet/local-ai ( v3.12.1 ➔ v4.1.0 ) Apr 2, 2026
@nerdz-bot nerdz-bot bot force-pushed the renovate/quay.io-go-skynet-local-ai-4.x branch 5 times, most recently from d2a4d3b to ea2dbfa Compare April 5, 2026 01:37
@nerdz-bot nerdz-bot bot changed the title feat(container)!: Update image quay.io/go-skynet/local-ai ( v3.12.1 ➔ v4.1.0 ) feat(container)!: Update image quay.io/go-skynet/local-ai ( v3.12.1 ➔ v4.1.1 ) Apr 5, 2026
@nerdz-bot nerdz-bot bot force-pushed the renovate/quay.io-go-skynet-local-ai-4.x branch 2 times, most recently from c9efcb8 to 6553c5e Compare April 6, 2026 10:27
@nerdz-bot nerdz-bot bot changed the title feat(container)!: Update image quay.io/go-skynet/local-ai ( v3.12.1 ➔ v4.1.1 ) feat(container)!: Update image quay.io/go-skynet/local-ai ( v3.12.1 ➔ v4.1.2 ) Apr 6, 2026
@nerdz-bot nerdz-bot bot force-pushed the renovate/quay.io-go-skynet-local-ai-4.x branch 2 times, most recently from 00d3ff3 to 2a4a55f Compare April 7, 2026 00:20
@nerdz-bot nerdz-bot bot changed the title feat(container)!: Update image quay.io/go-skynet/local-ai ( v3.12.1 ➔ v4.1.2 ) feat(container)!: Update image quay.io/go-skynet/local-ai Apr 7, 2026
@nerdz-bot nerdz-bot bot force-pushed the renovate/quay.io-go-skynet-local-ai-4.x branch from 2a4a55f to f509bb1 Compare April 7, 2026 01:38
@nerdz-bot nerdz-bot bot changed the title feat(container)!: Update image quay.io/go-skynet/local-ai feat(container)!: Update image quay.io/go-skynet/local-ai ( v3.12.1 ➔ v4.1.3 ) Apr 7, 2026
@nerdz-bot nerdz-bot bot force-pushed the renovate/quay.io-go-skynet-local-ai-4.x branch from f509bb1 to 739dd5b Compare April 8, 2026 08:30
@nerdz-bot nerdz-bot bot force-pushed the renovate/quay.io-go-skynet-local-ai-4.x branch from 739dd5b to 1ac4dab Compare April 9, 2026 08:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants