Update dependency com.microsoft.onnxruntime:onnxruntime to v1.25.1 by solrbot · Pull Request #3790 · apache/solr

solrbot · 2025-10-17T16:38:36Z

This PR contains the following updates:

Package	Type	Update	Change
com.microsoft.onnxruntime:onnxruntime (source)	dependencies	minor	`1.24.3` -> `1.25.1`

Release Notes

microsoft/onnxruntime (com.microsoft.onnxruntime:onnxruntime)

`v1.25.1`: ONNX Runtime v1.25.1

n.b. This changelog is LLM generated. Only the contributor listing has been verified.

ONNX Runtime Release 1.25.1

📢 Announcements & Breaking Changes

ONNX Op Updates

Enhanced ONNX operator support with new opset versions: Reshape (opset 25), Transpose (opset 24) (#27752)

✨ New Features

📊 New ONNX Ops & Model Support

LinearAttention and CausalConvState operators for Qwen3.5 model support (#27907)
RotaryEmbedding (RotEMB) and RMSNorm operators added (#27752)
Linear Attention signature support (#27842)

🌐 Web & JavaScript

WebGPU EP

Qwen3.5 model support on WebGPU execution provider (#27996)
QMoE 1-token decode path optimization — fused operations to reduce GPU dispatches for improved performance (#27998)

🐛 Bug Fixes

Core Runtime Fixes

Improved filesystem error messages during Linux device discovery for better debugging experience (#27289)
Fixed missing include for SetRawDataInTensorProto in NVIDIA TensorRT RTX tests (#28065)

🙏 Contributors

Thanks to our 7 contributors for this release:
@guschmue, @sanaa-hamel-microsoft, @apsonawane, @eserscor, @ishwar-raut1, @qjia7, @theHamsta

Full Changelog: microsoft/onnxruntime@v1.25.0...v1.25.1

`v1.25.0`: ONNX Runtime v1.25.0

📢 Announcements & Breaking Changes

Build & Platform

C++20 is now required to build ONNX Runtime from source. Minimum toolchains: MSVC 19.29+, GCC 10+, Clang 10+. Users of prebuilt packages are unaffected. (#27178)
CUDA minimum version raised to 12.0 — CUDA 11.x is no longer supported. Users pinned to CUDA 11.x should stay on ORT 1.24.x or upgrade their CUDA toolkit/driver. (#27570)
ONNX upgraded to 1.21.0 (#27601)
sympy is now an optional dependency for Python builds. (#27200)

Execution Provider Changes

ArmNN EP has been removed. Users should remove any --use_armnn build flags and migrate to the MLAS/KleidiAI-backed CPU EP or QNN EP for Qualcomm hardware. (#27447)

API Version

ORT_API_VERSION updated to 25. (#27280)

🔒 Security Fixes

Fixed potential integer truncation leading to heap out-of-bounds read/write (#27544)
Addressed Pad Reflect vulnerability (#27652)
Security fix for transpose optimizer (#27555)
Upgraded minimatch 3.1.2 → 3.1.4 for CVE-2026-27904 (#27667)
Hardened shell command handling for constant strings (#27840)
Added validation of onnx::TensorProto data size before allocation (#27547)
Cleaned up external data path validation (#27539)
Fixed misaligned address reads for tensor attributes from raw data buffers (#27312)
Fixed CPU Attention overflow issue (#27822)
Fixed CPU LRN integer overflow issues (#27886)
Additional input validation hardening:
- Tile kernel dim overflow (#27566)
- Out-of-bounds read in cross entropy (#27568)
- TreeEnsembleClassifier attributes (#27571)
- AffineGrid (#27572)
- EmbedLayerNorm position_ids (#27573)
- RotaryEmbedding position_ids (#27597)
- RoiAlign batch_indices (#27603)
- MaxUnpool indices (#27432)
- QMoECPU swiglu OOB (#27748)
- SVMClassifier initializer (#27699)
- Col2Im SafeInt (#27625)

✨ New Features

🔌 Execution Provider Plugin API & CUDA Plugin EP

ORT 1.25.0 introduces the CUDA Plugin EP — the first core implementation that enables third-party CUDA-backed EPs to be delivered as dynamically loaded plugins without rebuilding ORT.

CUDA Plugin EP: Core implementation (#27816)
CUDA Plugin EP: BFC-style arena and CUDA mempool allocators for stream-aware memory management (#27931)
Plugin EP Sync API for synchronous execution (#27538)
Plugin EP event profiling APIs (#27649)
Plugin EP APIs to retrieve ONNX operator schemas (#27713)
Annotation-based graph partitioning with resource accounting (#27595, #27972)
EP API adapter improvements: header-only adapter, OpKernelInfo::GetConfigOptions, LoggingManager::HasDefaultLogger() (#26879, #26919, #27540, #27541, #27587)
WebGPU EP made compatible with EP API (#26907)

🔧 Core APIs

Per-session thread pool work callbacks API (#27253)
enable_profiling in RunOptions (#26846)
KernelInfo string-array attribute APIs for C and C++ (#27599)
OrtModel input support for Compile API (#27332)
Session config to create weightless EPContext models during compilation (#27197)
Compiled model compatibility APIs in example plugin EP (#27088)
Model Package support (preview): Initial infrastructure for automatically selecting compiled EPContext model variants from a packaged collection based on EP, device, and hardware constraints. The directory structure is not yet finalized. (#27786)

📊 New ONNX Ops & Opset Coverage

Attention opset 23 on CUDA with GQA, boolean masks, softcap, and softmax precision (#26466, #27030, #27082, #27428, #27714)
Attention opset 24 on CUDA, disjoint from contrib op (#27542); nonpad KV seqlen on CPU (#27384)
TensorScatter-24 for CPU and CUDA (#27389, #27446)
DeformConv for CPU/CUDA (#27393)
LpNormalization-22 (#27164)
CUDA opset gap fills:
- Control flow & misc: Flatten, Identity, If, Loop, Scan, ConstantOfShape, Size (opset 21/23) (#27728)
- Pooling: GlobalAveragePool/GlobalMaxPool (→22) (#27733)
- Shape ops: Shape (→25), Squeeze/Unsqueeze (→25) (#27734, #27739)
- TopK (→24, BF16) (#27735), GRU (→22) (#27738)
- Pad (→25, wrap mode) (#27774), Resize v19 (#27415), RoiAlign v16/v22 (#27646)

🖥️ Execution Provider Updates

NVIDIA CUDA EP

GQA with XQA and quantized KV cache, including FP8 (E4M3) KV cache support (#27246, #27321)
CUDA graph capture compatibility for LLM ops and pre-compiled paths (#27484, #27477)
Volumetric (3-D) GridSample support (#27201)
Optimized 3D nearest resize kernel for 5D tensors (#27578)
Optional router_weights input to QMoE (#27687)

NVIDIA TensorRT RTX EP

D3D12 external resource import support (#26948)

Qualcomm QNN EP

Disabled file mapping for embedded cache (#27627)
Fixed use-after-free of logger object (#27804)
Fixed wheel build issues on WSL and Linux SDK version propagation (#27730, #27800)

Other EPs

VitisAI EP: Added PE version info to provider DLL (#27626)
DML EP: Fixed overflow in DmlGraphFusionHelper::ProcessInputData (#27815), fixed new-delete mismatch in QuantizeLinear (#27823)

🌐 Web & JavaScript

WebGPU EP — Performance

Gemm/MatMul optimization using subgroup features (#26433)
MatMulNBits: 2-bit zero-point support (#27285, #27325), higher K-parallelism (#27834), DP4A SmallM tiling (#27910)
Flash Attention: head_sink support (#27410), configurable multi rotary cache concat offset (#27434)
Optimized 4D Transpose (#26942), string stream optimization (#27223)

WebGPU EP — New Op Support

Added TopK (#27560), Softplus (#27457), Identity (#27067)
Added Conv3D support (#27917), LpNorm support (#27876)
int64/bool support for Range, Expand, Flatten, Gather, Unsqueeze (#26673, #27478, #27561)
DequantizeLinear fixes (#27706), Einsum 5D tensor fixes (#27779)

WebGPU EP — Stability

Fixed device destroyed on session release breaking recreation (#27634)
Fixed static destruction crash on exit (#27470, #27569)
Backward compat: Legacy WebGPU/WebNN memory info names are now accepted again (#27637)
Deterministic Split-K handling (#27086), buffer segment alignment fix (#27853)
Binary size reduction for WebAssembly builds (#27370, #27371)

WebNN EP

Broader GQA support and improved MultiHeadAttention (#27234, #27494)
Added DepthToSpace support (#27508)

Node.js & React Native

Fixed float16 tensor support in Node.js and React Native (#27327, #27549)
Added 16KB page size alignment for Android (required for Android 15+) (#27523)

🧠 CPU & Core Optimizations

MLAS / KleidiAI / Quantization

KleidiAI BF16 SME2 kernel integration (#26773), asymmetric 4-bit MatMulNBits on ARM64 (#27751)
Fused Silu and Gelu kernels for AVX512 (#27690)
Depthwise conv kernel for NCHW on AVX512 (#27874)
ARM64 NCHWc NEON asm kernels (#27099, #27788), BF16 KAI SBGemm on NCHWc ARM (#27703)
POWER10 Sgemm PackA optimization (#27575)
Improved pre-packing for 2-bit LUT kernels (#27131)
Backend kernel selector config in MLAS, allowing users to opt out of KleidiAI kernels on ARM platforms
(#27136)

DQ→MatMulNBits Fusion

Extended to cover significantly more quantized LLM inference scenarios on CPU:

2-bit and 8-bit weights with Cast(fp16→fp32) patterns (#27614)
FP16 models on CPU EP (#27640), fp16 8-bit on ARM64 (#27692)
Gemm + per-tensor/per-channel quantization (#27769)
FP16 quantized weight compatibility: models with HQNBIT quantized weights now route through the FP32 MLAS path for broader CPU compatibility (#27820)

Model Optimizer & Fusions

Qwen3 model type support and RotaryEmbedding fusion for Qwen3 RoPE patterns (#27556, #27590)
MobileClip attention fusion for both attention block patterns (#27883)
Nemotron speech conformer encoder MHA fusion (#27764)
Fixed GPT-2 no-past attention fusion for transformers ≥ 4.27 (#27449)
Fixed BART attention fusion for SDPA pattern from transformers ≥ 4.49 (#27458)
Pre-layer normalization support in attention fusion (#27418)
SkipLayerNorm fusion with bias Add (#27765), broadcasting skip shapes (#27489)
SpaceToDepth fusion pattern (#27747)
NCHWc transformer: more patterns and ONNX-domain Gelu/HardSigmoid activations (#27691, #27821)
Optimized qMoE code path for single-token execution (#27383)
ONNX Attention KV cache optimization with ConcatNewToPast (#27613)

🔌 Language Bindings

Python

Exposed OrtDeviceVendorId enum for vendor-aware OrtDevice aliases (#27594)
Added bindings for GetCompatibilityInfoFromModel / GetCompatibilityInfoFromModelBytes (#27565)
Fixed OrtValue.from_dlpack rejecting zero-size tensors as non-contiguous (#27451)

C#

Added bindings for GetCompatibilityInfoFromModel / GetCompatibilityInfoFromModelBytes (#27565)

Java

Avoid provider resource extraction when library already exists in onnxruntime.native.path (#27668)

🐛 Bug Fixes

Critical Fixes

Fixed CPU Attention overflow issue (#27822)
Fixed CPU LRN integer overflow issues (#27886)
Fixed incorrect pad indices in AveragePool count_include_pad computation — silent correctness issue (#27375)
Fixed integer division/modulo by zero in CPU EP Div and Mod operators (#27693, #27833)
Fixed non-ASCII Unicode model path crash (#27724)
Fixed arithmetic overflow in Det operator (#27070)
Fixed narrow-to-wide string conversion bugs in DLL load error reporting (#27777)

Operator & Graph Fixes

Fixed 3D attention mask broadcasting in MHA (#27464)
Fixed GQA shape inference for present outputs (#27250)
Fixed Einsum bugs for reduction and empty input cases (#27225, #27226)
Prevented cross-EP Cast fusion in RemoveDuplicateCastTransformer (#27363)
Fixed ConvTranspose bias input validation on CPU/CUDA (#27209)
Fixed Cast node naming collisions in float16 conversion (#27469)
Fixed concat/slice elimination and unsqueeze elimination against optional attrs and invalid models (#27638)
Improved EPContext error message when node is not assigned to an EP (#27474)

EP-Specific Fixes

Fixed MiGraphX EP double allocation (#27551)
Fixed MLAS qgemm dispatch and kernel regressions in quantized conv tests (#27671)
Fixed run-level profiling for subgraph operators (#27870)
Fixed --build_wasm_static_lib implicitly enabling --build_wasm (#27342)

🙏 Contributors

Thanks to our 72 contributors for this release!

@tianleiwu, @fs-eire, @edgchen1, @titaiwangms, @hariharans29, @eserscor, @Rishi-Dave, @guschmue, @adrianlizarraga, @jambayk, @qjia7, @skottmckay, @adrastogi, @sanaa-hamel-microsoft, @yuslepukhin, @ingyukoh, @Jiawei-Shao, @vraspar, @xhcao, @chilo-ms, @Honry, @JonathanC-ARM, @kunal-vaishnavi, @ShirasawaSama, @chaya2350, @derdeljan-msft, @gedoensmax, @HectorSVC, @milpuz01, @quic-calvnguy, @xenova, @akholodnamdcom, @AlekseiNikiforovIBM, @amd-genmingz, @ashrit-ms, @bachelor-dou, @BODAPATIMAHESH, @Colm-in-Arm, @daijh, @dodokw, @fanchenkong1, @ivarusic-amd, @JanSellner, @jchen10, @jiafatom, @jnagi-intel, @johannes-rehm-snkeos, @justinchuby, @keshavv27, @Kevin-Taha, @kevinlam92, @kpkbandi, @Laan33, @melkap01-Arm, @miaobin, @n-v-k, @nico-martin, @patryk-kaiser-ARM, @praneshgo, @prathikr, @qc-tbhardwa, @sagarbhure-msft, @sdotpeng, @the0cp, @TsofnatMaman, @umangb-09, @walidbr, @wenqinI, @xadupre, @xhan65, @xiaofeihan1

Full Changelog: v1.24.4...v1.25.0

Configuration

📅 Schedule: Branch creation - Only on Sunday ( * * * * 0 ) (UTC), Automerge - At any time (no schedule defined).

🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

♻ Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this PR and you won't be reminded about this update again.

If you want to rebase/retry this PR, check this box

This PR has been generated by Renovate Bot

solrbot added the exempt-stale Prevent a PR from going stale label Oct 17, 2025

github-actions Bot added dependencies Dependency upgrades tool:build module:sql labels Oct 17, 2025

solrbot force-pushed the renovate/onnx branch 3 times, most recently from ed07788 to 4c47e95 Compare October 23, 2025 02:05

solrbot force-pushed the renovate/onnx branch from 4c47e95 to f9f69d8 Compare October 30, 2025 04:36

solrbot changed the title ~~Update dependency com.microsoft.onnxruntime:onnxruntime to v1.23.1~~ Update dependency com.microsoft.onnxruntime:onnxruntime to v1.23.2 Oct 30, 2025

solrbot force-pushed the renovate/onnx branch from f9f69d8 to 71a8ca9 Compare November 8, 2025 02:05

solrbot force-pushed the renovate/onnx branch 3 times, most recently from d77242d to b4ec439 Compare December 18, 2025 20:38

solrbot force-pushed the renovate/onnx branch from b335648 to a7abd3e Compare January 3, 2026 02:13

solrbot force-pushed the renovate/onnx branch from bb257fd to 84ecba0 Compare February 10, 2026 20:56

solrbot changed the title ~~Update dependency com.microsoft.onnxruntime:onnxruntime to v1.23.2~~ Update dependency com.microsoft.onnxruntime:onnxruntime to v1.24.1 Feb 10, 2026

solrbot force-pushed the renovate/onnx branch 2 times, most recently from 6e75ae5 to d588b69 Compare February 24, 2026 20:50

solrbot changed the title ~~Update dependency com.microsoft.onnxruntime:onnxruntime to v1.24.1~~ Update dependency com.microsoft.onnxruntime:onnxruntime to v1.24.2 Feb 24, 2026

solrbot force-pushed the renovate/onnx branch from 291e4f1 to 5d3d784 Compare March 11, 2026 05:23

solrbot force-pushed the renovate/onnx branch from 98644e4 to f9c3dbb Compare April 20, 2026 23:58

solrbot changed the title ~~Update dependency com.microsoft.onnxruntime:onnxruntime to v1.24.2~~ Update dependency com.microsoft.onnxruntime:onnxruntime to v1.24.3 Apr 20, 2026

solrbot force-pushed the renovate/onnx branch from 9e24d44 to 849a513 Compare April 23, 2026 02:11

solrbot changed the title ~~Update dependency com.microsoft.onnxruntime:onnxruntime to v1.24.3~~ Update dependency com.microsoft.onnxruntime:onnxruntime to v1.25.0 Apr 23, 2026

solrbot force-pushed the renovate/onnx branch 3 times, most recently from 3278039 to 8d883c9 Compare April 28, 2026 11:00

solrbot force-pushed the renovate/onnx branch from 59ed14e to 177383b Compare May 3, 2026 16:55

solrbot changed the title ~~Update dependency com.microsoft.onnxruntime:onnxruntime to v1.25.0~~ Update dependency com.microsoft.onnxruntime:onnxruntime to v1.25.1 May 3, 2026

Update dependency com.microsoft.onnxruntime:onnxruntime to v1.25.1

45be597

solrbot force-pushed the renovate/onnx branch from 8afacda to 45be597 Compare May 4, 2026 03:02

Update changelog entry

14b08ba

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update dependency com.microsoft.onnxruntime:onnxruntime to v1.25.1#3790

Update dependency com.microsoft.onnxruntime:onnxruntime to v1.25.1#3790
solrbot wants to merge 2 commits intoapache:mainfrom
solrbot:renovate/onnx

solrbot commented Oct 17, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

solrbot commented Oct 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Release Notes

v1.25.1: ONNX Runtime v1.25.1

ONNX Runtime Release 1.25.1

📢 Announcements & Breaking Changes

ONNX Op Updates

✨ New Features

📊 New ONNX Ops & Model Support

🌐 Web & JavaScript

WebGPU EP

🐛 Bug Fixes

Core Runtime Fixes

🙏 Contributors

v1.25.0: ONNX Runtime v1.25.0

📢 Announcements & Breaking Changes

Build & Platform

Execution Provider Changes

API Version

🔒 Security Fixes

✨ New Features

🔌 Execution Provider Plugin API & CUDA Plugin EP

🔧 Core APIs

📊 New ONNX Ops & Opset Coverage

🖥️ Execution Provider Updates

NVIDIA CUDA EP

NVIDIA TensorRT RTX EP

Qualcomm QNN EP

Other EPs

🌐 Web & JavaScript

WebGPU EP — Performance

WebGPU EP — New Op Support

WebGPU EP — Stability

WebNN EP

Node.js & React Native

🧠 CPU & Core Optimizations

MLAS / KleidiAI / Quantization

DQ→MatMulNBits Fusion

Model Optimizer & Fusions

🔌 Language Bindings

Python

C#

Java

🐛 Bug Fixes

Critical Fixes

Operator & Graph Fixes

EP-Specific Fixes

🙏 Contributors

Configuration

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

solrbot commented Oct 17, 2025 •

edited

Loading

`v1.25.1`: ONNX Runtime v1.25.1

`v1.25.0`: ONNX Runtime v1.25.0