Releases: Azure/GPT-RAG
v2.6.6
Added
- Multimodal figure/image extraction for Content Understanding (Azure/GPT-RAG#446): When using Content Understanding as the document analysis backend (
USE_DOCUMENT_INTELLIGENCE=false), the multimodal chunker now extracts figures from documents, uploads them to thedocuments-imagesblob container, generates captions using a vision-capable model, and populatesrelatedImages,imageCaptions, andcaptionVectorfields in the search index — achieving full multimodal parity with the Document Intelligence path. Supports PDF (PyMuPDF page rendering with bounding-box crop), DOCX (word/media/ZIP extraction), and PPTX (ppt/media/ZIP extraction). TheContentUnderstandingClientnow parses and returns figure and page metadata from the API response instead of discarding it. New dependencies:PyMuPDF,python-docx,python-pptx.
Changed
- Bumped
gpt-rag-ingestiontov2.3.3.
Tested Service Versions
The following component versions were validated together for this release:
| Component | Version |
|---|---|
| gpt-rag-ui | v2.3.1 |
| gpt-rag-orchestrator | v2.6.2 |
| gpt-rag-ingestion | v2.3.3 |
| infra (landing zone) | v1.0.7 |
v2.6.5
Fixed
- OpenTelemetry version pinning (orchestrator): Pinned
azure-monitor-opentelemetry==1.8.7,azure-monitor-opentelemetry-exporter==1.0.0b49,opentelemetry-instrumentation-httpx==0.61b0, andopentelemetry-instrumentation-fastapi==0.61b0inrequirements.txt. Unpinned versions caused non-deterministic Docker builds where an older exporter (referencing the removedLogDataclass) could be paired withopentelemetry-sdk>=1.39.0, crashing the container on startup withImportError: cannot import name 'LogData' from 'opentelemetry.sdk._logs'. (#445) - Permission trimming header format (orchestrator): Removed erroneous
Bearerprefix from thex-ms-query-source-authorizationheader value in both the REST API path (search.py) and the SDK path (search_context_provider.py). Azure AI Search expects the raw OBO token without the prefix; including it caused400 Invalid headererrors whenpermissionFilterOptionwas enabled on the search index. (#447)
Changed
- Bumped
gpt-rag-orchestratortov2.6.2.
Tested Service Versions
The following component versions were validated together for this release:
| Component | Version |
|---|---|
| gpt-rag-ui | v2.3.1 |
| gpt-rag-orchestrator | v2.6.2 |
| gpt-rag-ingestion | v2.3.2 |
| infra (landing zone) | v1.0.7 |
v2.6.4
Fixed
- Restored missing
parent_idfield in the RAG search index template (config/search/search.j2), which was accidentally removed during the v2.6.0 merge. This causedgpt-rag-ingestionblob storage and SharePoint indexers to fail withCould not find a property named 'parent_id'errors.
Changed
- Updated
infrasubmodule to bicep-ptn-aiml-landing-zone tagv1.0.7, fixing Log Analytics provisioning failure in Sweden Central caused byforceCmkForQuerydefault.
Tested Service Versions
The following component versions were validated together for this release:
| Service | Tested Version |
|---|---|
| gpt-rag-ui | v2.3.1 |
| gpt-rag-orchestrator | v2.6.1 |
| gpt-rag-ingestion | v2.3.2 |
| infra (landing zone) | v1.0.7 |
v2.6.3
Changed
- Updated
infrasubmodule to bicep-ptn-aiml-landing-zone tagv1.0.6. - Parametrized Container App CPU and memory per app entry with fallback defaults (
0.5CPU /1.0Gi). - Increased
dataingestContainer App resources to1.0CPU and3.0Gimemory. - Increased
text-embedding-3-largedeployment capacity from40to100. - Bumped
gpt-rag-ingestionfromv2.2.5tov2.3.2.
What's New in gpt-rag-ingestion (v2.3.0 → v2.3.2)
v2.3.0 — Admin Dashboard, Content Understanding & Retry Tracking
- Admin dashboard: React frontend at
/dashboardwith paginated job/file tables, search, filters, and unblock action. - Content Understanding integration: New default document analysis path using Azure AI Foundry
prebuilt-layoutinstead of Document Intelligence (~69% cost reduction per page). - Per-file retry tracking: Files exceeding
MAX_FILE_PROCESSING_ATTEMPTS(default 3) are automatically blocked. Applies to blob storage and SharePoint indexers. - Scheduled log cleanup: Automatic old run-summary blob cleanup via APScheduler.
v2.3.1 — Processing Insights, Large PDF Handling & Memory Safety
- Processing timings breakdown: Per-phase timing data (download, analysis, chunking + embeddings, index upload) displayed as a stacked color bar in the dashboard.
- Per-file cost estimation: Cost broken down by service (analysis per page, embeddings per token, completions per token) with configurable unit prices.
- Automatic PDF splitting: PDFs exceeding the analysis service page limit (default 300) are split automatically, preventing
InputPageCountExceedederrors. - Memory guard: Checks file size against available container memory before download, skipping oversized files instead of risking OOM crashes.
- Temp file download: PDFs >10 MB downloaded to disk instead of memory, keeping peak usage bounded.
- Fixed dashboard unresponsive during large file processing (async event loop was blocked).
- Fixed stale error field retained on successful re-processing.
- Fixed
_as_datetimeNameError crashing indexer runs.
v2.3.2 — Stability Fixes
- Default
INDEXER_MAX_CONCURRENCYlowered to 2 (reduced memory pressure and rate-limit contention). - Fixed stale running jobs stuck forever after container crash/restart (auto-detected after 2 hours).
- Fixed dashboard retries column showing inflated count.
- Fixed cost estimate displayed with excessive decimal places.
- Fixed 429 rate-limit display issues.
Tested Service Versions
The following component versions were validated together for this release:
| Component | Version |
|---|---|
| gpt-rag-ui | v2.3.1 |
| gpt-rag-orchestrator | v2.6.1 |
| gpt-rag-ingestion | v2.3.2 |
| infra (landing zone) | v1.0.6 |
v2.6.2
v2.6.1
Fixed
- Fixed Zero Trust provisioning failure caused by jumpbox Custom Script Extension using incorrect release tag. Replaced
install_scriptURL field withailz_taginmanifest.json, allowing the install script URL and release parameter to be derived from the landing zone tag.
Changed
- Updated
infrasubmodule to bicep-ptn-aiml-landing-zone tagv1.0.5. - Bumped
gpt-rag-uitov2.3.1. - Bumped
gpt-rag-ingestiontov2.2.5. (Fixed #436)
Tested Service Versions
The following service versions were validated together for this release:
| Service | Tested Version |
|---|---|
| gpt-rag-ui | v2.3.1 |
| gpt-rag-orchestrator | v2.6.0 |
| gpt-rag-ingestion | v2.2.5 |
v2.6.0
What's Changed
- Update component versions (orchestrator v2.6.0, UI v2.3.0, ingestion v2.2.4), bump infra submodule to v1.0.4
- Add explicit partition keys to Cosmos DB container definitions (conversations uses /principal_id)
- Add conversation-documents storage container and conversationId field to search index
- Remove standalone MCP Container App from default deployment
Tested Service Versions
The following service versions were validated together for this release:
| Service | Tested Version |
|---|---|
| gpt-rag-ui | v2.3.0 |
| gpt-rag-orchestrator | v2.6.0 |
| gpt-rag-ingestion | v2.2.4 |
Full Changelog: v2.5.3...v2.6.0
v2.5.3
What's Changed
- chore: update default chat model to gpt-5-nano, bump infra submodule to v1.0.3, update component versions, add copilot-instructions
Tested Service Versions
The following service versions were validated together for this release:
| Service | Tested Version |
|---|---|
| gpt-rag-ui | v2.2.3 |
| gpt-rag-mcp | v0.3.5 |
| gpt-rag-orchestrator | v2.5.0 |
| gpt-rag-ingestion | v2.2.3 |
Full Changelog: v2.5.2...v2.5.3
v2.5.2
What's Changed
- fix: make postProvision.sh venv cleanup non-fatal — closes #426 by @sihbher in #427
- feat: skip clone when component repo already exists (#428) by @sihbher in #429
Tested Service Versions
The following service versions were validated together for this release:
| Service | Tested Version |
|---|---|
| gpt-rag-ui | v2.2.2 |
| gpt-rag-mcp | v0.3.5 |
| gpt-rag-orchestrator | v2.4.2 |
| gpt-rag-ingestion | v2.2.3 |
Full Changelog: v2.5.1...v2.5.2
New Contributors
v2.5.1
Changed
- Updated
infrasubmodule to external bicep-ptn-aiml-landing-zone tagv1.0.1. - Improved runtime performance by upgrading the Orchestrator and UI components to
v2.4.2andv2.2.2, respectively:- Bumped
gpt-rag-orchestratortov2.4.2. - Bumped
gpt-rag-uitov2.2.2.
- Bumped
- Bumped
gpt-rag-ingestiontov2.2.3.
Tested Service Versions
The following service versions were validated together for this release:
| Service | Tested Version |
|---|---|
| gpt-rag-ui | v2.2.2 |
| gpt-rag-mcp | v0.3.5 |
| gpt-rag-orchestrator | v2.4.2 |
| gpt-rag-ingestion | v2.2.3 |
Full Changelog: v2.5.0...v2.5.1