From cfbb1c08611fa646f9104fde82f523bfbcfb985f Mon Sep 17 00:00:00 2001
From: Sergey Yedrikov <syedriko@redhat.com>
Date: Sun, 22 Mar 2026 13:03:45 -0400
Subject: [PATCH] LCORE-1539: Add agentic hermeto simulation to the rag-content
 repo.

---
 .gitignore                                 |   7 ++
 AGENTS.md                                  | 117 +++++++++++++++++++++
 Makefile                                   |  14 +++
 scripts/gen_containerfile_hermetic_sim.sh  |  42 ++++++++
 scripts/hermeto/prefetch-konflux-cpu.json  |  34 ++++++
 scripts/hermeto/prefetch-konflux-cuda.json |  33 ++++++
 scripts/hermeto/prefetch-pip-cpu.json      |  21 ++++
 scripts/hermeto/prefetch-pip-cuda.json     |  21 ++++
 scripts/run_hermeto_fetch_deps.sh          |  81 ++++++++++++++
 scripts/simulate_hermetic_build.sh         |  80 ++++++++++++++
 scripts/stage_hermetic_build_context.sh    |  94 +++++++++++++++++
 11 files changed, 544 insertions(+)
 create mode 100644 AGENTS.md
 create mode 100755 scripts/gen_containerfile_hermetic_sim.sh
 create mode 100644 scripts/hermeto/prefetch-konflux-cpu.json
 create mode 100644 scripts/hermeto/prefetch-konflux-cuda.json
 create mode 100644 scripts/hermeto/prefetch-pip-cpu.json
 create mode 100644 scripts/hermeto/prefetch-pip-cuda.json
 create mode 100755 scripts/run_hermeto_fetch_deps.sh
 create mode 100755 scripts/simulate_hermetic_build.sh
 create mode 100755 scripts/stage_hermetic_build_context.sh
diff --git a/.gitignore b/.gitignore
index 87ad2ac5..a7307529 100644
--- a/.gitignore
+++ b/.gitignore
@@ -173,3 +173,10 @@ cython_debug/
 
 # PyPI configuration file
 .pypirc
+
+# Local Hermeto prefetch (scripts/run_hermeto_fetch_deps.sh)
+.hermeto-output/
+.hermeto-output-verify-cpu/
+.hermeto-output-verify-cuda/
+# Local hermetic build simulation (scripts/simulate_hermetic_build.sh)
+.hermetic-staging/
diff --git a/AGENTS.md b/AGENTS.md
new file mode 100644
index 00000000..0a3f70a3
--- /dev/null
+++ b/AGENTS.md
@@ -0,0 +1,117 @@
+# Agent notes: Konflux, Hermeto, and hermetic Python lockfiles
+
+This file captures lessons from debugging **prefetch-dependencies / Hermeto** failures (checksum mismatch, “No wheels found”, `pybuild-deps` errors) and aligning **CPU** and **CUDA** requirement generators. Use it when changing `pyproject.toml`, RHOAI indices, or `.tekton` prefetch inputs.
+
+## Hermeto behavior that breaks naive lockfiles
+
+- **Konflux invocation** (shape to mirror locally):  
+  `hermeto --log-level debug --mode strict fetch-deps '<json>' --sbom-output-type spdx --source <repo> --output <dir>`  
+  The pip slice of `<json>` matches `.tekton` `prefetch-input` (see `scripts/hermeto/*.json` and `scripts/run_hermeto_fetch_deps.sh`).
+
+- **PyPI intersection for wheels**  
+  For many binary packages, Hermeto logs lines like:  
+  `using intersection of requirements-file and PyPI-reported checksums`  
+  It then **drops** any wheel whose digest is not in that intersection.
+
+- **Implication for RHOAI**  
+  If a line uses **`package==version`** with **`--index-url`** pointing at RHOAI but the **actual wheel file** is a RHOAI rebuild (e.g. `torch-2.9.0-7-cp312-cp312-linux_x86_64.whl`), PyPI usually **does not** expose the same filename/digest. **Every** candidate can be filtered → `PackageRejected: No wheels found`.
+
+- **What fixes it**  
+  - Prefer **`name @ https://…/exact-file.whl`** plus a **single `--hash=sha256:…`** for bytes you control (pulp or `files.pythonhosted.org`), so Hermeto fetches by URL and verifies the hash **without** requiring a PyPI match for that RHOAI rebuild name.  
+  - For **CPU multi-arch**, the prefetch lists **both** arch-specific requirement fragments; **`Containerfile`** installs **one** fragment according to `TARGETARCH` (see below).
+
+- **Local Hermeto**  
+  **`full-cpu` / `full-cuda`** may fail without **RHSM** client cert paths—Konflux injects `options.ssl` on the **rpm** input; laptops typically do not have `/etc/pki/entitlement/…`. Prefer the **pip slice** for routine validation (see below).
+
+## Local Hermeto validation (pip slice)
+
+**CPU green is not CUDA green.** Konflux runs **separate** prefetch inputs for CPU vs CUDA images. The committed files differ (`requirements.hashes.*` vs `requirements.hashes.*.cuda*`, **`requirements.overrides.txt`** vs **`requirements.overrides.cuda.txt`**), and the generator can put the same package in **wheel** lists for one pipeline and **source** for the other. Validating only **`pip-cpu`** misses failures that appear only in **`pip-cuda`** (and the reverse is possible in principle).
+
+**Hermeto runs `cargo vendor --locked` on extracted Python sdists** that contain Rust (e.g. under `deps/pip/<name>-<ver>/`). That step is **independent of whether you build a CUDA image**—it triggers when the **pip prefetch** delivers an sdist with a `Cargo.toml`/`Cargo.lock` mismatch (example: historical **`jiter` 0.12.x** → `PackageWithCorruptLockfileRejected`). A package resolved as a **manylinux wheel** in CPU can still be an **sdist** on the CUDA requirement split if you do not run **`pip-cuda`**.
+
+**What to run before pushing prefetch or hashed-requirement changes**
+
+| Change touched | Run |
+|----------------|-----|
+| CPU only (`konflux_requirements.sh`, `requirements.hashes.*` not `.cuda`) | `make hermeto-verify-pip-cpu` |
+| CUDA only (`konflux_requirements_cuda.sh`, `*.cuda*`, `requirements.overrides.cuda.txt`) | `make hermeto-verify-pip-cuda` |
+| Shared: `pyproject.toml`, `uv.lock`, both overrides files, or both generators | `make hermeto-verify-pip` (CPU **and** CUDA) |
+
+Commands use **strict** Hermeto (same shape as Konflux) and separate output dirs so results are not overwritten:
+
+- `make hermeto-verify-pip-cpu` → **`.hermeto-output-verify-cpu/`**  
+- `make hermeto-verify-pip-cuda` → **`.hermeto-output-verify-cuda/`**  
+
+Ad-hoc: `HERMETO_OUT=/path ./scripts/run_hermeto_fetch_deps.sh pip-cuda`. Generic **`make hermeto-fetch-deps`** still defaults to **`HERMETO_MODE=pip-cpu`**—do not treat that alone as sufficient when CUDA inputs changed.
+
+## CPU pipeline (`scripts/konflux_requirements.sh`)
+
+- **Regenerate; do not hand-edit** hashed requirements. Use `make konflux-requirements` (or the script).
+
+- **`--extra-index-url` in lockfiles**  
+  Hermeto does not support that line in committed files. The script passes PyPI as an extra index **during** `uv pip compile`, then **`sed`** removes `--extra-index-url` from the generated file.
+
+- **Torch / triton (RHOAI) vs Hermeto**  
+  `torch==` / `triton==` under the RHOAI simple index **fail** Hermeto’s PyPI intersection. The generator **strips** those stanzas from `requirements.hashes.wheel.txt` and writes:  
+  - `requirements.hashes.wheel.cpu.x86_64.txt` — pulp URLs + hashes for **torch** and **triton**; **torchvision** from **PyPI manylinux** URLs (so PyPI intersection succeeds).  
+  - `requirements.hashes.wheel.cpu.aarch64.txt` — same pattern for aarch64.  
+  **`Containerfile`** selects one of these via `TARGETARCH` (`amd64` → `x86_64`, `arm64` → `aarch64`).
+
+- **`pylatexenc` on the PyPI wheel file**  
+  Same intersection issue: PyPI digest ≠ RHOAI rebuild `*-8-py3-none-any.whl`. After the PyPI-wheel compile, the script rewrites **`pylatexenc==…`** to a **pulp direct URL** (currently the **cuda12.9-ubi9** artifact; the **3.2/cpu-ubi9** pulp path returned 404 for that filename when checked).
+
+- **`pybuild-deps`**  
+  It cannot use sdists for **`nvidia-*`**, **`torch` / `torchvision` / `triton`**, **`faiss-cpu`** in this layout. The script feeds a filtered temp file to `pybuild-deps` (see script comments).
+
+- **CPU `requirements.hashes.source.txt` vs transitive CUDA wheels**  
+  The resolver can still list **`nvidia-*`** (and related pins) on “PyPI source” lines even when the image installs **CPU** torch from RHOAI. Those packages are **wheel-only / not fetchable as sdists** for Hermeto’s pip input → **`No distributions found`**. **`konflux_requirements.sh`** filters the same set out **before** the `uv pip compile` that writes **`requirements.hashes.source.txt`**, not only before `pybuild-deps`.
+
+- **Tekton / JSON sync**  
+  When `prefetch-input` **`requirements_files`** or **`binary.packages`** change, update **`.tekton/rag-tool-*.yaml`**, **`.tekton/lightspeed-core-rag-content-cpu-f176b-*.yaml`**, and **`scripts/hermeto/prefetch-*.json`** together. The konflux script only rewrites the **`"packages"`** string in some Tekton files via `sed`; it does **not** auto-insert new requirement filenames everywhere.
+
+## CUDA pipeline (`scripts/konflux_requirements_cuda.sh`)
+
+- **Regenerate** with `make konflux-requirements-cuda`.
+
+- **Policy**  
+  RHOAI **CUDA** `torch` is canonical; do not assume PyPI CUDA `torch`’s `nvidia-*` graph applies. See `README.md` (CUDA / RHOAI / `list_wheel_requires_dist.py`).
+
+- **`hf-xet` (CUDA x86 and hermetic install)**  
+  **`hf-xet` 1.3+ / 1.4.x** sdists use **Rust `edition2024`**, which **Cargo ~1.84** in UBI-based build images does not support. If the CUDA wheel requirement file resolves **`hf-xet>=1.2.0`** to **1.4.x** and pip ever uses the **sdist**, metadata/build fails with *“feature `edition2024` is required”*. **Do not try to build it:** pin **`hf-xet==1.2.0`** in **`requirements.overrides.cuda.txt`**, keep **`hf-xet`** in **`PYPI_WHEELS`**, and run the same **force `1.2.0`** step before the PyPI **`--only-binary`** compile as in **`konflux_requirements.sh`** (`requirements.hashes.wheel.pypi.cuda.base.txt` must only carry **1.2.0** wheel hashes). **`huggingface_hub`** remains usable with the older wheel.
+
+- **Never install PyPI `nvidia-*` packages in the CUDA image**  
+  Do **not** add **`nvidia-cublas-cu12`**, **`nvidia-cudnn-cu12`**, or any other **`nvidia-*`** wheels from PyPI to hashed requirements or prefetch. RHOAI **`torch`** already ships the CUDA stack it expects; pulling the separate PyPI **`nvidia-*`** graph causes **version skew, duplicate libraries, broken `torch`, and Hermeto/prefetch failures** (`No distributions found` when strict mode disagrees with the lockfile shape). The CUDA generator strips every **`nvidia-*==…`** stanza from **`requirements.hashes.wheel.pypi.cuda.base.txt`** after the PyPI wheel compile (even when `uv` expands PyPI CUDA **`torch`** and injects those lines).
+
+- **CUDA-specific mechanics**  
+  - Strip **`torch` / `torchvision` / `triton`** from the **second** PyPI-only compile input so PyPI CUDA torch does not pull **`nvidia-*`** into that file; any **`nvidia-*`** that still appear in the compiled wheel hash file are removed before emitting **`.base.txt` / arch fragments**.  
+  - **Sdist-only on PyPI**: pins are moved back to the RHOAI wheel file so `--only-binary :all:` can still run.  
+  - **`antlr4-python3-runtime`**: PyPI has no usable wheel for omegaconf’s constraints; inject **pulp URL** + fixed stanza/hash.  
+  - **`pylatexenc`**: pulp URL + hash on the appropriate file (same Hermeto intersection issue as CPU).  
+  - **`pybuild-deps`**: filtered input excludes wheel-only / problematic packages (see script).  
+  - **Wheel layout**: `requirements.hashes.wheel.pypi.cuda.base.txt` plus **`.x86_64.txt` / `.aarch64.txt`** — **not** a single `requirements.hashes.wheel.pypi.cuda.txt` (some older Tekton snippets may still be wrong; **c0ec3** YAMLs are the reference).  
+  - **`jiter` / Hermeto `cargo vendor --locked`**: older **`jiter==0.12.x`** sdists shipped a **`Cargo.lock`** out of sync with **`Cargo.toml`**, so prefetch fails with **`PackageWithCorruptLockfileRejected`**. Fix: pin **`jiter==0.13.0`** in **`requirements.overrides.txt`** and **`requirements.overrides.cuda.txt`**, and list **`jiter`** in **`PYPI_WHEELS`** (CPU and CUDA scripts) so resolver emits **manylinux wheels** instead of sdists.
+
+- **rag-tool-cuda Tekton**  
+  If prefetch lists the wrong CUDA wheel filenames, fix them to match **`Containerfile-cuda`** and **`lightspeed-core-rag-content-c0ec3-*`**.
+
+## Operational checklist after dependency changes
+
+1. Run **`make konflux-requirements`** and/or **`make konflux-requirements-cuda`**.  
+2. Commit generated **`requirements.hashes.*`**, **`requirements-build*.txt`**, new **`requirements.hashes.wheel.cpu.*.txt`** when CPU script emits them, **`Containerfile`** if install paths change, and **`.tekton` / `scripts/hermeto`** if prefetch inputs change.  
+3. Run Hermeto pip slice checks per the **Local Hermeto validation** table (**`make hermeto-verify-pip`** when both CPU and CUDA inputs may have moved).  
+4. If RHOAI **rebuilds** wheels (new `-*-` segment in filenames), update **pulp URLs / hashes** in the generator constants (and re-run Hermeto).
+
+## Local hermetic container simulation
+
+Konflux mounts prefetched content at **`/cachi2`** and applies Hermeto **`project_files`** (substituting **`file://${output_dir}/…`** paths) before the image build. To approximate that with **`Containerfile`** / **`Containerfile-cuda`**:
+
+1. Run Hermeto (**`make hermeto-verify-pip-cpu`** / **`hermeto-verify-pip-cuda`**, or set **`HERMETO_OUT`**) so outputs land in **`.hermeto-output-verify-*`** or a directory of your choice.  
+2. **`./scripts/stage_hermetic_build_context.sh`** — copies **`deps/`** to **`.hermetic-staging/cachi2/output/`**, writes **`cachi2.env`** (**`PIP_FIND_LINKS=/cachi2/output/deps/pip`**), and writes **`.hermetic-staging/patched-requirements/`** from **`.build-config.json`** **`project_files`** with **`/cachi2/output`**.  
+3. **`./scripts/simulate_hermetic_build.sh cpu`** (or **`cuda`**) — generates **`.hermetic-staging/Containerfile.sim.*`** (early **`COPY`** of **`cachi2`**, overlay **`COPY`** of patched requirements) and runs **`podman`/`docker` `build`**.  
+4. Set **`NETWORK_MODE=none`** only if early **`RUN`** layers are already cached or RPMs/gems are prefetched like on Konflux; **`pip-*`** Hermeto runs do not ship **`deps/generic/model.safetensors`** — use **`full-cpu`**, **`--model`**, **`embeddings_model/`**, or **`ALLOW_PLACEHOLDER_HERMETIC_MODEL=1`** for a build-only stub.  
+5. For **CUDA**, point **`HERMETO_OUT`** at a directory produced by **`pip-cuda`** before **`simulate_hermetic_build.sh cuda`**.
+
+## References
+
+- [Hermeto](https://github.com/hermetoproject/hermeto) — prefetch CLI and container image.  
+- In-repo: `scripts/konflux_requirements.sh`, `scripts/konflux_requirements_cuda.sh`, `scripts/run_hermeto_fetch_deps.sh`, **`Makefile`** targets **`hermeto-verify-pip-*`**, `scripts/stage_hermetic_build_context.sh`, `scripts/simulate_hermetic_build.sh`, `scripts/hermeto/*.json`, `README.md` (Konflux / CUDA sections).
diff --git a/Makefile b/Makefile
index a0d797c6..9317d03e 100644
--- a/Makefile
+++ b/Makefile
@@ -105,6 +105,20 @@ konflux-requirements:	## generate hermetic requirements.*.txt file and gemfile.l
 	./scripts/konflux_requirements.sh
 	bundle _2.2.33_ lock --add-platform aarch64-linux
 
+HERMETO_MODE ?= pip-cpu
+hermeto-fetch-deps:	## run Hermeto prefetch (HERMETO_MODE=pip-cpu|pip-cuda|full-cpu|full-cuda); podman/docker + network
+	@./scripts/run_hermeto_fetch_deps.sh "$(HERMETO_MODE)"
+
+# Pip-slice checks against committed lockfiles (mirrors Konflux prefetch-dependencies for Python).
+# CPU success does not imply CUDA: different requirements.*.cuda.txt, overrides.cuda.txt, and wheel/source splits
+# (e.g. Hermeto runs cargo vendor --locked on some sdists only seen in one pipeline).
+.PHONY: hermeto-verify-pip-cpu hermeto-verify-pip-cuda hermeto-verify-pip
+hermeto-verify-pip-cpu: ## Hermeto strict fetch-deps pip-cpu → .hermeto-output-verify-cpu
+	HERMETO_OUT="$(CURDIR)/.hermeto-output-verify-cpu" ./scripts/run_hermeto_fetch_deps.sh pip-cpu
+hermeto-verify-pip-cuda: ## Hermeto strict fetch-deps pip-cuda → .hermeto-output-verify-cuda
+	HERMETO_OUT="$(CURDIR)/.hermeto-output-verify-cuda" ./scripts/run_hermeto_fetch_deps.sh pip-cuda
+hermeto-verify-pip: hermeto-verify-pip-cpu hermeto-verify-pip-cuda ## both pip slices (run before pushing Konflux prefetch changes)
+
 konflux-rpm-lock:	## generate rpm.lock.yaml file for konflux build
 	./scripts/generate-rpm-lock.sh
 
diff --git a/scripts/gen_containerfile_hermetic_sim.sh b/scripts/gen_containerfile_hermetic_sim.sh
new file mode 100755
index 00000000..9d5ea152
--- /dev/null
+++ b/scripts/gen_containerfile_hermetic_sim.sh
@@ -0,0 +1,42 @@
+#!/usr/bin/env bash
+# Generate a Containerfile that injects staged /cachi2 and Hermeto-patched requirement fragments.
+#
+# Usage:
+#   ./scripts/gen_containerfile_hermetic_sim.sh cpu|cuda > .hermetic-staging/Containerfile.sim
+#
+set -euo pipefail
+
+ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
+# Paths must be relative to build context (repo root); keep staging under the repo.
+STAGING_REL=".hermetic-staging"
+
+[[ "${1:-}" == cpu || "${1:-}" == cuda ]] || {
+	echo "usage: $0 cpu|cuda" >&2
+	exit 1
+}
+
+if [[ "$1" == cpu ]]; then
+	base="$ROOT/Containerfile"
+else
+	base="$ROOT/Containerfile-cuda"
+fi
+
+awk -v staging="$STAGING_REL" '
+/^USER root$/ {
+	print
+	print ""
+	print "# Injected by gen_containerfile_hermetic_sim.sh (local hermetic simulation)"
+	print "COPY " staging "/cachi2 /cachi2"
+	next
+}
+{ print }
+' "$base" | awk -v staging="$STAGING_REL" '
+$0 ~ /^COPY LICENSE \/licenses\/LICENSE$/ {
+	print
+	print ""
+	print "# Hermeto project_files (file:// wheel paths) overlay"
+	print "COPY " staging "/patched-requirements/ ./"
+	next
+}
+{ print }
+'
diff --git a/scripts/hermeto/prefetch-konflux-cpu.json b/scripts/hermeto/prefetch-konflux-cpu.json
new file mode 100644
index 00000000..e0c5e511
--- /dev/null
+++ b/scripts/hermeto/prefetch-konflux-cpu.json
@@ -0,0 +1,34 @@
+[
+  {
+    "type": "generic",
+    "path": ".",
+    "lockfile": "artifacts.lock.yaml"
+  },
+  {
+    "type": "rpm",
+    "path": ".",
+    "include_summary_in_sbom": true
+  },
+  {
+    "type": "pip",
+    "path": ".",
+    "requirements_files": [
+      "requirements.hashes.wheel.txt",
+      "requirements.hashes.wheel.cpu.x86_64.txt",
+      "requirements.hashes.wheel.cpu.aarch64.txt",
+      "requirements.hashes.wheel.pypi.txt",
+      "requirements.hashes.source.txt",
+      "requirements.hermetic.txt"
+    ],
+    "requirements_build_files": ["requirements-build.txt"],
+    "binary": {
+      "packages": "torch,torchvision,triton,uv-build,uv,pip,maturin,opencv-python,omegaconf,rapidocr,sqlite-vec,griffe,griffecli,griffelib,pyclipper,tree-sitter-typescript,hf-xet,psycopg2-binary,docling-parse,pypdf,pypdfium2,aiohappyeyeballs,aiohttp,aiosignal,aiosqlite,annotated-doc,annotated-types,anyio,asyncpg,beautifulsoup4,cffi,click,colorama,cryptography,dataclasses-json,defusedxml,distro,et-xmlfile,faiss-cpu,filetype,fire,frozenlist,greenlet,h11,httpcore,httpx,httpx-sse,idna,jinja2,jiter,joblib,jsonlines,jsonref,jsonschema-specifications,lxml,markdown-it-py,markupsafe,mdurl,mpire,mpmath,mypy-extensions,nest-asyncio,networkx,numpy,openpyxl,packaging,pandas,pillow,pluggy,prompt-toolkit,propcache,pydantic,pydantic-core,pygments,pylatexenc,python-dateutil,python-docx,python-pptx,pyyaml,referencing,requests,rpds-py,rtree,safetensors,scikit-learn,scipy,setuptools,shapely,shellingham,six,sniffio,sympy,threadpoolctl,tiktoken,tokenizers,transformers,typing-extensions,typing-inspect,typing-inspection,tzdata,xlsxwriter,zipp",
+      "os": "linux",
+      "arch": "x86_64,aarch64",
+      "py_version": 312
+    }
+  },
+  {
+    "type": "bundler"
+  }
+]
diff --git a/scripts/hermeto/prefetch-konflux-cuda.json b/scripts/hermeto/prefetch-konflux-cuda.json
new file mode 100644
index 00000000..a0394651
--- /dev/null
+++ b/scripts/hermeto/prefetch-konflux-cuda.json
@@ -0,0 +1,33 @@
+[
+  {
+    "type": "generic",
+    "path": ".",
+    "lockfile": "artifacts.lock.yaml"
+  },
+  {
+    "type": "rpm",
+    "path": "cuda"
+  },
+  {
+    "type": "pip",
+    "path": ".",
+    "requirements_files": [
+      "requirements.hashes.wheel.cuda.txt",
+      "requirements.hashes.wheel.pypi.cuda.base.txt",
+      "requirements.hashes.wheel.pypi.cuda.x86_64.txt",
+      "requirements.hashes.wheel.pypi.cuda.aarch64.txt",
+      "requirements.hashes.source.cuda.txt",
+      "requirements.hermetic.txt"
+    ],
+    "requirements_build_files": ["requirements-build.cuda.txt"],
+    "binary": {
+      "packages": "triton,pylatexenc,uv-build,uv,pip,maturin,cmake,opencv-python,omegaconf,rapidocr,sqlite-vec,griffe,griffecli,griffelib,pyclipper,tree-sitter-typescript,hf-xet,docling-parse,torch,torchvision,psycopg2-binary,faiss-cpu,llama-index-vector-stores-faiss,pypdf,pypdfium2,jiter,aiohappyeyeballs,aiohttp,aiosignal,beautifulsoup4,click,defusedxml,distro,filetype,frozenlist,h11,httpx,idna,jinja2,jsonschema,lxml,marko,networkx,numpy,openpyxl,pandas,pillow,pluggy,prompt-toolkit,propcache,pydantic,python-docx,python-pptx,pyyaml,requests,rtree,scipy,setuptools,sniffio,sympy,termcolor,tiktoken,tomlkit,typing-extensions,urllib3",
+      "os": "linux",
+      "arch": "x86_64,aarch64",
+      "py_version": 312
+    }
+  },
+  {
+    "type": "bundler"
+  }
+]
diff --git a/scripts/hermeto/prefetch-pip-cpu.json b/scripts/hermeto/prefetch-pip-cpu.json
new file mode 100644
index 00000000..6f79f46f
--- /dev/null
+++ b/scripts/hermeto/prefetch-pip-cpu.json
@@ -0,0 +1,21 @@
+[
+  {
+    "type": "pip",
+    "path": ".",
+    "requirements_files": [
+      "requirements.hashes.wheel.txt",
+      "requirements.hashes.wheel.cpu.x86_64.txt",
+      "requirements.hashes.wheel.cpu.aarch64.txt",
+      "requirements.hashes.wheel.pypi.txt",
+      "requirements.hashes.source.txt",
+      "requirements.hermetic.txt"
+    ],
+    "requirements_build_files": ["requirements-build.txt"],
+    "binary": {
+      "packages": "torch,torchvision,triton,uv-build,uv,pip,maturin,opencv-python,omegaconf,rapidocr,sqlite-vec,griffe,griffecli,griffelib,pyclipper,tree-sitter-typescript,hf-xet,psycopg2-binary,docling-parse,pypdf,pypdfium2,aiohappyeyeballs,aiohttp,aiosignal,aiosqlite,annotated-doc,annotated-types,anyio,asyncpg,beautifulsoup4,cffi,click,colorama,cryptography,dataclasses-json,defusedxml,distro,et-xmlfile,faiss-cpu,filetype,fire,frozenlist,greenlet,h11,httpcore,httpx,httpx-sse,idna,jinja2,jiter,joblib,jsonlines,jsonref,jsonschema-specifications,lxml,markdown-it-py,markupsafe,mdurl,mpire,mpmath,mypy-extensions,nest-asyncio,networkx,numpy,openpyxl,packaging,pandas,pillow,pluggy,prompt-toolkit,propcache,pydantic,pydantic-core,pygments,pylatexenc,python-dateutil,python-docx,python-pptx,pyyaml,referencing,requests,rpds-py,rtree,safetensors,scikit-learn,scipy,setuptools,shapely,shellingham,six,sniffio,sympy,threadpoolctl,tiktoken,tokenizers,transformers,typing-extensions,typing-inspect,typing-inspection,tzdata,xlsxwriter,zipp",
+      "os": "linux",
+      "arch": "x86_64,aarch64",
+      "py_version": 312
+    }
+  }
+]
diff --git a/scripts/hermeto/prefetch-pip-cuda.json b/scripts/hermeto/prefetch-pip-cuda.json
new file mode 100644
index 00000000..dfa0cde0
--- /dev/null
+++ b/scripts/hermeto/prefetch-pip-cuda.json
@@ -0,0 +1,21 @@
+[
+  {
+    "type": "pip",
+    "path": ".",
+    "requirements_files": [
+      "requirements.hashes.wheel.cuda.txt",
+      "requirements.hashes.wheel.pypi.cuda.base.txt",
+      "requirements.hashes.wheel.pypi.cuda.x86_64.txt",
+      "requirements.hashes.wheel.pypi.cuda.aarch64.txt",
+      "requirements.hashes.source.cuda.txt",
+      "requirements.hermetic.txt"
+    ],
+    "requirements_build_files": ["requirements-build.cuda.txt"],
+    "binary": {
+      "packages": "triton,pylatexenc,uv-build,uv,pip,maturin,cmake,opencv-python,omegaconf,rapidocr,sqlite-vec,griffe,griffecli,griffelib,pyclipper,tree-sitter-typescript,hf-xet,docling-parse,torch,torchvision,psycopg2-binary,faiss-cpu,llama-index-vector-stores-faiss,pypdf,pypdfium2,jiter,aiohappyeyeballs,aiohttp,aiosignal,beautifulsoup4,click,defusedxml,distro,filetype,frozenlist,h11,httpx,idna,jinja2,jsonschema,lxml,marko,networkx,numpy,openpyxl,pandas,pillow,pluggy,prompt-toolkit,propcache,pydantic,python-docx,python-pptx,pyyaml,requests,rtree,scipy,setuptools,sniffio,sympy,termcolor,tiktoken,tomlkit,typing-extensions,urllib3",
+      "os": "linux",
+      "arch": "x86_64,aarch64",
+      "py_version": 312
+    }
+  }
+]
diff --git a/scripts/run_hermeto_fetch_deps.sh b/scripts/run_hermeto_fetch_deps.sh
new file mode 100755
index 00000000..a92e6490
--- /dev/null
+++ b/scripts/run_hermeto_fetch_deps.sh
@@ -0,0 +1,81 @@
+#!/usr/bin/env bash
+# Run Hermeto like Konflux prefetch-dependencies (see pipeline log):
+#   hermeto --log-level debug --mode strict fetch-deps '<json>' --sbom-output-type spdx --source ... --output ...
+# Image: https://github.com/hermetoproject/hermeto  (ghcr.io/hermetoproject/hermeto)
+#
+# Usage:
+#   ./scripts/run_hermeto_fetch_deps.sh pip-cpu    # pip input only (matches Konflux pip slice)
+#   ./scripts/run_hermeto_fetch_deps.sh pip-cuda
+#   ./scripts/run_hermeto_fetch_deps.sh full-cpu # generic + rpm + pip + bundler (no RHSM certs — rpm may fail locally)
+#   ./scripts/run_hermeto_fetch_deps.sh full-cuda
+#
+# Environment:
+#   HERMETO_IMAGE      default ghcr.io/hermetoproject/hermeto:latest
+#   HERMETO_OUT        output dir (default: <repo>/.hermeto-output)
+#   CONTAINER_ENGINE   podman | docker
+#
+set -euo pipefail
+
+ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
+HERMETO_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/hermeto" && pwd)"
+HERMETO_IMAGE="${HERMETO_IMAGE:-ghcr.io/hermetoproject/hermeto:latest}"
+OUT="${HERMETO_OUT:-$ROOT/.hermeto-output}"
+
+usage() {
+	echo "usage: $0 {pip-cpu|pip-cuda|full-cpu|full-cuda}" >&2
+	exit 1
+}
+
+[[ "${1:-}" ]] || usage
+
+engine="${CONTAINER_ENGINE:-}"
+if [[ -z "$engine" ]]; then
+	if command -v podman >/dev/null 2>&1; then
+		engine=podman
+	elif command -v docker >/dev/null 2>&1; then
+		engine=docker
+	else
+		echo "error: install podman or docker, or set CONTAINER_ENGINE" >&2
+		exit 1
+	fi
+fi
+
+case "$1" in
+pip-cpu) json="$HERMETO_DIR/prefetch-pip-cpu.json" ;;
+pip-cuda) json="$HERMETO_DIR/prefetch-pip-cuda.json" ;;
+full-cpu) json="$HERMETO_DIR/prefetch-konflux-cpu.json" ;;
+full-cuda) json="$HERMETO_DIR/prefetch-konflux-cuda.json" ;;
+*) usage ;;
+esac
+
+[[ -f "$json" ]] || { echo "error: missing $json" >&2; exit 1; }
+
+mkdir -p "$OUT"
+
+vol_mount=(-v "$ROOT:$ROOT")
+if [[ "$engine" == podman ]]; then
+	vol_mount=(-v "$ROOT:$ROOT:z")
+fi
+
+input_json="$(cat -- "$json")"
+if [[ -z "${input_json//[$'\t\n\r ']/}" ]]; then
+	echo "error: empty JSON in $json" >&2
+	exit 1
+fi
+
+echo "Running (Konflux-shaped): $engine run ... $HERMETO_IMAGE \\"
+echo "  --log-level debug --mode strict fetch-deps ... --sbom-output-type spdx --source $ROOT --output $OUT"
+echo "  mode=$1"
+
+# JSON is the fetch-deps positional argument; options follow (same order as Konflux HermetoCli log).
+exec "$engine" run --rm \
+	"${vol_mount[@]}" \
+	-w "$ROOT" \
+	"$HERMETO_IMAGE" \
+	--log-level debug \
+	--mode strict \
+	fetch-deps \
+	"$input_json" \
+	--sbom-output-type spdx \
+	--source "$ROOT" \
+	--output "$OUT"
diff --git a/scripts/simulate_hermetic_build.sh b/scripts/simulate_hermetic_build.sh
new file mode 100755
index 00000000..e64485d8
--- /dev/null
+++ b/scripts/simulate_hermetic_build.sh
@@ -0,0 +1,80 @@
+#!/usr/bin/env bash
+# Simulate a Konflux-style hermetic image build: stage Hermeto output as /cachi2, apply
+# project_files overlays, then podman/docker build with --network=none.
+#
+# Usage:
+#   ./scripts/simulate_hermetic_build.sh cpu|cuda
+#
+# Environment:
+#   HERMETO_OUT         Hermeto output dir (default: .hermeto-output)
+#   HERMETIC_STAGING    Staging under repo (default: .hermetic-staging)
+#   CONTAINER_ENGINE    podman | docker
+#   EXTRA_BUILD_ARGS    extra args to build (e.g. --build-arg BASE_IMAGE=registry.access.redhat.com/ubi9/ubi)
+#   NETWORK_MODE        unset = engine default (dnf/bundle work on a normal laptop).
+#                       Set to "none" for a strict no-network build (needs cached base
+#                       layers and Konflux-style RPM prefetch for a cold build).
+#   PULL_POLICY         podman/docker --pull value (default: never). Use "missing" or "always"
+#                       when the base image is not local yet.
+#
+# Caveats:
+#   - Base image: set PULL_POLICY=missing or always until the tag exists locally; default is never.
+#   - Containerfile early RUNs (dnf, pip install uv, bundle install) are unchanged. Use
+#     NETWORK_MODE=none only when those layers are already cached or RPMs/gems are prefetched
+#     like on Konflux; otherwise omit NETWORK_MODE so dnf/bundle can reach the network.
+#   - Hermeto pip-* output does not include deps/generic/model.safetensors; use full-cpu,
+#     --model, embeddings_model/, or ALLOW_PLACEHOLDER_HERMETIC_MODEL=1 for a build-only stub.
+#
+set -euo pipefail
+
+ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
+STAGING="${HERMETIC_STAGING:-$ROOT/.hermetic-staging}"
+PULL_POLICY="${PULL_POLICY:-never}"
+
+[[ "${1:-}" == cpu || "${1:-}" == cuda ]] || {
+	echo "usage: $0 cpu|cuda" >&2
+	exit 1
+}
+kind="$1"
+
+out="${HERMETO_OUT:-$ROOT/.hermeto-output}"
+if [[ "$kind" == cuda ]]; then
+	if ! python3 -c "import json, pathlib; p=pathlib.Path('$out')/'.build-config.json'; d=json.loads(p.read_text()); names=[pathlib.Path(x['abspath']).name for x in d.get('project_files',[])]; raise SystemExit(0 if any('cuda' in n for n in names) else 1)"; then
+		echo "error: HERMETO_OUT ($out) looks like CPU prefetch; run pip-cuda into this directory first" >&2
+		exit 1
+	fi
+fi
+
+engine="${CONTAINER_ENGINE:-}"
+if [[ -z "$engine" ]]; then
+	if command -v podman >/dev/null 2>&1; then
+		engine=podman
+	elif command -v docker >/dev/null 2>&1; then
+		engine=docker
+	else
+		echo "error: install podman or docker" >&2
+		exit 1
+	fi
+fi
+
+"$ROOT/scripts/stage_hermetic_build_context.sh" --hermeto-out "${HERMETO_OUT:-$ROOT/.hermeto-output}"
+
+mkdir -p "$STAGING"
+gen="$STAGING/Containerfile.sim.$kind"
+"$ROOT/scripts/gen_containerfile_hermetic_sim.sh" "$kind" >"$gen"
+
+tag="rag-content-hermetic-sim:$kind"
+ctx="$ROOT"
+
+echo "Generated $gen"
+echo "Building $tag (network=${NETWORK_MODE:-default}) with $engine..."
+
+build_args=(-f "$gen" -t "$tag" "$ctx" --pull="$PULL_POLICY")
+# Only pass --network for an explicit offline build. Podman rootless often breaks on
+# `--network=default`; omit the flag to use the engine's normal build networking.
+if [[ "${NETWORK_MODE:-}" == none ]]; then
+	build_args+=(--network=none)
+fi
+
+# shellcheck disable=2206
+extra=(${EXTRA_BUILD_ARGS:-})
+exec "$engine" build "${extra[@]}" "${build_args[@]}"
diff --git a/scripts/stage_hermetic_build_context.sh b/scripts/stage_hermetic_build_context.sh
new file mode 100755
index 00000000..edaf7ae3
--- /dev/null
+++ b/scripts/stage_hermetic_build_context.sh
@@ -0,0 +1,94 @@
+#!/usr/bin/env bash
+# Stage Konflux-style /cachi2 tree and pip requirement overlays from Hermeto output.
+#
+# Hermeto writes .build-config.json with PIP_FIND_LINKS=${output_dir}/deps/pip and
+# project_files templates using file://${output_dir}/deps/pip/.... Konflux substitutes
+# output_dir at build time; locally we use /cachi2/output.
+#
+# Usage:
+#   ./scripts/stage_hermetic_build_context.sh [--model PATH] [--hermeto-out DIR]
+#
+# Model: prefers deps/generic/model.safetensors from Hermeto; else --model; else
+# embeddings_model/model.safetensors in the repo; else ALLOW_PLACEHOLDER_HERMETIC_MODEL=1
+# (writes a tiny stub so the image build can finish — not runnable for RAG).
+#
+set -euo pipefail
+
+ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
+HERMETO_OUT="${HERMETO_OUT:-$ROOT/.hermeto-output}"
+STAGING="${HERMETIC_STAGING:-$ROOT/.hermetic-staging}"
+MODEL_SRC=""
+
+while [[ "${1:-}" ]]; do
+	case "$1" in
+	--hermeto-out)
+		HERMETO_OUT="$2"
+		shift 2
+		;;
+	--model)
+		MODEL_SRC="$2"
+		shift 2
+		;;
+	*)
+		echo "usage: $0 [--hermeto-out DIR] [--model PATH]" >&2
+		exit 1
+		;;
+	esac
+done
+
+BUILD_CONFIG="$HERMETO_OUT/.build-config.json"
+[[ -f "$BUILD_CONFIG" ]] || {
+	echo "error: missing $BUILD_CONFIG (run scripts/run_hermeto_fetch_deps.sh pip-cpu|pip-cuda|full-cpu|full-cuda first)" >&2
+	exit 1
+}
+[[ -d "$HERMETO_OUT/deps/pip" ]] || {
+	echo "error: missing $HERMETO_OUT/deps/pip" >&2
+	exit 1
+}
+
+CACHI2="$STAGING/cachi2"
+PATCHED="$STAGING/patched-requirements"
+rm -rf "$CACHI2" "$PATCHED"
+mkdir -p "$CACHI2/output" "$PATCHED"
+
+cp -a "$HERMETO_OUT/deps" "$CACHI2/output/"
+
+mkdir -p "$CACHI2/output/deps/generic"
+if [[ -f "$HERMETO_OUT/deps/generic/model.safetensors" ]]; then
+	cp -a "$HERMETO_OUT/deps/generic/model.safetensors" "$CACHI2/output/deps/generic/"
+elif [[ -n "$MODEL_SRC" ]]; then
+	cp -a "$MODEL_SRC" "$CACHI2/output/deps/generic/model.safetensors"
+elif [[ -f "$ROOT/embeddings_model/model.safetensors" ]]; then
+	cp -a "$ROOT/embeddings_model/model.safetensors" "$CACHI2/output/deps/generic/model.safetensors"
+elif [[ "${ALLOW_PLACEHOLDER_HERMETIC_MODEL:-}" == 1 ]]; then
+	printf '\0' >"$CACHI2/output/deps/generic/model.safetensors"
+	echo "warning: using 1-byte placeholder model (ALLOW_PLACEHOLDER_HERMETIC_MODEL=1)" >&2
+else
+	echo "error: no model.safetensors (Hermeto generic, --model, embeddings_model/, or set ALLOW_PLACEHOLDER_HERMETIC_MODEL=1)" >&2
+	exit 1
+fi
+
+# Konflux-style env (Hermeto uses ${output_dir}/deps/pip → /cachi2/output/deps/pip).
+cat >"$CACHI2/cachi2.env" <<'EOF'
+export PIP_FIND_LINKS=/cachi2/output/deps/pip
+export PIP_NO_INDEX=true
+EOF
+
+python3 - <<'PY' "$BUILD_CONFIG" "$PATCHED"
+import json
+import sys
+from pathlib import Path
+
+build_config = Path(sys.argv[1])
+out_dir = Path(sys.argv[2])
+data = json.loads(build_config.read_text())
+for pf in data.get("project_files", []):
+    path = Path(pf["abspath"])
+    name = path.name
+    text = pf["template"].replace("${output_dir}", "/cachi2/output")
+    (out_dir / name).write_text(text)
+    print("patched", name)
+PY
+
+echo "Staged: $CACHI2"
+echo "Patched requirements: $PATCHED"