diff --git a/docs/design/llama-stack-config-merge/llama-stack-config-merge-spike.md b/docs/design/llama-stack-config-merge/llama-stack-config-merge-spike.md
new file mode 100644
index 000000000..83451b4ff
--- /dev/null
+++ b/docs/design/llama-stack-config-merge/llama-stack-config-merge-spike.md
@@ -0,0 +1,840 @@
+# Spike: Llama Stack config merge (unified `lightspeed-stack.yaml`)
+
+## Overview
+
+**The problem**: Operators today must maintain two configuration files —
+`lightspeed-stack.yaml` (LCORE settings) and `run.yaml` (Llama Stack
+operational config: providers, storage, APIs, safety, registered resources).
+This split increases the chance of misconfiguration, makes downstream
+deployment templates larger, and forces every Lightspeed team to understand
+Llama Stack's internal schema. LCORE-836 asks for a single source of truth.
+
+**The recommendation**: A layered approach — "Option C + Option E layer":
+
+- **High-level keys** in `lightspeed-stack.yaml` under a new `llama_stack.config`
+  section (inference, later storage/safety/...). Most downstream teams write
+  only these.
+- **`native_override`** escape hatch under the same section — raw Llama Stack
+  schema, deep-merged last. Covers anything the high-level schema doesn't
+  express.
+- **`profile`** field that points to a YAML file used as the baseline — the
+  "profiles" feature is mechanism-only; LCORE ships no profiles of its own
+  beyond one or two reference examples under `examples/profiles/`.
+- **`baseline: default | empty`** selects whether the synthesis starts from
+  LCORE's built-in baseline or a blank slate.
+- **Legacy mode preserved**: existing `llama_stack.library_client_config_path`
+  works unchanged through a deprecation window. Mutual exclusion with the new
+  `llama_stack.config` block is enforced at load time.
+- **Migration tool**: `lightspeed-stack --migrate-config` produces a unified
+  single-file config from an existing (`run.yaml` + `lightspeed-stack.yaml`)
+  pair, lossless round-trip.
+
+**PoC validation**: A Level 3' PoC (per the spike howto) proves the mechanism
+end-to-end in library mode. A unified `lightspeed-stack.yaml` containing only
+`llama_stack.config` (no external `run.yaml`) successfully drives LCORE:
+liveness/readiness green, `/v1/query` returns a real model response,
+`native_override` demonstrably takes effect. Full unit-test suite passes
+(2098 tests), including a lossless migrate-then-synthesize round-trip.
+Server-mode end-to-end was not re-run through docker-compose — the container
+rebuild time was impractical and unrelated to PoC quality (the container image
+is ~2 GB of LS dependencies); the same synthesis code path is exercised by the
+library-mode PoC and unit tests.
+
+---
+
+## Strategic decisions — for @sbunciak (PM) and @tisnik
+
+These set scope, approach, and rollout shape. Each has a recommendation —
+please confirm or override.
+
+### Decision S1: Overall shape (Option C + optional Option E)
+
+See [Design alternatives considered](#design-alternatives-considered) for the
+full option set and scoring.
+
+| Option | Summary |
+|---|---|
+| A (Embedded native only) | `lightspeed-stack.yaml.llama_stack.config` is raw Llama Stack schema |
+| **B + C (High-level + native override)** | High-level keys cover the common path, `native_override` as escape hatch |
+| E (Profiles) | Named or path-based pre-built config bundles, layered on top of A/B/C |
+| G (Kustomize-style patches) | Ship a default baseline, operator writes JSON-Patch-like overlays |
+
+**Recommendation**: **C** (high-level + native_override) with **E** (profile
+feature, no shipped profiles) as an optional layer. Best balance of UX,
+escape-hatch power, validation rigor, and dynamic-reconfig fit for the
+broader feature roadmap (LCORE-777/781).
+
+### Decision S2: Deprecation timeline for the legacy path
+
+Legacy mode (`llama_stack.library_client_config_path` + external `run.yaml`)
+must coexist with unified mode through a deprecation window to avoid breaking
+downstream teams. Three candidate cadences:
+
+| Cadence | Timing |
+|---|---|
+| N+2 releases | Opt-in → warning → removed over two releases after landing |
+| N+3 releases | Opt-in → warning (N+1) → removed at N+3 |
+| **Calendar-based** | e.g., "removed no sooner than 6 months after warning starts" |
+
+**Recommendation**: **calendar-based**, because the right number depends on
+LCORE's release cadence and downstream consumers' update latency — both of
+which the spike author does not own. @sbunciak to set the actual numbers.
+
+### Decision S3: Downstream implications we may not have seen
+
+The spike author has direct evidence of Konflux/Tekton usage (`.tekton/` dir)
+and RHOAI testing (`tests/e2e-prow/rhoai/`). Other downstream consumers —
+RHOAI operator CRs, Helm charts, Kustomize overlays, any other products —
+are not visible from this repo alone.
+
+**Ask**: Reviewers from downstream teams to confirm whether their deployment
+setup treats `run.yaml` as a separate artifact (ConfigMap, templated file,
+build-time asset) that this design would need to accommodate.
+
+### Decision S4: Scope of this spike — what is deliberately left out
+
+The following related work streams are **not** included in this spike and
+should be tracked as separate future JIRAs:
+
+- **Llama Stack process supervision** from LCORE (restart-on-crash, signal
+  propagation, merged logs). Orthogonal to config merging; covered by
+  LCORE-777 / LCORE-778.
+- **Hot-reload / dynamic reconfig** (e.g., live `POST /v1/rag` that adds a
+  BYOK RAG without restart). Llama Stack does not natively support
+  hot-reload; achieving it would require supervision + restart flows.
+  Covered by LCORE-781.
+
+**Recommendation**: confirm this scope split. If reviewers want any of the
+above pulled in, this spike's JIRAs grow accordingly.
+
+---
+
+## Technical decisions — for @tisnik and team leads
+
+Architecture-level and implementation-level. Each has a recommendation
+grounded in the PoC.
+
+### Decision T1: Format detection (shape vs version field vs both)
+
+How does LCORE tell unified-mode configs from legacy-mode configs?
+
+| Option | Works by |
+|---|---|
+| Shape only | Presence of `llama_stack.config` → unified; else legacy |
+| Version field only | Explicit `config_format_version: 2` required |
+| **Both (soft-coupled)** | Shape decides; version field optional but must agree when present |
+
+**Recommendation**: **both, soft-coupled**. Gives a cheap upgrade path for
+future real schema bumps without forcing every existing user to add a
+version field today. Confidence: 75%.
+
+### Decision T2: Override precedence (inside Option C)
+
+When `llama_stack.config.native_override` overlaps with a high-level key,
+what semantics?
+
+| Strategy | Example: `safety: {excluded_categories: [a, b]}` vs override `{excluded_categories: [c]}` |
+|---|---|
+| Deep-merge, append lists | result: `[a, b, c]` |
+| **Deep-merge, replace lists** | result: `[c]`; other keys in `safety` preserved |
+| Whole-key override | result: whole `safety` replaced; lose `default_shield_id` unless restated |
+| JSON Patch (ops) | explicit — `{op: replace, path: /safety/excluded_categories, value: [c]}` |
+
+**Recommendation**: **deep-merge with list replacement**. Simple mental model,
+no list-merge tarpit, keeps scalar + map overrides minimal. Implemented in
+`deep_merge_list_replace()`. Confidence: 70%.
+
+See [Merge semantics worked examples](#merge-semantics-worked-examples).
+
+### Decision T3: Secrets in synthesized files
+
+The synthesized run.yaml lives on disk (library mode: `$TMPDIR`; server mode:
+inside the LS container). Option space:
+
+| Option | On-disk content |
+|---|---|
+| **Keep env-var refs verbatim** | `api_key: ${env.OPENAI_API_KEY}` (resolved by LS at start) |
+| Resolve before writing | `api_key: sk-...` |
+
+**Recommendation**: **keep env-var refs verbatim**. Security-leaning default;
+resolved secrets never touch the disk. Implemented in
+`apply_high_level_inference` (emits `${env.<NAME>}` strings). Confidence: 95%.
+
+### Decision T4: Synthesized file location
+
+Where the synthesized `run.yaml` goes at runtime:
+
+| Option | Path |
+|---|---|
+| Temp file | `$TMPDIR/llama_stack_synthesized_config.yaml` |
+| **Persistent known path** | Local: `./.generated/run.yaml` or `~/.local/state/lightspeed-stack/run.yaml`; Container: `/app-root/.generated/run.yaml`. Overwrite on each boot. |
+
+**Recommendation**: **persistent known path, overwrite on boot**. Debuggable,
+no stale-file risk (always overwritten before LS starts). The PoC used
+`$TMPDIR` for expediency; production should use the persistent path. CLI flag
+`--synthesized-config-output <path>` for debugging. Confidence: 85%.
+
+### Decision T5: Migration tool invocation
+
+How operators invoke the migration tool:
+
+| Option | Example |
+|---|---|
+| Separate script under `scripts/` | `uv run python scripts/migrate-config.py ...` |
+| **Flag on main entry point** | `lightspeed-stack --migrate-config --run-yaml X -c Y --migrate-output Z` |
+| Subcommand refactor | `lightspeed-stack migrate-config ...` (BREAKS existing invocations) |
+
+**Recommendation**: **flag on main entry point**. Parallels the existing
+`--dump-configuration` / `--dump-schema` flags; zero breaking change to
+existing invocations. Implemented in `src/lightspeed_stack.py` + a
+companion `migrate_config_dumb()` function. Confidence: 90%.
+
+### Decision T6: Profile distribution
+
+How profiles (Option E layer) reach downstream teams:
+
+| Option | Details |
+|---|---|
+| Ship named profiles in `src/profiles/` | LCORE ships a pre-curated set; `profile: openai-remote` resolves |
+| **Feature only, no shipped profiles** | `profile: <path-to-file>` is the only invocation; teams author their own; LCORE ships 1–2 reference examples under `examples/profiles/` |
+
+**Recommendation**: **feature only, no shipped profiles**. Avoids
+profile-sprawl and the burden of keeping "blessed" profiles in sync with
+downstream products. 1–2 reference examples in `examples/profiles/` are
+documentation, not shipped runtime assets. Confidence: 85%.
+
+### Decision T7: The `baseline` field (added during PoC)
+
+During the PoC, strict lossless round-trip for the migration tool surfaced
+a need: when `native_override` contains an entire run.yaml body, the default
+baseline's keys still leak into the result via deep-merge. Fix: a
+`baseline: "default" | "empty"` field.
+
+- `baseline: default` (default value) — start from LCORE's built-in baseline
+- `baseline: empty` — start from `{}`. Used by the dumb migration tool so
+  round-trip is exact.
+
+**Recommendation**: **accept this field**. Alternatives (`inherit_defaults:
+bool`, `starting_point: ...`) are cosmetic. Confidence: 80%. Reviewers: any
+preference on naming before this ships?
+
+### Decision T8: Konflux / Tekton pipelines
+
+The `.tekton/` directory exists in this repo. If any Konflux/Tekton pipeline
+templates or mounts `run.yaml` separately, unified mode needs that pipeline
+to either (a) keep using legacy mode during the deprecation window, or
+(b) mount the unified `lightspeed-stack.yaml` and drop the `run.yaml` mount.
+
+**Ask**: owner of `.tekton/` to confirm current pipeline shape and plan
+migration.
+
+### Decision T9: Library client API (resolved by PoC)
+
+**Finding from PoC**: `AsyncLlamaStackAsLibraryClient` in `llama-stack` only
+accepts a file-path string. It does not accept a dict. This means library
+mode must write the synthesized config to disk — no dict-only shortcut
+available. Not a decision; a fact to note in the spec doc.
+
+---
+
+## Proposed JIRAs
+
+Each JIRA's agentic-tool instruction points to the spec doc
+(`llama-stack-config-merge.md`), the permanent reference. The first JIRA
+(authoring e2e feature files) is the intentional kickoff — it happens
+before feature implementation so the test shape is not influenced by
+implementation choices.
+
+<!-- type: Story -->
+<!-- key: LCORE-???? -->
+### LCORE-???? E2E feature files for unified mode (no step implementation)
+
+**User story**: As a Lightspeed Core e2e engineer, I want the behave
+feature files for unified-mode scenarios written before the feature
+implementation lands, so that the test shape reflects the feature's
+intended behavior rather than the chosen implementation, and any
+architectural gaps surface early.
+
+**Description**: Author behave `.feature` files under `tests/e2e/features/`
+that describe the behaviors required of unified mode. Step definitions
+(Python glue) are explicitly **not** part of this ticket — they are
+covered by a later sibling ticket (LCORE-???? — Implement step
+definitions). The feature files can be submitted for review and land
+before implementation of the feature itself begins.
+
+**Scope**:
+- `.feature` files covering, at minimum, these R1–R11 surfaces from the
+  spec doc:
+  - Boot LCORE with unified `lightspeed-stack.yaml` (no external
+    `run.yaml`); `/liveness`, `/readiness`, and `/v1/query` succeed.
+  - Boot LCORE with legacy config
+    (`library_client_config_path` + external `run.yaml`); same result.
+  - Setting both `llama_stack.config` and
+    `llama_stack.library_client_config_path` fails at config-load time
+    with a clear error that mentions `--migrate-config`.
+  - Migration tool: `lightspeed-stack --migrate-config ...` produces a
+    unified file that drives equivalent Llama Stack behavior.
+  - `native_override` deep-merges onto the baseline with list
+    replacement (tested on a scalar key and a list key).
+  - `profile:` path (absolute and relative-to-config-dir) loads the
+    referenced baseline.
+  - Secrets appear as `${env.FOO}` references in the synthesized
+    `run.yaml` on disk; never resolved to raw values.
+  - Legacy mode emits a one-line deprecation WARN at startup; unified
+    mode does not.
+- Additions to `tests/e2e/test_list.txt` so behave discovers the new
+  files.
+- Gherkin scenarios authored from the spec doc (`R1..R11`) only; author
+  must avoid reading the implementation JIRAs' scope sections while
+  drafting scenarios.
+
+**Acceptance criteria**:
+- behave parses every new `.feature` file without syntax errors.
+- behave marks all new scenario steps as `undefined` (step definitions
+  land in LCORE-????).
+- `uv run make test-e2e` remains green (new scenarios are skipped or
+  reported undefined, not failing).
+- Any ambiguity or architectural tension uncovered while authoring is
+  captured either as a comment in the spec doc or as a new sub-JIRA.
+
+**Blocks**: LCORE-???? (Implement behave step definitions for unified
+mode).
+
+**Agentic tool instruction**:
+```text
+Read "Requirements" (R1..R11) and "Use Cases" in
+docs/design/llama-stack-config-merge/llama-stack-config-merge.md.
+Do NOT read the other JIRAs' scope sections or the synthesizer/schema
+implementation code while authoring; the point of this ticket is to
+produce feature files uncontaminated by implementation detail.
+Key files to create: tests/e2e/features/unified-mode-*.feature plus
+additions to tests/e2e/test_list.txt. Do NOT create step definitions in
+tests/e2e/features/steps/.
+To verify: `uv run behave --dry-run tests/e2e/features/unified-mode-*.feature`
+parses successfully; `uv run make test-e2e` still green with the new
+scenarios reported as undefined.
+```
+
+<!-- type: Task -->
+<!-- key: LCORE-???? -->
+### LCORE-???? Unified `llama_stack.config` schema + synthesizer
+
+**Description**: Implement the unified-mode config schema
+(`UnifiedLlamaStackConfig`, `UnifiedInferenceSection`,
+`UnifiedInferenceProvider`) and the synthesizer that produces a full Llama
+Stack `run.yaml` from it. Wire library mode to the synthesizer. Preserve
+legacy mode through mutual-exclusion validation.
+
+**Scope**:
+- New Pydantic classes in `src/models/config.py`.
+- New functions in `src/llama_stack_configuration.py`:
+  `synthesize_configuration`, `deep_merge_list_replace`,
+  `apply_high_level_inference`, `load_default_baseline`, `synthesize_to_file`.
+- A shipped default baseline at `src/data/default_run.yaml`.
+- Library-mode wiring in `src/client.py`: detect unified vs legacy, write
+  synthesized file, pass path to library client.
+- Cross-field validation: reject both `config` and
+  `library_client_config_path` set simultaneously.
+- Legacy behavior (`llama_stack.library_client_config_path` path) unchanged.
+
+**Acceptance criteria**:
+- Unified `lightspeed-stack.yaml` (no external `run.yaml`) boots LCORE in
+  library mode and serves `/v1/query`.
+- Legacy configs continue to work with no change.
+- Mutual-exclusion error message fires cleanly when both forms are set.
+- Unit tests for synthesizer, merge semantics, schema validation.
+
+**Agentic tool instruction**:
+```text
+Read the "Architecture" and "Implementation Suggestions" sections of
+docs/design/llama-stack-config-merge/llama-stack-config-merge.md.
+Key files to create or modify:
+  src/models/config.py  (new classes; modify LlamaStackConfiguration)
+  src/llama_stack_configuration.py  (synthesize_configuration + helpers)
+  src/data/default_run.yaml  (new)
+  src/client.py  (library-mode wiring)
+To verify: run a unified-mode config end-to-end via `uv run lightspeed-stack -c <config>` and confirm /v1/query succeeds.
+```
+
+<!-- type: Task -->
+<!-- key: LCORE-???? -->
+### LCORE-???? Migration tool — dumb-mode lift-and-shift
+
+**Description**: Implement `--migrate-config` on the `lightspeed-stack` CLI
+that produces a unified single-file config from an existing
+(`run.yaml` + `lightspeed-stack.yaml`) pair. Dumb mode places the entire
+`run.yaml` body under `llama_stack.config.native_override` with
+`baseline: empty`, removes `library_client_config_path`.
+
+**Scope**:
+- `migrate_config_dumb()` function in `src/llama_stack_configuration.py`.
+- `--migrate-config`, `--run-yaml`, `--migrate-output` flags in
+  `src/lightspeed_stack.py`.
+- Round-trip test: migrate → synthesize → byte-identical to original
+  `run.yaml`.
+
+**Acceptance criteria**:
+- `lightspeed-stack --migrate-config --run-yaml X -c Y --migrate-output Z`
+  produces a unified config that boots LCORE in library mode to the same
+  Llama Stack behavior as the original pair.
+- Round-trip unit test passes.
+- `--help` describes the flag clearly.
+
+**Agentic tool instruction**:
+```text
+Read "Migration tool" in docs/design/llama-stack-config-merge/llama-stack-config-merge.md.
+Key files: src/lightspeed_stack.py, src/llama_stack_configuration.py,
+tests/unit/test_llama_stack_synthesize.py.
+To verify: migrate the repo's root run.yaml + lightspeed-stack.yaml, then
+start LCORE with the output; confirm /v1/query works.
+```
+
+<!-- type: Task -->
+<!-- key: LCORE-???? -->
+### LCORE-???? LS container entrypoint + deployment artifacts for unified mode
+
+**Description**: Update the Llama Stack container entrypoint and deployment
+manifests so server mode works end-to-end from a unified
+`lightspeed-stack.yaml`. Rebuild guidance for container images that bundle
+the synthesizer script and default baseline.
+
+**Scope**:
+- Update `scripts/llama-stack-entrypoint.sh` — the existing script already
+  defers to the Python CLI for auto-detection; document that behavior.
+- Update `test.containerfile` to copy `src/data/` into the LS container so
+  `load_default_baseline()` resolves.
+- Provide a unified-mode `docker-compose.yaml` (or update the existing one)
+  that mounts only `lightspeed-stack.yaml` into the LS container.
+- Update `.tekton/` pipelines as needed (coordinate with pipeline owner,
+  see Decision T8).
+
+**Acceptance criteria**:
+- `docker compose up` with a unified `lightspeed-stack.yaml` starts both
+  containers healthy; `/v1/query` works through LCORE → LS.
+- Legacy docker-compose layout (with external `run.yaml` mount) still works.
+
+**Agentic tool instruction**:
+```text
+Read "Architecture → Server mode" in docs/design/llama-stack-config-merge/llama-stack-config-merge.md.
+Key files: scripts/llama-stack-entrypoint.sh, test.containerfile,
+docker-compose.yaml, .tekton/*.yaml.
+To verify: docker compose up with the unified config; curl LCORE /v1/query.
+```
+
+<!-- type: Story -->
+<!-- key: LCORE-???? -->
+### LCORE-???? Migrate in-repo e2e / integration test configurations
+
+**User story**: As a Lightspeed Core maintainer, I want the in-repo e2e and
+integration tests to use the unified-mode config format, so that the
+reference configuration shapes downstream teams see are the new ones.
+
+**Description**: Convert `tests/e2e/configs/run-*.yaml` and
+`tests/e2e/configuration/**/lightspeed-stack*.yaml` into unified form
+(or delete the `run-*.yaml` side and fold the content into the
+corresponding `lightspeed-stack*.yaml`). Migrate `tests/e2e-prow/rhoai/`
+configs similarly.
+
+**Scope**:
+- Identify every test config that references `run.yaml`.
+- Mechanically migrate using the migration tool (dumb mode).
+- Re-run the full e2e suite and resolve any differences.
+
+**Acceptance criteria**:
+- No in-repo test config references an external `run.yaml`.
+- `uv run make test-e2e` passes.
+- Existing test coverage is preserved (no tests deleted solely to make the
+  migration pass).
+
+**Agentic tool instruction**:
+```text
+Read "Migration paths" in docs/design/llama-stack-config-merge/llama-stack-config-merge.md.
+Key files: tests/e2e/configs/, tests/e2e/configuration/, tests/e2e-prow/rhoai/.
+To verify: `uv run make test-e2e` green.
+```
+
+<!-- type: Task -->
+<!-- key: LCORE-???? -->
+### LCORE-???? Implement behave step definitions for unified-mode feature files
+
+**Description**: Implement the Python step definitions
+(`@given`/`@when`/`@then` functions) under `tests/e2e/features/steps/`
+for the `.feature` files authored in LCORE-???? (E2E feature files
+kickoff). After this ticket lands, the scenarios transition from
+`undefined` to fully executing.
+
+The feature files are taken as-is — do not modify the Gherkin to make
+implementation easier. If a scenario cannot be implemented faithfully,
+raise it against the spec doc (and possibly back to LCORE-???? kickoff)
+rather than quietly weakening the test.
+
+**Scope**:
+- Step definitions for every step pattern in the new `.feature` files.
+- Fixtures or helpers under `tests/e2e/features/steps/` as needed
+  (e.g., temp-dir config authoring, subprocess start/stop for LCORE,
+  HTTP client helpers reusing existing `tests/e2e/` patterns).
+- CI wiring so the new scenarios run as part of `uv run make test-e2e`.
+
+**Acceptance criteria**:
+- behave reports zero `undefined` steps across the new `.feature`
+  files.
+- `uv run make test-e2e` runs the new scenarios and they pass.
+- No Gherkin edit was made to accommodate implementation constraints
+  (or if any edit was made, it is documented in a PR comment with
+  explicit rationale).
+
+**Blocked by**:
+- LCORE-???? (E2E feature files for unified mode — the `.feature`
+  files being implemented against).
+- LCORE-???? (Unified schema + synthesizer), LCORE-????
+  (Migration tool), LCORE-???? (LS container entrypoint + deployment)
+  — the feature under test must exist.
+
+**Agentic tool instruction**:
+```text
+Read "Architecture" and "Requirements" in
+docs/design/llama-stack-config-merge/llama-stack-config-merge.md.
+Key files to create: tests/e2e/features/steps/unified-mode*.py (or
+extend existing step-definition modules if patterns reuse cleanly).
+Do not modify tests/e2e/features/unified-mode-*.feature — take the
+Gherkin as-is. If a scenario genuinely cannot be implemented faithfully,
+file a sub-ticket rather than changing the Gherkin quietly.
+To verify: `uv run make test-e2e` runs every new scenario green and
+behave reports zero undefined steps.
+```
+
+<!-- type: Story -->
+<!-- key: LCORE-???? -->
+### LCORE-???? Docs migration to unified mode as primary
+
+**User story**: As an operator reading Lightspeed Core docs, I want the
+single-file unified configuration to be the primary way documented, with
+legacy mode clearly marked as a deprecation path.
+
+**Description**: Update
+`docs/deployment_guide.md`, `docs/byok_guide.md`, `docs/okp_guide.md`,
+`docs/rag_guide.md`, `docs/providers.md`, `docs/config.md`, `README.md`,
+`docs/local-stack-testing.md` to document unified mode as primary. Add a
+migration section with the migration tool command. Clean up the stale
+`create_argument_parser` docstring in `src/lightspeed_stack.py` that still
+mentions the removed `-g/-i/-o` flags.
+
+**Scope**:
+- Each doc file touched.
+- A new migration section (step-by-step).
+- Update the `create_argument_parser` docstring in
+  `src/lightspeed_stack.py`.
+
+**Acceptance criteria**:
+- Every doc page that showed a two-file setup also shows the unified-mode
+  equivalent.
+- Migration tool invocation documented with a worked example.
+- `docs/openapi.md` / `docs/config.html` regenerated.
+
+**Agentic tool instruction**:
+```text
+Read "Deprecation timeline" and "Migration paths" in docs/design/llama-stack-config-merge/llama-stack-config-merge.md.
+Key files: docs/*.md, docs/*.html, docs/*.json, README.md, src/lightspeed_stack.py docstring.
+To verify: rendered docs present the unified mode first; legacy mode is visibly deprecated.
+```
+
+<!-- type: Task -->
+<!-- key: LCORE-???? -->
+### LCORE-???? Reference profile examples and profile-path doc
+
+**Description**: Add `examples/profiles/` with two reference profile YAML
+files — one remote-provider (OpenAI) and one inline-provider (sentence-
+transformers + FAISS) — purely as reference material. Document how operators
+write and reference their own profiles via
+`llama_stack.config.profile: <path>`.
+
+**Scope**:
+- `examples/profiles/openai-remote.yaml`
+- `examples/profiles/inline-faiss.yaml`
+- Docs section: how to author a profile, where to place it, how to
+  reference it from `lightspeed-stack.yaml`.
+
+**Acceptance criteria**:
+- Both examples load cleanly via the synthesizer (sanity test).
+- A docs section titled "Profiles" exists and has a worked example.
+
+**Agentic tool instruction**:
+```text
+Read "Profiles" in docs/design/llama-stack-config-merge/llama-stack-config-merge.md.
+Key files to create: examples/profiles/*.yaml, a "Profiles" section in docs/config.md or docs/deployment_guide.md.
+To verify: load the example via `uv run lightspeed-stack -c <wrapper.yaml>` referencing the profile; confirm LS boots.
+```
+
+<!-- type: Task -->
+<!-- key: LCORE-???? -->
+### LCORE-???? Deprecation warning for legacy mode
+
+**Description**: After the unified-mode feature lands (one release later),
+emit a one-line startup WARN when `library_client_config_path` is set. Link
+to the migration doc. Legacy mode continues to fully function.
+
+**Scope**:
+- Warning emission point: on load in `LlamaStackConfiguration`
+  `check_llama_stack_model` validator, or at LCORE startup.
+- Log line format includes a stable URL fragment to the migration doc.
+
+**Acceptance criteria**:
+- Legacy configs still load and run.
+- A single WARN line appears at startup when legacy fields are used.
+- The warning is not emitted in unified mode.
+
+**Agentic tool instruction**:
+```text
+Read "Deprecation timeline" in docs/design/llama-stack-config-merge/llama-stack-config-merge.md.
+Key files: src/models/config.py (or src/lightspeed_stack.py startup).
+To verify: run LCORE with a legacy config; confirm WARN line; run with unified config; confirm no WARN.
+```
+
+---
+
+## PoC results
+
+### What the PoC does
+
+The PoC is at Level 3' (per the spike howto): unified config works
+end-to-end in library mode, with overrides and a profile. Server-mode
+end-to-end validation was skipped — same synthesis code path, container
+rebuild time was impractical.
+
+**Important**: The PoC diverges from the production design in these ways:
+
+- Uses `$TMPDIR` for the synthesized `run.yaml` instead of the persistent
+  known path recommended in Decision T4.
+- No `--synthesized-config-output` CLI flag yet.
+- Migration tool has only the "dumb" mode; "smart" factoring into
+  high-level keys is out of scope.
+- No deprecation warning yet (that's its own JIRA).
+- High-level inference's emitted `provider_id` uses the Literal value
+  directly (`sentence_transformers` with underscore), which differs from
+  the baseline's `sentence-transformers` (hyphen). Acceptable in the PoC
+  because the validation used `baseline: default` + a `native_override`
+  path, not high-level inference, to avoid this naming collision. Resolution
+  before production: align the emitted `provider_id` with the Literal
+  values that already exist in common baselines (hyphenated form).
+
+### Results
+
+See [poc-evidence/library-mode/](poc-evidence/library-mode/) for the full
+evidence bundle:
+
+- `lightspeed-stack-unified-library.yaml` — the unified-mode config used
+- `synthesized-run.yaml` — what LCORE produced (3.7 KB)
+- `query-response.json` — a real `/v1/query` round-trip
+
+Summary of validation:
+
+| Check | Evidence |
+|---|---|
+| Liveness 200 | `curl /liveness` → `{"alive":true}` |
+| Readiness 200 | `curl /readiness` → `{"ready":true,"reason":"All providers are healthy","providers":[]}` |
+| `/v1/query` works | `{"response":"The three primary colors are red, blue, and yellow.",...}` |
+| Profile loaded | `profile: /.../tests/e2e/configs/run-ci.yaml` resolved |
+| `native_override` took effect | `safety.default_shield_id: llama-guard` in synthesized output |
+| No external `run.yaml` needed | No `library_client_config_path` in config |
+| Secrets preserved as env refs | `api_key: ${env.OPENAI_API_KEY}` in synthesized file |
+| Full unit suite | 2098 passed, 1 skipped, 0 failed |
+| Round-trip lossless | `test_migrate_then_synthesize_reproduces_run_yaml` green |
+
+### Surprise discovered during PoC
+
+- **`AsyncLlamaStackAsLibraryClient` takes a file path, not a dict** (Decision
+  T9). The library client reads the file itself. Consequence: library mode
+  must write a synthesized file to disk. No dict-only shortcut.
+- **`profile:` path resolution** uses the directory of the
+  `lightspeed-stack.yaml`. Relative paths work only when the profile is
+  co-located with the LCORE config. Absolute paths always work. Spec doc
+  recommends documenting this clearly.
+- **Default baseline requires `EXTERNAL_PROVIDERS_DIR`**. `src/data/default_run.yaml`
+  (copied from the repo's `run.yaml`) references `${env.EXTERNAL_PROVIDERS_DIR}`
+  without a default. Either ship a thinner default baseline, or change the
+  reference to `${env.EXTERNAL_PROVIDERS_DIR:=~/.llama/providers.d}`. Flagging
+  for the implementation JIRA.
+- **High-level inference naming collision** (described above in "divergence
+  from production design").
+
+---
+
+## Background sections
+
+### Current architecture (before LCORE-836)
+
+Two files:
+
+- **`lightspeed-stack.yaml`** — LCORE settings: service host/port, auth,
+  conversation cache, user data collection, MCP servers, authentication,
+  authorization, quota, etc. Also contains `llama_stack:` with
+  connection-to-LS settings (URL/api_key or library-client mode with a path
+  to an external `run.yaml`).
+- **`run.yaml`** — Llama Stack operational config: `apis`, `providers`
+  (inference, safety, tool_runtime, vector_io, agents, ...), `storage`,
+  `registered_resources`, `vector_stores`, `safety`.
+
+**Existing enrichment** (`src/llama_stack_configuration.py`):
+
+- LCORE already enriches an input `run.yaml` with dynamic values from
+  `lightspeed-stack.yaml`: Azure Entra ID tokens (side-effect to `.env`),
+  BYOK RAG entries, Solr/OKP provider/store/model registration. Output is
+  an enriched `run.yaml`.
+- Called in two places: `scripts/llama-stack-entrypoint.sh` at LS container
+  boot (server mode) and `src/client.py:_enrich_library_config()` (library
+  mode).
+- LCORE-779 made this automatic; LCORE-518 (closed spike) proved (re)generation
+  feasibility. Both are the groundwork the current spike builds on.
+
+The new synthesizer *subsumes* the enrichment: it builds the full run.yaml
+(baseline + enrichment + high-level + native_override) rather than
+incrementally enriching an existing one.
+
+### Design alternatives considered
+
+Attributes (★ = high-weight for LCORE-836):
+
+| Attribute | A | B+C | C+E | E | G |
+|---|---|---|---|---|---|
+| ★ Operator UX | 2 | 4–5 | **4** | 5 | 3 |
+| Abstraction cleanliness | 1 | 4 | 3 | 4 | 2 |
+| LS schema resilience | 1 | 4 | 3 | 3 | 2 |
+| ★ Escape-hatch power | 5 | 3 | 5 | 5 | 5 |
+| Implementation cost | 4 | 2 | 2 | 3 | 3 |
+| Maintenance load | 2 | 3 | 3 | 2 | 3 |
+| ★ Backward compatibility | 3 | 3 | 3 | 3 | 4 |
+| Validation rigor | 2 | 5 | 4 | 3 | 2 |
+| ★ Dynamic-reconfig fit | 2 | 5 | 4 | 4 | 2 |
+| ★ Library+server parity | 5 | 4 | 4 | 5 | 5 |
+| Provider plurality | 5 | 4 | 5 | 4 | 5 |
+| Testability | 3 | 4 | 3 | 5 | 3 |
+
+- **A (Embedded native)** — no abstraction win; same LS schema exposure as today.
+- **B (High-level only)** — best UX when everything maps, painful at the edges.
+- **C (B + `native_override`)** — recommended; combines B's UX with A's escape hatch.
+- **E (Profiles, feature-only)** — optional layer on top of C.
+- **G (Kustomize-style patches)** — strong for backward compat, weak on
+  validation and dynamic reconfig.
+
+### Merge semantics — worked examples
+
+Given the baseline:
+```yaml
+safety:
+  default_shield_id: llama-guard
+  excluded_categories: [violence, sexual_content]
+providers:
+  inference:
+    - provider_id: openai
+      provider_type: remote::openai
+      config:
+        api_key: ${env.OPENAI_API_KEY}
+```
+
+And `native_override`:
+```yaml
+safety:
+  excluded_categories: [spam]
+```
+
+**Deep-merge-with-list-replacement (chosen)** produces:
+```yaml
+safety:
+  default_shield_id: llama-guard          # preserved (not in override)
+  excluded_categories: [spam]             # list replaced
+providers:                                # not in override — preserved
+  inference:
+    - provider_id: openai
+      ...
+```
+
+The recommendation's appeal: to keep `default_shield_id`, the user doesn't
+have to restate it. To replace `excluded_categories`, the user provides the
+new list — they don't need to know a patch syntax.
+
+### Process-model recap (no LCORE supervision of LS)
+
+**Library mode**: LCORE process embeds the Llama Stack library client. LCORE
+synthesizes `run.yaml` to a file, calls `AsyncLlamaStackAsLibraryClient(path)`,
+initializes, serves. One process.
+
+**Server mode**: Llama Stack runs as a separate process (container). LCORE
+connects to it over HTTP. Under unified mode, the LS container's entrypoint
+reads the mounted `lightspeed-stack.yaml`, the Python CLI auto-detects
+unified mode, synthesizes `run.yaml`, then `exec llama stack run` with it.
+LCORE container reads the same `lightspeed-stack.yaml`, ignores the
+`config` sub-block (server mode — only connection fields matter), connects.
+Two processes. LCORE does **not** start, monitor, or supervise the LS
+process — the orchestrator (docker-compose, systemd, k8s) does. Supervision
+is out of scope for this spike (see Decision S4).
+
+### What must not break during rollout
+
+See [Backward compatibility scope](#backward-compatibility-scope). The four
+must-not-break surfaces:
+
+1. Existing `lightspeed-stack.yaml` with `library_client_config_path`.
+2. Existing `run.yaml` content, including fields LCORE doesn't model.
+3. Existing CI/CD templating that treats `run.yaml` as a separate artifact.
+4. Existing enrichment behavior (Azure Entra ID, BYOK RAG, Solr/OKP).
+
+### Backward compatibility scope
+
+Detection rule at load time:
+
+| `lightspeed-stack.yaml` shape | Interpretation |
+|---|---|
+| `llama_stack.library_client_config_path` set, no `llama_stack.config` | **Legacy** — today's behavior |
+| `llama_stack.config.*` present | **Unified** — new path |
+| Both present | Error at load time — clear message |
+| Neither (remote URL only, no config) | Existing remote mode — unchanged |
+
+Three migration paths operators can choose:
+
+| Path | Effort | Result |
+|---|---|---|
+| Do nothing | 0 | Legacy keeps working until deprecation window closes |
+| Lift-and-shift (via migration tool) | seconds | Unified single file, zero semantic change |
+| Re-express | hours+ | Unified single file, fully adopts the high-level schema |
+
+---
+
+## Appendix A — Files changed in the PoC
+
+Relative to `upstream/main`:
+
+| File | Purpose |
+|---|---|
+| `src/models/config.py` | New classes: `UnifiedInferenceProvider`, `UnifiedInferenceSection`, `UnifiedLlamaStackConfig`; modified `LlamaStackConfiguration` (adds `config` field + mutual-exclusion validator) |
+| `src/llama_stack_configuration.py` | New: `synthesize_configuration`, `deep_merge_list_replace`, `apply_high_level_inference`, `load_default_baseline`, `synthesize_to_file`, `migrate_config_dumb`. CLI `main()` auto-detects unified vs legacy. |
+| `src/data/default_run.yaml` | Built-in default baseline (copied from repo root `run.yaml` for the PoC — implementation JIRA should slim it down; see PoC surprise about `EXTERNAL_PROVIDERS_DIR`) |
+| `src/client.py` | Library-mode path picks synthesis for unified configs, enrichment for legacy |
+| `src/lightspeed_stack.py` | `--migrate-config`, `--run-yaml`, `--migrate-output` flags |
+| `scripts/llama-stack-entrypoint.sh` | Comment updated — script itself needs no change (Python CLI auto-detects) |
+| `test.containerfile` | Copies `src/data/` into the LS container |
+| `tests/unit/test_llama_stack_synthesize.py` | 22 new tests: merge semantics, high-level inference, synthesize pipeline, migration round-trip |
+| `tests/unit/models/config/test_llama_stack_configuration.py` | 3 new tests: unified/legacy mutual exclusion |
+| `tests/unit/models/config/test_dump_configuration.py` | 5 expected-dict updates (new `config: None` field appears in dumps) |
+| `tests/unit/test_client.py` | Error-message regex updated |
+| `docs/design/llama-stack-config-merge/` | Spike doc, spec doc, PoC evidence, proposed JIRAs |
+
+## Appendix B — Commands to reproduce the library-mode PoC
+
+```bash
+# 1. Start LCORE in library mode with a unified config
+export OPENAI_API_KEY=<your-key>
+export E2E_OPENAI_MODEL=gpt-4o-mini
+mkdir -p /tmp/lcore-836-poc
+uv run lightspeed-stack \
+  -c docs/design/llama-stack-config-merge/poc-evidence/lightspeed-stack-unified-library.yaml
+
+# 2. In another shell — query
+curl -s http://localhost:8080/liveness
+curl -s http://localhost:8080/readiness
+curl -s -X POST http://localhost:8080/v1/query \
+  -H 'Content-Type: application/json' \
+  -d '{"query": "Name three primary colors. One sentence."}'
+
+# 3. Inspect what was synthesized
+cat /tmp/llama_stack_synthesized_config.yaml
+```
diff --git a/docs/design/llama-stack-config-merge/llama-stack-config-merge.md b/docs/design/llama-stack-config-merge/llama-stack-config-merge.md
new file mode 100644
index 000000000..9847cc2fd
--- /dev/null
+++ b/docs/design/llama-stack-config-merge/llama-stack-config-merge.md
@@ -0,0 +1,502 @@
+# Feature design: Llama Stack config merge (unified `lightspeed-stack.yaml`)
+
+|                    |                                                                                  |
+|--------------------|----------------------------------------------------------------------------------|
+| **Date**           | 2026-04-23                                                                       |
+| **Component**      | Lightspeed Core Stack (src/models/config.py, src/llama_stack_configuration.py, src/client.py, src/lightspeed_stack.py, scripts/llama-stack-entrypoint.sh) |
+| **Authors**        | Maxim Svistunov                                                                   |
+| **Feature**        | [LCORE-836](https://redhat.atlassian.net/browse/LCORE-836)                       |
+| **Spike**          | [llama-stack-config-merge-spike.md](llama-stack-config-merge-spike.md)           |
+| **Links**          | LCORE-509 (Epic), LCORE-777 (Epic), LCORE-518 (prior spike, Closed), LCORE-779 (auto-regen, Closed) |
+
+## What
+
+This feature collapses the two Lightspeed Core configuration files —
+`lightspeed-stack.yaml` (LCORE settings) and `run.yaml` (Llama Stack
+operational config) — into a single `lightspeed-stack.yaml`. At runtime,
+LCORE synthesizes a full Llama Stack `run.yaml` from a new
+`llama_stack.config` sub-section and hands it to Llama Stack (library
+client or subprocess, mode-dependent).
+
+Key shape:
+
+- High-level keys under `llama_stack.config` for the common path
+  (v1: `inference`; future: `storage`, `safety`, `tools`).
+- `llama_stack.config.native_override` escape hatch — raw Llama Stack
+  schema, deep-merged with list replacement. Covers anything the
+  high-level schema doesn't express.
+- `llama_stack.config.profile` — path to a user-authored YAML that serves
+  as the synthesis baseline.
+- `llama_stack.config.baseline: default | empty` — pick between LCORE's
+  built-in baseline and an empty dict (used by the migration tool for
+  exact round-trip).
+- Legacy two-file mode (`llama_stack.library_client_config_path` +
+  external `run.yaml`) is preserved during a deprecation window;
+  mutually exclusive with `llama_stack.config`.
+
+## Why
+
+Two-file configuration multiplies the surface area for misconfiguration
+and forces every downstream Lightspeed team (RHOAI, Konflux pipelines,
+any product integrating LCORE) to understand Llama Stack's full internal
+schema. A single source of truth:
+
+- Reduces the number of artifacts deployment tooling must manage
+  (Helm values, ConfigMaps, Kustomize overlays).
+- Lets downstream teams express their intent at a high level (e.g. "use
+  OpenAI with these allowed models") rather than authoring raw LS
+  provider entries.
+- Preserves an escape hatch so edge cases don't block adoption.
+
+LCORE-518 (closed) proved a generation PoC in principle; LCORE-779
+(closed) made configuration regeneration automatic at startup. This
+feature completes the picture by making `run.yaml` an implementation
+detail that LCORE owns, not an operator-facing artifact.
+
+## Requirements
+
+- **R1:** `lightspeed-stack.yaml` with a `llama_stack.config` sub-section
+  and no external `run.yaml` boots LCORE in both library and server modes
+  and serves `/v1/query` successfully.
+- **R2:** Legacy mode (`llama_stack.library_client_config_path` +
+  external `run.yaml`) works unchanged until the deprecation window
+  closes. A startup WARN is emitted one release after unified mode lands.
+- **R3:** Setting both `llama_stack.config` and
+  `llama_stack.library_client_config_path` in the same file fails at
+  configuration load time with a clear error message pointing to the
+  migration tool.
+- **R4:** `lightspeed-stack --migrate-config --run-yaml X -c Y
+  --migrate-output Z` produces a unified configuration from the legacy
+  two-file pair. Running the migrated file drives Llama Stack to
+  byte-identical behavior as the original pair (dumb-mode lossless
+  round-trip).
+- **R5:** When `llama_stack.config.native_override` overlaps a key set
+  by the high-level section or by the baseline, deep-merge semantics
+  apply with list replacement (maps merge recursively; lists are
+  replaced wholesale; scalars are replaced).
+- **R6:** Secrets are never resolved into the synthesized file on disk.
+  `${env.FOO}` references appear verbatim in the synthesized `run.yaml`.
+- **R7:** Existing enrichment behavior (Azure Entra ID, BYOK RAG,
+  Solr/OKP) produces the same result in unified mode as in legacy mode
+  for equivalent inputs.
+- **R8:** A profile referenced by a relative `profile:` path resolves
+  against the directory of the loaded `lightspeed-stack.yaml`.
+- **R9:** The unified schema extends current `LlamaStackConfiguration`
+  pydantic model with a new `config: Optional[UnifiedLlamaStackConfig]`
+  field; validation enforces mutual exclusion with legacy mode and
+  rejects unknown fields (`extra="forbid"`).
+- **R10:** The synthesized `run.yaml` is written to a persistent known
+  path (overwritten each boot), logged, and a CLI flag
+  `--synthesized-config-output` lets operators override the location for
+  debugging.
+- **R11:** Shape detection determines mode (unified vs legacy); an
+  optional `config_format_version` field is accepted but must agree with
+  the shape when present.
+
+## Use Cases
+
+- **U1:** As an operator setting up LCORE for the first time, I want to
+  write one config file with high-level provider choices (OpenAI, Azure,
+  …) so that I don't have to learn Llama Stack's internal schema.
+- **U2:** As a downstream team maintainer with an existing heavily
+  customized `run.yaml`, I want a mechanical one-shot migration so that
+  I can move to the unified format without re-expressing my edge cases.
+- **U3:** As an operator whose deployment sits behind a vLLM serving
+  stack not covered by the high-level schema, I want to drop my custom
+  configuration into `native_override` and still benefit from the rest
+  of the unified schema.
+- **U4:** As a Lightspeed Core maintainer, I want a single authoritative
+  place for docs, examples, and test configs so that downstream teams
+  find the same patterns everywhere.
+- **U5:** As a Red Hat release manager, I want legacy configs to keep
+  working throughout a deprecation window so that downstream products
+  can migrate on their own cadence.
+
+## Architecture
+
+### Overview
+
+```text
+lightspeed-stack.yaml (unified mode)
+       │
+       ▼
+ ┌────────────────────────────┐
+ │ Configuration load         │    Pydantic validation, mutual-exclusion
+ │  src/configuration.py      │    check between `config` and
+ │  src/models/config.py      │    `library_client_config_path`.
+ └────────────┬───────────────┘
+              │ Configuration (typed)
+              ▼
+ ┌────────────────────────────┐   Baseline selection (profile /
+ │ Synthesizer                │   default / empty) + enrichment
+ │  synthesize_configuration  │   (BYOK RAG, Solr/OKP) + high-level
+ │  (llama_stack_config…)     │   sections + native_override deep-merge.
+ └────────────┬───────────────┘
+              │ synthesized run.yaml (dict)
+              ▼
+  Library mode                       Server mode
+  ────────────                       ───────────
+  Write to deterministic path.       Written by LS container's entrypoint
+  AsyncLlamaStackAsLibraryClient     script (same synthesizer, same CLI,
+  reads the path and initializes.    auto-detects unified via Python).
+                                     `llama stack run <path>` starts LS.
+                                     LCORE connects by URL.
+```
+
+### Trigger mechanism
+
+At LCORE startup (library mode): if `llama_stack.config` is set in the
+loaded `lightspeed-stack.yaml`, the synthesizer produces a `run.yaml`
+dict, writes it to disk, and passes the path to the library client.
+
+At Llama Stack container startup (server mode): the container's
+entrypoint script invokes
+`python3 /opt/app-root/llama_stack_configuration.py -c <lightspeed-stack.yaml>
+-o /opt/app-root/run.yaml`. The Python CLI auto-detects unified vs legacy
+by `llama_stack.config` presence; in unified mode it synthesizes and
+writes the output; in legacy mode it performs in-place enrichment as
+before.
+
+### Storage / data model changes
+
+No persistent storage is added. The synthesized `run.yaml` is written
+once per boot to a deterministic path; not a database. `src/data/
+default_run.yaml` is a new package-shipped file, the built-in baseline
+Llama Stack configuration.
+
+### Configuration
+
+New sub-section under the existing `llama_stack` block:
+
+```yaml
+llama_stack:
+  use_as_library_client: true
+  # NOTE: library_client_config_path intentionally OMITTED in unified mode.
+  # Setting both `config` and `library_client_config_path` is a validation error.
+  config:
+    # Baseline selection
+    baseline: default              # default | empty; ignored if `profile` is set
+    profile: ./my-profile.yaml     # optional; resolves relative to lightspeed-stack.yaml
+
+    # High-level sections (v1: inference; future: storage, safety, tools, ...)
+    inference:
+      providers:
+        - type: openai             # mapped to remote::openai
+          api_key_env: OPENAI_API_KEY
+          allowed_models: [gpt-4o-mini]
+        - type: sentence_transformers
+
+    # Escape hatch — raw Llama Stack schema, deep-merged with list replacement
+    native_override:
+      safety:
+        excluded_categories: [spam]
+```
+
+Pydantic classes (see `src/models/config.py`):
+
+```python
+class UnifiedInferenceProvider(ConfigurationBase):
+    type: Literal[
+        "openai", "sentence_transformers", "azure", "vertexai",
+        "watsonx", "vllm_rhaiis", "vllm_rhel_ai",
+    ]
+    api_key_env: Optional[str] = None
+    allowed_models: Optional[list[str]] = None
+    extra: dict[str, Any] = Field(default_factory=dict)
+
+
+class UnifiedInferenceSection(ConfigurationBase):
+    providers: list[UnifiedInferenceProvider] = Field(default_factory=list)
+
+
+class UnifiedLlamaStackConfig(ConfigurationBase):
+    baseline: Literal["default", "empty"] = "default"
+    profile: Optional[str] = None
+    inference: Optional[UnifiedInferenceSection] = None
+    native_override: dict[str, Any] = Field(default_factory=dict)
+
+
+class LlamaStackConfiguration(ConfigurationBase):
+    # existing fields unchanged (url, api_key, use_as_library_client,
+    # library_client_config_path, timeout)
+    config: Optional[UnifiedLlamaStackConfig] = None
+
+    @model_validator(mode="after")
+    def check_llama_stack_model(self) -> Self:
+        if self.config is not None and self.library_client_config_path is not None:
+            raise ValueError("... mutually exclusive ... use --migrate-config")
+        # ...legacy checks preserved...
+        return self
+```
+
+### API changes
+
+None at the REST API surface. Internal API additions in
+`src/llama_stack_configuration.py`:
+
+- `synthesize_configuration(lcs_config, config_file_dir, default_baseline)
+  -> dict` — the synthesis pipeline.
+- `synthesize_to_file(lcs_config, output_file, config_file_dir) -> None` —
+  synthesis + write.
+- `migrate_config_dumb(run_yaml_path, lightspeed_yaml_path, output_path)
+  -> None` — dumb-mode migration (lossless round-trip).
+- `deep_merge_list_replace(base, overlay) -> dict` — merge helper.
+- `apply_high_level_inference(ls_config, inference)` — high-level expansion.
+- `load_default_baseline() -> dict` — loads `src/data/default_run.yaml`.
+
+CLI additions in `src/lightspeed_stack.py`:
+
+- `--migrate-config` — invoke the migration tool.
+- `--run-yaml <path>` — input for `--migrate-config`.
+- `--migrate-output <path>` — output for `--migrate-config`.
+- (recommended for R10) `--synthesized-config-output <path>` — override
+  the default deterministic synthesis location.
+
+The legacy CLI docstring in `create_argument_parser()` referencing the
+removed `-g/-i/-o` flags is cleaned up as part of the docs JIRA.
+
+### Error handling
+
+- **Unified + legacy set simultaneously**: raised during
+  `LlamaStackConfiguration.check_llama_stack_model`. Error message
+  directs to `--migrate-config`.
+- **Library mode with neither `config` nor `library_client_config_path`**:
+  raised during the same validator. Error identifies the two valid paths.
+- **`profile:` path does not exist**: surfaced as `FileNotFoundError`
+  from `open(profile_path)` during synthesis. The implementation JIRA
+  should wrap this with context about where the path was resolved.
+- **Unknown provider `type` in high-level inference**: rejected by the
+  Pydantic `Literal` — operator sees a validation error naming the
+  allowed types. Escape: use `native_override`.
+- **Unknown fields in any unified-mode section**: rejected by
+  `extra="forbid"` on `ConfigurationBase`.
+- **Llama Stack rejects the synthesized `run.yaml`**: surfaces as
+  whatever LS itself raises (ValidationError from LS's own config
+  parsing). The implementation JIRA should log the synthesized file path
+  before handing to LS so operators can inspect what failed.
+
+### Security considerations
+
+- **No secrets written to disk**: `apply_high_level_inference` emits
+  `${env.<VAR>}` references, never the resolved secret. The synthesized
+  `run.yaml` is safe to log path-wise; its contents only contain env
+  references for secrets.
+- **`native_override` is raw YAML**: content is operator-controlled, so
+  no new injection surface — same trust model as the existing
+  `run.yaml`. LCORE does no template expansion other than the existing
+  `replace_env_vars()` step in the load pipeline.
+- **Synthesized file location**: persistent known path, world-readable
+  by default in a container. This is acceptable because the file
+  contains only env-var references for secrets; operators who want
+  stricter filesystem permissions should tighten the mount.
+
+### Migration / backwards compatibility
+
+Coexistence mechanism: shape detection (see R11). Legacy configs with
+`llama_stack.library_client_config_path` continue through the
+configured deprecation window.
+
+Three operator-facing migration paths (choose per deployment):
+
+| Path | Effort | Result |
+|---|---|---|
+| Do nothing | 0 | Legacy keeps working until deprecation closes |
+| Lift-and-shift | seconds — `lightspeed-stack --migrate-config ...` | Single-file, byte-equivalent LS behavior |
+| Re-express | hours+ | Single-file; high-level sections replace `native_override` |
+
+Deprecation schedule: calendar-based (per Decision S2 in the spike);
+concrete numbers set by @sbunciak at release time. Default recommended
+shape: unified mode ships as opt-in at release N; legacy-mode WARN
+begins one release later; legacy-mode removal no sooner than 6 months
+after WARN begins.
+
+## Implementation Suggestions
+
+### Key files and insertion points
+
+| File | What to do |
+|---|---|
+| `src/models/config.py` | Add `UnifiedInferenceProvider`, `UnifiedInferenceSection`, `UnifiedLlamaStackConfig`. Modify `LlamaStackConfiguration` — add `config` field, extend the `model_validator` for mutual-exclusion check. |
+| `src/llama_stack_configuration.py` | Add `synthesize_configuration`, `deep_merge_list_replace`, `apply_high_level_inference`, `load_default_baseline`, `synthesize_to_file`, `migrate_config_dumb`, `PROVIDER_TYPE_MAP`, `DEFAULT_BASELINE_RESOURCE`. Update `main()` to auto-detect unified vs legacy. |
+| `src/data/default_run.yaml` | New file — a thinner baseline than today's repo-root `run.yaml`. Notably do **not** reference `${env.EXTERNAL_PROVIDERS_DIR}` without a default (see PoC surprise in the spike doc). |
+| `src/client.py` | In `_load_library_client`: branch on `config.config` presence. Add `_synthesize_library_config()` that calls the synthesizer and writes to the deterministic path (R10). Keep `_enrich_library_config` for legacy. |
+| `src/lightspeed_stack.py` | Add `--migrate-config`, `--run-yaml`, `--migrate-output`, `--synthesized-config-output` flags. Add an early-exit branch in `main()` that dispatches to `migrate_config_dumb` when `--migrate-config` is set. Clean up stale docstring. |
+| `scripts/llama-stack-entrypoint.sh` | No functional change — the Python CLI already auto-detects. Update the comment to document both modes. |
+| `test.containerfile` | Copy `src/data/` into `/opt/app-root/data/` so `load_default_baseline()` resolves inside the LS container. |
+| `docker-compose.yaml` | Provide a unified-mode variant (either a new compose file or env-var-switched mount list). Legacy compose continues to work. |
+
+### Insertion point detail
+
+**`synthesize_configuration` pipeline** (the core new function):
+
+1. Retrieve `unified = lcs_config["llama_stack"]["config"]` — raise if absent.
+2. Baseline: if `unified.profile` set → load that file. Else if
+   `unified.baseline == "empty"` → `{}`. Else → `default_baseline` arg or
+   `load_default_baseline()`.
+3. Run `dedupe_providers_vector_io` on the baseline.
+4. Apply existing enrichment: `enrich_byok_rag`, `enrich_solr` (Azure
+   Entra ID intentionally stays separate because it's a `.env`
+   side-effect, not an `ls_config` mutation).
+5. If `unified.inference` present → `apply_high_level_inference`.
+6. If `unified.native_override` non-empty →
+   `deep_merge_list_replace(ls_config, native_override)`.
+7. `dedupe_providers_vector_io` again for good measure.
+8. Return the final dict.
+
+**`_load_library_client` fork point** (in `src/client.py`):
+
+```python
+if config.config is not None:
+    self._config_path = self._synthesize_library_config()
+elif config.library_client_config_path is not None:
+    self._config_path = self._enrich_library_config(config.library_client_config_path)
+else:
+    raise ValueError(...)  # caught by the validator at load time; belt-and-suspenders here
+```
+
+### Config pattern
+
+All new config classes extend `ConfigurationBase` (`extra="forbid"`).
+Use `Field()` with defaults, title, and description for every attribute.
+Cross-field validation in `UnifiedLlamaStackConfig` is not currently
+needed — the precedence is strictly ordered and handled by the
+synthesizer, not by the model.
+
+Example config files live in `examples/profiles/` (two reference
+profiles — one remote-provider, one inline-provider) and in
+`examples/lightspeed-stack-unified.yaml` as the canonical "unified mode"
+reference.
+
+### Test patterns
+
+- Framework: pytest + pytest-mock. Unit tests live in
+  `tests/unit/test_llama_stack_synthesize.py` (synthesizer + migration)
+  and `tests/unit/models/config/test_llama_stack_configuration.py`
+  (schema validation).
+- Merge semantics: parametric tests over scalar / map / list /
+  type-mismatch / precedence cases.
+- Round-trip test: migrate → synthesize → assert dict equality with the
+  original `run.yaml`. Pattern already live in
+  `test_migrate_then_synthesize_reproduces_run_yaml`.
+- Schema validation tests: mutual exclusion, remote URL + config,
+  library mode + config without legacy path.
+- Feature-specific: provider_type map completeness test asserts every
+  `Literal` value on `UnifiedInferenceProvider.type` has a
+  `PROVIDER_TYPE_MAP` entry.
+- e2e behave tests: migrate `tests/e2e/configuration/**` configs to
+  unified form as part of LCORE-???? (test migration JIRA).
+
+## Open Questions for Future Work
+
+- **Smart migration mode** (`--migrate-config --smart`): factoring an
+  existing `run.yaml` into high-level sections rather than dumping to
+  `native_override`. Valuable ergonomic win; deferred because the
+  factoring rules require careful design per provider type.
+- **Additional high-level sections** beyond `inference` — `storage`,
+  `safety`, `tools`, `vector_stores`, etc. Add as real demand appears,
+  not speculatively.
+- **User-supplied profile directory**: `profile_dir: /etc/lcore/profiles/`
+  with name-based lookup. Deferred to v2.
+- **LS process supervision** (restart on crash, signal propagation,
+  merged logs) — covered by LCORE-777 / LCORE-778, not this feature.
+- **Dynamic reconfig / hot-reload** (live `POST /v1/rag` that adds a BYOK
+  RAG without restart) — covered by LCORE-781, not this feature. Llama
+  Stack's lack of native hot-reload means any implementation requires
+  supervised restart, which is out of scope here.
+- **`config_format_version`** as an explicit schema version, accepted
+  but not required. Will become load-bearing the first time the unified
+  schema undergoes a real breaking change.
+- **Validation pre-flight against the Llama Stack schema**: today LCORE
+  only validates its own schema; LS validates its own at startup.
+  Introducing a pre-flight validator would catch bad synthesis earlier
+  but creates a heavy dependency on LS internals.
+
+## Changelog
+
+| Date | Change | Reason |
+|---|---|---|
+| 2026-04-23 | Initial version | Spike completion |
+
+## Appendix A — Worked example: legacy → unified migration
+
+Given legacy:
+
+```yaml
+# run.yaml
+version: 2
+apis: [agents, inference, vector_io, ...]
+providers:
+  inference:
+    - provider_id: openai
+      provider_type: remote::openai
+      config:
+        api_key: ${env.OPENAI_API_KEY}
+        allowed_models: ["${env.E2E_OPENAI_MODEL:=gpt-4o-mini}"]
+# ... more ...
+```
+
+```yaml
+# lightspeed-stack.yaml
+name: LCS
+llama_stack:
+  use_as_library_client: true
+  library_client_config_path: ./run.yaml
+# ... rest ...
+```
+
+Run:
+
+```bash
+lightspeed-stack --migrate-config \
+  --run-yaml run.yaml \
+  -c lightspeed-stack.yaml \
+  --migrate-output lightspeed-stack-unified.yaml
+```
+
+Produces:
+
+```yaml
+# lightspeed-stack-unified.yaml
+name: LCS
+llama_stack:
+  use_as_library_client: true
+  # library_client_config_path is REMOVED
+  config:
+    baseline: empty
+    native_override:
+      version: 2
+      apis: [agents, inference, vector_io, ...]
+      providers:
+        inference:
+          - provider_id: openai
+            provider_type: remote::openai
+            config:
+              api_key: ${env.OPENAI_API_KEY}
+              allowed_models: ["${env.E2E_OPENAI_MODEL:=gpt-4o-mini}"]
+      # ... rest of run.yaml content under native_override ...
+# ... rest of lightspeed-stack.yaml content ...
+```
+
+Operator uses the unified file directly and can delete the original
+`run.yaml`. Subsequent re-expression (moving from `native_override` into
+high-level sections) is optional and per-deployment.
+
+## Appendix B — Reference profile example
+
+```yaml
+# examples/profiles/openai-remote.yaml
+# A minimal profile for an OpenAI-backed remote Llama Stack.
+# Referenced via `llama_stack.config.profile: examples/profiles/openai-remote.yaml`.
+version: 2
+apis: [agents, inference, safety, tool_runtime, vector_io]
+providers:
+  inference:
+    - provider_id: openai
+      provider_type: remote::openai
+      config:
+        api_key: ${env.OPENAI_API_KEY}
+        allowed_models: ["${env.OPENAI_MODEL:=gpt-4o-mini}"]
+    - provider_id: sentence-transformers
+      provider_type: inline::sentence-transformers
+# ... the rest is the same shape as a working run.yaml ...
+```
diff --git a/docs/design/llama-stack-config-merge/poc-evidence/library-mode/README.md b/docs/design/llama-stack-config-merge/poc-evidence/library-mode/README.md
new file mode 100644
index 000000000..6bfa7f5e9
--- /dev/null
+++ b/docs/design/llama-stack-config-merge/poc-evidence/library-mode/README.md
@@ -0,0 +1,26 @@
+# Library-mode PoC evidence
+
+Command:
+```bash
+export OPENAI_API_KEY=<redacted>
+export E2E_OPENAI_MODEL=gpt-4o-mini
+uv run lightspeed-stack -c docs/design/llama-stack-config-merge/poc-evidence/lightspeed-stack-unified-library.yaml
+```
+
+## What the unified config does
+
+- `llama_stack.config.profile: /abs/path/to/tests/e2e/configs/run-ci.yaml` — baseline loaded from the CI profile
+- `llama_stack.config.native_override.safety.default_shield_id: llama-guard` — override proves merge works
+
+## Evidence
+
+- `synthesized-run.yaml` — the full run.yaml LCORE produced from the unified config
+- `query-response.json` — a successful `/v1/query` round-trip
+
+## Proves
+
+- `llama_stack.library_client_config_path` was NOT used (no external run.yaml needed)
+- `llama_stack.config.profile` was used as the synthesis baseline (path resolution works with absolute paths)
+- `llama_stack.config.native_override` was merged onto the baseline
+- `AsyncLlamaStackAsLibraryClient` accepts the synthesized file path (answered item #24: file-only, not dict)
+- `/v1/query` succeeded end-to-end through the synthesized stack
diff --git a/docs/design/llama-stack-config-merge/poc-evidence/library-mode/query-response.json b/docs/design/llama-stack-config-merge/poc-evidence/library-mode/query-response.json
new file mode 100644
index 000000000..5664cbd00
--- /dev/null
+++ b/docs/design/llama-stack-config-merge/poc-evidence/library-mode/query-response.json
@@ -0,0 +1 @@
+{"conversation_id":"976ef32527283085ba2f1d0cfb4c16d97071bf64391a8200","response":"The three primary colors are red, blue, and yellow.","rag_chunks":[],"referenced_documents":[],"truncated":false,"input_tokens":24,"output_tokens":12,"available_quotas":{},"tool_calls":[],"tool_results":[]}
\ No newline at end of file
diff --git a/docs/design/llama-stack-config-merge/poc-evidence/library-mode/synthesized-run.yaml b/docs/design/llama-stack-config-merge/poc-evidence/library-mode/synthesized-run.yaml
new file mode 100644
index 000000000..34e3e1fc9
--- /dev/null
+++ b/docs/design/llama-stack-config-merge/poc-evidence/library-mode/synthesized-run.yaml
@@ -0,0 +1,148 @@
+apis:
+  - agents
+  - batches
+  - datasetio
+  - eval
+  - files
+  - inference
+  - safety
+  - scoring
+  - tool_runtime
+  - vector_io
+benchmarks: []
+datasets: []
+image_name: starter
+providers:
+  agents:
+    - config:
+        persistence:
+          agent_state:
+            backend: kv_default
+            namespace: agents_state
+          responses:
+            backend: sql_default
+            table_name: agents_responses
+      provider_id: meta-reference
+      provider_type: inline::meta-reference
+  batches:
+    - config:
+        kvstore:
+          backend: kv_default
+          namespace: batches_store
+      provider_id: reference
+      provider_type: inline::reference
+  datasetio:
+    - config:
+        kvstore:
+          backend: kv_default
+          namespace: huggingface_datasetio
+      provider_id: huggingface
+      provider_type: remote::huggingface
+    - config:
+        kvstore:
+          backend: kv_default
+          namespace: localfs_datasetio
+      provider_id: localfs
+      provider_type: inline::localfs
+  eval:
+    - config:
+        kvstore:
+          backend: kv_default
+          namespace: eval_store
+      provider_id: meta-reference
+      provider_type: inline::meta-reference
+  files:
+    - config:
+        metadata_store:
+          backend: sql_default
+          table_name: files_metadata
+        storage_dir: ~/.llama/storage/files
+      provider_id: meta-reference-files
+      provider_type: inline::localfs
+  inference:
+    - config:
+        allowed_models:
+          - ${env.E2E_OPENAI_MODEL:=gpt-4o-mini}
+        api_key: ${env.OPENAI_API_KEY}
+      provider_id: openai
+      provider_type: remote::openai
+    - config: {}
+      provider_id: sentence-transformers
+      provider_type: inline::sentence-transformers
+  safety:
+    - config:
+        excluded_categories: []
+      provider_id: llama-guard
+      provider_type: inline::llama-guard
+  scoring:
+    - config: {}
+      provider_id: basic
+      provider_type: inline::basic
+    - config: {}
+      provider_id: llm-as-judge
+      provider_type: inline::llm-as-judge
+    - config:
+        openai_api_key: '********'
+      provider_id: braintrust
+      provider_type: inline::braintrust
+  tool_runtime:
+    - config: {}
+      provider_id: rag-runtime
+      provider_type: inline::rag-runtime
+    - config: {}
+      provider_id: model-context-protocol
+      provider_type: remote::model-context-protocol
+  vector_io: []
+registered_resources:
+  benchmarks: []
+  datasets: []
+  models:
+    - metadata:
+        embedding_dimension: 768
+      model_id: all-mpnet-base-v2
+      model_type: embedding
+      provider_id: sentence-transformers
+      provider_model_id: all-mpnet-base-v2
+  scoring_fns: []
+  shields:
+    - provider_id: llama-guard
+      provider_shield_id: openai/gpt-4o-mini
+      shield_id: llama-guard
+  tool_groups:
+    - provider_id: rag-runtime
+      toolgroup_id: builtin::rag
+  vector_stores: []
+safety:
+  default_shield_id: llama-guard
+scoring_fns: []
+server:
+  port: 8321
+storage:
+  backends:
+    kv_default:
+      db_path: ${env.KV_STORE_PATH:=~/.llama/storage/kv_store.db}
+      type: kv_sqlite
+    sql_default:
+      db_path: ${env.SQL_STORE_PATH:=~/.llama/storage/sql_store.db}
+      type: sql_sqlite
+  stores:
+    conversations:
+      backend: sql_default
+      table_name: openai_conversations
+    inference:
+      backend: sql_default
+      max_write_queue_size: 10000
+      num_writers: 4
+      table_name: inference_store
+    metadata:
+      backend: kv_default
+      namespace: registry
+    prompts:
+      backend: kv_default
+      namespace: prompts
+vector_stores:
+  default_embedding_model:
+    model_id: all-mpnet-base-v2
+    provider_id: sentence-transformers
+  default_provider_id: faiss
+version: 2
diff --git a/docs/design/llama-stack-config-merge/poc-evidence/lightspeed-stack-unified-library.yaml b/docs/design/llama-stack-config-merge/poc-evidence/lightspeed-stack-unified-library.yaml
new file mode 100644
index 000000000..a75ad5bf6
--- /dev/null
+++ b/docs/design/llama-stack-config-merge/poc-evidence/lightspeed-stack-unified-library.yaml
@@ -0,0 +1,33 @@
+name: Lightspeed Core Service (LCS) - Unified PoC
+service:
+  host: 0.0.0.0
+  port: 8080
+  base_url: http://localhost:8080
+  auth_enabled: false
+  workers: 1
+  color_log: true
+  access_log: true
+# Unified mode: no `library_client_config_path`. Operational LS config is
+# synthesized by LCORE from `llama_stack.config` below.
+llama_stack:
+  use_as_library_client: true
+  config:
+    # Use the CI-friendly baseline via `profile` (no EXTERNAL_PROVIDERS_DIR
+    # env var required). Equivalent to what tests/e2e/configs/run-ci.yaml
+    # provides; this exercises the `profile:` path of the synthesizer.
+    profile: /home/msvistun/repos/lightspeed/stack/tests/e2e/configs/run-ci.yaml
+    # Small native_override: prove overrides take effect end-to-end.
+    native_override:
+      safety:
+        default_shield_id: llama-guard
+user_data_collection:
+  feedback_enabled: false
+  feedback_storage: "/tmp/lcore-836-poc/feedback"
+  transcripts_enabled: false
+  transcripts_storage: "/tmp/lcore-836-poc/transcripts"
+conversation_cache:
+  type: "sqlite"
+  sqlite:
+    db_path: "/tmp/lcore-836-poc/conversation-cache.db"
+authentication:
+  module: "noop"
diff --git a/scripts/llama-stack-entrypoint.sh b/scripts/llama-stack-entrypoint.sh
index a7eeb797b..6917017c5 100755
--- a/scripts/llama-stack-entrypoint.sh
+++ b/scripts/llama-stack-entrypoint.sh
@@ -1,6 +1,12 @@
 #!/bin/bash
 # Entrypoint for llama-stack container.
-# Enriches config with lightspeed dynamic values, then starts llama-stack.
+# Produces the run.yaml from lightspeed-stack.yaml then starts llama-stack.
+#
+# Two modes, auto-detected by the Python CLI (llama_stack_configuration.py):
+# - Unified (LCORE-836): `llama_stack.config` present in lightspeed-stack.yaml.
+#   The full run.yaml is SYNTHESIZED from the unified block; -i is ignored.
+# - Legacy: `run.yaml` is mounted separately and ENRICHED with BYOK RAG / Solr /
+#   Azure Entra ID values from lightspeed-stack.yaml.
 
 set -e
 
@@ -9,9 +15,9 @@ ENRICHED_CONFIG="/opt/app-root/run.yaml"
 LIGHTSPEED_CONFIG="${LIGHTSPEED_CONFIG:-/opt/app-root/lightspeed-stack.yaml}"
 ENV_FILE="/opt/app-root/.env"
 
-# Enrich config if lightspeed config exists
+# Run the config producer if lightspeed config exists
 if [ -f "$LIGHTSPEED_CONFIG" ]; then
-    echo "Enriching llama-stack config..."
+    echo "Preparing llama-stack config from $LIGHTSPEED_CONFIG ..."
     ENRICHMENT_FAILED=0
     python3 /opt/app-root/llama_stack_configuration.py \
         -c "$LIGHTSPEED_CONFIG" \
diff --git a/src/client.py b/src/client.py
index 0c77c2d49..f42544a13 100644
--- a/src/client.py
+++ b/src/client.py
@@ -3,6 +3,7 @@
 import json
 import os
 import tempfile
+from pathlib import Path
 from typing import Optional
 
 import yaml
@@ -11,7 +12,12 @@
 from llama_stack_client import APIConnectionError, AsyncLlamaStackClient
 
 from configuration import configuration
-from llama_stack_configuration import YamlDumper, enrich_byok_rag, enrich_solr
+from llama_stack_configuration import (
+    YamlDumper,
+    enrich_byok_rag,
+    enrich_solr,
+    synthesize_configuration,
+)
 from log import get_logger
 from models.config import LlamaStackConfiguration
 from models.responses import ServiceUnavailableResponse
@@ -44,22 +50,65 @@ async def load(self, llama_stack_config: LlamaStackConfiguration) -> None:
     async def _load_library_client(self, config: LlamaStackConfiguration) -> None:
         """Initialize client in library mode.
 
+        Two paths:
+        - Unified mode (`config.config` set): synthesize full run.yaml from the
+          lightspeed-stack config and write to a deterministic path.
+        - Legacy mode (`config.library_client_config_path` set): read the
+          external run.yaml and apply in-place enrichment.
+
         Stores the final config path for use in reload.
         """
-        if config.library_client_config_path is None:
+        if config.config is not None:
+            logger.info("Using Llama stack as library client (unified mode)")
+            self._config_path = self._synthesize_library_config()
+        elif config.library_client_config_path is not None:
+            logger.info("Using Llama stack as library client (legacy mode)")
+            self._config_path = self._enrich_library_config(
+                config.library_client_config_path
+            )
+        else:
             raise ValueError(
-                "Configuration problem: library_client_config_path is not set"
+                "Configuration problem: neither `llama_stack.config` (unified) "
+                "nor `llama_stack.library_client_config_path` (legacy) is set"
             )
-        logger.info("Using Llama stack as library client")
-
-        self._config_path = self._enrich_library_config(
-            config.library_client_config_path
-        )
 
         client = AsyncLlamaStackAsLibraryClient(self._config_path)
         await client.initialize()
         self._lsc = client
 
+    def _synthesize_library_config(self) -> str:
+        """Synthesize the full Llama Stack run.yaml from unified-mode config.
+
+        Library-client-friendly: writes to a file since the Llama Stack library
+        client only accepts a file path (not a dict). Returns the path to the
+        synthesized file.
+
+        The synthesizer preserves env-var references (`${env.FOO}`) verbatim;
+        secrets are not resolved into the file on disk.
+
+        Returns:
+            str: Path to the synthesized run.yaml.
+        """
+        lcs_config_dict = configuration.configuration.model_dump(
+            exclude_none=True, mode="python"
+        )
+        config_file_dir: Optional[Path] = None
+        env_path = os.environ.get("LIGHTSPEED_STACK_CONFIG_PATH")
+        if env_path:
+            config_file_dir = Path(env_path).resolve().parent
+
+        ls_config = synthesize_configuration(
+            lcs_config_dict, config_file_dir=config_file_dir
+        )
+
+        synthesized_path = os.path.join(
+            tempfile.gettempdir(), "llama_stack_synthesized_config.yaml"
+        )
+        with open(synthesized_path, "w", encoding="utf-8") as f:
+            yaml.dump(ls_config, f, Dumper=YamlDumper, default_flow_style=False)
+        logger.info("Wrote synthesized Llama Stack config to %s", synthesized_path)
+        return synthesized_path
+
     def _load_service_client(self, config: LlamaStackConfiguration) -> None:
         """Initialize client in service mode (remote HTTP)."""
         logger.info("Using Llama stack running as a service")
diff --git a/src/data/default_run.yaml b/src/data/default_run.yaml
new file mode 100644
index 000000000..7a4a78efa
--- /dev/null
+++ b/src/data/default_run.yaml
@@ -0,0 +1,155 @@
+version: 2
+
+apis:
+- agents
+- batches
+- datasetio
+- eval
+- files
+- inference
+- safety
+- scoring
+- tool_runtime
+- vector_io
+      
+benchmarks: []
+datasets: []
+image_name: starter
+external_providers_dir: ${env.EXTERNAL_PROVIDERS_DIR}
+
+providers:
+  inference:
+  - provider_id: openai # This ID is a reference to 'providers.inference'
+    provider_type: remote::openai
+    config:
+      api_key: ${env.OPENAI_API_KEY}
+      allowed_models: ["${env.E2E_OPENAI_MODEL:=gpt-4o-mini}"]
+  - provider_id: sentence-transformers
+    provider_type: inline::sentence-transformers
+  files:
+  - config:
+      metadata_store:
+        table_name: files_metadata
+        backend: sql_default
+      storage_dir: ~/.llama/storage/files
+    provider_id: meta-reference-files
+    provider_type: inline::localfs
+  safety:
+  - config:
+      excluded_categories: []
+    provider_id: llama-guard
+    provider_type: inline::llama-guard
+  scoring:
+  - provider_id: basic
+    provider_type: inline::basic
+    config: {}
+  - provider_id: llm-as-judge
+    provider_type: inline::llm-as-judge
+    config: {}
+  - provider_id: braintrust
+    provider_type: inline::braintrust
+    config:
+      openai_api_key: '********'
+  tool_runtime:
+  - config: {} # Enable the RAG tool
+    provider_id: rag-runtime
+    provider_type: inline::rag-runtime
+  vector_io:
+  - config:
+      persistence:
+        namespace: vector_io::faiss
+        backend: kv_default
+    provider_id: faiss
+    provider_type: inline::faiss
+  agents:
+  - config:
+      persistence:
+        agent_state:
+          namespace: agents_state
+          backend: kv_default
+        responses:
+          table_name: agents_responses
+          backend: sql_default
+    provider_id: meta-reference
+    provider_type: inline::meta-reference
+  batches:
+  - config:
+      kvstore:
+        namespace: batches_store
+        backend: kv_default
+    provider_id: reference
+    provider_type: inline::reference
+  datasetio:
+  - config:
+      kvstore:
+        namespace: huggingface_datasetio
+        backend: kv_default
+    provider_id: huggingface
+    provider_type: remote::huggingface
+  - config:
+      kvstore:
+        namespace: localfs_datasetio
+        backend: kv_default
+    provider_id: localfs
+    provider_type: inline::localfs
+  eval:
+  - config:
+      kvstore:
+        namespace: eval_store
+        backend: kv_default
+    provider_id: meta-reference
+    provider_type: inline::meta-reference
+scoring_fns: []
+server:
+  port: 8321
+storage:
+  backends:
+    kv_default: # Define the storage backend type for RAG, in this case registry and RAG are unified i.e. information on registered resources (e.g. models, vector_stores) are saved together with the RAG chunks
+      type: kv_sqlite
+      db_path: ${env.KV_STORE_PATH:=~/.llama/storage/rag/kv_store.db}
+    sql_default:
+      type: sql_sqlite
+      db_path: ${env.SQL_STORE_PATH:=~/.llama/storage/sql_store.db}
+  stores:
+    metadata:
+      namespace: registry
+      backend: kv_default
+    inference:
+      table_name: inference_store
+      backend: sql_default
+      max_write_queue_size: 10000
+      num_writers: 4
+    conversations:
+      table_name: openai_conversations
+      backend: sql_default
+    prompts:
+      namespace: prompts
+      backend: kv_default
+registered_resources:
+  models: []
+  shields:
+  - shield_id: llama-guard
+    provider_id: llama-guard
+    provider_shield_id: openai/gpt-4o-mini
+  vector_stores: []
+  datasets: []
+  scoring_fns: []
+  benchmarks: []
+  tool_groups:
+  - toolgroup_id: builtin::rag # Register the RAG tool
+    provider_id: rag-runtime
+# REQUIRED: This section is necessary for file_search tool calls to work.
+# Without it, llama-stack's rag-runtime silently fails all file_search operations
+# with no error logged.
+vector_stores:
+  # LCORE-1498: Disables Llama Stack RAG annotation generation
+  # causing unwanted citation/file markers in model output.
+  annotation_prompt_params:
+    enable_annotations: false
+  default_provider_id: faiss
+  default_embedding_model: # Define the default embedding model for RAG
+    provider_id: sentence-transformers
+    model_id: nomic-ai/nomic-embed-text-v1.5
+safety:
+  default_shield_id: llama-guard
+
diff --git a/src/lightspeed_stack.py b/src/lightspeed_stack.py
index 858799c36..261092063 100644
--- a/src/lightspeed_stack.py
+++ b/src/lightspeed_stack.py
@@ -48,10 +48,11 @@ def create_argument_parser() -> ArgumentParser:
     - -d / --dump-configuration: dump the loaded configuration to JSON and exit
     - -s / --dump-schema: dump the configuration schema to OpenAPI JSON and exit
     - -c / --config: path to the configuration file (default "lightspeed-stack.yaml")
-    - -g / --generate-llama-stack-configuration: generate a Llama Stack
-                                                 configuration from the service configuration
-    - -i / --input-config-file: Llama Stack input configuration filename (default "run.yaml")
-    - -o / --output-config-file: Llama Stack output configuration filename (default "run_.yaml")
+    - --migrate-config: migrate a legacy (run.yaml + lightspeed-stack.yaml)
+                        setup into a unified single-file config and exit
+    - --run-yaml: input run.yaml for --migrate-config (default "run.yaml")
+    - --migrate-output: output path for --migrate-config
+                        (default "lightspeed-stack-unified.yaml")
 
     Returns:
         Configured ArgumentParser for parsing the service CLI options.
@@ -88,6 +89,27 @@ def create_argument_parser() -> ArgumentParser:
         help="path to configuration file (default: lightspeed-stack.yaml)",
         default="lightspeed-stack.yaml",
     )
+    parser.add_argument(
+        "--migrate-config",
+        dest="migrate_config",
+        help="migrate legacy (run.yaml + lightspeed-stack.yaml) into a unified "
+        "single-file configuration and exit",
+        action="store_true",
+        default=False,
+    )
+    parser.add_argument(
+        "--run-yaml",
+        dest="run_yaml",
+        help="path to legacy run.yaml for --migrate-config (default: run.yaml)",
+        default="run.yaml",
+    )
+    parser.add_argument(
+        "--migrate-output",
+        dest="migrate_output",
+        help="output path for --migrate-config "
+        "(default: lightspeed-stack-unified.yaml)",
+        default="lightspeed-stack-unified.yaml",
+    )
 
     return parser
 
@@ -125,6 +147,23 @@ def main() -> None:
             if isinstance(existing_logger, logging.Logger):
                 existing_logger.setLevel(logging.DEBUG)
 
+    # --migrate-config runs standalone; does not load config into the singleton,
+    # since the input may be in legacy form and we are producing its successor.
+    if args.migrate_config:
+        # pylint: disable=import-outside-toplevel
+        from llama_stack_configuration import migrate_config_dumb
+
+        try:
+            migrate_config_dumb(args.run_yaml, args.config_file, args.migrate_output)
+            logger.info(
+                "Migration complete. Wrote unified config to %s",
+                args.migrate_output,
+            )
+        except Exception as e:
+            logger.error("Migration failed: %s", e)
+            raise SystemExit(1) from e
+        return
+
     configuration.load_configuration(args.config_file)
     logger.info("Configuration: %s", configuration.configuration)
     logger.info(
diff --git a/src/llama_stack_configuration.py b/src/llama_stack_configuration.py
index 3ac4a8ec4..3776eefbf 100644
--- a/src/llama_stack_configuration.py
+++ b/src/llama_stack_configuration.py
@@ -589,6 +589,246 @@ def enrich_solr(  # pylint: disable=too-many-locals
         logger.info("Added OKP embedding model to registered_resources.models")
 
 
+# =============================================================================
+# Synthesis for Unified Mode (LCORE-836)
+# =============================================================================
+
+
+DEFAULT_BASELINE_RESOURCE = "default_run.yaml"
+
+PROVIDER_TYPE_MAP: dict[str, str] = {
+    "openai": "remote::openai",
+    "sentence_transformers": "inline::sentence-transformers",
+    "azure": "remote::azure",
+    "vertexai": "remote::vertexai",
+    "watsonx": "remote::watsonx",
+    "vllm_rhaiis": "remote::vllm",
+    "vllm_rhel_ai": "remote::vllm",
+}
+
+
+def load_default_baseline() -> dict[str, Any]:
+    """Load LCORE's built-in default Llama Stack baseline config.
+
+    Returns:
+        dict[str, Any]: The default baseline run.yaml parsed as a dict.
+    """
+    # importlib.resources-style load; `src/data/default_run.yaml` is shipped
+    # with the package.
+    baseline_path = Path(__file__).parent / "data" / DEFAULT_BASELINE_RESOURCE
+    logger.info("Loading built-in default baseline from %s", baseline_path)
+    with open(baseline_path, "r", encoding="utf-8") as f:
+        return yaml.safe_load(f)
+
+
+def deep_merge_list_replace(
+    base: dict[str, Any], overlay: dict[str, Any]
+) -> dict[str, Any]:
+    """Deep-merge `overlay` onto `base`.
+
+    Maps are merged recursively. Lists and scalars in `overlay` replace the
+    corresponding entry in `base` (no append semantics). Result is a new dict;
+    neither argument is mutated.
+
+    Parameters:
+        base: The base mapping.
+        overlay: The mapping whose values take precedence.
+
+    Returns:
+        dict[str, Any]: A new mapping with overlay applied on top of base.
+    """
+    import copy  # pylint: disable=import-outside-toplevel
+
+    result: dict[str, Any] = copy.deepcopy(base)
+    for key, value in overlay.items():
+        if key in result and isinstance(result[key], dict) and isinstance(value, dict):
+            result[key] = deep_merge_list_replace(result[key], value)
+        else:
+            result[key] = copy.deepcopy(value)
+    return result
+
+
+def apply_high_level_inference(
+    ls_config: dict[str, Any], inference: dict[str, Any]
+) -> None:
+    """Apply a high-level `inference` block into `ls_config['providers']['inference']`.
+
+    Replaces the inference provider list entirely. Use `native_override` for
+    additive tweaks.
+
+    Parameters:
+        ls_config: Llama Stack config dict (modified in place).
+        inference: High-level inference section as a dict (with 'providers' list).
+    """
+    providers_out: list[dict[str, Any]] = []
+    for provider in inference.get("providers", []):
+        p_type = provider["type"]
+        entry: dict[str, Any] = {
+            "provider_id": p_type,
+            "provider_type": PROVIDER_TYPE_MAP[p_type],
+        }
+        cfg: dict[str, Any] = {}
+        if provider.get("api_key_env"):
+            cfg["api_key"] = f"${{env.{provider['api_key_env']}}}"
+        if provider.get("allowed_models"):
+            cfg["allowed_models"] = provider["allowed_models"]
+        if provider.get("extra"):
+            cfg.update(provider["extra"])
+        if cfg:
+            entry["config"] = cfg
+        providers_out.append(entry)
+
+    if "providers" not in ls_config:
+        ls_config["providers"] = {}
+    ls_config["providers"]["inference"] = providers_out
+    logger.info(
+        "Applied high-level inference section: %s provider entries",
+        len(providers_out),
+    )
+
+
+def synthesize_configuration(
+    lcs_config: dict[str, Any],
+    config_file_dir: Optional[Path] = None,
+    default_baseline: Optional[dict[str, Any]] = None,
+) -> dict[str, Any]:
+    """Synthesize a full Llama Stack run.yaml from a unified-mode LCORE config.
+
+    Pipeline:
+        1. Baseline = profile file (if set) else `default_baseline` (if provided)
+           else LCORE's built-in default.
+        2. Apply existing top-level enrichment (BYOK RAG, Solr/OKP).
+           Azure Entra ID is intentionally not run here (side-effect on .env).
+        3. Apply high-level sections (inference, and later storage/safety/...).
+        4. Deep-merge (list-replace) `native_override`.
+
+    Precedence: profile < high-level sections < native_override.
+
+    Parameters:
+        lcs_config: Full lightspeed-stack.yaml content as a dict (env-expanded).
+        config_file_dir: Directory containing the lightspeed-stack.yaml, used
+            to resolve relative `profile:` paths. If None, relative paths are
+            resolved against the current working directory.
+        default_baseline: Override for the baseline when `profile:` is unset
+            (primarily for tests). If None, LCORE's built-in baseline is used.
+
+    Returns:
+        dict[str, Any]: The synthesized Llama Stack run.yaml as a dict.
+
+    Raises:
+        ValueError: If llama_stack.config is not present in `lcs_config`.
+    """
+    unified = (lcs_config.get("llama_stack") or {}).get("config")
+    if unified is None:
+        raise ValueError(
+            "synthesize_configuration called without llama_stack.config set"
+        )
+
+    # 1. Baseline
+    profile = unified.get("profile")
+    baseline_kind = unified.get("baseline", "default")
+    if profile:
+        profile_path = Path(profile)
+        if not profile_path.is_absolute() and config_file_dir is not None:
+            profile_path = config_file_dir / profile_path
+        logger.info("Loading unified-mode profile baseline from %s", profile_path)
+        with open(profile_path, "r", encoding="utf-8") as f:
+            ls_config: dict[str, Any] = yaml.safe_load(f)
+    elif baseline_kind == "empty":
+        logger.info("Unified mode: starting from empty baseline")
+        ls_config = {}
+    elif default_baseline is not None:
+        import copy  # pylint: disable=import-outside-toplevel
+
+        ls_config = copy.deepcopy(default_baseline)
+    else:
+        ls_config = load_default_baseline()
+
+    dedupe_providers_vector_io(ls_config)
+
+    # 2. Existing enrichment (BYOK RAG, Solr/OKP) — Azure stays out (file side-effect).
+    enrich_byok_rag(ls_config, lcs_config.get("byok_rag", []))
+    enrich_solr(ls_config, lcs_config.get("rag", {}), lcs_config.get("okp", {}))
+
+    # 3. High-level sections
+    inference = unified.get("inference")
+    if inference is not None:
+        apply_high_level_inference(ls_config, inference)
+
+    # 4. native_override — deep-merge (list-replace)
+    native_override = unified.get("native_override") or {}
+    if native_override:
+        ls_config = deep_merge_list_replace(ls_config, native_override)
+
+    dedupe_providers_vector_io(ls_config)
+    return ls_config
+
+
+def migrate_config_dumb(
+    run_yaml_path: str,
+    lightspeed_yaml_path: str,
+    output_path: str,
+) -> None:
+    """Lossless lift-and-shift migration: fold run.yaml into lightspeed-stack.yaml.
+
+    Reads the legacy two-file configuration (run.yaml + lightspeed-stack.yaml)
+    and writes a unified single-file configuration where the entire run.yaml
+    content is placed under `llama_stack.config.native_override`. Removes any
+    `llama_stack.library_client_config_path` that referenced the old run.yaml.
+
+    This is the "dumb" migration mode — preserves 100% of the existing
+    Llama Stack schema content. A future `--smart` mode (out of scope for this
+    PoC) would factor portions into high-level sections.
+
+    Parameters:
+        run_yaml_path: Path to the existing Llama Stack run.yaml.
+        lightspeed_yaml_path: Path to the existing lightspeed-stack.yaml.
+        output_path: Path to write the unified lightspeed-stack.yaml.
+    """
+    logger.info("Reading %s and %s for migration", lightspeed_yaml_path, run_yaml_path)
+
+    with open(run_yaml_path, "r", encoding="utf-8") as f:
+        run_yaml_content: dict[str, Any] = yaml.safe_load(f)
+
+    with open(lightspeed_yaml_path, "r", encoding="utf-8") as f:
+        lcs_yaml: dict[str, Any] = yaml.safe_load(f)
+
+    llama_stack_section = lcs_yaml.setdefault("llama_stack", {})
+    llama_stack_section.pop("library_client_config_path", None)
+    # `baseline: empty` is required for true lossless round-trip: default baseline
+    # would add extra keys not present in the source run.yaml.
+    llama_stack_section["config"] = {
+        "baseline": "empty",
+        "native_override": run_yaml_content,
+    }
+
+    logger.info("Writing unified configuration to %s", output_path)
+    with open(output_path, "w", encoding="utf-8") as f:
+        yaml.dump(lcs_yaml, f, Dumper=YamlDumper, default_flow_style=False)
+
+
+def synthesize_to_file(
+    lcs_config: dict[str, Any],
+    output_file: str,
+    config_file_dir: Optional[Path] = None,
+) -> None:
+    """Synthesize unified-mode Llama Stack config and write it to disk.
+
+    Secrets are never resolved — env-var references like `${env.FOO}` are
+    preserved verbatim in the output.
+
+    Parameters:
+        lcs_config: lightspeed-stack.yaml as a dict.
+        output_file: Path to write the synthesized run.yaml.
+        config_file_dir: Directory for resolving relative profile paths.
+    """
+    ls_config = synthesize_configuration(lcs_config, config_file_dir=config_file_dir)
+    logger.info("Writing synthesized Llama Stack configuration to %s", output_file)
+    Path(output_file).parent.mkdir(parents=True, exist_ok=True)
+    with open(output_file, "w", encoding="utf-8") as f:
+        yaml.dump(ls_config, f, Dumper=YamlDumper, default_flow_style=False)
+
+
 # =============================================================================
 # Main Generation Function (service/container mode only)
 # =============================================================================
@@ -638,9 +878,15 @@ def generate_configuration(
 
 
 def main() -> None:
-    """CLI entry point."""
+    """CLI entry point.
+
+    Auto-detects the mode:
+    - Unified mode: `llama_stack.config` present in the lightspeed config file.
+      Synthesizes the full run.yaml (no `-i/--input` needed); writes to `-o`.
+    - Legacy mode: requires `-i/--input` run.yaml; enriches it and writes to `-o`.
+    """
     parser = ArgumentParser(
-        description="Enrich Llama Stack config with Lightspeed values",
+        description="Produce Llama Stack run.yaml from Lightspeed config.",
     )
     parser.add_argument(
         "-c",
@@ -652,13 +898,14 @@ def main() -> None:
         "-i",
         "--input",
         default="run.yaml",
-        help="Input Llama Stack config (default: run.yaml)",
+        help="Input Llama Stack config for legacy-mode enrichment "
+        "(default: run.yaml; ignored in unified mode)",
     )
     parser.add_argument(
         "-o",
         "--output",
         default="run_.yaml",
-        help="Output enriched config (default: run_.yaml)",
+        help="Output run.yaml path (default: run_.yaml)",
     )
     parser.add_argument(
         "-e",
@@ -672,7 +919,19 @@ def main() -> None:
         config = yaml.safe_load(f)
         config = replace_env_vars(config)
 
-    generate_configuration(args.input, args.output, config, args.env_file)
+    unified_present = (config.get("llama_stack") or {}).get("config") is not None
+    if unified_present:
+        logger.info("Unified mode detected (llama_stack.config present)")
+        # Azure Entra ID side-effect (writes .env) stays part of boot — still run it.
+        setup_azure_entra_id_token(config.get("azure_entra_id"), args.env_file)
+        synthesize_to_file(
+            config,
+            args.output,
+            config_file_dir=Path(args.config).resolve().parent,
+        )
+    else:
+        logger.info("Legacy mode detected (no llama_stack.config)")
+        generate_configuration(args.input, args.output, config, args.env_file)
 
 
 if __name__ == "__main__":
diff --git a/src/models/config.py b/src/models/config.py
index 95bfc4782..9252ccdba 100644
--- a/src/models/config.py
+++ b/src/models/config.py
@@ -582,6 +582,97 @@ def resolve_auth_headers(self) -> Self:
         return self
 
 
+class UnifiedInferenceProvider(ConfigurationBase):
+    """High-level inference provider entry (unified-mode schema).
+
+    Expanded by the synthesizer into a Llama Stack `providers.inference` entry.
+    Unknown provider types must be expressed via `native_override` instead.
+    """
+
+    type: Literal[
+        "openai",
+        "sentence_transformers",
+        "azure",
+        "vertexai",
+        "watsonx",
+        "vllm_rhaiis",
+        "vllm_rhel_ai",
+    ] = Field(
+        ...,
+        description="High-level provider type. Mapped to Llama Stack provider_type.",
+    )
+
+    api_key_env: Optional[str] = Field(
+        None,
+        description="Environment variable name from which the api_key will be read "
+        "at Llama Stack start time (as `${env.<api_key_env>}`). Kept as a reference; "
+        "secrets are never resolved into the synthesized file on disk.",
+    )
+
+    allowed_models: Optional[list[str]] = Field(
+        None,
+        description="Optional list of model ids allowed for this provider.",
+    )
+
+    extra: dict[str, Any] = Field(
+        default_factory=dict,
+        description="Extra per-provider-type config fields merged into the emitted "
+        "`config` map (escape hatch for per-type oddities).",
+    )
+
+
+class UnifiedInferenceSection(ConfigurationBase):
+    """High-level inference section (unified-mode schema)."""
+
+    providers: list[UnifiedInferenceProvider] = Field(
+        default_factory=list,
+        description="High-level list of inference providers; replaces "
+        "`providers.inference` in the synthesized run.yaml.",
+    )
+
+
+class UnifiedLlamaStackConfig(ConfigurationBase):
+    """Operational Llama Stack config synthesized by LCORE at runtime.
+
+    When present (unified mode), LCORE produces a full Llama Stack run.yaml
+    from this block. Precedence (lowest to highest):
+
+        baseline (default / empty / profile)  <  high-level sections  <  native_override
+
+    This section is mutually exclusive with
+    `llama_stack.library_client_config_path` (legacy mode).
+    """
+
+    baseline: Literal["default", "empty"] = Field(
+        "default",
+        description="Starting point before profile / high-level / native_override "
+        "are applied. 'default' uses LCORE's built-in baseline run.yaml; 'empty' "
+        "starts from an empty dict (useful when `native_override` specifies the "
+        "entire Llama Stack schema, as produced by dumb-mode migration). Ignored "
+        "when `profile` is set.",
+    )
+
+    profile: Optional[str] = Field(
+        None,
+        description="Path to a profile YAML file (absolute or relative to the "
+        "lightspeed-stack.yaml location). Loaded as the baseline if set; "
+        "overrides the `baseline` field.",
+    )
+
+    inference: Optional[UnifiedInferenceSection] = Field(
+        None,
+        description="High-level inference section. Additional high-level sections "
+        "(storage, safety, tools, ...) may be added in future versions.",
+    )
+
+    native_override: dict[str, Any] = Field(
+        default_factory=dict,
+        description="Raw Llama Stack schema fragment, deep-merged onto the result "
+        "of profile + high-level expansion. Lists are replaced (not appended). "
+        "Escape hatch for anything not expressible via high-level keys.",
+    )
+
+
 class LlamaStackConfiguration(ConfigurationBase):
     """Llama stack configuration.
 
@@ -620,7 +711,16 @@ class LlamaStackConfiguration(ConfigurationBase):
     library_client_config_path: Optional[str] = Field(
         None,
         title="Llama Stack configuration path",
-        description="Path to configuration file used when Llama Stack is run in library mode",
+        description="Path to configuration file used when Llama Stack is run in library "
+        "mode (legacy mode). Mutually exclusive with `config`.",
+    )
+
+    config: Optional[UnifiedLlamaStackConfig] = Field(
+        None,
+        title="Unified Llama Stack configuration",
+        description="Operational Llama Stack config synthesized by LCORE at runtime "
+        "(unified mode). When present, LCORE produces run.yaml from this block. "
+        "Mutually exclusive with `library_client_config_path`.",
     )
 
     timeout: PositiveInt = Field(
@@ -635,21 +735,26 @@ def check_llama_stack_model(self) -> Self:
         """
         Validate the Llama Stack configuration and enforce mode-specific requirements.
 
-        If no URL is provided, requires explicit library-client mode selection.
-        When library-client mode is enabled, requires a non-empty
-        `library_client_config_path` that points to a regular, readable YAML
-        file (checked via checks.file_check). Also normalizes a None
-        `use_as_library_client` to False.
+        Unified mode (`config` set) and legacy mode (`library_client_config_path`
+        set) are mutually exclusive. If no URL is provided, requires explicit
+        library-client mode selection. When library-client mode is enabled,
+        requires either `config` (unified) or `library_client_config_path`
+        (legacy) to be set. Legacy paths are validated via checks.file_check.
 
         Returns:
             Self: The validated LlamaStackConfiguration instance.
 
         Raises:
-            ValueError: If the configuration is invalid, e.g. no
-            URL and library-client mode is unspecified or
-            disabled, or library-client mode is enabled but
-            `library_client_config_path` is not provided.
+            ValueError: If the configuration is invalid.
         """
+        if self.config is not None and self.library_client_config_path is not None:
+            raise ValueError(
+                "llama_stack.config (unified mode) and "
+                "llama_stack.library_client_config_path (legacy mode) are mutually "
+                "exclusive. Migrate legacy configurations with: "
+                "lightspeed-stack --migrate-config"
+            )
+
         if self.url is None:
             # when URL is not set, it is supposed that Llama Stack should be run in library mode
             # it means that use_as_library_client attribute must be set to True
@@ -667,20 +772,19 @@ def check_llama_stack_model(self) -> Self:
             self.use_as_library_client = False
 
         if self.use_as_library_client:
-            # when use_as_library_client is set to true, Llama Stack will be run in library mode
-            # it means that:
-            # - Llama Stack URL should not be set, and
-            # - library_client_config_path attribute must be set and must point to
-            #   a regular readable YAML file
-            if self.library_client_config_path is None:
-                # pylint: disable=line-too-long
+            # library mode requires either unified config or legacy config path
+            if self.library_client_config_path is None and self.config is None:
                 raise ValueError(
-                    "Llama stack library client mode is enabled but a configuration file path is not specified"
+                    "Llama stack library client mode is enabled but neither "
+                    "`config` (unified) nor `library_client_config_path` (legacy) "
+                    "is specified"
+                )
+            if self.library_client_config_path is not None:
+                # legacy: the configuration file must exist and be a regular readable file
+                checks.file_check(
+                    Path(self.library_client_config_path),
+                    "Llama Stack configuration file",
                 )
-            # the configuration file must exists and be regular readable file
-            checks.file_check(
-                Path(self.library_client_config_path), "Llama Stack configuration file"
-            )
         return self
 
 
diff --git a/test.containerfile b/test.containerfile
index 884fd8525..dff715b9f 100644
--- a/test.containerfile
+++ b/test.containerfile
@@ -36,11 +36,14 @@ RUN mkdir -p /opt/app-root/src/.llama/storage \
     chown -R 1001:0 /opt/app-root && \
     chmod -R 775 /opt/app-root
 
-# Copy enrichment scripts for runtime config enrichment
+# Copy enrichment / unified-mode synthesis scripts for runtime config production
 COPY src/llama_stack_configuration.py /opt/app-root/llama_stack_configuration.py
+COPY src/data /opt/app-root/data
 COPY scripts/llama-stack-entrypoint.sh /opt/app-root/enrich-entrypoint.sh
 RUN chmod +x /opt/app-root/enrich-entrypoint.sh && \
-    chown 1001:0 /opt/app-root/enrich-entrypoint.sh /opt/app-root/llama_stack_configuration.py
+    chown -R 1001:0 /opt/app-root/enrich-entrypoint.sh \
+                    /opt/app-root/llama_stack_configuration.py \
+                    /opt/app-root/data
 
 # Switch back to the original user
 USER 1001
diff --git a/tests/unit/models/config/test_dump_configuration.py b/tests/unit/models/config/test_dump_configuration.py
index 06a3ef08c..c5954a478 100644
--- a/tests/unit/models/config/test_dump_configuration.py
+++ b/tests/unit/models/config/test_dump_configuration.py
@@ -144,6 +144,7 @@ def test_dump_configuration(tmp_path: Path) -> None:
                 "use_as_library_client": True,
                 "api_key": "**********",
                 "library_client_config_path": "tests/configuration/run.yaml",
+                "config": None,
                 "timeout": 180,
             },
             "user_data_collection": {
@@ -486,6 +487,7 @@ def test_dump_configuration_with_quota_limiters(tmp_path: Path) -> None:
                 "use_as_library_client": True,
                 "api_key": "**********",
                 "library_client_config_path": "tests/configuration/run.yaml",
+                "config": None,
                 "timeout": 180,
             },
             "user_data_collection": {
@@ -720,6 +722,7 @@ def test_dump_configuration_with_quota_limiters_different_values(
                 "use_as_library_client": True,
                 "api_key": "**********",
                 "library_client_config_path": "tests/configuration/run.yaml",
+                "config": None,
                 "timeout": 180,
             },
             "user_data_collection": {
@@ -934,6 +937,7 @@ def test_dump_configuration_byok(tmp_path: Path) -> None:
                 "use_as_library_client": True,
                 "api_key": "**********",
                 "library_client_config_path": "tests/configuration/run.yaml",
+                "config": None,
                 "timeout": 180,
             },
             "user_data_collection": {
@@ -1138,6 +1142,7 @@ def test_dump_configuration_pg_namespace(tmp_path: Path) -> None:
                 "use_as_library_client": True,
                 "api_key": "**********",
                 "library_client_config_path": "tests/configuration/run.yaml",
+                "config": None,
                 "timeout": 180,
             },
             "user_data_collection": {
diff --git a/tests/unit/models/config/test_llama_stack_configuration.py b/tests/unit/models/config/test_llama_stack_configuration.py
index deec7e765..57ba66861 100644
--- a/tests/unit/models/config/test_llama_stack_configuration.py
+++ b/tests/unit/models/config/test_llama_stack_configuration.py
@@ -86,20 +86,65 @@ def test_llama_stack_wrong_configuration_no_config_file() -> None:
     """Test the LlamaStackConfiguration constructor.
 
     Verify that enabling library-client mode without providing a configuration
-    file path raises a ValueError.
-
-    Asserts that constructing LlamaStackConfiguration with
-    use_as_library_client=True and no library_client_config_path raises a
-    ValueError whose message is "Llama stack library client mode is enabled but
-    a configuration file path is not specified".
+    file path or unified config raises a ValueError.
     """
-    m = "Llama stack library client mode is enabled but a configuration file path is not specified"
+    m = (
+        "Llama stack library client mode is enabled but neither `config` "
+        "\\(unified\\) nor `library_client_config_path` \\(legacy\\) is specified"
+    )
     with pytest.raises(ValueError, match=m):
         LlamaStackConfiguration(
             use_as_library_client=True
         )  # pyright: ignore[reportCallIssue]
 
 
+# =============================================================================
+# Unified mode (LCORE-836)
+# =============================================================================
+
+
+def test_llama_stack_unified_mode_library_client() -> None:
+    """Unified mode in library mode: `config` set, no library_client_config_path."""
+    # pylint: disable=import-outside-toplevel
+    from models.config import UnifiedLlamaStackConfig
+
+    cfg = LlamaStackConfiguration(
+        use_as_library_client=True,
+        config=UnifiedLlamaStackConfig(),
+    )  # pyright: ignore[reportCallIssue]
+    assert cfg.config is not None
+    assert cfg.library_client_config_path is None
+
+
+def test_llama_stack_unified_and_legacy_are_mutually_exclusive() -> None:
+    """Setting both `config` and `library_client_config_path` is rejected."""
+    # pylint: disable=import-outside-toplevel
+    from models.config import UnifiedLlamaStackConfig
+
+    with pytest.raises(
+        ValueError,
+        match="mutually exclusive",
+    ):
+        LlamaStackConfiguration(
+            use_as_library_client=True,
+            library_client_config_path="tests/configuration/run.yaml",
+            config=UnifiedLlamaStackConfig(),
+        )  # pyright: ignore[reportCallIssue]
+
+
+def test_llama_stack_unified_mode_with_remote_url() -> None:
+    """Unified config is also allowed when connecting to a remote Llama Stack."""
+    # pylint: disable=import-outside-toplevel
+    from models.config import UnifiedLlamaStackConfig
+
+    cfg = LlamaStackConfiguration(
+        url="http://remote-ls:8321",
+        config=UnifiedLlamaStackConfig(),
+    )  # pyright: ignore[reportCallIssue]
+    assert cfg.config is not None
+    assert str(cfg.url) == "http://remote-ls:8321/"
+
+
 def test_llama_stack_configuration_valid_http_url() -> None:
     """Test that valid HTTP URLs are accepted."""
     config = LlamaStackConfiguration(
diff --git a/tests/unit/test_client.py b/tests/unit/test_client.py
index bcd5ca0d6..999abf7f6 100644
--- a/tests/unit/test_client.py
+++ b/tests/unit/test_client.py
@@ -84,7 +84,7 @@ async def test_get_async_llama_stack_wrong_configuration() -> None:
     cfg.library_client_config_path = None
     with pytest.raises(
         ValueError,
-        match="Configuration problem: library_client_config_path is not set",
+        match="neither .*unified.* nor .*legacy.* is set",
     ):
         client = AsyncLlamaStackClientHolder()
         await client.load(cfg)
diff --git a/tests/unit/test_llama_stack_synthesize.py b/tests/unit/test_llama_stack_synthesize.py
new file mode 100644
index 000000000..9e86bcaa0
--- /dev/null
+++ b/tests/unit/test_llama_stack_synthesize.py
@@ -0,0 +1,371 @@
+"""Unit tests for unified-mode synthesizer and migration tool (LCORE-836)."""
+
+from pathlib import Path
+from typing import Any
+
+import pytest
+import yaml
+
+from llama_stack_configuration import (
+    PROVIDER_TYPE_MAP,
+    apply_high_level_inference,
+    deep_merge_list_replace,
+    load_default_baseline,
+    migrate_config_dumb,
+    synthesize_configuration,
+)
+
+# =============================================================================
+# deep_merge_list_replace
+# =============================================================================
+
+
+def test_deep_merge_scalar_replace() -> None:
+    """Overlay scalar replaces base scalar."""
+    result = deep_merge_list_replace({"a": 1}, {"a": 2})
+    assert result == {"a": 2}
+
+
+def test_deep_merge_adds_new_keys() -> None:
+    """Overlay keys not in base are added."""
+    result = deep_merge_list_replace({"a": 1}, {"b": 2})
+    assert result == {"a": 1, "b": 2}
+
+
+def test_deep_merge_nested_map_merges() -> None:
+    """Nested maps merge recursively."""
+    base = {"a": {"x": 1, "y": 2}}
+    overlay = {"a": {"y": 20, "z": 30}}
+    result = deep_merge_list_replace(base, overlay)
+    assert result == {"a": {"x": 1, "y": 20, "z": 30}}
+
+
+def test_deep_merge_list_replaces() -> None:
+    """Lists are replaced, not appended."""
+    base = {"items": [1, 2, 3]}
+    overlay = {"items": [9]}
+    result = deep_merge_list_replace(base, overlay)
+    assert result == {"items": [9]}
+
+
+def test_deep_merge_does_not_mutate_inputs() -> None:
+    """Neither base nor overlay are mutated."""
+    base = {"a": {"x": 1}}
+    overlay = {"a": {"x": 2}}
+    result = deep_merge_list_replace(base, overlay)
+    assert base == {"a": {"x": 1}}
+    assert overlay == {"a": {"x": 2}}
+    assert result == {"a": {"x": 2}}
+
+
+def test_deep_merge_type_mismatch_replaces() -> None:
+    """If overlay type != base type at same key, overlay wins."""
+    # base is map, overlay is scalar
+    result = deep_merge_list_replace({"a": {"x": 1}}, {"a": "replaced"})
+    assert result == {"a": "replaced"}
+
+
+# =============================================================================
+# apply_high_level_inference
+# =============================================================================
+
+
+def test_apply_high_level_inference_single_provider() -> None:
+    """Single provider with api_key_env and allowed_models."""
+    ls_config: dict[str, Any] = {}
+    inference = {
+        "providers": [
+            {
+                "type": "openai",
+                "api_key_env": "OPENAI_API_KEY",
+                "allowed_models": ["gpt-4o-mini"],
+            }
+        ]
+    }
+    apply_high_level_inference(ls_config, inference)
+    assert ls_config["providers"]["inference"] == [
+        {
+            "provider_id": "openai",
+            "provider_type": "remote::openai",
+            "config": {
+                "api_key": "${env.OPENAI_API_KEY}",
+                "allowed_models": ["gpt-4o-mini"],
+            },
+        }
+    ]
+
+
+def test_apply_high_level_inference_replaces_existing() -> None:
+    """Providers list is replaced entirely, not merged."""
+    ls_config = {"providers": {"inference": [{"provider_id": "stale"}]}}
+    apply_high_level_inference(
+        ls_config, {"providers": [{"type": "sentence_transformers"}]}
+    )
+    assert ls_config["providers"]["inference"] == [
+        {
+            "provider_id": "sentence_transformers",
+            "provider_type": "inline::sentence-transformers",
+        }
+    ]
+
+
+def test_apply_high_level_inference_extra_merged() -> None:
+    """`extra` dict fields merge into emitted config."""
+    ls_config: dict[str, Any] = {}
+    inference = {
+        "providers": [
+            {
+                "type": "vertexai",
+                "extra": {"project_id": "my-project", "location": "us-central1"},
+            }
+        ]
+    }
+    apply_high_level_inference(ls_config, inference)
+    assert ls_config["providers"]["inference"][0]["config"] == {
+        "project_id": "my-project",
+        "location": "us-central1",
+    }
+
+
+def test_provider_type_map_covers_all_literals() -> None:
+    """Every Literal value declared on UnifiedInferenceProvider.type has a mapping."""
+    # pylint: disable=import-outside-toplevel
+    from models.config import UnifiedInferenceProvider
+
+    literal_values = (
+        UnifiedInferenceProvider.model_fields[  # pylint: disable=unsubscriptable-object
+            "type"
+        ].annotation.__args__
+    )
+    for value in literal_values:
+        assert value in PROVIDER_TYPE_MAP
+
+
+# =============================================================================
+# synthesize_configuration
+# =============================================================================
+
+
+MINIMAL_BASELINE: dict[str, Any] = {
+    "version": 2,
+    "apis": ["inference"],
+    "providers": {
+        "inference": [
+            {"provider_id": "stock", "provider_type": "remote::stock", "config": {}}
+        ]
+    },
+    "safety": {"default_shield_id": "llama-guard"},
+}
+
+
+def test_synthesize_errors_without_config() -> None:
+    """Without llama_stack.config present, synthesize raises ValueError."""
+    with pytest.raises(ValueError, match="llama_stack.config"):
+        synthesize_configuration({"llama_stack": {}})
+
+
+def test_synthesize_uses_default_baseline_when_no_profile() -> None:
+    """With neither profile nor native_override, result is the baseline (through enrichment)."""
+    lcs_config: dict[str, Any] = {"llama_stack": {"config": {}}}
+    result = synthesize_configuration(lcs_config, default_baseline=MINIMAL_BASELINE)
+    # Baseline preserved (enrichment is a no-op without byok_rag/rag/okp)
+    assert result["safety"] == {"default_shield_id": "llama-guard"}
+    assert result["providers"]["inference"] == [
+        {"provider_id": "stock", "provider_type": "remote::stock", "config": {}}
+    ]
+
+
+def test_synthesize_loads_profile_from_path(tmp_path: Path) -> None:
+    """Profile path is loaded as the baseline."""
+    profile_data = {
+        "version": 2,
+        "apis": ["inference"],
+        "providers": {"inference": [{"provider_id": "profile_p"}]},
+    }
+    profile_path = tmp_path / "profile.yaml"
+    profile_path.write_text(yaml.dump(profile_data))
+
+    lcs_config: dict[str, Any] = {
+        "llama_stack": {"config": {"profile": str(profile_path)}}
+    }
+    result = synthesize_configuration(lcs_config)
+    assert result["providers"]["inference"] == [{"provider_id": "profile_p"}]
+
+
+def test_synthesize_profile_relative_path(tmp_path: Path) -> None:
+    """Relative profile path resolves against config_file_dir."""
+    profile_data = {"version": 2}
+    (tmp_path / "p.yaml").write_text(yaml.dump(profile_data))
+    lcs_config: dict[str, Any] = {"llama_stack": {"config": {"profile": "p.yaml"}}}
+    result = synthesize_configuration(lcs_config, config_file_dir=tmp_path)
+    assert result == {"version": 2}
+
+
+def test_synthesize_applies_high_level_inference() -> None:
+    """High-level inference section expands into native providers list."""
+    lcs_config: dict[str, Any] = {
+        "llama_stack": {
+            "config": {
+                "inference": {
+                    "providers": [{"type": "openai", "api_key_env": "OPENAI_API_KEY"}]
+                }
+            }
+        }
+    }
+    result = synthesize_configuration(lcs_config, default_baseline=MINIMAL_BASELINE)
+    assert result["providers"]["inference"] == [
+        {
+            "provider_id": "openai",
+            "provider_type": "remote::openai",
+            "config": {"api_key": "${env.OPENAI_API_KEY}"},
+        }
+    ]
+
+
+def test_synthesize_native_override_deep_merges() -> None:
+    """native_override deep-merges on top (scalar path)."""
+    lcs_config: dict[str, Any] = {
+        "llama_stack": {
+            "config": {
+                "native_override": {
+                    "safety": {"default_shield_id": "overridden"},
+                }
+            }
+        }
+    }
+    result = synthesize_configuration(lcs_config, default_baseline=MINIMAL_BASELINE)
+    assert result["safety"]["default_shield_id"] == "overridden"
+
+
+def test_synthesize_native_override_list_replaces() -> None:
+    """native_override replaces lists, not appends."""
+    lcs_config: dict[str, Any] = {
+        "llama_stack": {
+            "config": {
+                "native_override": {
+                    "providers": {
+                        "inference": [{"provider_id": "override-only"}],
+                    }
+                }
+            }
+        }
+    }
+    result = synthesize_configuration(lcs_config, default_baseline=MINIMAL_BASELINE)
+    assert result["providers"]["inference"] == [{"provider_id": "override-only"}]
+
+
+def test_synthesize_precedence_override_beats_high_level() -> None:
+    """When high-level and native_override both touch the same path, override wins."""
+    lcs_config: dict[str, Any] = {
+        "llama_stack": {
+            "config": {
+                "inference": {"providers": [{"type": "openai"}]},
+                "native_override": {
+                    "providers": {
+                        "inference": [{"provider_id": "override-wins"}],
+                    }
+                },
+            }
+        }
+    }
+    result = synthesize_configuration(lcs_config, default_baseline=MINIMAL_BASELINE)
+    assert result["providers"]["inference"] == [{"provider_id": "override-wins"}]
+
+
+def test_synthesize_preserves_env_var_refs_verbatim() -> None:
+    """Secrets stay as ${env.FOO} references; never resolved into the output."""
+    lcs_config: dict[str, Any] = {
+        "llama_stack": {
+            "config": {
+                "inference": {
+                    "providers": [{"type": "openai", "api_key_env": "OPENAI_API_KEY"}]
+                }
+            }
+        }
+    }
+    result = synthesize_configuration(lcs_config, default_baseline=MINIMAL_BASELINE)
+    api_key_value = result["providers"]["inference"][0]["config"]["api_key"]
+    assert api_key_value == "${env.OPENAI_API_KEY}"
+
+
+# =============================================================================
+# Built-in default baseline loader
+# =============================================================================
+
+
+def test_load_default_baseline_returns_dict() -> None:
+    """The shipped default baseline loads as a dict with expected keys."""
+    baseline = load_default_baseline()
+    assert isinstance(baseline, dict)
+    assert baseline.get("version") == 2
+    assert "providers" in baseline
+
+
+# =============================================================================
+# migrate_config_dumb
+# =============================================================================
+
+
+def test_migrate_dumb_lossless_roundtrip(tmp_path: Path) -> None:
+    """Dumb migration places full run.yaml under config.native_override."""
+    run_yaml_content = {
+        "version": 2,
+        "apis": ["inference"],
+        "providers": {"inference": [{"provider_id": "opa"}]},
+    }
+    lcs_yaml_content = {
+        "name": "LCS",
+        "llama_stack": {
+            "use_as_library_client": True,
+            "library_client_config_path": str(tmp_path / "run.yaml"),
+        },
+    }
+
+    run_yaml_path = tmp_path / "run.yaml"
+    run_yaml_path.write_text(yaml.dump(run_yaml_content))
+    lcs_yaml_path = tmp_path / "lightspeed-stack.yaml"
+    lcs_yaml_path.write_text(yaml.dump(lcs_yaml_content))
+    output_path = tmp_path / "unified.yaml"
+
+    migrate_config_dumb(str(run_yaml_path), str(lcs_yaml_path), str(output_path))
+
+    result = yaml.safe_load(output_path.read_text())
+
+    # Legacy path is gone
+    assert "library_client_config_path" not in result["llama_stack"]
+    # Unified config has full run.yaml under native_override
+    assert result["llama_stack"]["config"]["native_override"] == run_yaml_content
+    # Other fields preserved
+    assert result["llama_stack"]["use_as_library_client"] is True
+    assert result["name"] == "LCS"
+
+
+def test_migrate_then_synthesize_reproduces_run_yaml(tmp_path: Path) -> None:
+    """End-to-end round trip: run.yaml → migrate → synthesize → original content."""
+    run_yaml_content = {
+        "version": 2,
+        "apis": ["inference", "vector_io"],
+        "providers": {
+            "inference": [{"provider_id": "rt", "provider_type": "remote::rt"}]
+        },
+        "safety": {"default_shield_id": "guard"},
+    }
+    lcs_yaml_content = {
+        "name": "LCS",
+        "llama_stack": {
+            "use_as_library_client": True,
+            "library_client_config_path": str(tmp_path / "run.yaml"),
+        },
+    }
+    run_yaml_path = tmp_path / "run.yaml"
+    run_yaml_path.write_text(yaml.dump(run_yaml_content))
+    lcs_yaml_path = tmp_path / "lightspeed-stack.yaml"
+    lcs_yaml_path.write_text(yaml.dump(lcs_yaml_content))
+    output_path = tmp_path / "unified.yaml"
+    migrate_config_dumb(str(run_yaml_path), str(lcs_yaml_path), str(output_path))
+
+    unified = yaml.safe_load(output_path.read_text())
+    synthesized = synthesize_configuration(unified)
+
+    # Synthesized == original run.yaml (lossless round trip in dumb mode)
+    assert synthesized == run_yaml_content