diff --git a/docs/baremetal/README.md b/docs/baremetal/README.md new file mode 100644 index 00000000..351a78a2 --- /dev/null +++ b/docs/baremetal/README.md @@ -0,0 +1,109 @@ +# Baremetal README +**Please Note**: The code in the "baremetal" directory provides reference PAL implementations meant to be adapted to real platforms; only the RD-family FVPs are regression-tested on the supplied reference models, and all new platforms must validate their own PAL changes on the relevant hardware or model before relying on them. +The directory baremetal consists of the reference code of the PAL API's specific to a platform. +Description of each directory are as follows: + +## Directory Structure +  - `base`: The implementation for all modules are present in this directory.\ +    - `include`: Consists of the include files\ +    - `src`: Source files for all modules which do not require user modification.\ +      Eg: Info tables parsing, PCIe enumeration code, etc. + +  - `target`: Contains Platform specific code. The details in this folder need to be modified w.r.t the platform\ +    - ``: Info tables parsing, PCIe enumeration code, etc\ +      - `include`: Consists of the include files for the platform specific information\ +      - `src`: Source files for all modules which require user modification. + +## Build Steps + +### Pre-requisite + +>Note: If you want to build ACS for the reference model(s) provided as part of ACS (for example, RDN2), you can directly follow the build steps below. +If you want to build ACS for your own platform, you must first complete the PAL porting for your platform (refer to the Baremetal [PAL Porting Overview](porting-pal/overview.md)) and fill platform-specific knobs using: [Platform override guides](porting-pal/platform-override-guides/README.md). Once the PAL porting is complete, execute the build steps below. + +> Note: .bin stands for either bsa.bin or sbsa.bin or pc_bsa.bin. Any platform specific changes can be done by using `TARGET_BAREMETAL` macro definition. The baremetal reference code is located in [baremetal](../../pal/baremetal/). + +Run the command +- `cd sysarch-acs` +- `python tools/scripts/generate.py ` + +> Eg: `python tools/scripts/generate.py RDN2` +> This command will create a folder RDN2 under the `pal/target` folder path and the files `pal_bsa.c` and `pal_sbsa.c` files within the `RDN2/src` folder. + +1. To compile BSA, perform the following steps\ +  1.1 `cd sysarch-acs`\ +  1.2 `export CROSS_COMPILE=/bin/aarch64-none-elf-`\ +  1.3 `cmake --preset bsa -DTARGET="Target platform"`\ +  1.4 `cmake --build --preset bsa` + +2. To compile SBSA, perform the following steps\ +  2.1 `cd sysarch-acs`\ +  2.2 `export CROSS_COMPILE=/bin/aarch64-none-elf-`\ +  2.3 `cmake --preset sbsa -DTARGET="Target platform"`\ +  2.4 `cmake --build --preset sbsa` + +3. To compile PC_BSA, perform the following steps\ +  3.1 `cd sysarch-acs`\ +  3.2 `export CROSS_COMPILE=/bin/aarch64-none-elf-`\ +  3.3 `cmake --preset pc_bsa -DTARGET="Target platform"`\ +  3.4 `cmake --build --preset pc_bsa` + +
+ +> **Note:** +> You can check available presets using `cmake --list-presets` +> If you do not provide `-DTARGET`, defaults to `RDN2`. +> If you like to use make command do `cmake --preset acs_all; cd build; make bsa` (for all baremetal acs `make acs_all`) +> Recommended: CMake v3.21 (min version to support --preset), GCC v12.2 + +
+ +``` +CMake Command Line Options: + `-DARM_ARCH_MAJOR` = Arch major version. Default value is 9. + `-DARM_ARCH_MINOR` = Arch minor version. Default value is 0. + `-DCROSS_COMPILE` = Cross compiler path + `-DTARGET` = Target platform. Should be same as folder under baremetal/target/ + `-DACS` = To compile ACS +``` + +
+ +> On a successful build, *.bin, *.elf, *.img and debug binaries are generated at `build/_build/output` directory. The output library files will be generated at `build/_build/tools/cmake` directory. + +## Running ACS with Bootwrapper on RDN2 + +**1. In RDN2 software stack make following change:** + + In `/build-scripts/build-target-bins.sh` - replace `uefi.bin` with `acs_latest.bin` + +```bash + if [ "${!tfa_tbbr_enabled}" == "1" ]; then + $TOP_DIR/$TF_A_PATH/tools/cert_create/cert_create \ + ${cert_tool_param} \ +- ${bl33_param_id} ${OUTDIR}/${!uefi_out}/uefi.bin ++ ${bl33_param_id} ${OUTDIR}/${!uefi_out}/acs_latest.bin + fi + + ${fip_tool} update \ + ${fip_param} \ +- ${bl33_param_id} ${OUTDIR}/${!uefi_out}/uefi.bin \ ++ ${bl33_param_id} ${OUTDIR}/${!uefi_out}/acs_latest.bin \ + ${PLATDIR}/${!target_name}/fip-uefi.bin + +``` + +**2. Repackage the FIP image with this new binary** +- `cp /build/_build/output/.bin /output/rdn2/components/css-common/acs_latest.bin` +- `cd ` +- `./build-scripts/rdinfra/build-test-acs.sh -p rdn2 package` +- `export MODEL=` +- `cd /model-scripts/rdinfra/platforms/rdn2` +- `./run_model.sh` + + +For more details on how to port the reference code to a specific platform and for further customisation please refer to the [User Guide](porting-pal/overview.md) + +----------------- + +*Copyright (c) 2026, Arm Limited and Contributors. All rights reserved.* diff --git a/docs/baremetal/porting-pal/checklists/new-pal-readiness-checklist.md b/docs/baremetal/porting-pal/checklists/new-pal-readiness-checklist.md new file mode 100644 index 00000000..4f056d69 --- /dev/null +++ b/docs/baremetal/porting-pal/checklists/new-pal-readiness-checklist.md @@ -0,0 +1,28 @@ +# Release readiness checklist (Baremetal) + +Use this checklist before you upstream/check-in a new platform port. + +## Documentation +- [ ] `docs/baremetal/README.md` links are valid +- [ ] Platform-specific bring-up notes are captured somewhere discoverable +- [ ] Override guides used are referenced/linked + +## Build reproducibility +- [ ] Clean build works from scratch (documented steps) +- [ ] Toolchain requirements documented +- [ ] Platform selection mechanism documented + +## Functional coverage +- [ ] Smoke run passes (define what smoke is for your suite set) +- [ ] No hard faults / aborts in normal run +- [ ] Logs contain platform ID and build ID for traceability + +## Platform override correctness +- [ ] Counts match number of populated entries for each block +- [ ] Memory map has no overlaps +- [ ] ECAM and MMIO windows validated +- [ ] IORT ID mappings validated (unit test or sanity dump) + +## Artifact hygiene +- [ ] Example logs attached to PR (or referenced) +- [ ] Known limitations documented (what doesn’t work yet) diff --git a/docs/baremetal/porting-pal/checklists/porting-checklist.md b/docs/baremetal/porting-pal/checklists/porting-checklist.md new file mode 100644 index 00000000..15967b7e --- /dev/null +++ b/docs/baremetal/porting-pal/checklists/porting-checklist.md @@ -0,0 +1,46 @@ +# Porting checklist (Baremetal) + +Use this checklist while bringing up a new platform. + +## Boot & console +- [ ] Can boot a minimal payload +- [ ] UART console prints reliably +- [ ] UART config matches actual IP and base +- [ ] UART region is marked DEVICE in memory map + +## Memory map +- [ ] DRAM regions are correct and marked NORMAL +- [ ] MMIO windows are marked DEVICE +- [ ] Reserved carveouts marked RESERVED +- [ ] Holes/unbacked regions marked NOT_POPULATED (if required) + +## Timers & watchdog +- [ ] Per-CPU timer GSIVs correct (PPIs) +- [ ] Timer flags (mode/polarity/always-on) correct +- [ ] CNTFRQ correct +- [ ] Platform timer frames correct (if used) +- [ ] Watchdog bases + GSIVs correct (if used) + +## PCIe (if used) +- [ ] ECAM base + bus ranges correct +- [ ] Enumeration limits correct +- [ ] PCIe MMIO resource windows don’t overlap +- [ ] Static device list (if used) matches enumeration + +## IORT / SMMU (if used) +- [ ] SMMU base addresses correct +- [ ] ITS count/grouping consistent with firmware +- [ ] RID→StreamID mappings correct +- [ ] StreamID→DeviceID mappings correct +- [ ] No overlapping StreamID windows + +## Optional blocks (as required by suites) +- [ ] Cache/topology overrides +- [ ] HMAT bandwidth hints +- [ ] PMU nodes +- [ ] RAS / RAS2 blocks + +## Repro artifacts +- [ ] Save binary + git SHA +- [ ] Save override header +- [ ] Save UART log diff --git a/docs/baremetal/porting-pal/overview.md b/docs/baremetal/porting-pal/overview.md new file mode 100644 index 00000000..c05a6d62 --- /dev/null +++ b/docs/baremetal/porting-pal/overview.md @@ -0,0 +1,264 @@ +# System Bring-up and ACS Debug – Consolidated Guide + +> **Purpose** +> This guide provides a single, structured reference for Arm-based platform bring-up and +> BSA/SBSA/RME ACS debugging, covering PCIe, IORT/IOMMU, MMIO, interrupts, and common failure patterns. + +## Table of Contents +- Overview +- Platform Bring-up +- PCIe Enumeration +- IORT and IOMMU +- Memory Map and MMIO +- Interrupts, Timers, and UART +- Troubleshooting and Debugging + + +## Overview + +### What PAL is + +PAL (Platform Abstraction Layer) is where sysarch-acs obtains **platform-specific information** and implements platform hooks that cannot be discovered automatically in a baremetal environment. + +Typical PAL responsibilities: +- memory map and MMIO regions +- console UART +- timer/watchdog properties +- PCIe config access (ECAM) and enumerator limits +- IOMMU/SMMU topology (IORT) and ID mapping helpers +- optional: RAS/PMU topology if required by the suite + +### What “platform overrides” are + +Many baremetal flows use a `platform_override.h`-style header (or generator input) that provides values used to build ACPI-like tables or to drive tests directly. + +In this repo, the override fields are documented under: +- [Platform override guides](platform-override-guides/README.md) + +### Where to implement platform support + +Common patterns: +- `pal/baremetal/target//...` +- `pal/platform_override.h` or generated override headers +- a platform selection CMake option or build-time macro + +> Keep platform code minimal. Prefer configuration in overrides + shared code in PAL common helpers. + + +## Platform Bringup + +This section is the **recommended workflow** for enabling a *new* platform for baremetal execution. + +### Step 1 — Boot and UART first + +1. Ensure you can boot *any* payload on your platform. +2. Bring up UART logging (correct base, IP type, baud/clock). +3. Confirm you can run a “hello world” style payload and capture logs reliably. + +See: [UART / console override guide](platform-override-guides/uart.md). + +### Step 2 — Memory map sanity + +1. Define DRAM ranges as `NORMAL`. +2. Define key MMIO windows as `DEVICE`. +3. Mark firmware carveouts as `RESERVED`. +4. Mark holes/unbacked regions as `NOT_POPULATED` (if the tests intentionally probe them). + +See: [Memory map override guide](platform-override-guides/memory.md). + +### Step 3 — Timers and watchdogs + +Populate: +- per-CPU generic timer PPIs (GSIVs/flags) +- counter frequency (CNTFRQ) +- platform GT frames (if used) +- watchdog frames (if used) + +See: [Timers & watchdog override guide](platform-override-guides/timers-watchdog.md). + +### Step 4 — PCIe (if applicable) + +Populate: +- ECAM window(s) +- bus ranges and enumeration limits +- PCIe MMIO resource windows +- optional static device list used by tests + +See: [PCIe ECAM and device hierarchy guide](platform-override-guides/pcie.md). + +### Step 5 — IOMMU / IORT (if applicable) + +Populate: +- ITS count + grouping +- SMMU instances +- RC/named component nodes +- ID mappings (RID→StreamID, StreamID→DeviceID) + +See: [IORT/IOVIRT guide](platform-override-guides/iort.md). + +### Step 6 — Optional: cache/PMU/RAS/HMAT + +Depending on which suites and tests you run, you may need: +- cache/topology overrides +- PMU nodes (APMT) +- RAS/RAS2 +- HMAT bandwidth hints + +See: [Platform override guides](platform-override-guides/README.md). + +### Step 7 — Run, triage, iterate + +- Run a minimal smoke set +- Fix hard faults first (MMIO to wrong region, timer interrupts wrong, ECAM mapping wrong) +- Then iterate on correctness (IORT mapping, feature flags, expectations) + +Use the checklists: +- [Porting checklist](checklists/porting-checklist.md) +- [New PAL readiness checklist](checklists/new-pal-readiness-checklist.md) + + +## PCIe Enumeration + +This section explains what you need to get PCIe working in baremetal runs. + +### Minimum you need + +- ECAM base + segment + bus range +- enumeration limits (max bus/dev/func) used by your baremetal enumerator +- MMIO resource windows (BAR apertures) that don't overlap DRAM/MMIO +- (optional) static inventory list of expected endpoints/bridges used by tests + +### Primary reference + +See: [PCIe ECAM and device hierarchy guide](platform-override-guides/pcie.md). + +### Debug tips + +- If cfg reads return `0xFFFF_FFFF`: + - ECAM base mapping or bus range is wrong +- If devices enumerate but BAR programming fails: + - MMIO apertures too small or overlapping +- If DMA tests fail: + - cross-check IORT/SMMU mapping + + +## IORT and IOMMU + +This section explains what you need to wire up **SMMU / IORT** for baremetal test correctness. + +### When you need this + +You generally need IORT/IOVIRT configuration if: +- devices are behind SMMU translation (typical on Arm servers) +- tests expect correct StreamID / DeviceID mapping +- ATS/PRI expectations are validated +- MSI routing via ITS relies on DeviceID width/mapping + +### Primary reference + +See: [IOVIRT/IORT guide](platform-override-guides/iort.md). + +### Debug tips + +- Off-by-one in `ID_COUNT` is the #1 issue. +- Keep `DeviceID == StreamID` when possible to simplify debug. +- Ensure ITS IDs match MADT GIC ITS entries in firmware (if ACPI is generated). + + +## Memory Map and MMIO + +This section explains **what to decide** for memory and MMIO when porting a new platform, and points to the concrete override macros. + +### What you must get right + +- **DRAM ranges** (Normal memory) must be correct and non-overlapping. +- **MMIO ranges** must be marked as Device to avoid speculative/cached accesses. +- **Reserved carveouts** must not be used by tests as RAM. +- **Holes / not-populated regions** should be explicitly marked when tests probe abort behavior. + +### Primary references + +- Memory map override values: [Memory Platform Config Guide](platform-override-guides/memory.md) +- Bandwidth and latency details (HMAT): [HMAT guide](platform-override-guides/hmat.md) + +### Recommended debugging approach + +1. Start with *only* DRAM + essential MMIO (UART, GIC, timers, ECAM if needed). +2. Boot and log. +3. Add secondary MMIO windows incrementally. +4. If you see synchronous aborts: + - verify the address falls inside a defined region + - verify the region type is correct (DEVICE vs NORMAL) + - verify alignment and size fields + + +## Interrupts, Timers, and UART + +This section groups the common “early boot essentials” for baremetal porting. + +### UART (console) + +Your first milestone is a stable UART console: +- correct UART IP type (PL011 vs 16550 etc.) +- correct base address + access width +- correct baud/clock (or choose “as-is” if firmware configured it) +- optional: correct GSIV if interrupt-driven console is used + +See: [UART guide](platform-override-guides/uart.md). + +### Timers + +Baremetal tests depend on timer correctness for: +- timeouts +- delays +- watchdog interactions +- performance measurement (in some suites) + +See: [Timers & watchdog guide](platform-override-guides/timers-watchdog.md). + +### Interrupt routing + +Many failures are caused by: +- wrong GSIV values (SPI vs PPI mixups) +- incorrect trigger mode/polarity bits +- using a forbidden GSIV range for SPCR/GTDT + +When in doubt: +1. validate against GIC integration documentation +2. confirm in firmware logs which interrupts fire + + +## Troubleshooting + +### No UART output +- Verify base address + IP type (PL011 vs 16550) +- Verify clock/baud (or set “as-is”) +- Verify memory map marks UART range as DEVICE +- Verify you are connected to the correct UART instance + +### Early synchronous abort / hang +- Check the faulting address (if printed) is within a defined memory/MMIO entry +- Ensure MMIO is DEVICE, not NORMAL +- Ensure reserved regions are not accessed as RAM + +### Timer-related timeouts / flakiness +- Verify timer PPIs (GSIV) and flags (mode/polarity) +- Verify CNTFRQ matches what firmware programs +- If using platform GT frames, verify CNTBASE addresses and GSIVs + +### PCIe not enumerating +- ECAM base mapping wrong (bus stride mismatch) +- Start/end bus too small +- Segment mismatch +- ECAM region not marked DEVICE in memory map +- Overlapping of MMIO address for BAR programming + +### DMA / SMMU / ATS failures +- IORT ID mappings wrong (RID→StreamID / StreamID→DeviceID) +- Wrong “behind SMMU” expectations for endpoints +- ATS marked supported but firmware doesn’t enable it + +### RAS / RAS2 / HMAT / PMU failures +- Start with minimal publication and add blocks incrementally +- Ensure counts match the number of blocks populated +- Keep domain/instance identifiers consistent with SRAT/NUMA view (if applicable) diff --git a/docs/baremetal/porting-pal/platform-override-guides/README.md b/docs/baremetal/porting-pal/platform-override-guides/README.md new file mode 100644 index 00000000..cbb664e8 --- /dev/null +++ b/docs/baremetal/porting-pal/platform-override-guides/README.md @@ -0,0 +1,24 @@ +# Platform override guides + +These guides explain how to populate **platform override values** used by the baremetal PAL / table generation. + +## Core bring-up +- [PE & GIC (MADT)](pe-gic.md) +- [UART / SPCR](uart.md) +- [Timers & watchdogs / GTDT](timers-watchdog.md) +- [Memory map entries](memory.md) +- [IOVIRT / IORT (SMMU)](iort.md) +- [PCIe ECAM & device inventory](pcie.md) + + +## Optional / suite-dependent +- [Cache / topology (PPTT-style)](cache.md) +- [PMU nodes (APMT)](pmu.md) +- [RAS nodes](ras.md) +- [RAS2 blocks](ras2.md) +- [HMAT bandwidth hints](hmat.md) + +## Advanced / topology-dependent +- [MPAM & PCC](mpam-pcc.md) +- [SRAT / NUMA locality](srat.md) + diff --git a/docs/baremetal/porting-pal/platform-override-guides/cache.md b/docs/baremetal/porting-pal/platform-override-guides/cache.md new file mode 100644 index 00000000..29d40a52 --- /dev/null +++ b/docs/baremetal/porting-pal/platform-override-guides/cache.md @@ -0,0 +1,324 @@ +# Processor Topology & Cache Configuration Guide (Platform Override) + +This guide explains how to populate **processor cache topology** configuration in the platform override header, using the **RD-N2 reference values** as a worked example. + +--- + +## 1) What this configuration is used for + +The platform override cache/topology data is used to describe: + +- **Which caches exist** (L1I, L1D, L2, L3…) +- **How caches are chained** (e.g., L1 → L2 → end) +- **Which caches are private vs shared** +- **Cache identifiers (Cache IDs)** that allow the firmware/OS tooling to correlate “cache instances” to CPU nodes and sharing groups + +In the RD-N2 example (see `pal/baremetal/target/RDN2/include/platform_override_fvp.h`), the cache database is built as a **set of cache descriptors** (`PLATFORM_CACHE*_*`) plus a **per-processor mapping** (`PLATFORM_PPTT*_*`) that selects the cache IDs for each processing element / leaf node. + +> Key idea: you define caches once (with IDs and next-level links), then reference those IDs from CPU nodes. + +--- + +## 2) High-level structure + +Your override is split into two layers: + +### A. Cache Descriptor List (the “cache database”) +A flat list of cache descriptors: +- `PLATFORM_OVERRIDE_CACHE_CNT` +- `PLATFORM_CACHE_FLAGS` +- `PLATFORM_CACHE_OFFSET` +- `PLATFORM_CACHE_NEXT_LEVEL_INDEX` +- `PLATFORM_CACHE_SIZE` +- `PLATFORM_CACHE_CACHE_ID` +- `PLATFORM_CACHE_IS_PRIVATE` +- `PLATFORM_CACHE_TYPE` + +These describe *what caches exist* and *how they link together*. + +### B. Per-PE Cache Mapping (CPU → Cache IDs) +A list mapping each CPU leaf node to its cache IDs: +- `PLATFORM_PPTT_CACHEID0` +- `PLATFORM_PPTT_CACHEID1` + +These describe *which cache IDs belong to each PE* (typically L1D and L1I, or other “head” caches). From those head caches, the **NEXT_LEVEL_INDEX** chain describes downstream caches (e.g., L2). + +--- + +## 3) Fields in the cache descriptors + +### 3.1 `PLATFORM_OVERRIDE_CACHE_CNT` +**Total number of cache descriptors** you define. + +Example: +```c +#define PLATFORM_OVERRIDE_CACHE_CNT 0x30 +``` + +This must be: +- ≥ the highest `PLATFORM_CACHE` index you define + 1 +- consistent with any internal array sizing assumptions in the platform override implementation + +--- + +### 3.2 `PLATFORM_CACHE_CACHE_ID` +A **non-zero unique ID** for the cache descriptor. + +Example: +```c +#define PLATFORM_CACHE0_CACHE_ID 0x1 +#define PLATFORM_CACHE2_CACHE_ID 0x2 +#define PLATFORM_CACHE1_CACHE_ID 0x3 +``` + +**Rules of thumb** +- Cache IDs must be globally unique. ACPI tables such as APMT or MPAM may reference caches by ID, so duplicates break cross-table linking and are not permitted. +- If you need **per-core cache identification**, give each cache instance its own descriptor and ID; never reuse a descriptor across multiple cores even if the caches are identical. + +In RD-N2, you can see a pattern: +- CPU0 uses cache IDs `0x1` and `0x2` +- CPU1 uses `0x1001` and `0x1002` +- CPU2 uses `0x2001` and `0x2002` +…which implies **unique cache instances per CPU** for L1 caches (and likely unique L2 as well). + +--- + +### 3.3 `PLATFORM_CACHE_TYPE` +Cache “kind” (data, instruction, unified). + +Examples in RD-N2: +```c +#define PLATFORM_CACHE0_TYPE 0 // Data +#define PLATFORM_CACHE2_TYPE 1 // Instruction +#define PLATFORM_CACHE1_TYPE 2 // Unified +``` + +Interpretation used in your reference: +- `0` = Data cache (L1D) +- `1` = Instruction cache (L1I) +- `2` = Unified cache (e.g., L2) + +If your platform has separate L2I/L2D (rare on modern Arm servers), you’d model them as separate chains; otherwise, use unified for L2/L3. + +--- + +### 3.4 `PLATFORM_CACHE_SIZE` +Cache size in bytes. + +Examples: +```c +#define PLATFORM_CACHE0_SIZE 0x10000 // 64 KB (L1D) +#define PLATFORM_CACHE2_SIZE 0x10000 // 64 KB (L1I) +#define PLATFORM_CACHE1_SIZE 0x100000 // 1 MB (L2) +``` + +Populate from: +- SoC TRM / core integration guide +- actual CPU cache registers (if you can probe) during bring-up (and then freeze into override) + +--- + +### 3.5 `PLATFORM_CACHE_NEXT_LEVEL_INDEX` +Defines the “linked list” relationship to the **next cache level**. + +Examples (CPU0 chain): +```c +#define PLATFORM_CACHE0_NEXT_LEVEL_INDEX 1 // L1D -> cache1 (L2) +#define PLATFORM_CACHE2_NEXT_LEVEL_INDEX 1 // L1I -> cache1 (L2) +#define PLATFORM_CACHE1_NEXT_LEVEL_INDEX -1 // L2 -> end +``` + +**How to use** +- For L1 caches, point to the L2 cache descriptor index. +- For L2, point to L3 if it is private to the same node; otherwise `-1` if it ends there. +- Use `-1` (or the implementation’s “null” marker) to terminate. + +> If your system has a shared L3 at a cluster/package level, represent L3 as a separate descriptor and ensure the chain points to it at the right level (depending on how your consuming code models “shared”). + +--- + +### 3.6 `PLATFORM_CACHE_IS_PRIVATE` +Indicates whether the cache is private to the associated CPU node. + +Example: +```c +#define PLATFORM_CACHE1_IS_PRIVATE 0x1 +``` + +In RD-N2, all shown caches are marked private (`0x1`). That suggests: +- either the model only describes *per-CPU private caches*, or +- shared caches are modeled differently elsewhere, or +- sharing is not expressed in this particular override format. + +**For a new platform** +- If you have shared L3, decide how your platform override expects you to represent it: + - Option A: shared cache descriptor referenced by multiple nodes, `IS_PRIVATE = 0` + - Option B: per-cluster cache descriptor referenced by a “cluster node” (if your format supports non-leaf nodes) + - Option C: not modeled here (and discovered elsewhere), in which case keep private caches only + +--- + +### 3.7 `PLATFORM_CACHE_FLAGS` +A bitfield indicating which cache properties are valid in the descriptor. + +RD-N2 uses: +```c +#define PLATFORM_CACHE_FLAGS 0xFF +``` + +Meaning: “all relevant properties are valid and should be used”. + +For new platforms: +- `0xFF` is a safe baseline if your code expects explicit properties. +- If your consuming stack can discover some properties dynamically, you *could* reduce flags, but that is typically not needed unless required. + +--- + +### 3.8 `PLATFORM_CACHE_OFFSET` +This is an implementation detail used by the firmware generator/consumer as a reference into a serialized structure blob (or to compute relative pointers). + +Example: +```c +#define PLATFORM_CACHE0_OFFSET 0x68 +#define PLATFORM_CACHE1_OFFSET 0xA0 +... +``` + +**Important:** These offsets are **not arbitrary**. +They usually must match: +- the byte offset within the built structure blob, or +- a specific packing layout expected by the codebase. + +**For a new platform** +- Do not “invent” offsets. +- Derive them the same way RD-N2 does: + - either by using the same macro generator logic (recommended), + - or by following the same layout rules (size of each structure, alignment constraints). +- If your project has a generator C file that emits the structures, ensure it computes these offsets consistently. + +If you don’t have a generator, a typical pattern is: +- cache descriptor structures are fixed size (e.g., 28 bytes), +- each entry offset increments by that size (plus alignment), +but you should confirm against your actual code. + +--- + +## 4) Per-processor cache mapping (`PLATFORM_PPTT_CACHEID*`) + +These macros associate each CPU leaf node with the “head” cache IDs it owns. + +Example: +```c +#define PLATFORM_PPTT0_CACHEID0 0x1 +#define PLATFORM_PPTT0_CACHEID1 0x2 +``` + +In RD-N2, the pattern repeats per CPU: +- `CACHEID0` looks like **L1D** +- `CACHEID1` looks like **L1I** + +For CPU1: +```c +#define PLATFORM_PPTT1_CACHEID0 0x1001 +#define PLATFORM_PPTT1_CACHEID1 0x1002 +``` + +This implies: +- each CPU has distinct L1 caches (different IDs), +- and each CPU’s L1 caches point to its L2 via the descriptor chain. + +### How to populate for a new platform +1. Decide how many CPU leaf nodes you have (matches your PE count). +2. For each CPU `i`, assign: + - `CACHEID0` → its data-cache head (usually L1D) + - `CACHEID1` → its instruction-cache head (usually L1I) +3. Ensure those cache IDs exist in the cache descriptor list. +4. Ensure each of those descriptors links correctly via `NEXT_LEVEL_INDEX`. + +--- + +## 5) Reading the RD-N2 example as a template + +### CPU0 caches +Descriptors: +- `CACHE0` (ID 0x1, Data, 64KB) → next level = index 1 +- `CACHE2` (ID 0x2, Instr, 64KB) → next level = index 1 +- `CACHE1` (ID 0x3, Unified, 1MB) → next level = -1 + +CPU0 mapping: +```c +PLATFORM_PPTT0_CACHEID0 = 0x1 // L1D head +PLATFORM_PPTT0_CACHEID1 = 0x2 // L1I head +``` + +This implies an L1D/L1I pair feeding a private L2. + +### CPU1 caches +Descriptors: +- `CACHE3` (ID 0x1001) → next level = index 4 +- `CACHE5` (ID 0x1002) → next level = index 4 +- `CACHE4` (ID 0x1003) → next level = -1 + +CPU1 mapping: +```c +PLATFORM_PPTT1_CACHEID0 = 0x1001 +PLATFORM_PPTT1_CACHEID1 = 0x1002 +``` + +…and so on for each CPU. + +--- + +## 6) What to change for a new platform + +### 6.1 If the CPU count differs +- Update the number of per-CPU mapping entries (`PLATFORM_PPTT_*` count). +- Decide whether to keep **unique caches per CPU** (like RD-N2) or share identical structures. + +### 6.2 If cache sizes differ +- Update `*_SIZE` for the corresponding descriptors. + +### 6.3 If cache topology differs +Examples: +- **Shared L3 per cluster**: you may want L2 → L3 and L3 → end, and reference the same L3 descriptor from all CPUs in the cluster. +- **No separate L2 (rare)**: L1 → end. +- **L2 shared across a pair**: point both CPUs’ L1 heads to the same L2 descriptor. + +The key is to keep the `NEXT_LEVEL_INDEX` chain consistent with the topology you want represented. + +### 6.4 Offsets must follow your implementation rules +If your platform generator auto-computes offsets, you should not hand-edit them. +If you must edit them, ensure: +- correct structure sizes +- correct alignment +- monotonic increasing offsets +- no collisions + +--- + +## 7) Minimal skeleton for a new platform (template) + +```c +/* Cache descriptor count */ +#define PLATFORM_OVERRIDE_CACHE_CNT + +/* Cache descriptors */ +#define PLATFORM_CACHE0_FLAGS 0xFF +#define PLATFORM_CACHE0_OFFSET +#define PLATFORM_CACHE0_NEXT_LEVEL_INDEX +#define PLATFORM_CACHE0_SIZE +#define PLATFORM_CACHE0_CACHE_ID +#define PLATFORM_CACHE0_IS_PRIVATE 0x1 +#define PLATFORM_CACHE0_TYPE 0 /* Data */ + +/* ... other cache descriptors ... */ + +/* Per-CPU cache mapping (heads) */ +#define PLATFORM_PPTT0_CACHEID0 +#define PLATFORM_PPTT0_CACHEID1 +#define PLATFORM_PPTT1_CACHEID0 +#define PLATFORM_PPTT1_CACHEID1 +/* ... */ +``` + +--- diff --git a/docs/baremetal/porting-pal/platform-override-guides/hmat.md b/docs/baremetal/porting-pal/platform-override-guides/hmat.md new file mode 100644 index 00000000..9270e9d3 --- /dev/null +++ b/docs/baremetal/porting-pal/platform-override-guides/hmat.md @@ -0,0 +1,227 @@ +# Memory attributes & bandwidth hints — platform override guide + +This document describes how to fill the **platform override configuration** for: + +- The number of **memory proximity domains** you want to describe +- A list of **memory domain entries** +- Per-domain **read/write peak bandwidth hints** +- The **common encoding** for the bandwidth dataset (data type, base unit, flags) + +The intent is to provide software with a consistent, complete set of *initiator→memory-domain* performance hints (here: bandwidth) for memory placement and optimization. + +--- + +## 1) What you need from the platform + +Collect the following before you set values: + +1. **Memory proximity domain IDs** + - A small integer per memory domain (0..N-1 is typical). + - These IDs must match whatever the platform uses to describe memory/NUMA domains elsewhere (e.g., SRAT/firmware topology). + +2. **Peak sustainable bandwidth per memory domain** + - Provide **read** and **write** numbers **per domain**, in a *normalized* representation. + - You can use either: + - Measured values (recommended), or + - Spec/SoC interconnect limits (acceptable for early bring-up) + +3. **A base unit** for bandwidth entries + - The base unit defines how to interpret the per-domain “entry” value. + - Represented bandwidth = `entry_value × entry_base_unit`. + +--- + +## 2) Macro groups and what they mean + +### A. Top-level counts + +```c +#define PLATFORM_OVERRIDE_NUM_OF_HMAT_PROX_DOMAIN 1 +#define PLATFORM_OVERRIDE_HMAT_MEM_ENTRIES 0x4 +``` + +- `PLATFORM_OVERRIDE_NUM_OF_HMAT_PROX_DOMAIN` + - Number of *initiator proximity domains* described in this dataset. + - In many server designs, you can start with **1** initiator domain representing “the CPUs / initiators as a group”. + - Increase this if you want to publish different bandwidth numbers depending on which initiator domain is accessing memory. + +- `PLATFORM_OVERRIDE_HMAT_MEM_ENTRIES` + - Number of **memory domain entries** you will populate (i.e., how many memory proximity domains you are publishing bandwidth for). + - Must match the number of `PLATFORM_HMAT_MEMx_*` blocks you define. + + +### B. Dataset description (bandwidth encoding) + +```c +#define HMAT_NODE_MEM_SLLBIC 0x1 +#define HMAT_NODE_MEM_SLLBIC_DATA_TYPE 0x3 +#define HMAT_NODE_MEM_SLLBIC_FLAGS 0x0 +#define HMAT_NODE_MEM_SLLBIC_ENTRY_BASE_UNIT 0x64 +``` + +These macros describe the **kind of dataset** you’re publishing and how to interpret entry values. + +- `HMAT_NODE_MEM_SLLBIC` + - Selects the dataset category used by the implementation (commonly: a “system locality latency/bandwidth” dataset selector). + - In this reference, it is set to `0x1` indicating the **latency/bandwidth dataset** is enabled/selected. + +- `HMAT_NODE_MEM_SLLBIC_DATA_TYPE` + - Selects the **data type** within the dataset. + - In this reference, `0x3` means **Access Bandwidth** (i.e., a single bandwidth number used for both reads and writes *if they are the same*). + - However, your per-domain macros provide **separate read and write values**, which is also a common platform practice; treat them as read/write bandwidth entries even if the dataset is “access bandwidth”. + - If your firmware implementation supports explicit **Read Bandwidth** / **Write Bandwidth** types, prefer those when read≠write. + +- `HMAT_NODE_MEM_SLLBIC_FLAGS` + - Qualifiers for the dataset. + - `0x0` typically means **default**: no special access attribute qualifiers (e.g., not indicating “non-sequential” or “minimum transfer size”). + +- `HMAT_NODE_MEM_SLLBIC_ENTRY_BASE_UNIT` + - The scaling factor for each entry. + - For bandwidth datasets, interpret as **MB/s per unit**. + - With `0x64` (decimal 100): an entry value of `0x82` (130) corresponds to `130 × 100 = 13,000 MB/s`. + +--- + +## 3) Per-memory-domain entries + +Each memory domain entry provides: + +- The **memory proximity domain ID** +- A **max write bandwidth** entry +- A **max read bandwidth** entry + +Reference pattern: + +```c +#define PLATFORM_HMAT_MEMx_PROX_DOMAIN +#define PLATFORM_HMAT_MEMx_MAX_WRITE_BW +#define PLATFORM_HMAT_MEMx_MAX_READ_BW +``` + +Where the represented bandwidth is: + +- `MaxWriteBandwidth = PLATFORM_HMAT_MEMx_MAX_WRITE_BW × HMAT_NODE_MEM_SLLBIC_ENTRY_BASE_UNIT` (MB/s) +- `MaxReadBandwidth = PLATFORM_HMAT_MEMx_MAX_READ_BW × HMAT_NODE_MEM_SLLBIC_ENTRY_BASE_UNIT` (MB/s) + + +### RD N2 example (as provided) + +```c +#define PLATFORM_OVERRIDE_NUM_OF_HMAT_PROX_DOMAIN 1 +#define PLATFORM_OVERRIDE_HMAT_MEM_ENTRIES 0x4 + +#define HMAT_NODE_MEM_SLLBIC 0x1 +#define HMAT_NODE_MEM_SLLBIC_DATA_TYPE 0x3 +#define HMAT_NODE_MEM_SLLBIC_FLAGS 0x0 +#define HMAT_NODE_MEM_SLLBIC_ENTRY_BASE_UNIT 0x64 + +#define PLATFORM_HMAT_MEM0_PROX_DOMAIN 0x0 +#define PLATFORM_HMAT_MEM0_MAX_WRITE_BW 0x82 +#define PLATFORM_HMAT_MEM0_MAX_READ_BW 0x82 + +#define PLATFORM_HMAT_MEM1_PROX_DOMAIN 0x1 +#define PLATFORM_HMAT_MEM1_MAX_WRITE_BW 0x8c +#define PLATFORM_HMAT_MEM1_MAX_READ_BW 0x8c + +#define PLATFORM_HMAT_MEM2_PROX_DOMAIN 0x2 +#define PLATFORM_HMAT_MEM2_MAX_WRITE_BW 0x96 +#define PLATFORM_HMAT_MEM2_MAX_READ_BW 0x96 + +#define PLATFORM_HMAT_MEM3_PROX_DOMAIN 0x3 +#define PLATFORM_HMAT_MEM3_MAX_WRITE_BW 0xa0 +#define PLATFORM_HMAT_MEM3_MAX_READ_BW 0xa0 +``` + +#### What these numbers represent + +With `ENTRY_BASE_UNIT = 0x64 = 100 MB/s`: + +| Memory domain | Entry (read/write) | Represented bandwidth | +|---:|---:|---:| +| 0 | 0x82 = 130 | 130 × 100 = **13,000 MB/s** (~13.0 GB/s) | +| 1 | 0x8C = 140 | 140 × 100 = **14,000 MB/s** (~14.0 GB/s) | +| 2 | 0x96 = 150 | 150 × 100 = **15,000 MB/s** (~15.0 GB/s) | +| 3 | 0xA0 = 160 | 160 × 100 = **16,000 MB/s** (~16.0 GB/s) | + +Because read and write entries are equal in this reference, this is consistent with “Access Bandwidth”. + +--- + +## 4) Choosing good values + +### A. Picking `ENTRY_BASE_UNIT` + +Choose a base unit that: + +- Preserves ordering (higher entry ⇒ higher bandwidth) +- Avoids overflow/saturation in the firmware data structures +- Keeps entry values reasonably sized (e.g., 10–10,000, not 1–2) + +Typical options: + +- `100 MB/s` (0x64) — good for 10–200 GB/s platforms +- `1000 MB/s` (0x3E8) — good for very high bandwidth fabrics +- `10 MB/s` (0x0A) — if you need finer granularity + +### B. Deriving entry values from measured bandwidth + +If you measure peak bandwidth in GB/s: + +1. Convert to MB/s: `GB/s × 1024` (or ×1000 if your measurement tool reports decimal GB) +2. Compute entry: `entry = round(MB/s / ENTRY_BASE_UNIT)` +3. Clamp to the supported range of your implementation (commonly 16-bit entries) + +Example: target 51.2 GB/s (decimal) ≈ 51,200 MB/s with base unit 100 MB/s: + +- `entry ≈ 512` (`0x200`) + +--- + +## 5) Completeness and consistency rules + +Use this checklist to avoid common integration failures: + +- **Counts match reality** + - `PLATFORM_OVERRIDE_HMAT_MEM_ENTRIES` equals the number of `PLATFORM_HMAT_MEMx_*` blocks. + +- **Domain IDs are consistent** + - Each `*_PROX_DOMAIN` value must match the platform’s memory domain numbering. + - Do not reuse a proximity domain ID across two entries. + +- **Dataset is complete for the chosen shape** + - If you publish bandwidth for a set of initiators and targets, you must publish entries for all relevant initiator→target pairs as required by your firmware’s HMAT construction logic. + - In this simplified RD N2 style (single initiator domain), ensure you provide bandwidth numbers for **all memory domains**. + +- **Read vs write semantics** + - If read and write are materially different on your platform, prefer separate read/write dataset types (if supported). + - Otherwise, keep them equal as in the reference. + +- **Units are not ambiguous** + - Document the base unit you choose and confirm consumers interpret it as MB/s. + +--- + +## 6) Quick template for a new platform + +```c +/* How many initiator proximity domains are described (often 1) */ +#define PLATFORM_OVERRIDE_NUM_OF_HMAT_PROX_DOMAIN + +/* How many memory proximity domains are described */ +#define PLATFORM_OVERRIDE_HMAT_MEM_ENTRIES + +/* Dataset selector + encoding */ +#define HMAT_NODE_MEM_SLLBIC 0x1 +#define HMAT_NODE_MEM_SLLBIC_DATA_TYPE +#define HMAT_NODE_MEM_SLLBIC_FLAGS 0x0 +#define HMAT_NODE_MEM_SLLBIC_ENTRY_BASE_UNIT + +/* Per memory domain entries */ +#define PLATFORM_HMAT_MEM0_PROX_DOMAIN +#define PLATFORM_HMAT_MEM0_MAX_WRITE_BW +#define PLATFORM_HMAT_MEM0_MAX_READ_BW + +/* ... repeat for MEM1..MEM(n-1) */ +``` + +--- diff --git a/docs/baremetal/porting-pal/platform-override-guides/iort.md b/docs/baremetal/porting-pal/platform-override-guides/iort.md new file mode 100644 index 00000000..674c4c47 --- /dev/null +++ b/docs/baremetal/porting-pal/platform-override-guides/iort.md @@ -0,0 +1,341 @@ +# IOVIRT / SMMU Platform Configuration Guide (ACPI IORT) + +This document explains **how to fill IOVIRT / SMMU platform configuration** for a new Arm-based platform, using the **ACPI IORT (IO Remapping Table)** specification (DEN0049) and the **RD-N2 platform** as a concrete reference. + +The intent is to help platform and firmware engineers translate **real hardware topology** (PCIe, SMMUs, StreamIDs, ITS, DMA-capable devices) into the **platform override macros** used by SBSA/BSA ACS. + +This guide focuses **only on IOVIRT/IORT**, and is intentionally detailed. + +--- + +## 1. What IORT / IOVIRT Describes + +At a high level, IORT tells the OS: + +- Which **devices generate transactions** (PCIe RCs, named components) +- Which **SMMU(s)** those transactions pass through +- How **input IDs** (RIDs or implementation-defined IDs) are translated into: + - **StreamIDs** (for SMMU lookup) + - **DeviceIDs** (for MSI routing via ITS) +- Which **ITS instance(s)** ultimately receive MSIs + +Conceptually, the dataflow is: + +``` +Requester (PCIe / Named Component) + | + | Input ID (RID or device-specific ID) + v +Root Complex / Named Component Node + | + | StreamID + v +SMMUv3 Node + | + | DeviceID + v +ITS Group Node + | + v +GIC ITS (MADT) +``` + +Every mapping in IORT exists to describe **one step in this pipeline**. + +--- + +## 2. IORT Node Types You Will Typically Use + +Most Arm server platforms only need a subset of IORT node types: + +| Node Type | Purpose | +|---------|--------| +| **Root Complex (RC)** | Describes PCIe root complexes and RID → StreamID mapping | +| **SMMUv3** | Describes SMMUv3 instances and StreamID → DeviceID mapping | +| **ITS Group** | Groups one or more GIC ITS units | +| **Named Component** | Describes non-PCIe DMA-capable devices | +| **RMR (optional)** | Reserved Memory Regions for DMA devices | + +RD-N2 uses: +- 1 Root Complex +- 5 SMMUv3 instances +- 1 ITS Group (containing 5 ITS units) +- 2 Named Components + +--- + +## 3. Platform-Level IOVIRT Macros + +These macros define **table-level structure and counts**. + +```c +#define IORT_NODE_COUNT 13 +#define NUM_ITS_COUNT 5 +#define IOVIRT_ITS_COUNT 1 +#define IOVIRT_SMMUV3_COUNT 5 +#define IOVIRT_RC_COUNT 1 +#define IOVIRT_SMMUV2_COUNT 0 +#define IOVIRT_NAMED_COMPONENT_COUNT 2 +#define IOVIRT_PMCG_COUNT 0 +``` + +### How to fill these for a new platform + +1. **Count every IORT node you will emit** + - RC nodes + - SMMU nodes + - ITS Group nodes + - Named Components + - RMR nodes (if any) + +2. `NUM_ITS_COUNT` + - Total number of **GIC ITS units** in hardware + - Must match **MADT GIC ITS entries** + +3. `IOVIRT_ITS_COUNT` + - Number of **ITS Group nodes** (usually 1) + +4. `IOVIRT_SMMUV3_COUNT` + - Number of SMMUv3 instances + +5. `IOVIRT_RC_COUNT` + - Number of PCIe root complexes + +6. `IOVIRT_NAMED_COMPONENT_COUNT` + - Number of non-PCIe DMA-capable devices. + +--- + +## 4. SMMUv3 Nodes + +Each SMMUv3 node represents **one SMMUv3 instance**. + +```c +#define IOVIRT_SMMUV3_0_BASE_ADDRESS 0x40000000 +#define IOVIRT_SMMUV3_1_BASE_ADDRESS 0x42000000 +#define IOVIRT_SMMUV3_2_BASE_ADDRESS 0x44000000 +#define IOVIRT_SMMUV3_3_BASE_ADDRESS 0x46000000 +#define IOVIRT_SMMUV3_4_BASE_ADDRESS 0x48000000 +``` + +### How to fill for a new platform + +For each SMMUv3 instance: + +- Use the **MMIO base address** of the SMMU +- Ensure it matches: + - Hardware address map + - IORT SMMUv3 node base + - SMMU base used by firmware/OS + +Optional but important fields (not shown in RD-N2 snippet): +- Event IRQ +- PRI IRQ +- GERR IRQ +- Sync IRQ + +If your platform supports these, they **must be consistent** across: +- IORT +- Interrupt controller configuration +- Linux `arm-smmu-v3` expectations + +--- + +## 5. Root Complex (RC) Node + +The RC node describes **PCIe requesters** and how **RIDs map to StreamIDs**. + +```c +#define IOVIRT_RC_PCI_SEG_NUM 0x0 +#define IOVIRT_RC_MEMORY_PROPERTIES 0x1 +#define IOVIRT_RC_ATS_ATTRIBUTE 0x1 +``` + +### Key RC attributes + +- **PCI Segment Number** + - Typically `0` unless multi-segment PCIe is used + +- **Memory Properties** + - Bitfield describing coherency / shareability + - `0x1` typically indicates coherent access + +- **ATS Attribute** + - `1` if ATS is supported and enabled + - Must match PCIe capability exposure + +--- + +## 6. ID Mapping Fundamentals (CRITICAL SECTION) + +Almost all IORT bring-up bugs come from **incorrect ID mappings**. + +Each mapping describes: + +``` +Input ID range ---> Output ID range + | | + INPUT_BASE OUTPUT_BASE + ID_COUNT OUTPUT_REF +``` + +### Mapping rule + +For an input ID `X`: + +``` +if (INPUT_BASE <= X <= INPUT_BASE + ID_COUNT) + OUTPUT_ID = OUTPUT_BASE + (X - INPUT_BASE) +``` + +⚠️ **ID_COUNT is (number_of_IDs - 1)** — this is the most common mistake. + +--- + +## 7. RC → SMMU (RID → StreamID) Mappings + +Example from RD-N2: + +```c +#define RC_MAP0_INPUT_BASE 0x0 +#define RC_MAP0_ID_COUNT 0x8FFF +#define RC_MAP0_OUTPUT_BASE 0x30000 +#define RC_MAP0_OUTPUT_REF 0x5A4 +``` + +Meaning: + +- PCIe RIDs `0x0000 – 0x8FFF` +- Map to StreamIDs starting at `0x30000` +- Output goes to **SMMUv3 node reference `0x5A4`** + +### How to fill for a new platform + +1. Determine **RID ranges** + - Often grouped by PCIe bus ranges or controllers + +2. Decide **StreamID allocation** + - Fixed window per SMMU is recommended + - Avoid overlapping StreamID ranges across SMMUs + +3. `OUTPUT_REF` + - Must reference the **correct SMMUv3 node** + - This is a node offset/reference, not a base address + +Repeat mapping entries if: +- Different PCIe buses go to different SMMUs +- You want distinct StreamID windows per bus group + +--- + +## 8. SMMU → ITS (StreamID → DeviceID) Mappings + +Each SMMUv3 maps StreamIDs to DeviceIDs consumed by an ITS Group. + +Example: + +```c +#define SMMUV3_0_ID_MAP1_INPUT_BASE 0x30000 +#define SMMUV3_0_ID_MAP1_ID_COUNT 0x8FF +#define SMMUV3_0_ID_MAP1_OUTPUT_BASE 0x30000 +#define SMMUV3_0_ID_MAP1_OUTPUT_REF 0x18 +``` + +Meaning: + +- StreamIDs `0x30000 – 0x308FF` +- Map to DeviceIDs starting at `0x30000` +- Sent to ITS Group node reference `0x18` + +### Best practices + +- Keep **DeviceID = StreamID** if possible (simplifies debug) +- Ensure DeviceID range fits within ITS DeviceID width +- All SMMUs that feed the same ITS Group must agree on DeviceID space + +--- + +## 9. Named Component Nodes + +Named Components describe **non-PCIe DMA-capable devices**. + +Example: + +```c +#define IOVIRT_NAMED_0_DEVICE_NAME "\\_SB_.ETR0" +#define IOVIRT_NAMED_1_DEVICE_NAME "\\_SB_.DMA0" +``` + +These nodes still require **ID mappings**: + +```c +#define NAMED_COMP0_MAP0_INPUT_BASE 0x0 +#define NAMED_COMP0_MAP0_OUTPUT_BASE 0x10000 +#define NAMED_COMP0_MAP0_OUTPUT_REF 0xA54 +``` + +### How to fill for a new platform + +1. Identify DMA-capable ACPI devices + - DMA engines + - Trace units + - Accelerators + +2. Use the **ACPI namespace path** + - Must match DSDT/SSDT exactly + +3. Assign implementation-defined input IDs + - Often small integers (0,1,2,...) + +4. Map to StreamID or DeviceID space + - Usually directly to an SMMU + +--- + +## 10. ITS Group Nodes + +ITS Group nodes bind DeviceIDs to **actual GIC ITS instances**. + +Key rules: + +- ITS IDs **must match MADT GIC ITS IDs** +- Multiple ITS units may share one ITS Group +- Usually **one ITS Group per SoC** + +RD-N2: + +```c +#define NUM_ITS_COUNT 5 +#define IOVIRT_ITS_COUNT 1 +``` + +Meaning: +- 5 ITS units exist +- 1 ITS Group node aggregates them + +--- + +## 11. Recommended Bring-Up Order + +1. Enumerate **GIC ITS units** (MADT) +2. Decide **DeviceID width and layout** +3. Enumerate **SMMUv3 instances** +4. Assign **StreamID windows per SMMU** +5. Populate **RC RID → StreamID mappings** +6. Populate **SMMU StreamID → DeviceID mappings** +7. Add **Named Components** +9. Run ACS IOVIRT / DMA tests + +--- + +## 12. Common Failure Patterns + +- Overlapping StreamID windows across SMMUs +- Wrong `ID_COUNT` (off-by-one) +- ITS IDs mismatched between MADT and IORT +- Named Component ACPI paths incorrect +- RC mappings pointing to wrong SMMU node reference + +--- + +This document should be sufficient to derive **IOVIRT/IORT configuration for a new platform** starting from hardware topology and firmware knowledge. diff --git a/docs/baremetal/porting-pal/platform-override-guides/memory.md b/docs/baremetal/porting-pal/platform-override-guides/memory.md new file mode 100644 index 00000000..70213f46 --- /dev/null +++ b/docs/baremetal/porting-pal/platform-override-guides/memory.md @@ -0,0 +1,124 @@ +# Platform Memory Map Override Guide (Memory Entries) + +This guide explains how to populate the **platform memory configuration** macros (often used by validation firmware / ACS harnesses / platform abstraction layers) for a **new platform**, using the **RD‑N2 reference values** as an example. + + +> This is the **static memory map override** used by the platform layer to describe **physical regions**, optional **identity-mapped virtual aliases**, sizes, and semantic **memory types** (Normal/Device/Reserved/Not‑Populated). + +--- + +## 1) What these entries represent + +Each `PLATFORM_OVERRIDE_MEMORY_ENTRY_*` describes **one contiguous address range**: + +- **PHY_ADDR**: base physical address of the region +- **VIRT_ADDR**: virtual address at which the region is accessed (often identical to PHY on identity maps) +- **SIZE**: size in bytes +- **TYPE**: how the software should treat the range: + - `MEMORY_TYPE_NORMAL` – normal cacheable DRAM (usable RAM) + - `MEMORY_TYPE_DEVICE` – MMIO / Device-nGnRnE / strongly-ordered / non-cacheable region + - `MEMORY_TYPE_RESERVED` – address range that exists but must **not** be used as general memory (firmware carveout, secure region, MMIO window, etc.) + - `MEMORY_TYPE_NOT_POPULATED` – address range that *could exist architecturally* but is **not backed by memory** on this platform (holes) + +> **Key idea:** the OS/firmware/test-harness uses these entries to know **what is RAM**, **what is MMIO**, and **what must not be touched**. + +--- + +## 2) How to derive the values for a new platform + +### Step A — Enumerate *all* address regions you care about +Create a list from: +- SoC memory map / TRM (DRAM windows, MMIO apertures, firmware SRAM, PCIe ECAM/MMIO, GIC, UART, etc.) +- Firmware memory map (carveouts: secure, OP-TEE, SCP, shared memory mailboxes, crashlog, etc.) +- Any “holes” in the DRAM map (e.g., reserved interleaves, highmem gaps) + +### Step B — Decide the **entry type** for each region +Use the rule of thumb: + +- **Normal DRAM** → `MEMORY_TYPE_NORMAL` +- **Peripheral registers / MMIO apertures** → `MEMORY_TYPE_DEVICE` +- **Firmware carveouts / reserved regions** (even if physically DRAM) → `MEMORY_TYPE_RESERVED` +- **Unimplemented holes** → `MEMORY_TYPE_NOT_POPULATED` + +### Step C — Choose PHY vs VIRT mapping +Most platforms use **identity mapping** in early boot / bare-metal tests: + +- `VIRT_ADDR == PHY_ADDR` + +If your platform uses a fixed offset mapping (e.g., `VA = PA + 0xFFFF_0000_0000_0000`), then: +- `VIRT_ADDR = PHY_ADDR + offset` + +### Step D — Ensure entries are **page-aligned** and non-overlapping +Good hygiene (and often required): +- `PHY_ADDR` aligned to at least 4KB (often 64KB/2MB depending on mapping granularity) +- `SIZE` a multiple of the same alignment +- No overlaps between entries +- Prefer fewer, larger ranges unless distinct types require splitting + +--- + +## 3) RD‑N2 example (your reference) annotated + +### Provided reference macros +```c +/* Memory config */ +#define PLATFORM_OVERRIDE_MEMORY_ENTRY_COUNT 0x4 + +#define PLATFORM_OVERRIDE_MEMORY_ENTRY0_PHY_ADDR 0x1050000000 +#define PLATFORM_OVERRIDE_MEMORY_ENTRY0_VIRT_ADDR 0x1050000000 +#define PLATFORM_OVERRIDE_MEMORY_ENTRY0_SIZE 0x4000000 +#define PLATFORM_OVERRIDE_MEMORY_ENTRY0_TYPE MEMORY_TYPE_DEVICE + +#define PLATFORM_OVERRIDE_MEMORY_ENTRY1_PHY_ADDR 0xFF600000 +#define PLATFORM_OVERRIDE_MEMORY_ENTRY1_VIRT_ADDR 0xFF600000 +#define PLATFORM_OVERRIDE_MEMORY_ENTRY1_SIZE 0x10000 +#define PLATFORM_OVERRIDE_MEMORY_ENTRY1_TYPE MEMORY_TYPE_RESERVED + +#define PLATFORM_OVERRIDE_MEMORY_ENTRY2_PHY_ADDR 0x80000000 +#define PLATFORM_OVERRIDE_MEMORY_ENTRY2_VIRT_ADDR 0x80000000 +#define PLATFORM_OVERRIDE_MEMORY_ENTRY2_SIZE 0x60000000 +#define PLATFORM_OVERRIDE_MEMORY_ENTRY2_TYPE MEMORY_TYPE_NORMAL + +#define PLATFORM_OVERRIDE_MEMORY_ENTRY3_PHY_ADDR 0xC030000 +#define PLATFORM_OVERRIDE_MEMORY_ENTRY3_VIRT_ADDR 0xC030000 +#define PLATFORM_OVERRIDE_MEMORY_ENTRY3_SIZE 0x20000 +#define PLATFORM_OVERRIDE_MEMORY_ENTRY3_TYPE MEMORY_TYPE_NOT_POPULATED +``` + +### What each entry implies + +| Entry | Range (PA) | Size | Type | Practical meaning | +|------:|------------|------|------|------------------| +| 0 | `0x1050_0000_00` .. `0x1053_FFFF_FF` | 64 MB | Device | A device/MMIO region (e.g., a peripheral window, PCIe MMIO, etc.) | +| 1 | `0xFF60_0000` .. `0xFF60_FFFF` | 64 KB | Reserved | Region exists but must not be treated as normal memory | +| 2 | `0x8000_0000` .. `0xDFFF_FFFF` | 1.5 GB | Normal | DRAM usable by firmware/tests/OS | +| 3 | `0x00C0_3000` .. `0x00C0_4FFF` | 128 KB | Not populated | Hole / unbacked region; touching it should fault/abort | + +> Note: The names of the regions (what device or carveout they correspond to) should come from your platform memory map/TRM. The **types** reflect how software must access them. + +--- + +## 4) Template for a new platform + +Copy/paste and fill: + +```c +/* Memory config */ +#define PLATFORM_OVERRIDE_MEMORY_ENTRY_COUNT + +/* Entry0: */ +#define PLATFORM_OVERRIDE_MEMORY_ENTRY0_PHY_ADDR <0x...> +#define PLATFORM_OVERRIDE_MEMORY_ENTRY0_VIRT_ADDR <0x...> +#define PLATFORM_OVERRIDE_MEMORY_ENTRY0_SIZE <0x...> +#define PLATFORM_OVERRIDE_MEMORY_ENTRY0_TYPE + +/* Entry1: */ +#define PLATFORM_OVERRIDE_MEMORY_ENTRY1_PHY_ADDR <0x...> +#define PLATFORM_OVERRIDE_MEMORY_ENTRY1_VIRT_ADDR <0x...> +#define PLATFORM_OVERRIDE_MEMORY_ENTRY1_SIZE <0x...> +#define PLATFORM_OVERRIDE_MEMORY_ENTRY1_TYPE <...> + +/* ... */ +``` + +--- diff --git a/docs/baremetal/porting-pal/platform-override-guides/mpam-pcc.md b/docs/baremetal/porting-pal/platform-override-guides/mpam-pcc.md new file mode 100644 index 00000000..71b5f3b1 --- /dev/null +++ b/docs/baremetal/porting-pal/platform-override-guides/mpam-pcc.md @@ -0,0 +1,375 @@ +# MPAM and PCC Platform Configuration Guide (Platform Override Macros) + +This note explains how to populate MPAM- and PCC-related platform configuration macros (like those used in SBSA/BSA ACS) for a new Arm-based platform. + +The focus is on how to *think about* your platform and then translate that into ACS-style macros such as: + +```c +/* MPAM Config */ +#define MPAM_MAX_MSC_NODE +#define MPAM_MAX_RSRC_NODE +#define PLATFORM_MPAM_MSC_COUNT +... +/* PCC Config */ +#define PLATFORM_PCC_SUBSPACE_COUNT +#define PLATFORM_PCC_SUBSPACE0_... +``` + +--- + +## 1. Overview: What MPAM and PCC Describe + +### 1.1 MPAM + +MPAM (Memory System Resource Partitioning And Monitoring) is described to the OS using the **ACPI MPAM table**. Conceptually: + +- **MSCs (Memory System Components)** are logical blocks that expose MPAM registers (partitioning and monitoring controls). +- Each **MSC node** in the MPAM table describes: + - How software accesses the MSC (MMIO or PCC). + - Interrupts for overflow and error reporting (optional). + - MAX_NRDY timing for configuration changes. + - The set of **resources** managed by that MSC (cache storage, memory bandwidth, interconnect link, etc). +- Each **Resource node** describes: + - Which logical resource is controlled (cache instance, memory domain, SMMU TLB, interconnect, etc). + - Whether RIS (Resource Instance Selection) is used. + - The **locator** describing *where* the resource is in the system (PPTT cache node, SRAT proximity domain, IORT node, etc). + +The job of the platform configuration is therefore: + +1. Enumerate all MSCs you want the OS to see. +2. For each MSC, describe: + - How to reach it (MMIO vs PCC, base address, size). + - How to handle its interrupts. + - How many MPAM resources it contains and where those resources live in the system. + +### 1.2 PCC (Platform Communication Channel) + +PCC is a generic mechanism for **host–firmware shared-memory communication**, described in the **ACPI PCCT**. MPAM uses PCC as one of the possible **interface types** for an MSC: + +- MPAM MSC Interface Type: + - `0x00` – MMIO (SystemMemory) + - `0x0A` – PCC + +If you choose PCC for an MSC: + +- The MPAM MSC node’s **Base Address** field holds the **PCC subspace ID**, *not* an MMIO address. +- The actual shared memory region and doorbell mechanisms are defined in the **PCCT subspace** referenced by that ID. + +In ACS, PCC configuration macros describe those PCCT subspaces and the related doorbell and completion registers. + +--- + +## 2. MPAM Platform Configuration + +We’ll use the RD-N2-style macros as a reference and then generalize. + +### 2.1 High-Level Steps for MPAM + +For a new platform, go through this sequence: + +1. **Enumerate all MSCs in the SoC** + Candidates include: + - L1/L2/L3 caches and cluster-level caches. + - Memory-side caches. + - Memory controllers and channels (for memory bandwidth). + - SMMUs (IO TLBs, translation caches in the SMMU datapath). + - Inter-socket / inter-NUMA interconnects. + - SoC interconnect buffers / fabrics that implement MPAM. + - Other ACPI devices with MPAM-capable resources (implementation-specific). + +2. **Decide interface type for each MSC** + - **MMIO**: registers are in normal memory-mapped region. + - **PCC**: registers are read/written via a PCC shared memory region and doorbell. + +3. **Define MSC-level properties** + For each MSC: + - Assign an **Identifier** (must be unique across all MSCs). + - Define access information (base address or PCC subspace ID, MMIO size). + - Configure Overflow/Error interrupts and their affinities if implemented. + - Set `MAX_NRDY_USEC` to match the worst-case “Not-Ready” time after configuration changes. + - Optionally link the MSC to a device (HID/UID) for power management. + +4. **Define resources inside each MSC** + - For each logical resource that the MSC controls (cache, bandwidth, interconnect link, etc), create a **resource node**. + - Choose a **locator type** (processor cache, memory, SMMU, memory-side cache, ACPI device, interconnect, or unknown). + - Fill in the locator descriptors for that type. + - Decide RIS indices if the MSC supports RIS. + +5. **Fill in ACS platform macros** based on steps 3–4: + - `MPAM_MAX_MSC_NODE`, `MPAM_MAX_RSRC_NODE`, `PLATFORM_MPAM_MSC_COUNT` + - MSCi config (ID, base, size, interrupts, MAX_NRDY, resource count) + - Per-resource config (RIS index, locator type, descriptor fields). + +### 2.2 Mapping to ACS-Style MPAM Macros + +Using your RD-N2 example as a template: + +```c +/* MPAM Config */ +#define MPAM_MAX_MSC_NODE 0x1 +#define MPAM_MAX_RSRC_NODE 0x1 +#define PLATFORM_MPAM_MSC_COUNT 0x1 + +#define PLATFORM_MPAM_MSC0_INTR_TYPE 0x0 +#define PLATFORM_MPAM_MSC0_ID 0x3 +#define PLATFORM_MPAM_MSC0_BASE_ADDR 0x1010028000 +#define PLATFORM_MPAM_MSC0_ADDR_LEN 0x2004 +#define PLATFORM_MPAM_MSC0_MAX_NRDY 10000000 +#define PLATFORM_MPAM_MSC0_RSRC_COUNT 0x1 + +#define PLATFORM_MPAM_MSC0_RSRC0_RIS_INDEX 0x0 +#define PLATFORM_MPAM_MSC0_RSRC0_LOCATOR_TYPE 0x1 +#define PLATFORM_MPAM_MSC0_RSRC0_DESCRIPTOR1 0x0 +#define PLATFORM_MPAM_MSC0_RSRC0_DESCRIPTOR2 0x0 +``` + +You can think of this as: + +- **Global limits:** + - `MPAM_MAX_MSC_NODE` – maximum MSCs the override infrastructure can hold. + - `MPAM_MAX_RSRC_NODE` – maximum resource nodes it can hold. + - `PLATFORM_MPAM_MSC_COUNT` – how many MSCs are actually described for this platform. + +- **Per-MSC macros:** + - `PLATFORM_MPAM_MSCo_ID` → MSC node *Identifier* (must match MSC Device _UID if you expose one in ACPI). + - `PLATFORM_MPAM_MSCo_BASE_ADDR` → + - If Interface Type = MMIO: base of MSC’s MPAM register space. + - If Interface Type = PCC: PCC subspace ID (the ACS code must know to interpret it that way). + - `PLATFORM_MPAM_MSCo_ADDR_LEN` → + - If MMIO: size of the MMIO region for MPAM registers. + - If PCC: 0. + - `PLATFORM_MPAM_MSCo_INTR_TYPE` → describes whether you use wired interrupts, MSI, or none (exact encoding is ACS-specific, typically 0 means wired/GSIV or “not used”). + - `PLATFORM_MPAM_MSCo_MAX_NRDY` → value for `MAX_NRDY_USEC` in MSC node. + - `PLATFORM_MPAM_MSCo_RSRC_COUNT` → number of resource nodes attached to this MSC. + +- **Per-resource macros for MSC `o`, resource `r`:** + - `PLATFORM_MPAM_MSCo_RSRCr_RIS_INDEX` → RIS index (0 if RIS not used). + - `PLATFORM_MPAM_MSCo_RSRCr_LOCATOR_TYPE` → value from the location types table: + - `0x00` – Processor cache + - `0x01` – Memory + - `0x02` – SMMU + - `0x03` – Memory-side cache + - `0x04` – ACPI device + - `0x05` – Interconnect + - `0xFF` – Unknown + - `PLATFORM_MPAM_MSCo_RSRCr_DESCRIPTOR1` / `_DESCRIPTOR2` → the two parts of the locator structure (Descriptor1 is 8 bytes, Descriptor2 is 4 bytes). In ACS macros they are split into two values; how you pack them is implementation-specific, but conceptually: + - Descriptor1 → “primary” identifier (cache ID, proximity domain, IORT identifier, etc). + - Descriptor2 → “secondary” identifier (often 0 or reserved). + +### 2.3 Choosing Locator Types and Descriptors + +The locator type tells the OS *where* this resource lives. You choose it based on the component: + +- **Processor cache (0x00)** + - Use for L1/L2/L3 caches. + - `Descriptor1` must match the Identifier of the PPTT Type 1 cache structure representing that cache. + - `Descriptor2` is reserved / 0. + +- **Memory (0x01)** + - Use for memory bandwidth resources. + - `Descriptor1` = SRAT proximity domain associated with the memory range. + - `Descriptor2` = 0. + +- **SMMU (0x02)** + - Use for IO TLBs or translation caches in an SMMU. + - `Descriptor1` = Identifier of the IORT node describing that SMMU interface. + - `Descriptor2` = 0. + +- **Memory-side cache (0x03)** + - Use for memory-side caches that sit in front of far memory. + - Descriptor encodes: `{Proximity domain, cache level}` (exact packing is implementation-specific in ACS; logically it’s that tuple). + +- **ACPI device (0x04)** + - Use for implementation-specific resources associated with an ACPI-described device. + - `Descriptor1` = encoded ACPI _HID (or pointer to it, depending on your implementation). + - `Descriptor2` = _UID. + +- **Interconnect (0x05)** + - Use for NUMA or processor-cluster interconnect links. + - `Descriptor1` often points to a resource-specific data block containing an interconnect descriptor table (UUID, number of descriptors, then an array of {source ID, dest ID, link type}). + +- **Unknown (0xFF)** + - Use when the resource is in a component that cannot be described via ACPI, e.g. internal fabric buffers. + +How you pack `Descriptor1` / `Descriptor2` into the ACS macros for each locator type is a pure implementation detail in the ACS platform override. The key is that they match the identifiers that the OS will see in PPTT, SRAT, HMAT, IORT and the ACPI namespace. + +### 2.4 RIS Index + +If your MSC supports RIS (Resource Instance Selection), then: + +- `PLATFORM_MPAM_MSCo_RSRCr_RIS_INDEX` must be between 0 and `MPAMF_IDR.RIS_MAX` for that MSC. +- Each resource instance of the same type (e.g., multiple channels, multiple cache slices) gets a distinct RIS index. +- If the MSC does **not** support RIS (`HAS_RIS = 0`), then: + - Set RIS index to 0 in all resource nodes. + - All controls of a given type operate on that single resource instance. + +### 2.5 Interrupts, MAX_NRDY, and Linked Devices + +Per MSC node, you also need to decide: + +- **Overflow interrupt / Error interrupt**: + - GSIV numbers (wired) for overflow/error interrupts, or 0 if not present. + - Interrupt flags (level/edge, processor vs processor container affinity). + - Affinity (ACPI UID of the CPU or cluster that handles the interrupt). + +- **`MAX_NRDY_USEC`**: + - Worst-case time in microseconds for “Not Ready” to be deasserted after updating configuration. + +- **Linked device (HID/UID)** (optional but recommended): + - If MSC shares power management with some other ACPI device (CPU, cluster, memory controller, etc.), set: + - Linked device HID = _HID of that device. + - Linked device UID = _UID of that device. + - This lets OSPM coordinate power/pstate for the MSC and its associated device. + +In ACS macros, you might have extra fields for interrupt type and affinities; fill them consistently with the MPAM spec fields you intend to expose. + +### 2.6 Example: Simple Memory Bandwidth MSC + +Assume a new platform with: + +- A single MPAM MSC (ID=3) at MMIO base `0x1010_028000`, size `0x2004`. +- The MSC controls **memory bandwidth** for the entire memory in proximity domain 0. +- No overflow/error interrupts. +- No RIS (single resource instance). + +Then you could write: + +```c +#define MPAM_MAX_MSC_NODE 0x1 +#define MPAM_MAX_RSRC_NODE 0x1 +#define PLATFORM_MPAM_MSC_COUNT 0x1 + +#define PLATFORM_MPAM_MSC0_INTR_TYPE 0x0 // e.g. 0 = no special handling / wired + none +#define PLATFORM_MPAM_MSC0_ID 0x3 +#define PLATFORM_MPAM_MSC0_BASE_ADDR 0x1010028000ULL +#define PLATFORM_MPAM_MSC0_ADDR_LEN 0x2004 +#define PLATFORM_MPAM_MSC0_MAX_NRDY 10000000 // 10ms worst-case +#define PLATFORM_MPAM_MSC0_RSRC_COUNT 0x1 + +/* Resource 0: memory bandwidth for proximity domain 0 */ +#define PLATFORM_MPAM_MSC0_RSRC0_RIS_INDEX 0x0 // no RIS +#define PLATFORM_MPAM_MSC0_RSRC0_LOCATOR_TYPE 0x1 // Memory +#define PLATFORM_MPAM_MSC0_RSRC0_DESCRIPTOR1 0x0 // SRAT proximity domain 0 +#define PLATFORM_MPAM_MSC0_RSRC0_DESCRIPTOR2 0x0 // reserved +``` + +Adapting this to a more complex topology just means adding more MSCs and more resources with appropriate locator types. + +--- + +## 3. PCC Platform Configuration + +### 3.1 When Do You Need PCC Config? + +You only need to fully populate PCC-related platform macros when: + +- At least one MPAM MSC (or some other firmware interface) uses **PCC** (Interface Type = PCC / ASID=0x0A), **and** +- The ACS tests expect to reach that MSC through a PCC subspace defined in PCCT. + +If your MPAM MSCs are all MMIO-based, PCC configuration can stay as placeholders (or be unused). + +### 3.2 Relationship Between MPAM and PCC + +For an MPAM MSC with **PCC interface**: + +- In the MSC node: + - `Interface type` = PCC (0x0A). + - `Base address` = the **subspace ID** of the PCC subspace in PCCT, not a physical address. + - `MMIO size` = 0. +- In PCCT (PCC table): + - There is a subspace of **type 1 (HW-reduced communications subspace)**. + - That subspace references: + - A **shared memory region** (“Generic Communication Shared Memory Region”), where the MPAM register image lives. + - Optional **doorbell register(s)** and associated masks. + - The MPAM feature page is mapped within this shared region at offset 8, after the PCC header fields. + +In ACS macros, the PCC side is configured with something like: + +```c +#define PLATFORM_PCC_SUBSPACE_COUNT 0x1 +#define PLATFORM_PCC_SUBSPACE0_INDEX 0x0 +#define PLATFORM_PCC_SUBSPACE0_TYPE 0x1 // typically HW-reduced subspace +#define PLATFORM_PCC_SUBSPACE0_BASE +#define PLATFORM_PCC_SUBSPACE0_MIN_REQ_TURN_TIME +// doorbell / status configuration +#define PLATFORM_PCC_SUBSPACE0_DOORBELL_PRESERVE +#define PLATFORM_PCC_SUBSPACE0_DOORBELL_WRITE +#define PLATFORM_PCC_SUBSPACE0_CMD_COMPLETE_CHK_MASK +#define PLATFORM_PCC_SUBSPACE0_CMD_UPDATE_PRESERVE +#define PLATFORM_PCC_SUBSPACE0_CMD_COMPLETE_UPDATE_SET +/* GAS-style doorbell and status registers */ +#define PLATFORM_PCC_SUBSPACE0_DOORBELL_REG {space_id, bit_width, bit_offset, access_size, address} +#define PLATFORM_PCC_SUBSPACE0_CMD_COMPLETE_UPDATE_REG { ... } +#define PLATFORM_PCC_SUBSPACE0_CMD_COMPLETE_CHK_REG { ... } +``` + +(RD-N2 example uses dummy values, which is fine if MPAM is MMIO-only.) + +### 3.3 Filling PCC Subspace Macros for a New Platform + +For each PCC subspace you actually use: + +1. **Decide subspace index and type** + - `PLATFORM_PCC_SUBSPACEi_INDEX` → index used in your firmware/PCCT. + - `PLATFORM_PCC_SUBSPACEi_TYPE` → typically `1` for HW-reduced communications subspace. + - `PLATFORM_PCC_SUBSPACE_COUNT` → total number of subspaces you want to describe in ACS override. + +2. **Shared memory region (BASE)** + - `PLATFORM_PCC_SUBSPACEi_BASE` → base physical address of the PCC shared memory region used by this subspace. + - Ensure the memory attributes are correct: + - UEFI: set in EFI memory map. + - Non-UEFI: must be mapped as Device-nGnRnE. + +3. **Doorbell register** + - Identify how software tells firmware that the command is ready: + - Typically a register (GAS) whose write triggers an interrupt or wakeup. + - Fill **Generic Address Structure** fields: + - `space_id` – usually system memory or system I/O. + - `bit_width`, `bit_offset`, `access_size` – describe the doorbell field. + - `address` – base address of doorbell register. + + Example (simple memory-mapped doorbell bit 0 at address 0x2_0000_0000): + + ```c + #define PLATFORM_PCC_SUBSPACE0_DOORBELL_REG {0x00, 32, 0, 3, 0x0000000200000000ULL} + #define PLATFORM_PCC_SUBSPACE0_DOORBELL_PRESERVE 0xFFFFFFFEU // preserve bits [31:1] + #define PLATFORM_PCC_SUBSPACE0_DOORBELL_WRITE 0x00000001U // set bit 0 + ``` + +4. **Command complete check + update registers** + - Identify which status bit firmware sets when the command is complete and which register it lives in. + - `PLATFORM_PCC_SUBSPACEi_CMD_COMPLETE_CHK_REG` → GAS for that status register. + - `PLATFORM_PCC_SUBSPACEi_CMD_COMPLETE_CHK_MASK` → bit mask evaluated by the driver to know when firmware is done. + - `PLATFORM_PCC_SUBSPACEi_CMD_COMPLETE_UPDATE_REG` → register written by the OS to clear/ack the completion. + - `PLATFORM_PCC_SUBSPACEi_CMD_COMPLETE_UPDATE_SET` → bits to write to clear the completion bit. + - `PLATFORM_PCC_SUBSPACEi_CMD_UPDATE_PRESERVE` → mask of bits preserved when writing the update/ack. + + If your PCC implementation uses the standard ACPI `Command`/`Status` semantics in the shared memory region, these macros simply reflect that mapping. + +5. **Minimum request turnaround time** + - `PLATFORM_PCC_SUBSPACEi_MIN_REQ_TURN_TIME` → minimum time (in microseconds) between PCC requests. + - This should match the PCCT and your firmware’s expectations. + +6. **Hooking PCC into MPAM MSC nodes** + - For every MSC that uses this PCC subspace: + - Set its **Interface type** to PCC. + - Set its **Base address** field to the PCC subspace index. + - Set its `MMIO size` field to 0. + + In your ACS macros, this means making sure that the MPAM MSC macros and the PCC subspace macros agree on which subspace index is used. + +### 3.4 When PCC Macros Can Stay as Placeholders + +If your platform: + +- Implements all MPAM MSCs as MMIO only, and +- Does not use PCC for other ACS-tested channels, + +then you can leave PCC macros as: + +```c +#define PLATFORM_PCC_SUBSPACE_COUNT 0x0 +/* or a dummy 0th subspace with REGs set to 0 / DEADDEAD as in RD-N2 example */ +``` + +--- diff --git a/docs/baremetal/porting-pal/platform-override-guides/pcie.md b/docs/baremetal/porting-pal/platform-override-guides/pcie.md new file mode 100644 index 00000000..70463520 --- /dev/null +++ b/docs/baremetal/porting-pal/platform-override-guides/pcie.md @@ -0,0 +1,311 @@ +# Platform Override Configuration Guide — PCIe ECAM and PCIe Device Hierarchy + +This guide explains how to populate **platform override macros** for: +1) **PCIe ECAM windows** (configuration-space access) +2) **PCIe device hierarchy entries** (a static inventory used by the test/validation stack) + +The examples in this guide mirror the macro style based on RD N2, but the intent is to help you fill these fields for a **new platform**. + +--- + +## 1) What you are configuring + +### 1.1 ECAM windows (config space discovery) +An **ECAM window** defines where PCIe configuration space is memory-mapped for a given: +- **Segment Group** (PCI Segment) +- **Bus range** (Start Bus .. End Bus) +- **Base address** (ECAM base for bus 0 of that window) + +This is what allows software (OS / firmware / ACS) to do: +- `cfg_read(seg,bus,dev,func,offset)` → MMIO at ECAM base + B:D:F:offset + +### 1.2 PCIe device hierarchy table (static inventory) +Many validation frameworks keep a **platform override list** of PCIe devices to: +- sanity-check enumeration +- apply feature expectations (DMA, coherency, ATC, behind SMMU, P2P capability) +- drive directed tests (e.g., exerciser endpoints, bridges) + + +--- + +## 2) ECAM configuration + +Two kinds of ECAM-related macros: +- A set of “ECAM window” fields (`PCIE_ECAM_BASE_ADDR_n`, segment, start/end bus) +- A set of “ECAM0 host-bridge” resource fields (BAR windows, bus range, etc.) + +Treat them as: +- **(A)** Config-space mapping windows (what address range maps to config space) +- **(B)** Host-bridge resource windows (what address ranges are assigned to PCIe MMIO regions) + +### 2.1 ECAM window macros (config-space mapping) + +Example: + +```c +#define PLATFORM_OVERRIDE_PCIE_ECAM_BASE_ADDR_0 0x1010000000 +#define PLATFORM_OVERRIDE_PCIE_SEGMENT_GRP_NUM_0 0x0 +#define PLATFORM_OVERRIDE_PCIE_START_BUS_NUM_0 0x0 +#define PLATFORM_OVERRIDE_PCIE_END_BUS_NUM_0 0x8 +``` + +#### How to fill these for a new platform + +**(1) Determine segments (PCI segment groups)** +- If you have a single PCIe domain, segment is usually `0`. +- If you have multiple independent PCIe domains (common in server SoCs), you may have segments 0..N-1. + +**(2) Determine bus ranges per segment** +- Each root complex (or host bridge) typically owns a bus range. +- Bus numbering scheme is platform/firmware dependent. +- For static platforms, you often allocate **non-overlapping bus ranges** per host bridge. + +**(3) Determine ECAM base address** +From your system memory map / integration doc: +- ECAM base is where config space for **start_bus** begins. +- For standard ECAM mapping, each bus consumes **1 MiB**: + - bus stride = 1 MB + - dev stride = 32 KB + - func stride = 4 KB + +So, ECAM size required for a bus range is: + +``` +ecam_size = (end_bus - start_bus + 1) * 1 MiB +``` + +**(4) Multiple windows with the same base address** +In the example: +- Window 0 covers buses 0..8 +- Window 1 covers buses 0x40..0x7F +- Both have the same base address + +This can occur if: +- your platform uses **non-contiguous bus numbering** but maps them into one ECAM aperture, or +- the override layer expects “logical windows” for test partitioning. + +For a new platform, prefer a clean mapping: +- Each window base points to the correct bus0-of-window mapping. +- Use separate bases if the hardware maps them separately. + +> Practical rule: if bus numbers are non-contiguous, only do this “same base, different bus ranges” if your ECAM mapping logic explicitly supports it. + +--- + +### 2.2 Host bridge (HB) and resource window macros + +Example (ECAM0 host bridge resource config): + +```c +#define PLATFORM_OVERRIDE_PCIE_ECAM0_HB_COUNT 1 +#define PLATFORM_OVERRIDE_PCIE_ECAM0_SEG_NUM 0x0 +#define PLATFORM_OVERRIDE_PCIE_ECAM0_START_BUS_NUM 0x0 +#define PLATFORM_OVERRIDE_PCIE_ECAM0_END_BUS_NUM 0x8 + +#define PLATFORM_OVERRIDE_PCIE_ECAM0_EP_BAR64 0x4000100000 +#define PLATFORM_OVERRIDE_PCIE_ECAM0_RP_BAR64 0x4000000000 +#define PLATFORM_OVERRIDE_PCIE_ECAM0_EP_NPBAR32 0x60000000 +#define PLATFORM_OVERRIDE_PCIE_ECAM0_EP_PBAR32 0x60600000 +#define PLATFORM_OVERRIDE_PCIE_ECAM0_RP_BAR32 0x60850000 +``` + +These are not ECAM addresses. They are **MMIO apertures** reserved for PCIe: +- **EP_BAR64**: 64-bit prefetchable region for Endpoints (MMIO64, prefetch) +- **RP_BAR64**: 64-bit region for Root Ports +- **EP_NPBAR32**: 32-bit non-prefetchable endpoint region (MMIO32 NP) +- **EP_PBAR32**: 32-bit prefetchable endpoint region (MMIO32 P) +- **RP_BAR32**: 32-bit region used for root-port internal BARs / RP regs + +#### How to fill these for a new platform + +1) From your SoC memory map / firmware resource map, identify the **PCIe MMIO apertures**: +- A 64-bit prefetchable window (preferred for high BAR devices) +- Optionally a 64-bit non-prefetchable window (less common) +- 32-bit MMIO windows if your platform supports/needs them + +2) Ensure the windows do not overlap with: +- DRAM ranges +- device MMIO ranges (UART, GIC, SMMU, etc.) +- reserved regions + +3) Ensure the windows are aligned appropriately (typical): +- 64-bit windows aligned to their size (power-of-two is ideal) +- 32-bit windows aligned to at least 1 MiB or 64 KiB depending on platform conventions + +4) Match the firmware/OS assignment strategy: +- If firmware assigns resources dynamically, your overrides should reflect the **reserved apertures** the firmware uses. +- If you are in a bare-metal test environment, you must ensure the enumerator uses these windows consistently. + +--- + +### 2.3 Bare-metal enumeration limits + +Example: + +```c +#define PLATFORM_BM_OVERRIDE_PCIE_MAX_BUS 0x9 +#define PLATFORM_BM_OVERRIDE_PCIE_MAX_DEV 32 +#define PLATFORM_BM_OVERRIDE_PCIE_MAX_FUNC 8 +``` + +Guidance: +- `MAX_BUS` should be at least `(max_end_bus + 1)` across all windows used by BM tests. +- `MAX_DEV` is typically 32 (PCI spec max devices per bus). +- `MAX_FUNC` is typically 8 (PCI spec max functions per device). + +If your platform uses bus numbers > 0xFF, that’s not standard PCI; most platforms stay within 0..255. + +--- + +## 3) PCIe device hierarchy table configuration + +### 3.1 What each entry represents +Each `PLATFORM_PCIE_DEVn_*` entry describes one enumerated PCI function: +- location: Segment, Bus, Device, Function +- identity: Vendor ID, Device ID, Class Code +- behavior expectations: DMA, coherency, P2P, behind SMMU, ATC, 64-bit DMA + +Example: + +```c +#define PLATFORM_PCIE_DEV5_CLASSCODE 0xED000000 +#define PLATFORM_PCIE_DEV5_VENDOR_ID 0x13B5 +#define PLATFORM_PCIE_DEV5_DEV_ID 0xED01 +#define PLATFORM_PCIE_DEV5_BUS_NUM 2 +#define PLATFORM_PCIE_DEV5_DEV_NUM 0 +#define PLATFORM_PCIE_DEV5_FUNC_NUM 0 +#define PLATFORM_PCIE_DEV5_SEG_NUM 0 +#define PLATFORM_PCIE_DEV5_DMA_SUPPORT 1 +#define PLATFORM_PCIE_DEV5_DMA_COHERENT 0 +#define PLATFORM_PCIE_DEV5_P2P_SUPPORT 0 +#define PLATFORM_PCIE_DEV5_DMA_64BIT 0 +#define PLATFORM_PCIE_DEV5_BEHIND_SMMU 1 +#define PLATFORM_PCIE_DEV5_ATC_SUPPORT 0 +``` + +### 3.2 How to build the device list for a new platform + +Set: +```c +#define PLATFORM_PCIE_NUM_ENTRIES +``` + +#### Fill capability expectations +These flags are **expectations** used by tests. They must match the platform behavior. + +**DMA_SUPPORT** +- `1` if device can perform DMA (almost all endpoints do) +- `0` for bridges / ports that don’t initiate DMA + +**DMA_COHERENT** +- `1` if the device is cache-coherent with CPU (e.g., CXL.cache capable devices, or coherent PCIe endpoints on a coherent interconnect) +- `0` otherwise + +**DMA_64BIT** +- `1` if device supports 64-bit DMA addressing (common for modern devices) +- `0` if restricted to 32-bit addressing + +**BEHIND_SMMU** +- `1` if the device’s transactions are translated/managed by an SMMU (typical on Arm servers) +- `0` if it bypasses SMMU + +How to decide: +- Use IORT/SMMU topology: if the requester ID maps through an SMMU node, it’s “behind SMMU”. +- Some devices (e.g., host bridge internal functions) may bypass translation. + +**ATC_SUPPORT** +- `1` if device supports Address Translation Cache (ATS/ATC capability) and platform enables it +- `0` otherwise +Notes: +- ATC/ATS typically depends on both endpoint capability and SMMU/RC support. +- If you mark ATC supported but firmware doesn’t enable ATS, tests may fail. + +**P2P_SUPPORT** +- `1` if the device supports/permits peer-to-peer transactions in your platform configuration +- `0` otherwise +This is platform-policy dependent. Many validation setups set P2P not supported unless explicitly enabled. + +You also have a global: +```c +#define PLATFORM_PCIE_P2P_NOT_SUPPORTED 1 +``` +Treat this as “platform-level policy”: if set, you likely want `*_P2P_SUPPORT` = 0 for all entries except explicit exceptions. + +--- + +## 4) Common pitfalls and how to avoid them + +### 4.1 ECAM base doesn’t match bus numbering +Symptoms: +- config reads return 0xFFFF_FFFF +- enumeration only sees bus 0 or fails beyond a bus + +Fix: +- Validate ECAM base + bus stride mapping: + - bus X config space must be at `ecam_base + (X - start_bus) * 1MiB` + +### 4.2 End bus too small +Symptoms: +- downstream devices not discovered +Fix: +- Ensure `END_BUS_NUM` covers the full topology, including bridges/switches. + +### 4.3 Resource windows overlap or are too small +Symptoms: +- BAR assignment failures +- devices disable memory space enable (MSE), or ACS exerciser failures +Fix: +- Increase EP/RP MMIO apertures and ensure alignment / non-overlap. + +### 4.4 Wrong “behind SMMU” / ATS expectations +Symptoms: +- SMMU tests fail, DMA faults, PRI/ATS tests fail +Fix: +- Cross-check with IORT stream-ID mapping and SMMU configuration. +- Only enable ATC if both endpoint and platform enable it. + +--- + +## 5) Minimal templates you can copy/paste + +### 5.1 ECAM window template +```c +#define PLATFORM_OVERRIDE_PCIE_ECAM_BASE_ADDR_ +#define PLATFORM_OVERRIDE_PCIE_SEGMENT_GRP_NUM_ +#define PLATFORM_OVERRIDE_PCIE_START_BUS_NUM_ +#define PLATFORM_OVERRIDE_PCIE_END_BUS_NUM_ +``` + +### 5.2 Host bridge resource windows template +```c +#define PLATFORM_OVERRIDE_PCIE_ECAM_HB_COUNT +#define PLATFORM_OVERRIDE_PCIE_ECAM_SEG_NUM +#define PLATFORM_OVERRIDE_PCIE_ECAM_START_BUS_NUM +#define PLATFORM_OVERRIDE_PCIE_ECAM_END_BUS_NUM + +#define PLATFORM_OVERRIDE_PCIE_ECAM_EP_BAR64 +#define PLATFORM_OVERRIDE_PCIE_ECAM_RP_BAR64 +#define PLATFORM_OVERRIDE_PCIE_ECAM_EP_NPBAR32 +#define PLATFORM_OVERRIDE_PCIE_ECAM_EP_PBAR32 +#define PLATFORM_OVERRIDE_PCIE_ECAM_RP_BAR32 +``` + +### 5.3 PCIe device entry template +```c +#define PLATFORM_PCIE_DEV_CLASSCODE +#define PLATFORM_PCIE_DEV_VENDOR_ID +#define PLATFORM_PCIE_DEV_DEV_ID +#define PLATFORM_PCIE_DEV_BUS_NUM +#define PLATFORM_PCIE_DEV_DEV_NUM +#define PLATFORM_PCIE_DEV_FUNC_NUM +#define PLATFORM_PCIE_DEV_SEG_NUM + +#define PLATFORM_PCIE_DEV_DMA_SUPPORT <0_or_1> +#define PLATFORM_PCIE_DEV_DMA_COHERENT <0_or_1> +#define PLATFORM_PCIE_DEV_DMA_64BIT <0_or_1> +#define PLATFORM_PCIE_DEV_BEHIND_SMMU <0_or_1> +#define PLATFORM_PCIE_DEV_ATC_SUPPORT <0_or_1> +#define PLATFORM_PCIE_DEV_P2P_SUPPORT <0_or_1> +``` + +--- diff --git a/docs/baremetal/porting-pal/platform-override-guides/pe-gic.md b/docs/baremetal/porting-pal/platform-override-guides/pe-gic.md new file mode 100644 index 00000000..c93845c5 --- /dev/null +++ b/docs/baremetal/porting-pal/platform-override-guides/pe-gic.md @@ -0,0 +1,434 @@ +# PE & GIC Platform Configuration Guide (MADT-Based, ACS Platform Override Macros) + +This note explains how to populate **Processing Element (PE)** and **GIC** platform configuration macros (like those used in SBSA/BSA ACS) for a new ARM-based platform. + +It connects the ACPI MADT definitions (GICC, GICD, GICR, ITS, etc.) to ACS-style overrides such as: + +```c +/* PE platform config parameters */ +#define PLATFORM_OVERRIDE_PE_CNT +#define PLATFORM_OVERRIDE_PE0_INDEX +#define PLATFORM_OVERRIDE_PE0_MPIDR +#define PLATFORM_OVERRIDE_PE0_PMU_GSIV +#define PLATFORM_OVERRIDE_PE0_GMAIN_GSIV +#define PLATFORM_OVERRIDE_PE0_TRBE_INTR +... +/* GIC platform config parameters */ +#define PLATFORM_OVERRIDE_GIC_VERSION +#define PLATFORM_OVERRIDE_CORE_COUNT +#define PLATFORM_OVERRIDE_CLUSTER_COUNT +#define PLATFORM_OVERRIDE_GICC_COUNT +#define PLATFORM_OVERRIDE_GICD_COUNT +... +``` + +Using RD-N2 as an example, we’ll walk through what each thing means and how you should fill it for a new platform. + +--- + +## 1. Conceptual Overview + +For ARM systems using the **GIC** interrupt model, ACPI MADT provides: + +- One **GICC structure** per logical processor (PE). +- One **GICD structure** describing the Distributor. +- Zero or more **GICR structures** describing redistributor discovery ranges (GICv3+). +- Zero or more **GIC ITS structures** describing ITS units. + +Your platform override **PE** config is essentially a distilled view of: + +- Which logical CPUs (PEs) exist and their **MPIDR** values. +- For each PE, which **GSIVs** correspond to: + - PMU interrupts, + - GIC main maintenance interrupts, + - Optional TRBE (Trace Buffer Extension) interrupts. + +The **GIC** config macros summarize: + +- GIC version and topology (core / cluster count). +- Count and base addresses of GICC, GICD, GICR, ITS, and any virtualization components (GICH). + +The key idea: **PE + GIC config must be consistent with MADT (and other GIC-related tables) that your platform exposes.** + +--- + +## 2. PE Configuration (PLATFORM_OVERRIDE_PE* Macros) + +### 2.1 What a “PE” is here + +In ACS tests, a **Processing Element** (PE) means a logical CPU that: + +- Has a **GICC entry** in MADT (type 0xB), and +- Has a valid **MPIDR** and is “Enabled” in the GICC Flags. + +The ACS PE array is effectively: + +```c +struct { + UINT32 Index; // logical index 0..N-1 + UINT64 Mpidr; // MPIDR_EL1 value of this core + UINT32 PmuGsiv; // PMU interrupt GSIV used by this core + UINT32 GMainGsiv; // GIC maintenance / main GSIV (if relevant) + UINT32 TrbeGsiv; // TRBE interrupt GSIV (if supported) +} PlatformPe[]; +``` + +Mapped to macros like: + +```c +#define PLATFORM_OVERRIDE_PE0_INDEX 0x0 +#define PLATFORM_OVERRIDE_PE0_MPIDR 0x0 +#define PLATFORM_OVERRIDE_PE0_PMU_GSIV 0x17 +#define PLATFORM_OVERRIDE_PE0_GMAIN_GSIV 0x19 +#define PLATFORM_OVERRIDE_PE0_TRBE_INTR 0x1A +``` + +### 2.2 High-Level Steps for PE Config + +For a new platform: + +1. **Enumerate all logical CPUs (PEs) you want ACS to test** + - Typically all cores present at boot and marked Enabled in MADT GICC structures. + +2. **Determine MPIDR for each PE** + - From hardware view (cluster/core topology). + - Must match the `MPIDR` field in the GICC structure of that PE. + +3. **Determine per-PE interrupt GSIVs** + - **PMU interrupt GSIV**: the PPI/IRQ used as PMU interrupt for that PE (often same number for all cores, e.g., a PPI). + - **GMAIN GSIV**: often used for GIC virtual maintenance interrupt or similar; in RD-N2, it is the GIC maintenance PPI. + - **TRBE interrupt GSIV**: if TRBE is implemented, this is the per-core TRBE PPI value; otherwise 0. + +4. **Assign a simple index ordering** + - `PLATFORM_OVERRIDE_PE_INDEX` is typically equal to `i` (0,1,2,…). + - Order should match GICC ordering in MADT for sanity (and to match many OS expectations). + +5. **Set total PE count** + +```c +#define PLATFORM_OVERRIDE_PE_CNT +``` + +### 2.3 Mapping MPIDR from Topology + +In RD-N2 example: + +```c +#define PLATFORM_OVERRIDE_PE0_MPIDR 0x0 +#define PLATFORM_OVERRIDE_PE1_MPIDR 0x10000 +#define PLATFORM_OVERRIDE_PE2_MPIDR 0x20000 +... +#define PLATFORM_OVERRIDE_PE15_MPIDR 0xF0000 +``` + +This pattern corresponds to: + +- A cluster/core topology where **Aff0** (bits [7:0]) is the core number and Aff1 ([15:8]) is the cluster ID *or vice versa*. +- MPIDR encoding: + - On Armv8: + - Bits [39:32] Aff3 + - Bits [23:16] Aff2 + - Bits [15:8] Aff1 + - Bits [7:0] Aff0 + +Example: `0x0000000000010000` means Aff1=1, Aff0=0 (or depending on your mapping). + +For your platform: + +1. Decide core numbering (Aff0) and cluster numbering (Aff1/Aff2/Aff3). +2. For each PE, compute the MPIDR value that your hardware uses. +3. Ensure that **this MPIDR matches the value stored in MADT GICC.MPIDR field**. +4. Copy it into `PLATFORM_OVERRIDE_PEx_MPIDR` macros. + +### 2.4 PMU, GMAIN, TRBE GSIV values + +- **GSIV** (Global System Interrupt Vector) = GIC INTID (for SPIs/PPIs) as seen by the OS. +- These are per-core **PPIs** in most designs: + - `PMU` PPI: Standard Arm recommended PPI for PMU (e.g., INTID 23 = 0x17). + - `VGIC maintenance` PPI: typically INTID 25 = 0x19 (example). + - `TRBE` PPI: specific INTID for Trace Buffer Extension. + +You must: + +1. Look at your **GIC configuration** (or your firmware/source) to know which PPIs are used for these interrupts. +2. Ensure that **the same GSIV values** are used consistently in: + - GICC structure fields (Performance Interrupt GSIV, VGIC Maintenance GSIV, SPE/TRBE fields, etc., if used). + - Your platform override macros. + +In RD-N2, all cores share the same GSIV values: + +```c +#define PLATFORM_OVERRIDE_PE0_PMU_GSIV 0x17 +#define PLATFORM_OVERRIDE_PE0_GMAIN_GSIV 0x19 +#define PLATFORM_OVERRIDE_PE0_TRBE_INTR 0x1A +... +``` + +For a new platform, if all cores use: + +- PMU PPI INTID = 0x23 (35 decimal) +- VGIC maintenance PPI INTID = 0x25 (37 decimal) +- TRBE PPI INTID = 0x26 (38 decimal) + +then your macros would be (for each PE): + +```c +#define PLATFORM_OVERRIDE_PE0_PMU_GSIV 0x23 +#define PLATFORM_OVERRIDE_PE0_GMAIN_GSIV 0x25 +#define PLATFORM_OVERRIDE_PE0_TRBE_INTR 0x26 +``` + +If your design uses **per-core unique SPIs** for PMU (less common), you’d fill each PE’s GSIV accordingly. + +### 2.5 Worked Example: 4-Core, Single-Cluster Platform + +Assume: + +- 4 cores, single cluster (Cluster ID=0). +- MPIDR encoding: Aff1 = clusterID, Aff0 = coreID. + - CPU0: MPIDR = 0x0000_0000_0000_0000 + - CPU1: MPIDR = 0x0000_0000_0000_0001 + - CPU2: MPIDR = 0x0000_0000_0000_0002 + - CPU3: MPIDR = 0x0000_0000_0000_0003 +- PMU PPI: 0x17, GMAIN PPI: 0x19, TRBE PPI: 0x1A (same as RD-N2). + +You’d define: + +```c +#define PLATFORM_OVERRIDE_PE_CNT 4 + +#define PLATFORM_OVERRIDE_PE0_INDEX 0x0 +#define PLATFORM_OVERRIDE_PE0_MPIDR 0x0 +#define PLATFORM_OVERRIDE_PE0_PMU_GSIV 0x17 +#define PLATFORM_OVERRIDE_PE0_GMAIN_GSIV 0x19 +#define PLATFORM_OVERRIDE_PE0_TRBE_INTR 0x1A + +#define PLATFORM_OVERRIDE_PE1_INDEX 0x1 +#define PLATFORM_OVERRIDE_PE1_MPIDR 0x1 +#define PLATFORM_OVERRIDE_PE1_PMU_GSIV 0x17 +#define PLATFORM_OVERRIDE_PE1_GMAIN_GSIV 0x19 +#define PLATFORM_OVERRIDE_PE1_TRBE_INTR 0x1A + +#define PLATFORM_OVERRIDE_PE2_INDEX 0x2 +#define PLATFORM_OVERRIDE_PE2_MPIDR 0x2 +#define PLATFORM_OVERRIDE_PE2_PMU_GSIV 0x17 +#define PLATFORM_OVERRIDE_PE2_GMAIN_GSIV 0x19 +#define PLATFORM_OVERRIDE_PE2_TRBE_INTR 0x1A + +#define PLATFORM_OVERRIDE_PE3_INDEX 0x3 +#define PLATFORM_OVERRIDE_PE3_MPIDR 0x3 +#define PLATFORM_OVERRIDE_PE3_PMU_GSIV 0x17 +#define PLATFORM_OVERRIDE_PE3_GMAIN_GSIV 0x19 +#define PLATFORM_OVERRIDE_PE3_TRBE_INTR 0x1A +``` + +The corresponding MADT GICC entries must use the same MPIDR values and GSIVs. + +--- + +## 3. GIC Configuration (PLATFORM_OVERRIDE_GIC* Macros) + +The GIC macros describe the **global interrupt controller topology** in a compressed way. MADT already provides detailed structures: + +- **GICC** (type 0xB) – per-CPU interfaces. +- **GICD** (type 0xC) – distributor. +- **GIC MSI Frame** (type 0xD) – MSI frames. +- **GICR** (type 0xE) – redistributor discovery range. +- **GIC ITS** (type 0xF) – ITS units. + +Your RD-N2 config: + +```c +#define PLATFORM_OVERRIDE_GIC_VERSION 0x3 +#define PLATFORM_OVERRIDE_CORE_COUNT 0x4 +#define PLATFORM_OVERRIDE_CLUSTER_COUNT 0x2 +#define PLATFORM_OVERRIDE_GICC_COUNT 16 +#define PLATFORM_OVERRIDE_GICD_COUNT 0x1 +#define PLATFORM_OVERRIDE_GICC_GICRD_COUNT 0x0 +#define PLATFORM_OVERRIDE_GICR_GICRD_COUNT 0x1 +#define PLATFORM_OVERRIDE_GICITS_COUNT 0x6 +#define PLATFORM_OVERRIDE_GICH_COUNT 0x1 +#define PLATFORM_OVERRIDE_GICMSIFRAME_COUNT 0x0 +#define PLATFORM_OVERRIDE_NONGIC_COUNT 0x0 + +#define PLATFORM_OVERRIDE_GICC_BASE 0x30000000 +#define PLATFORM_OVERRIDE_GICD_BASE 0x30000000 +#define PLATFORM_OVERRIDE_GICC_GICRD_BASE 0x0 +#define PLATFORM_OVERRIDE_GICR_GICRD_BASE 0x301C0000 +#define PLATFORM_OVERRIDE_GICH_BASE 0x2C010000 +#define PLATFORM_OVERRIDE_GICITS0_BASE 0x30040000 +#define PLATFORM_OVERRIDE_GICITS0_ID 0 +#define PLATFORM_OVERRIDE_GICITS1_BASE 0x30080000 +#define PLATFORM_OVERRIDE_GICITS1_ID 0x1 +#define PLATFORM_OVERRIDE_GICITS2_BASE 0x300C0000 +#define PLATFORM_OVERRIDE_GICITS2_ID 0x2 +#define PLATFORM_OVERRIDE_GICITS3_BASE 0x30100000 +#define PLATFORM_OVERRIDE_GICITS3_ID 0x3 +#define PLATFORM_OVERRIDE_GICITS4_BASE 0x30140000 +#define PLATFORM_OVERRIDE_GICITS4_ID 0x4 +#define PLATFORM_OVERRIDE_GICITS5_BASE 0x30180000 +#define PLATFORM_OVERRIDE_GICITS5_ID 0x5 +#define PLATFORM_OVERRIDE_GICCIRD_LENGTH 0x0 +#define PLATFORM_OVERRIDE_GICRIRD_LENGTH (0x20000*8) +``` + +### 3.1 GIC Version and Counts + +For a new platform: + +1. **Determine GIC version** + - From hardware / SoC manual or GICD version register: + - 0x01 – GICv1 + - 0x02 – GICv2 + - 0x03 – GICv3 + - 0x04 – GICv4 + - Set `PLATFORM_OVERRIDE_GIC_VERSION` accordingly. + +2. **Core and cluster counts** + - `PLATFORM_OVERRIDE_CORE_COUNT` – number of cores per cluster (or total cores; check ACS reference, but RD-N2 uses 0x4 with 16 PEs/2 clusters). + - `PLATFORM_OVERRIDE_CLUSTER_COUNT` – number of CPU clusters. + - Combined with MPIDRs, ACS uses this to derive topology. + +3. **GICC count** + - `PLATFORM_OVERRIDE_GICC_COUNT` = number of **GICC structures** in MADT (i.e., number of logical CPUs described). + - Typically equals `PLATFORM_OVERRIDE_PE_CNT` or larger if some PEs are disabled. + +4. **GICD count** + - On most ARM systems, there is **one** GIC distributor → `PLATFORM_OVERRIDE_GICD_COUNT = 0x1`. + +5. **GICR and GICRD counts** + - GICv3+ may use: + + - GICC GICRD base (GICR base per CPU interface in the GICC) or + - A separate **GICR structure** with a discovery range (as in RD-N2). + + - `PLATFORM_OVERRIDE_GICC_GICRD_COUNT` – number of redistributors described via GICC structures. + - `PLATFORM_OVERRIDE_GICR_GICRD_COUNT` – number of redistributor discovery ranges described via GICR structures. + +6. **ITS, GICH, MSI frames, non-GIC controllers** + - `PLATFORM_OVERRIDE_GICITS_COUNT` – number of GIC ITS units (MADT type 0xF). + - `PLATFORM_OVERRIDE_GICH_COUNT` – 1 if virtualization control interface exists and is mapped; 0 otherwise. + - `PLATFORM_OVERRIDE_GICMSIFRAME_COUNT` – number of GIC MSI frames (MADT type 0xD). + - `PLATFORM_OVERRIDE_NONGIC_COUNT` – used if there are additional interrupt controllers. + +### 3.2 GIC Base Addresses + +Fill these from the actual SoC memory map (and ensure consistency with MADT): + +- `PLATFORM_OVERRIDE_GICD_BASE` – **GICD Physical Base Address** from MADT GICD structure. +- `PLATFORM_OVERRIDE_GICC_BASE` – **legacy GICC base** (for GICv2 compatibility). On pure GICv3 systems with redistributors, may be 0 or unused. +- `PLATFORM_OVERRIDE_GICC_GICRD_BASE` – base of GICR when described in GICC. +- `PLATFORM_OVERRIDE_GICR_GICRD_BASE` – base of GICR discovery range from GICR structure. +- `PLATFORM_OVERRIDE_GICH_BASE` – GIC virtual interface control block registers. +- `PLATFORM_OVERRIDE_GICITSx_BASE` – base of each GIC ITS unit from ITS structures. +- `PLATFORM_OVERRIDE_GICITSx_ID` – ITS ID from ITS structures. +- `PLATFORM_OVERRIDE_GICRIRD_LENGTH` – length of redistributor discovery range. +- `PLATFORM_OVERRIDE_GICCIRD_LENGTH` – length if redistributors are described via GICC. + +All of these must match the addresses and IDs you program in hardware and describe in MADT. + +### 3.3 Worked Example: Simple GICv3 Platform + +Assume a new platform with: + +- 8 cores in 2 clusters (4 cores each). +- GICv3, one Distributor, one Redistributor discovery range, two ITS units. +- Memory map (example): + - GICD base: `0x2F000000` + - GICR redistributor discovery base: `0x2F100000`, length: `0x20000 * 8` + - GICH base: `0x2F200000` + - ITS0 base: `0x2F400000`, ITS0_ID = 0 + - ITS1 base: `0x2F500000`, ITS1_ID = 1 + +Then define: + +```c +#define PLATFORM_OVERRIDE_GIC_VERSION 0x3 +#define PLATFORM_OVERRIDE_CORE_COUNT 0x4 // 4 cores/cluster +#define PLATFORM_OVERRIDE_CLUSTER_COUNT 0x2 // 2 clusters +#define PLATFORM_OVERRIDE_GICC_COUNT 8 // 8 logical CPUs +#define PLATFORM_OVERRIDE_GICD_COUNT 0x1 +#define PLATFORM_OVERRIDE_GICC_GICRD_COUNT 0x0 +#define PLATFORM_OVERRIDE_GICR_GICRD_COUNT 0x1 +#define PLATFORM_OVERRIDE_GICITS_COUNT 0x2 +#define PLATFORM_OVERRIDE_GICH_COUNT 0x1 +#define PLATFORM_OVERRIDE_GICMSIFRAME_COUNT 0x0 +#define PLATFORM_OVERRIDE_NONGIC_COUNT 0x0 + +#define PLATFORM_OVERRIDE_GICC_BASE 0x0 // if not used +#define PLATFORM_OVERRIDE_GICD_BASE 0x2F000000 +#define PLATFORM_OVERRIDE_GICC_GICRD_BASE 0x0 +#define PLATFORM_OVERRIDE_GICR_GICRD_BASE 0x2F100000 +#define PLATFORM_OVERRIDE_GICH_BASE 0x2F200000 + +#define PLATFORM_OVERRIDE_GICITS0_BASE 0x2F400000 +#define PLATFORM_OVERRIDE_GICITS0_ID 0x0 +#define PLATFORM_OVERRIDE_GICITS1_BASE 0x2F500000 +#define PLATFORM_OVERRIDE_GICITS1_ID 0x1 + +#define PLATFORM_OVERRIDE_GICCIRD_LENGTH 0x0 +#define PLATFORM_OVERRIDE_GICRIRD_LENGTH (0x20000 * 8) +``` + +Your MADT GICD, GICR, ITS entries must use the same base addresses and IDs. + +--- + +## 4. How MADT GICC / GICD / GICR / ITS Tie Back to PE & GIC Macros + +- **GICC Structures** + - Provide: + - ACPI Processor UID (matches Device(_HID=ACPI0007, _UID=N)). + - CPU Interface number. + - Performance interrupt GSIV. + - GIC virtual maintenance interrupt GSIV. + - MPIDR. + - GICR base address (for GICv3, unless GICR discovery range is used). + - Consistency requirements: + - `PLATFORM_OVERRIDE_PEx_MPIDR` == `GICC[i].MPIDR` for that PE. + - `PLATFORM_OVERRIDE_PEx_PMU_GSIV` == `GICC[i].Performance Interrupt GSIV` (if populated). + - `PLATFORM_OVERRIDE_PEx_GMAIN_GSIV` == `GICC[i].VGIC Maintenance GSIV` (if populated). + - `PLATFORM_OVERRIDE_PEx_TRBE_INTR` == GICC[i].TRBE interrupt field (if TRBE is present). + +- **GICD Structure** + - Provides the **Distributor base address** and GIC version. + - Must match `PLATFORM_OVERRIDE_GICD_BASE` and `PLATFORM_OVERRIDE_GIC_VERSION`. + +- **GICR Structures** + - Provide **redistributor discovery base** and length. + - Must match `PLATFORM_OVERRIDE_GICR_GICRD_BASE` and `PLATFORM_OVERRIDE_GICRIRD_LENGTH`. + +- **ITS Structures** + - Provide base address and ITS ID. + - Must match `PLATFORM_OVERRIDE_GICITSx_BASE` and `PLATFORM_OVERRIDE_GICITSx_ID` for each ITS. + +When these are consistent, ACS PE/GIC tests can discover and exercise the same topology that the OS will see via ACPI. + +--- + +## 5. Implementation Checklist + +### PE + +- [ ] `PLATFORM_OVERRIDE_PE_CNT` equals number of logical CPUs described by MADT GICC entries with Enabled=1. +- [ ] For each PE `i`: + - [ ] `PLATFORM_OVERRIDE_PEi_INDEX` is unique (usually i). + - [ ] `PLATFORM_OVERRIDE_PEi_MPIDR` equals GICC[i].MPIDR. + - [ ] `PLATFORM_OVERRIDE_PEi_PMU_GSIV` equals Performance Interrupt GSIV from GICC (if populated). + - [ ] `PLATFORM_OVERRIDE_PEi_GMAIN_GSIV` equals VGIC Maintenance GSIV from GICC (if populated). + - [ ] `PLATFORM_OVERRIDE_PEi_TRBE_INTR` equals TRBE interrupt GSIV (if TRBE supported), else 0. + +### GIC + +- [ ] `PLATFORM_OVERRIDE_GIC_VERSION` matches the actual hardware GIC version and MADT GICD “GIC version” field. +- [ ] Core and cluster counts align with MPIDR encoding and the number of GICC entries. +- [ ] `PLATFORM_OVERRIDE_GICC_COUNT` equals number of MADT GICC entries. +- [ ] `PLATFORM_OVERRIDE_GICD_COUNT` is 1 for a single distributor system. +- [ ] `PLATFORM_OVERRIDE_GICC_GICRD_COUNT` and `PLATFORM_OVERRIDE_GICR_GICRD_COUNT` match how redistributors are described (GICC vs GICR). +- [ ] `PLATFORM_OVERRIDE_GICITS_COUNT` equals number of MADT ITS entries. +- [ ] `PLATFORM_OVERRIDE_GICMSIFRAME_COUNT` equals number of MADT MSI Frame entries, if any. +- [ ] `PLATFORM_OVERRIDE_NONGIC_COUNT` is set appropriately if you have additional interrupt controllers. +- [ ] All base addresses (`GICD`, `GICC`, `GICR`, `GICH`, `ITS`) match the physical addresses in the SoC memory map and MADT. +- [ ] Redistributor discovery length is correctly set to cover all redistributors. + +Once you follow this mapping, your PE and GIC platform overrides will faithfully reflect the MADT (and hardware), and SBSA/BSA ACS tests for PE and GIC should be able to run on a new platform without surprises. diff --git a/docs/baremetal/porting-pal/platform-override-guides/pmu.md b/docs/baremetal/porting-pal/platform-override-guides/pmu.md new file mode 100644 index 00000000..2e21c090 --- /dev/null +++ b/docs/baremetal/porting-pal/platform-override-guides/pmu.md @@ -0,0 +1,273 @@ +# Platform Override Configuration Guide — PMU Nodes (CoreSight-based PMUs) + +This document explains how to populate **PMU-related platform override macros** for a new platform, using the **RD-N2 reference values** as a baseline together with the Arm CoreSight-based PMU ACPI specification as the architectural basis. + +This guide targets **system or auxiliary PMUs** (memory controller, SMMU, PCIe RC, cache, or other device PMUs) that implement a CoreSight-style register interface and therefore require explicit entries in the platform override data so the ACS can discover them. + +> Scope note +> Architectural CPU PMUs (PMUv3 instances in the processor cores) are discovered directly through architectural registers and do **not** need entries in this override. Only CoreSight-style PMU blocks attached to SoC components are described here. + +--- + +## 1) What this configuration describes + +Each PMU node represents: +- one CoreSight-based PMU block, +- associated with a specific **system component** (memory controller, SMMU, PCIe root complex, cache, or ACPI device), +- with MMIO register base(s), +- optional overflow interrupt, +- and a defined **affinity** to a processor or processor container. + +The platform override macros are used to construct a table of PMU nodes that an ACS PMU driver can enumerate and bind to component-specific event sets. + +--- + +## 2) High‑level structure of the override + +From reference: + +```c +#define MAX_NUM_OF_PMU_SUPPORTED 512 +#define PLATFORM_OVERRIDE_PMU_NODE_CNT 0x1 + +#define PLATFORM_PMU_NODE0_BASE0 0x1010028000 +#define PLATFORM_PMU_NODE0_BASE1 0x0 +#define PLATFORM_PMU_NODE0_TYPE 0x2 +#define PLATFORM_PMU_NODE0_PRI_INSTANCE 0x0 +#define PLATFORM_PMU_NODE0_SEC_INSTANCE 0x0 +#define PLATFORM_PMU_NODE0_DUAL_PAGE_EXT 0x0 +``` + +This breaks down into: + +1. **Global limits / counts** +2. **Per‑PMU node properties** + - base address(es) + - node type (what component the PMU belongs to) + - node instance identifiers + - feature flags (dual‑page support, etc.) + +--- + +## 3) Global PMU limits and counts + +### 3.1 `MAX_NUM_OF_PMU_SUPPORTED` + +```c +#define MAX_NUM_OF_PMU_SUPPORTED 512 +``` + +This is a **platform or framework upper bound**, not the actual number of PMUs present. + +Guidance: +- Set this to a value comfortably larger than the maximum PMUs your platform could ever expose. +- It is often used for static array sizing. + +--- + +### 3.2 `PLATFORM_OVERRIDE_PMU_NODE_CNT` + +```c +#define PLATFORM_OVERRIDE_PMU_NODE_CNT 0x1 +``` + +This is the **actual number of PMU nodes** you describe. + +For a new platform: +- Count each CoreSight-based PMU block you want the ACS to see. +- Typical systems may have: + - one PMU per memory controller, + - one PMU per PCIe root complex, + - optional PMUs for SMMUs or caches. + +--- + +## 4) Per‑PMU node fields + +Each PMU node is described by a group of macros indexed by `NODE`. + +--- + +### 4.1 Base address fields + +```c +#define PLATFORM_PMU_NODE0_BASE0 0x1010028000 +#define PLATFORM_PMU_NODE0_BASE1 0x0 +``` + +These correspond to the PMU register pages: + +- **BASE0** + Base address of **Page 0** of the PMU register space. + If the PMU is a single‑page implementation, this is the only base you need. + +- **BASE1** + Base address of **Page 1**, used only if the PMU supports the **dual‑page extension**. + +How to fill for a new platform: +- Consult the SoC TRM / integration manual for the PMU block. +- If the PMU implements only one page: + - set `BASE1 = 0` + - clear the dual‑page flag. +- If dual‑page is implemented: + - set `BASE1` to the Page‑1 base address. + +--- + +### 4.2 Dual‑page extension flag + +```c +#define PLATFORM_PMU_NODE0_DUAL_PAGE_EXT 0x0 +``` + +This indicates whether the PMU supports the **dual‑page register layout**. + +- `0` → single‑page PMU +- `1` → dual‑page PMU (Page 0 + Page 1) + +This flag must match the hardware implementation; otherwise register access will be incorrect. + +--- + +### 4.3 Node type + +```c +#define PLATFORM_PMU_NODE0_TYPE 0x2 +``` + +The node type defines **which system component** the PMU is associated with. + +Common values include: + +| Value | Meaning | +|------:|--------| +| `0x00` | Memory controller | +| `0x01` | SMMU | +| `0x02` | PCIe root complex | +| `0x03` | ACPI device | +| `0x04` | CPU cache | + +In the RD‑N2 example: +- `0x2` means the PMU is associated with a **PCIe root complex**. + +For a new platform: +- Choose the node type based on **what block generates the PMU events**. +- This choice determines how the ACS interprets standard events and masks. + +--- + +### 4.4 Primary and secondary node instance fields + +```c +#define PLATFORM_PMU_NODE0_PRI_INSTANCE 0x0 +#define PLATFORM_PMU_NODE0_SEC_INSTANCE 0x0 +``` + +These fields **disambiguate which instance** of a component the PMU belongs to. + +Their meaning depends on the node type: + +#### Examples + +- **Memory controller (0x00)** + - Primary instance = memory proximity domain (from SRAT) + - Secondary = 0 + +- **SMMU (0x01)** + - Primary instance = Identifier of the SMMU node in the IORT + - Secondary = 0 + +- **PCIe root complex (0x02)** + - Primary instance = Identifier of the RC node in the IORT + - Secondary = 0 + +- **ACPI device (0x03)** + - Primary instance = device _HID + - Secondary instance = device _UID + +- **CPU cache (0x04)** + - Primary = 0 + - Secondary = Cache ID from the cache topology configuration + +In RD‑N2: +- `PRI_INSTANCE = 0x0` means the PMU is tied to the IORT identifier `0` of the PCIe RC. + +--- + +## 5) Interrupt considerations (conceptual) + +Although not shown in RD N 2 override snippet, PMU nodes may support an **overflow interrupt**: + +- If present: + - the interrupt is described by a GSIV + - flags describe edge/level and wired/MSI +- If absent: + - GSIV is set to 0 + +For a new platform: +- Check whether the PMU supports overflow interrupts. +- If it does: + - use a GSIV that routes to the appropriate interrupt controller, + - ensure the interrupt is not shared incorrectly. +- If it does not: + - set GSIV = 0 and rely on polling. + +--- + +## 6) Reading the RD‑N2 example as a template + +RD‑N2 defines: +- **one PMU node** +- associated with **PCIe root complex 0** +- single‑page PMU +- no secondary instance + +This matches a typical “RC‑level PMU” that counts PCIe traffic or events. + +--- + +## 7) How to extend for a new platform + +### 7.1 Multiple PMUs +If your platform has: +- 2 memory controllers +- 2 PCIe root complexes + +You might have: +```c +#define PLATFORM_OVERRIDE_PMU_NODE_CNT 0x4 +``` + +And then: +- Node0/1 → memory controllers (type 0x00, instances = proximity domains) +- Node2/3 → PCIe RCs (type 0x02, instances = IORT RC IDs) + +--- + +### 7.2 Cache PMUs +If your platform exposes cache‑level PMUs: +- use node type `0x04` +- secondary instance must match the **cache ID** used in your cache topology configuration. + +This ties PMU data directly to a specific cache level or instance. + +--- + +## 8) Minimal template for a new platform + +```c +#define MAX_NUM_OF_PMU_SUPPORTED 512 +#define PLATFORM_OVERRIDE_PMU_NODE_CNT + +/* PMU node 0 */ +#define PLATFORM_PMU_NODE0_BASE0 +#define PLATFORM_PMU_NODE0_BASE1 +#define PLATFORM_PMU_NODE0_TYPE +#define PLATFORM_PMU_NODE0_PRI_INSTANCE +#define PLATFORM_PMU_NODE0_SEC_INSTANCE +#define PLATFORM_PMU_NODE0_DUAL_PAGE_EXT <0_or_1> + +/* PMU node 1 ... */ +``` + +--- diff --git a/docs/baremetal/porting-pal/platform-override-guides/ras.md b/docs/baremetal/porting-pal/platform-override-guides/ras.md new file mode 100644 index 00000000..efe7b169 --- /dev/null +++ b/docs/baremetal/porting-pal/platform-override-guides/ras.md @@ -0,0 +1,272 @@ +# RAS Platform Override Configuration Guide (ACPI RAS Error Nodes) + +This document explains how to populate **RAS-related platform override configuration** for a new Arm-based platform. + +--- + +## 1) What the RAS configuration describes + +The RAS configuration describes **error sources** in the system and tells the OS: + +- Which components can report architectural or implementation-defined errors +- How error records are accessed: + - **System Register (SR) interface**, or + - **MMIO error record groups** +- Which **interrupts** (if any) are used to signal errors +- How error sources are associated with: + - CPUs + - memory controllers + - SMMUs + - interrupt controllers + - or vendor-defined devices + +Each error source is represented as a **RAS node** with: +- node-specific identification data, +- one interface description, +- zero or more interrupt entries. + +--- + +## 2) How this maps to `platform_override.h` macros + +In the platform override header, each RAS node is represented by a **block of macros**. + +Typical macro groupings are: + +1. **Global sizing and counts** +2. **Per-node header fields** +3. **Per-node resource identification** +4. **Per-node interface description** +5. **Per-node interrupt entries** + +Your RD‑N2 example illustrates this flattened representation well. + +```c +#define PLATFORM_OVERRIDE_NUM_RAS_NODES 0x1 +#define PLATFORM_OVERRIDE_NUM_PE_RAS_NODES 0x1 +#define PLATFORM_OVERRIDE_NUM_MC_RAS_NODES 0x0 +``` + +--- + +## 3) Step-by-step approach for a new platform + +### Step 0 — Decide which error sources to expose + +Start from hardware capability and firmware design: + +- Which blocks implement RAS error registers? + - CPUs + - memory controllers + - SMMUs + - interrupt controllers + - vendor-specific logic +- Are error records accessed via: + - system registers, or + - memory-mapped register windows? +- Do errors generate interrupts? + - Fault-handling interrupts + - Error-recovery interrupts + - Wired interrupts or MSIs + +From this, decide: +- how many **RAS nodes** you need, and +- what **type** each node represents (processor, memory, SMMU, etc.). + +--- + +## 4) Global RAS sizing macros + +These macros define overall limits and counts. + +```c +#define RAS_MAX_NUM_NODES 140 +#define RAS_MAX_INTR_TYPE 0x2 +#define PLATFORM_OVERRIDE_NUM_RAS_NODES +#define PLATFORM_OVERRIDE_NUM_PE_RAS_NODES +#define PLATFORM_OVERRIDE_NUM_MC_RAS_NODES +``` + +### How to fill these + +- `RAS_MAX_NUM_NODES` + - Upper bound supported by the override implementation +- `PLATFORM_OVERRIDE_NUM_RAS_NODES` + - Total number of RAS nodes actually populated +- `PLATFORM_OVERRIDE_NUM_PE_RAS_NODES` + - Number of processor-related RAS nodes +- `PLATFORM_OVERRIDE_NUM_MC_RAS_NODES` + - Number of memory controller RAS nodes + +If you do not expose a certain class (e.g. MC), set its count to zero. + +--- + +## 5) Per-node header fields + +Each node has header information that controls layout and interrupt count. + +```c +#define PLATFORM_RAS_NODE0_LENGTH 0x0 +#define PLATFORM_RAS_NODE0_NUM_INTR_ENTRY 0x0 +``` + +### How to interpret these + +- `*_LENGTH` + - Total size (in bytes) of the node, including: + - node identification data + - interface structure + - interrupt array +- `*_NUM_INTR_ENTRY` + - Number of interrupt entries associated with this node + +These values must be consistent with how the firmware actually lays out the table. + +--- + +## 6) Processor-related RAS nodes + +Processor nodes describe error sources associated with CPUs. + +```c +#define PLATFORM_RAS_NODE0_PE_PROCESSOR_ID 0x0 +#define PLATFORM_RAS_NODE0_PE_RES_TYPE 0x0 +#define PLATFORM_RAS_NODE0_PE_FLAGS 0x0 +#define PLATFORM_RAS_NODE0_PE_AFF 0x0 +#define PLATFORM_RAS_NODE0_PE_RES_DATA 0x0 +``` + +### How to fill these fields + +- `PE_PROCESSOR_ID` + - ACPI `_UID` of the processor this node applies to + - If the node represents a **global/shared** processor resource, this field must be 0 + +- `PE_RES_TYPE` + - Identifies which processor resource is being described: + - generic processor error source + - cache + - TLB + - other processor-internal structures + +- `PE_FLAGS` + - Indicates whether the node is: + - per-CPU + - shared across CPUs + - global for the system + +- `PE_AFF` + - Processor affinity descriptor + - Only meaningful for shared resources accessed via system registers + - Must match the architectural affinity encoding used by the hardware + +- `PE_RES_DATA` + - Resource-specific payload + - Used to encode cache level/type, TLB level, or generic processor data + +**Practical guidance** +- Start with one **generic processor node per CPU** +- Add cache/TLB-specific nodes only if finer-grain reporting is required + +--- + +## 7) Interface description (how error records are accessed) + +Each node has exactly one interface description. + +```c +#define PLATFORM_RAS_NODE0_INTF_TYPE 0x0 +#define PLATFORM_RAS_NODE0_INTF_FLAGS 0x0 +#define PLATFORM_RAS_NODE0_INTF_BASE 0x0 +#define PLATFORM_RAS_NODE0_INTF_START_REC 0x1 +#define PLATFORM_RAS_NODE0_INTF_NUM_REC 0x1 +#define PLATFORM_RAS_NODE0_INTF_ERR_REC_IMP 0x0 +#define PLATFORM_RAS_NODE0_INTF_ERR_STATUS 0x0 +#define PLATFORM_RAS_NODE0_INTF_ADDR_MODE 0x0 +``` + +### Field meanings + +- `INTF_TYPE` + - `0x0` – System Register interface + - `0x1` – MMIO interface + +- `INTF_FLAGS` + - Shared interface indication + - Clear-on-read behavior for status registers + +- `INTF_BASE` + - Base address of MMIO error register block + - Must be zero for system register interfaces + +- `INTF_START_REC` + - Index of first error record handled by this node + +- `INTF_NUM_REC` + - Total number of error records associated with this node + +- `INTF_ERR_REC_IMP` + - Bitmap indicating which records are **not implemented** + - Bit = 0 → implemented + - Bit = 1 → not implemented + +- `INTF_ERR_STATUS` + - Bitmap indicating which records support architectural status reporting + +- `INTF_ADDR_MODE` + - Platform-specific selection for record addressing + +**Important** +- The meaning of record indices must be consistent with how firmware exposes records +- Off-by-one errors here commonly break OS discovery + +--- + +## 8) Interrupt entries (error signaling) + +Each node may define zero or more interrupt entries. + +```c +#define PLATFORM_RAS_NODE0_INTR0_TYPE 0x0 +#define PLATFORM_RAS_NODE0_INTR0_FLAG 0x1 +#define PLATFORM_RAS_NODE0_INTR0_GSIV 0x11 +#define PLATFORM_RAS_NODE0_INTR0_ITS_ID 0x0 +``` + +### How to fill interrupt fields + +- `INTR_TYPE` + - `0x0` – fault-handling interrupt + - `0x1` – error-recovery interrupt + +- `INTR_FLAG` + - `0` – edge-triggered + - `1` – level-triggered + +- `INTR_GSIV` + - Global System Interrupt value + - Must be non-zero for wired interrupts + - Must be zero if MSI is used + +- `INTR_ITS_ID` + - Identifier of the interrupt translation group used for MSI delivery + - Must be zero for wired interrupts + +If hardware uses the same interrupt for both fault handling and recovery, define **two interrupt entries with identical signaling parameters**. + +--- + +## 11) What the RD‑N2 example implies + +The RD‑N2 snippet uses placeholder values for many fields. For a real platform you must: + +- Compute real node lengths +- Populate meaningful record ranges and bitmaps +- Use real interrupt wiring or MSI identifiers +- Ensure consistency with: + - CPU topology (MADT) + - IOMMU topology (IORT) + - interrupt routing (GIC/ITS) + +--- diff --git a/docs/baremetal/porting-pal/platform-override-guides/ras2.md b/docs/baremetal/porting-pal/platform-override-guides/ras2.md new file mode 100644 index 00000000..196b62bb --- /dev/null +++ b/docs/baremetal/porting-pal/platform-override-guides/ras2.md @@ -0,0 +1,234 @@ +# RAS2 Platform Override Configuration Guide (for new platforms) + +This guide explains how to populate **RAS2** platform override macros for a new platform. + +RAS2 provides a scalable way for the OS (OSPM) to discover and control platform RAS features **per component instance** (for example, per NUMA proximity domain for memory). The platform supports **either RAS2 or RASF, not both**. + +--- + +## 1) What RAS2 represents at a high level + +RAS2 is an ACPI table that lists **RAS2 PCC descriptors**. Each descriptor ties: +- a **PCC subspace** (defined in **PCCT**) to +- a **RAS feature type** (e.g., Memory) and +- an **Instance identifier** (e.g., proximity domain for memory). + +### Key linkage +- RAS2 **PCC Identifier** → indexes into the **PCCT subspace array** +- RAS2 **Feature Type** → what class of RAS feature this subspace controls (Memory = `0x00`) +- RAS2 **Instance** → which component instance it applies to + - For **Memory RAS features**, Instance **must match the SRAT proximity domain** + +So: **SRAT defines the proximity domains**, and RAS2 uses those IDs to provide **per-domain** RAS controls (scrubbing, address translation, etc.). + +--- + +## 2) How RAS2 maps to the platform override macros + +Override uses “blocks” that conceptually represent RAS2 descriptors (typically one per instance). + +```c +#define RAS2_MAX_NUM_BLOCKS 0x4 +#define PLATFORM_OVERRIDE_NUM_RAS2_BLOCK 0x3 +#define PLATFORM_OVERRIDE_NUM_RAS2_MEM_BLOCK 0x3 +``` + +### Macro meaning +- `RAS2_MAX_NUM_BLOCKS` + - Upper bound supported by your override implementation +- `PLATFORM_OVERRIDE_NUM_RAS2_BLOCK` + - How many RAS2 descriptors you will actually publish +- `PLATFORM_OVERRIDE_NUM_RAS2_MEM_BLOCK` + - How many of those descriptors are for **Memory feature type (0x00)** + +> If you later add vendor-defined feature types (0x80–0xFF), you’d track those similarly (if your override supports per-feature counts). + +--- + +## 3) The core per-block fields to populate + +Each RAS2 descriptor needs to encode the equivalent of: + +- PCC Identifier (PCCT subspace index) +- Feature Type +- Instance + +Override flattens this into macros like: + +```c +#define PLATFORM_OVERRIDE_RAS2_BLOCK0_PROXIMITY 0x0 +#define PLATFORM_OVERRIDE_RAS2_BLOCK0_PATROL_SCRUB_SUPPORT 0x1 +``` + +### Interpretation for Memory feature type +For **Memory RAS features** (Feature Type `0x00`): +- **Instance = Proximity Domain** +- Proximity Domain must match SRAT memory proximity domain definitions + +So the field `*_PROXIMITY` is the **RAS2 Instance** for feature type Memory. + +--- + +## 4) How to fill RAS2 for a new platform (step-by-step) + +### Step 1 — Decide whether you should use RAS2 +Use RAS2 when: +- you want the OS to control RAS features via PCC, and/or +- you need per-instance scaling (e.g., per NUMA domain), and/or +- you want OS-managed memory scrubbing / LA→PA translation services. + +Do **not** publish both RAS2 and RASF. + +--- + +### Step 2 — Identify the RAS2 feature types you will support +The spec defines: +- `0x00` = Memory RAS features +- `0x01–0x7F` reserved +- `0x80–0xFF` vendor-defined + +Most systems begin with **Memory (0x00)** since it’s explicitly defined and ties nicely to SRAT proximity domains. + +--- + +### Step 3 — Determine the “instances” for each feature type + +#### For Memory RAS features (Feature Type 0x00) +Instance **must be** the **SRAT proximity domain** (NUMA domain) for that memory. + +Practical mapping approaches: +- **UMA system**: only one proximity domain (usually 0) + - publish one RAS2 descriptor: Instance = 0 +- **NUMA system**: multiple proximity domains + - publish one RAS2 descriptor per domain you want manageable independently + +**Example** +If SRAT defines memory proximity domains `{0,1,2}`, and you want independent scrub controls per domain: +- publish 3 RAS2 descriptors: + - Instance 0 + - Instance 1 + - Instance 2 + +This matches RD-N2 sample. + +--- + +### Step 4 — Choose / allocate PCC subspaces in PCCT +RAS2 does not define the PCC subspace itself — it references PCCT. + +You must ensure: +- PCCT contains enough subspaces for your RAS2 descriptors +- Each descriptor’s PCC Identifier points to a valid PCCT subspace index +- Each subspace provides a shared memory region with the RAS2 communication layout + +#### Practical guidance +- Most platforms assign **one PCC subspace per instance** (scales cleanly) +- If firmware/SoC only supports one mailbox but multiplexes internally, you *can* share subspaces, but you lose independent concurrency and may violate the “channel dedicated to a given component instance” intent. + +**Rule of thumb:** prefer **dedicated PCC per instance** unless you have a strong reason not to. + +--- + +### Step 5 — Decide what memory features you expose per instance +For Memory RAS features, the spec defines at least: +- `PATROL_SCRUB` (bit 0) +- `LA2PA_TRANSLATION` (bit 1) + +Your override currently has: +```c +#define PLATFORM_OVERRIDE_RAS2_BLOCKn_PATROL_SCRUB_SUPPORT 0x1 +``` + +So for a new platform you should decide per proximity domain whether: +- scrub engine exists +- scrub engine is controllable by OS +- you want to expose it (some platforms keep it firmware-managed) + +#### Recommended bring-up sequence +1. Expose `PATROL_SCRUB` first (if supported) +2. Add `LA2PA_TRANSLATION` once you have correct component scoping and translation correctness + +--- + +## 5) Recommended macro schema for a robust implementation + +Sample shows only proximity + scrub support. For a production-quality override, a per-block set usually needs: + +### A) Descriptor identity +- `BLOCKn_PCC_ID` (PCCT subspace index) +- `BLOCKn_FEATURE_TYPE` (0x00 for memory) +- `BLOCKn_INSTANCE` (proximity domain) + +### B) Feature support bitmap (or per-feature macros) +- patrol scrub supported +- LA→PA supported +- other future bits + +### C) Optional: capability constraints per instance +- min scrub rate +- max scrub rate +- alignment constraints for ranges +- maximum number of parameter blocks + +--- + +## 6) Translating RD-N2 style into a new-platform filling rule + +### Rule 1 — Count fields +- `PLATFORM_OVERRIDE_NUM_RAS2_BLOCK` = number of descriptors you will publish +- `PLATFORM_OVERRIDE_NUM_RAS2_MEM_BLOCK` = how many of those have Feature Type 0x00 + +### Rule 2 — Per-block proximity +For memory blocks: +- `PLATFORM_OVERRIDE_RAS2_BLOCKn_PROXIMITY` = SRAT proximity domain ID + +### Rule 3 — Patrol scrub support +Set: +- `*_PATROL_SCRUB_SUPPORT = 1` if: + - hardware supports scrubbing for that domain, and + - platform is willing to let OS control it +Otherwise set it to 0. + +### Rule 4 — Ensure PCCT is consistent +For each published block: +- there must be a corresponding PCC subspace in PCCT +- the OS must be able to perform “Execute RAS2 Command” on that subspace + +If you publish 3 blocks but only 1 PCCT subspace, you must ensure the PCC identifier mapping is valid and multiplexing is correct (not recommended for independent control). + +--- + +## 7) How the OS will use this (useful for validation) + +OSPM workflow (simplified): +1. Parse RAS2 descriptors +2. For each descriptor, locate PCCT subspace via PCC Identifier +3. Read the RAS2 communication region to discover supported features (bitmap) +4. To invoke a feature: + - set “Set RAS Capabilities” bitmap + - fill the parameter block (e.g., PATROL_SCRUB structure) + - issue PCC Execute command (`0x01`) + +So validation for a new platform typically includes: +- OS can enumerate all instances +- feature bitmap matches what you claim +- parameter blocks work and return expected status +- scrub start/stop commands behave and are properly scoped per proximity domain +- LA→PA translation returns correct physical addresses (when supported) + +--- + +## 9) Quick template: filling RAS2 for memory-only platforms + +If your platform has N proximity domains with memory and supports patrol scrub on all: + +- `PLATFORM_OVERRIDE_NUM_RAS2_BLOCK = N` +- `PLATFORM_OVERRIDE_NUM_RAS2_MEM_BLOCK = N` +- For each domain `d[i]`: + - `BLOCKi_PROXIMITY = d[i]` + - `BLOCKi_PATROL_SCRUB_SUPPORT = 1` + +If only some domains support scrub: +- Set scrub support per-domain accordingly + +--- diff --git a/docs/baremetal/porting-pal/platform-override-guides/srat.md b/docs/baremetal/porting-pal/platform-override-guides/srat.md new file mode 100644 index 00000000..700df305 --- /dev/null +++ b/docs/baremetal/porting-pal/platform-override-guides/srat.md @@ -0,0 +1,309 @@ +# SRAT Platform Configuration Guide (Platform Override Macros) + +This note explains how to populate SRAT-related platform configuration macros (like those used in SBSA/BSA ACS) for a new Arm-based platform. It uses the RD-N2 example as a reference, but the principles are generic. + +--- + +## 1. High-Level SRAT Design Steps + +SRAT describes *system locality* (NUMA / proximity domains) for: + +- Processors (GICC Affinity structures on Arm) +- Memory ranges (Memory Affinity structures) +- Optional: GIC ITS, generic initiators (accelerators, coherent devices) + +For a new platform, decide the following first: + +1. **Number of proximity domains (NUMA nodes)** + - UMA system: usually 1 domain (0). + - NUMA system: one domain per socket / cluster / memory region, as needed. + +2. **CPU → domain mapping** + - For each logical CPU (ACPI Processor UID), decide which proximity domain it belongs to. + +3. **Memory → domain mapping** + - For each contiguous memory range the OS will use, decide which domain it is local to. + +4. **Clock domains (optional)** + - If the OS need not distinguish clock domains, put all CPUs in clock domain 0. + - If there are distinct clock domains (different PLLs/sockets), use different IDs. + +The platform override macros simply encode these choices. + +--- + +## 2. Memory Affinity Structures (Type 1) + +Each **Memory Affinity** entry associates a memory range with a proximity domain and provides flags like Enabled and HotPluggable. + +In ACS-style macros (as in RD-N2): + +```c +#define PLATFORM_OVERRIDE_MEM_AFF_CNT 1 + +#define PLATFORM_SRAT_MEM0_PROX_DOMAIN 0x0 +#define PLATFORM_SRAT_MEM0_FLAGS 0x1 +#define PLATFORM_SRAT_MEM0_ADDR_BASE 0x8080000000 +#define PLATFORM_SRAT_MEM0_ADDR_LEN 0x3F7F7FFFFFF +``` + +For a new platform: + +1. **Enumerate memory regions** + - Use your platform’s memory map. + - Include contiguous DRAM regions visible to the OS. + - Exclude MMIO, reserved firmware, and device regions. + +2. **For each region `i` define:** + - `PLATFORM_SRAT_MEMi_PROX_DOMAIN` + Proximity domain ID (e.g., 0, 1, 2, …) this memory is closest to. + - `PLATFORM_SRAT_MEMi_ADDR_BASE` + Base physical address of this region (64-bit). + - `PLATFORM_SRAT_MEMi_ADDR_LEN` + Length in bytes (64-bit). + - `PLATFORM_SRAT_MEMi_FLAGS` + - Bit 0: Enabled + - Bit 1: HotPluggable + - Bit 2: NonVolatile + For normal non-hotplug DRAM → `0x1` (Enabled only). + +3. **Count of memory affinity entries:** + +```c +#define PLATFORM_OVERRIDE_MEM_AFF_CNT +``` + +**Sanity checks:** + +- No overlapping memory ranges. +- Combined SRAT memory ranges are consistent with the platform memory map. +- Each region’s domain is valid (0 ≤ domain < number of domains). + +--- + +## 3. CPU Affinity via GICC Affinity Structures (Type 3) + +On Arm, the **GICC Affinity** structure maps each processor’s **ACPI Processor UID** to a proximity domain and clock domain. + +Example from RD-N2: + +```c +#define PLATFORM_OVERRIDE_GICC_AFF_CNT 16 + +#define PLATFORM_SRAT_GICC0_PROX_DOMAIN 0x0 +#define PLATFORM_SRAT_GICC0_PROC_UID 0x0 +#define PLATFORM_SRAT_GICC0_FLAGS 0x1 +#define PLATFORM_SRAT_GICC0_CLK_DOMAIN 0x0 + +// ... up to GICC15 ... +``` + +For a new platform: + +1. **Determine ACPI Processor UIDs** + - These come from the MADT GICC structures. + - Typically one ACPI Processor UID per logical CPU: 0, 1, 2, … + +2. **For each logical CPU `n` define:** + - `PLATFORM_SRAT_GICCn_PROC_UID` + Must match the ACPI Processor UID in MADT for that CPU. + - `PLATFORM_SRAT_GICCn_PROX_DOMAIN` + Proximity domain this CPU belongs to. + - UMA: all CPUs → domain 0. + - NUMA: group CPUs by socket/cluster and assign domain 0, 1, … + - `PLATFORM_SRAT_GICCn_FLAGS` + - Bit 0: Enabled. + Use `0x1` for CPUs present at boot. + Use `0x0` for potential future CPUs you want to describe but keep disabled. + - `PLATFORM_SRAT_GICCn_CLK_DOMAIN` + Usually 0 for all CPUs unless you want to distinguish clock domains. + +3. **Number of GICC affinity entries:** + +```c +#define PLATFORM_OVERRIDE_GICC_AFF_CNT +``` + +OSPM requirements: + +- Every CPU started at boot **must** have a corresponding GICC Affinity entry in SRAT (or be associated via `_PXM` if not). +- For hot-added CPUs not described in SRAT, `_PXM` must be used on the processor device (or its ancestor). + +--- + +## 4. Total SRAT Entry Count + +In ACS-style overrides, `PLATFORM_OVERRIDE_NUM_SRAT_ENTRIES` is the total count of structures appended after the SRAT header: + +```c +NUM_SRAT_ENTRIES = + MEM_AFF_CNT // Type 1 memory affinity + + GICC_AFF_CNT // Type 3 GICC affinity + + ITS_AFF_CNT // Type 4 ITS affinity (optional) + + GI_AFF_CNT; // Type 5 generic initiator (optional) +``` + +Example (RD-N2 excerpt): + +- 1 Memory Affinity entry +- 16 GICC Affinity entries +- 0 ITS Affinity entries +- 0 Generic Initiator entries + +→ `PLATFORM_OVERRIDE_NUM_SRAT_ENTRIES = 17` + +For your platform: + +```c +#define PLATFORM_OVERRIDE_NUM_SRAT_ENTRIES (PLATFORM_OVERRIDE_MEM_AFF_CNT + PLATFORM_OVERRIDE_GICC_AFF_CNT + PLATFORM_OVERRIDE_ITS_AFF_CNT + PLATFORM_OVERRIDE_GI_AFF_CNT) +``` + +(If ITS / GI entries are not used, drop those terms.) + +--- + +## 5. Optional: GIC ITS Affinity Structures (Type 4) + +If your platform exposes **GIC ITS** entries in MADT, you can describe their proximity via **ITS Affinity** structures. This helps the OS choose nearby memory when allocating ITS tables and command queues. + +For each ITS: + +- `Proximity Domain` → NUMA node whose memory is closest to the ITS. +- `ITS ID` → must match the `ITS ID` in the MADT GIC ITS entry. + +In macro form you might have something like: + +```c +#define PLATFORM_OVERRIDE_ITS_AFF_CNT 1 + +#define PLATFORM_SRAT_ITS0_PROX_DOMAIN 0x0 +#define PLATFORM_SRAT_ITS0_ITS_ID +``` + +If you do not need to advertise ITS locality, you can omit these entries and set `PLATFORM_OVERRIDE_ITS_AFF_CNT` to 0 (or not define it, depending on the ACS codebase). + +--- + +## 6. Optional: Generic Initiator Affinity (Type 5) + +Use **Generic Initiator Affinity** structures for devices that initiate transactions and should have a defined locality, for example: + +- CXL.mem devices +- Coherent accelerators / GPUs with local HBM +- DMA engines acting as architectural initiators + +Key fields: + +- `Proximity Domain` → NUMA node whose memory is local to the device. +- `Device Handle Type` → ACPI or PCI. +- `Device Handle` → either: + - ACPI: `_HID`, `_UID`, or + - PCI: Segment and BDF (Bus/Device/Function). +- `Flags`: + - Bit 0: Enabled + - Bit 1: Architectural transactions (device adheres to the same memory model as host) + +Example logical mapping (PCI device): + +- PCI device at Segment 0, Bus 0x40, Device 0x1, Function 0. +- Closest to Proximity Domain 2. +- Fully coherent, architectural initiator. + +Conceptually: + +- Proximity Domain = 2 +- Device Handle Type = PCI +- PCI Segment = 0 +- PCI BDF = encoded (Bus=0x40, Dev=0x1, Func=0) +- Flags = `0x3` (Enabled + Architectural transactions) + +If you do not have such devices, simply omit all Type 5 entries. + +--- + +## 7. Worked Example: Two-NUMA-Node Platform + +Assume a new platform with: + +- **8 logical CPUs**, ACPI Processor UIDs 0–7. +- **2 NUMA nodes**: + - Node 0: CPUs 0–3, memory region `[0x0000_0000_8000_0000, 1 GiB)` + - Node 1: CPUs 4–7, memory region `[0x0000_0001_0000_0000, 1 GiB)` +- Single clock domain (0). +- No ITS or generic initiators described in SRAT. + +### Memory Affinity + +```c +#define PLATFORM_OVERRIDE_MEM_AFF_CNT 2 + +/* Memory node 0 */ +#define PLATFORM_SRAT_MEM0_PROX_DOMAIN 0x0 +#define PLATFORM_SRAT_MEM0_FLAGS 0x1 // Enabled +#define PLATFORM_SRAT_MEM0_ADDR_BASE 0x0000000080000000ULL +#define PLATFORM_SRAT_MEM0_ADDR_LEN 0x0000000040000000ULL + +/* Memory node 1 */ +#define PLATFORM_SRAT_MEM1_PROX_DOMAIN 0x1 +#define PLATFORM_SRAT_MEM1_FLAGS 0x1 // Enabled +#define PLATFORM_SRAT_MEM1_ADDR_BASE 0x0000000100000000ULL +#define PLATFORM_SRAT_MEM1_ADDR_LEN 0x0000000040000000ULL +``` + +### GICC Affinity + +```c +#define PLATFORM_OVERRIDE_GICC_AFF_CNT 8 + +/* CPUs 0-3 on domain 0 */ +#define PLATFORM_SRAT_GICC0_PROX_DOMAIN 0x0 +#define PLATFORM_SRAT_GICC0_PROC_UID 0x0 +#define PLATFORM_SRAT_GICC0_FLAGS 0x1 +#define PLATFORM_SRAT_GICC0_CLK_DOMAIN 0x0 + +#define PLATFORM_SRAT_GICC1_PROX_DOMAIN 0x0 +#define PLATFORM_SRAT_GICC1_PROC_UID 0x1 +#define PLATFORM_SRAT_GICC1_FLAGS 0x1 +#define PLATFORM_SRAT_GICC1_CLK_DOMAIN 0x0 + +#define PLATFORM_SRAT_GICC2_PROX_DOMAIN 0x0 +#define PLATFORM_SRAT_GICC2_PROC_UID 0x2 +#define PLATFORM_SRAT_GICC2_FLAGS 0x1 +#define PLATFORM_SRAT_GICC2_CLK_DOMAIN 0x0 + +#define PLATFORM_SRAT_GICC3_PROX_DOMAIN 0x0 +#define PLATFORM_SRAT_GICC3_PROC_UID 0x3 +#define PLATFORM_SRAT_GICC3_FLAGS 0x1 +#define PLATFORM_SRAT_GICC3_CLK_DOMAIN 0x0 + +/* CPUs 4-7 on domain 1 */ +#define PLATFORM_SRAT_GICC4_PROX_DOMAIN 0x1 +#define PLATFORM_SRAT_GICC4_PROC_UID 0x4 +#define PLATFORM_SRAT_GICC4_FLAGS 0x1 +#define PLATFORM_SRAT_GICC4_CLK_DOMAIN 0x0 + +#define PLATFORM_SRAT_GICC5_PROX_DOMAIN 0x1 +#define PLATFORM_SRAT_GICC5_PROC_UID 0x5 +#define PLATFORM_SRAT_GICC5_FLAGS 0x1 +#define PLATFORM_SRAT_GICC5_CLK_DOMAIN 0x0 + +#define PLATFORM_SRAT_GICC6_PROX_DOMAIN 0x1 +#define PLATFORM_SRAT_GICC6_PROC_UID 0x6 +#define PLATFORM_SRAT_GICC6_FLAGS 0x1 +#define PLATFORM_SRAT_GICC6_CLK_DOMAIN 0x0 + +#define PLATFORM_SRAT_GICC7_PROX_DOMAIN 0x1 +#define PLATFORM_SRAT_GICC7_PROC_UID 0x7 +#define PLATFORM_SRAT_GICC7_FLAGS 0x1 +#define PLATFORM_SRAT_GICC7_CLK_DOMAIN 0x0 +``` + +### Total SRAT Entries + +```c +#define PLATFORM_OVERRIDE_NUM_SRAT_ENTRIES (PLATFORM_OVERRIDE_MEM_AFF_CNT + PLATFORM_OVERRIDE_GICC_AFF_CNT) +``` + +This pattern can be adapted directly to your own CPU count, memory map, and NUMA design. + +--- diff --git a/docs/baremetal/porting-pal/platform-override-guides/timers-watchdog.md b/docs/baremetal/porting-pal/platform-override-guides/timers-watchdog.md new file mode 100644 index 00000000..3aa49145 --- /dev/null +++ b/docs/baremetal/porting-pal/platform-override-guides/timers-watchdog.md @@ -0,0 +1,310 @@ +# Platform Override Configuration Guide — Timers and Watchdogs + +This guide explains how to populate **timer** and **watchdog** platform override macros for a new platform, based on the **Generic Timer Description Table (GTDT)** structures and the **RD-N2 reference macros**. + +It focuses on what you need to decide from hardware/firmware, and how those decisions map into the override header fields typically used by SBSA/BSA ACS table generation. + +--- + +## 1) What this configuration describes + +The timer configuration published to the OS includes two groups: + +1. **Per-processor Generic Timers** (interrupts are PPIs): + - Secure EL1 timer + - Non-secure EL1 timer + - EL2 timer + - Virtual EL1 timer + - Virtual EL2 timer (only required if ARMv8.1 VHE is implemented) + +2. **Platform (memory-mapped) timers**: + - **GT Block**: one block can implement up to 8 timer frames (GT0..GT7) + - **Arm Generic Watchdog**: a standardized watchdog block (refresh frame + control frame) + +The OS uses this information during early boot to configure timer interrupts and discover optional platform timers/watchdogs. + +--- + +## 2) Inputs you must collect for a new platform + +Before filling overrides, gather: + +### Per-processor timer interrupts (PPIs) +For each implemented per-CPU timer, collect: +- **GSIV** (maps 1:1 to the PPI interrupt ID) +- **trigger mode** (edge vs level) +- **polarity** (active-high vs active-low) +- **always-on capability** (can it wake the CPU from low-power states reliably?) + +Typically these are platform-wide constants (same GSIV for all CPUs). + +### System counter control/read base +From the system memory map: +- `CntControlBase` (CNTCTLBase) physical address *if* exposed via MMIO to non-secure world. +- `CntReadBase` physical address *if* exposed. +If not provided, firmware uses `0xFFFFFFFFFFFFFFFF` in the table, and OS relies on architectural registers. + +### Platform timers (GT Blocks) +For each GT Block: +- `CntCtlBase` (GT Block physical base) +- Number of frames implemented +For each frame (GT0..GT7 frame x): +- `CntBaseX` physical address +- `CntEL0BaseX` physical address (or all-ones if not present) +- Physical timer GSIV +- Optional virtual timer GSIV (0 if not implemented) +- Flags (mode/polarity + common flags such as secure/non-secure + always-on) + +### Watchdogs +For each watchdog instance: +- Refresh frame base address +- Control frame base address +- GSIV +- Flags (mode/polarity + secure bit) + +### Counter frequency +- `CNTFRQ` (system counter frequency), used by bare-metal components or validation harnesses. + +--- + +## 3) Per-processor timer fields (PPI timers) + +### What the overrides represent +The following overrides map directly to the per-processor timer GSIV/flags fields: + +```c +#define PLATFORM_OVERRIDE_S_EL1_TIMER_GSIV 0x1D +#define PLATFORM_OVERRIDE_NS_EL1_TIMER_GSIV 0x1E +#define PLATFORM_OVERRIDE_NS_EL2_TIMER_GSIV 0x1A +#define PLATFORM_OVERRIDE_VIRTUAL_TIMER_GSIV 0x1B +#define PLATFORM_OVERRIDE_EL2_VIR_TIMER_GSIV 28 +``` + +And flags per timer: + +```c +#define PLATFORM_OVERRIDE_S_EL1_TIMER_FLAGS ((TIMER_POLARITY << 1) | (TIMER_MODE << 0)) +#define PLATFORM_OVERRIDE_NS_EL1_TIMER_FLAGS ((TIMER_POLARITY << 1) | (TIMER_MODE << 0)) +#define PLATFORM_OVERRIDE_NS_EL2_TIMER_FLAGS ((TIMER_POLARITY << 1) | (TIMER_MODE << 0)) +#define PLATFORM_OVERRIDE_VIRTUAL_TIMER_FLAGS ((TIMER_POLARITY << 1) | (TIMER_MODE << 0)) +``` + +### How to fill GSIV values +- These are **GSIVs**, which for ARM timer interrupts correspond to **PPI INTIDs**. +- Use your GIC configuration to confirm the PPI interrupt numbers for: + - CNTPNSIRQ (Non-secure EL1 physical timer) + - CNTPHYSIRQ / CNTHPIRQ equivalents (EL2) + - CNTVIRQ (Virtual EL1) + - CNTPSIRQ (Secure EL1 timer, if exposed) + - CNTVIRQ in EL2-VHE context (Virtual EL2 timer) + +In many systems, the timer PPIs align with architectural defaults, but **do not assume** — verify from SoC integration. + +### How to fill per-processor timer flags +Each timer flags field includes: +- Bit 0: mode (1=edge, 0=level) +- Bit 1: polarity (1=active low, 0=active high) +- Bit 2: always-on capability (for per-processor timers this is in the per-processor flags definition) + +Your macro pack uses: +- `TIMER_MODE` → bit 0 +- `TIMER_POLARITY` → bit 1 + +If your platform also encodes “always-on” in these flags, add it per your implementation (some override headers keep always-on separately). + +**Guidance** +- Most ARM PPIs are **level-triggered** and **active-high** in typical GIC configurations +- Always-on should be set if: + - the timer interrupt can wake the CPU from low-power states and + - the timer context is retained / or re-programmable as required + +--- + +## 4) System counter control base (CNTCTLBase) + +The RD-N2 block uses: + +```c +#define PLATFORM_OVERRIDE_TIMER_CNTCTL_BASE 0x2a810000 +``` + +This corresponds to the GTDT field: +- `CntControlBase Physical Address` + +### How to choose the value +- If the system provides a memory mapped counter control block to non-secure world, set it to that physical address. +- If not provided (common in some designs), the GTDT field is set to `0xFFFFFFFFFFFFFFFF`. + +**Important** +- Don’t confuse CNTCTLBase with CNTBase frames of platform timers. CNTCTLBase is the *counter control block*, not a specific GT frame. + +--- + +## 5) Platform timers — GT Block mapping + +The RD-N2 example uses a schema that represents a GT Block with multiple frames: + +```c +#define PLATFORM_OVERRIDE_PLATFORM_TIMER_COUNT 0x2 +#define PLATFORM_OVERRIDE_TIMER_COUNT 0x2 +``` + +Interpretation depends on your table generator, but typically: +- `PLATFORM_OVERRIDE_PLATFORM_TIMER_COUNT` = number of platform timer structures (GT Blocks + Watchdogs) +- `PLATFORM_OVERRIDE_TIMER_COUNT` = number of GT frames/timers populated under a GT Block (or total frame entries) + +In RD-N2, they populate **2 GT frames** (frame 0 and frame 1). + +### Per-frame fields + +Frame 0: +```c +#define PLATFORM_OVERRIDE_TIMER_FRAME_NUM_0 0 +#define PLATFORM_OVERRIDE_TIMER_CNTBASE_0 0x2a830000 +#define PLATFORM_OVERRIDE_TIMER_CNTEL0BASE_0 0xFFFFFFFFFFFFFFFF +#define PLATFORM_OVERRIDE_TIMER_GSIV_0 0x6d +#define PLATFORM_OVERRIDE_TIMER_VIRT_GSIV_0 0x0 +``` + +Frame 1: +```c +#define PLATFORM_OVERRIDE_TIMER_FRAME_NUM_1 1 +#define PLATFORM_OVERRIDE_TIMER_CNTBASE_1 0x2a820000 +#define PLATFORM_OVERRIDE_TIMER_CNTEL0BASE_1 0xFFFFFFFFFFFFFFFF +#define PLATFORM_OVERRIDE_TIMER_GSIV_1 0x6c +#define PLATFORM_OVERRIDE_TIMER_VIRT_GSIV_1 0x0 +``` + +These map to GT Block Timer Structure fields: +- Frame number +- CntBaseX +- CntEL0BaseX +- physical timer GSIV +- virtual timer GSIV (0 if not present) + +### Flags packing (RD-N2 style) +RD-N2 packs flags into a combined macro: + +```c +#define PLATFORM_OVERRIDE_TIMER_PHY_FLAGS_0 0x0 +#define PLATFORM_OVERRIDE_TIMER_VIRT_FLAGS_0 0x0 +#define PLATFORM_OVERRIDE_TIMER_CMN_FLAGS_0 ((TIMER_IS_ALWAYS_ON_CAPABLE << 1) | (!TIMER_IS_SECURE << 0)) +#define PLATFORM_OVERRIDE_TIMER_FLAGS_0 ((PLATFORM_OVERRIDE_TIMER_CMN_FLAGS_0 << 16) | (PLATFORM_OVERRIDE_TIMER_VIRT_FLAGS_0 << 8) | (PLATFORM_OVERRIDE_TIMER_PHY_FLAGS_0)) +``` + +Interpretation of this packing commonly is: +- Bits [7:0] = Physical timer flags (mode/polarity) +- Bits [15:8] = Virtual timer flags (mode/polarity) +- Bits [31:16] = Common flags (secure + always-on) + +#### How to fill the per-frame timer flags +1. **Physical flags** (mode/polarity): + - Set bit0 = edge/level + - Set bit1 = active-low/high +2. **Virtual flags** (if implemented): + - same encoding; if virtual timer not present, keep GSIV = 0 and flags = 0 +3. **Common flags**: + - Secure bit: 1 for secure timers, 0 otherwise + - Always-on bit: 1 if guaranteed wake/interrupt in low-power states + +**Example decision table** +- Non-secure always-on physical timer frame: + - common: secure=0, always-on=1 +- Secure always-on frame: + - common: secure=1, always-on=1 + +--- + +## 6) Counter frequency + +```c +#define PLATFORM_BM_TIMER_CNTFRQ 0x5F5E100 +``` + +This is the system counter frequency (CNTFRQ) in Hz. +- Ensure it matches what firmware programs into CNTFRQ_EL0. +- ACS and bare-metal environments use this to validate timer behavior and intervals. + +--- + +## 7) Watchdog configuration (Arm Generic Watchdog) + +The watchdog structure provides: +- Refresh frame base +- Control frame base +- GSIV and flags per watchdog timer interrupt + +RD-N2 provides: + +```c +#define PLATFORM_OVERRIDE_WD_TIMER_COUNT 0x2 +#define PLATFORM_OVERRIDE_WD_REFRESH_BASE 0x2A450000 +#define PLATFORM_OVERRIDE_WD_CTRL_BASE 0x2A440000 +#define PLATFORM_OVERRIDE_WD_GSIV_0 0x6E +#define PLATFORM_OVERRIDE_WD_FLAGS_0 ((!WD_IS_SECURE << 2) | (WD_POLARITY << 1) | (WD_MODE << 0)) +#define PLATFORM_OVERRIDE_WD_GSIV_1 0x6F +#define PLATFORM_OVERRIDE_WD_FLAGS_1 ((WD_IS_SECURE << 2) | (WD_POLARITY << 1) | (WD_MODE << 0)) +``` + +### How to fill watchdog base addresses +- `WD_REFRESH_BASE` → RefreshFrame Physical Address +- `WD_CTRL_BASE` → WatchdogControlFrame Physical Address + +These are MMIO base addresses of the standard watchdog blocks. + +### How to fill watchdog GSIVs +- `WD_GSIV_n` = GSIV (SPI/PPI) used by that watchdog instance +- Many implementations use SPIs for watchdogs; confirm from the GIC interrupt map. + +### How to fill watchdog flags +From the watchdog flag definition: +- bit0 = mode (1=edge, 0=level) +- bit1 = polarity (1=active low, 0=active high) +- bit2 = secure timer (1=secure, 0=non-secure) + +RD-N2 uses: +- `WD_MODE` → bit0 +- `WD_POLARITY` → bit1 +- `WD_IS_SECURE` → bit2 (note inversion in one macro; ensure your own logic is consistent) + +**Guidance** +- Unless you have a secure-world-only watchdog, most watchdogs used by the OS are **non-secure**. +- Keep secure bit = 0 unless OS is expected to manage it from secure context (rare in standard OS deployments). + +--- + +## 8) Minimal template (fill-in) +Use this to start a new platform override quickly: + +```c +/* Per-processor timers */ +#define PLATFORM_OVERRIDE_NS_EL1_TIMER_GSIV +#define PLATFORM_OVERRIDE_VIRTUAL_TIMER_GSIV +#define PLATFORM_OVERRIDE_NS_EL2_TIMER_GSIV +#define PLATFORM_OVERRIDE_S_EL1_TIMER_GSIV +#define PLATFORM_OVERRIDE_EL2_VIR_TIMER_GSIV + +#define PLATFORM_OVERRIDE_NS_EL1_TIMER_FLAGS (( << 1) | ( << 0) | ( << 2)) +#define PLATFORM_OVERRIDE_VIRTUAL_TIMER_FLAGS (( << 1) | ( << 0) | ( << 2)) +#define PLATFORM_OVERRIDE_NS_EL2_TIMER_FLAGS (( << 1) | ( << 0) | ( << 2)) + +/* Counter control base */ +#define PLATFORM_OVERRIDE_TIMER_CNTCTL_BASE + +/* Platform GT frames */ +#define PLATFORM_OVERRIDE_TIMER_COUNT + +#define PLATFORM_OVERRIDE_TIMER_FRAME_NUM_0 0 +#define PLATFORM_OVERRIDE_TIMER_CNTBASE_0 +#define PLATFORM_OVERRIDE_TIMER_CNTEL0BASE_0 +#define PLATFORM_OVERRIDE_TIMER_GSIV_0 +#define PLATFORM_OVERRIDE_TIMER_VIRT_GSIV_0 +#define PLATFORM_OVERRIDE_TIMER_FLAGS_0 + +/* Watchdog */ +#define PLATFORM_OVERRIDE_WD_TIMER_COUNT +#define PLATFORM_OVERRIDE_WD_REFRESH_BASE +#define PLATFORM_OVERRIDE_WD_CTRL_BASE +#define PLATFORM_OVERRIDE_WD_GSIV_0 +#define PLATFORM_OVERRIDE_WD_FLAGS_0 (( << 2) | ( << 1) | ( << 0)) +``` diff --git a/docs/baremetal/porting-pal/platform-override-guides/uart.md b/docs/baremetal/porting-pal/platform-override-guides/uart.md new file mode 100644 index 00000000..7202f8c1 --- /dev/null +++ b/docs/baremetal/porting-pal/platform-override-guides/uart.md @@ -0,0 +1,307 @@ +# UART Platform Configuration Guide + +This document explains **how to fill the platform override macros** used by SBSA ACS-style firmware code to describe a **console UART** override table. + +--- + +## 1) What override is used for + +The override table tells the OS **which serial port** firmware used for early console / redirection, and **how to program it** (base address, interrupt routing, baud rate hints, etc.). + +If override is correct, you typically get: +- early boot logs on the same UART firmware uses, +- a reliable serial console even before a full-featured driver loads. + +If this table is wrong, you typically see: +- no output on expected UART, +- output but with garbage (wrong baud/clock), +- interrupts misrouted (if interrupt-driven console is used). + +--- + +## 2) Typical platform override mapping + +RD N2 sample macros: + +```c +#define UART_ADDRESS 0xF98DFE18 +#define BASE_ADDRESS_ADDRESS_SPACE_ID 0x0 +#define BASE_ADDRESS_REGISTER_BIT_WIDTH 0x20 +#define BASE_ADDRESS_REGISTER_BIT_OFFSET 0x0 +#define BASE_ADDRESS_ADDRESS_SIZE 0x3 +#define BASE_ADDRESS_ADDRESS 0x2A400000 +#define INTERFACE_TYPE 8 +#define UART_IRQ 0 +#define UART_BAUD_RATE 0x7 +#define UART_BAUD_RATE_BPS 115200 +#define UART_CLK_IN_HZ 24000000 +#define UART_GLOBAL_SYSTEM_INTERRUPT 0x70 +#define UART_PCI_DEVICE_ID 0xFFFF +#define UART_PCI_VENDOR_ID 0xFFFF +#define UART_PCI_BUS_NUMBER 0x0 +#define UART_PCI_DEV_NUMBER 0x0 +#define UART_PCI_FUNC_NUMBER 0x0 +#define UART_PCI_FLAGS 0x0 +#define UART_PCI_SEGMENT 0x0 +``` + +Conceptually this set covers: + +- **Base Address (GAS)** + → `BASE_ADDRESS_*` macros +- **Interface Type** + → `INTERFACE_TYPE` +- **Interrupt routing** + → `UART_IRQ` and `UART_GLOBAL_SYSTEM_INTERRUPT` (and the Interrupt Type mask in implementation) +- **Baud/clock fields** + → `UART_BAUD_RATE`, `UART_BAUD_RATE_BPS`, `UART_CLK_IN_HZ` +- **PCI identity fields (optional)** + → `UART_PCI_*` (usually all-FFFF / 0 for MMIO UART) + +> **Note:** `UART_ADDRESS` is often *not* a field itself; it is commonly used by platform code as the UART base to program/debug and may be used to derive GAS base address. Treat it as “platform’s UART register base” unless your codebase defines it differently. + +--- + +## 3) How to fill each field for a new platform + +### 3.1 `BASE_ADDRESS_*` — “Base Address” Generic Address Structure (GAS) + +The offset 40 provides a **12-byte GAS** describing where UART registers live. + +Your macros correspond to the GAS members: + +| GAS Field | Meaning | Platform Macro | +|---|---|---| +| Address Space ID | MMIO vs I/O space | `BASE_ADDRESS_ADDRESS_SPACE_ID` | +| Register Bit Width | register access width | `BASE_ADDRESS_REGISTER_BIT_WIDTH` | +| Register Bit Offset | bit offset within access | `BASE_ADDRESS_REGISTER_BIT_OFFSET` | +| Access Size | 1/2/3/4/… byte access | `BASE_ADDRESS_ADDRESS_SIZE` | +| Address | base address | `BASE_ADDRESS_ADDRESS` | + +#### (A) `BASE_ADDRESS_ADDRESS_SPACE_ID` +Use: +- `0x0` = **System Memory** (MMIO) → *most Arm SoCs* +- `0x1` = **System I/O** (x86 legacy IO ports like COM1 0x3F8) + +For almost all modern Arm platforms: **set `0x0`.** + +#### (B) `BASE_ADDRESS_ADDRESS` +This is the **UART register block base** used by firmware for console. + +How to obtain: +- SoC TRM / platform memory map (UART controller base) +- device tree used in pre-ACPI environments (serial node base) +- bootloader debug config (often “earlycon” address) + +**Must match the UART instance actually used for console.** + +#### (C) `BASE_ADDRESS_REGISTER_BIT_WIDTH` +This expresses the typical register access width the OS should use. + +Common choices: +- `0x20` (32-bit) for MMIO UARTs with 32-bit registers +- `0x08` (8-bit) for byte-register UARTs / 16550-like layouts + +**Rule of thumb:** pick the native register width of the UART register interface that firmware uses. + +#### (D) `BASE_ADDRESS_REGISTER_BIT_OFFSET` +Usually `0x0`. + +Only change if the UART registers are not aligned at bit 0 within the access. That’s rare. + +#### (E) `BASE_ADDRESS_ADDRESS_SIZE` +This describes the access size encoding used by GAS (ACPI-defined): +- `0` = undefined +- `1` = byte access +- `2` = word (16-bit) +- `3` = dword (32-bit) +- `4` = qword (64-bit) + +For a 32-bit MMIO UART interface: **use `0x3`.** +For an 8-bit register interface: **use `0x1`.** + +--- + +### 3.2 `INTERFACE_TYPE` — SPCR “Interface Type” +Offset 36 selects the UART programming model. + +For revision 2+, it refers to **DBG2 Serial Port Subtypes** (Table 3 of DBG2 spec). In many Arm server platforms: +- `8` commonly indicates **ARM PL011 UART** subtype. + +How to fill for a new platform: +1. Identify your UART IP: + - ARM PL011? SBSA generic UART? 16550-compatible? vendor-specific? +2. Map it to the appropriate DBG2 serial port subtype value expected by your firmware/OS ecosystem. +3. Use that numeric value in `INTERFACE_TYPE`. + +**Pitfall:** If you put `8` but your UART is 16550-compatible, OS may program it incorrectly. + +--- + +### 3.3 Interrupt routing fields + +This describes interrupt usage via: +- Interrupt Type bitmask (IRQ vs GSI, and which controller model), +- IRQ number (8259 legacy only), +- Global System Interrupt (GSIV) for APIC/SAPIC/GIC/PLIC etc. + +Your override set includes: +- `UART_IRQ` +- `UART_GLOBAL_SYSTEM_INTERRUPT` + +#### (A) `UART_IRQ` +This is only meaningful if the platform uses a PC-AT 8259-style IRQ routing (legacy x86). + +For Arm server platforms: +- set `UART_IRQ = 0` (placeholder / unused) unless your code explicitly requires otherwise. + +#### (B) `UART_GLOBAL_SYSTEM_INTERRUPT` +This is the UART’s **GSIV** if you intend to advertise interrupt-driven console. + +How to obtain: +- From platform interrupt map (GIC SPI number used by UART) +- From device tree interrupt spec (SPI ID) +- From GIC integration documentation + +**Important constraints for Arm GIC in SPCR:** +- GSIV must **not** be in `{0..31}` (SGI/PPI range) +- and must **not** be in `{1056..1119}` (reserved/forbidden range as noted) + +In practice, UART console interrupt is usually an **SPI** ≥ 32. + +#### (C) Interrupt Type bitmask (often inside platform code) +You didn’t show a macro for “Interrupt Type”, but your platform code likely sets it based on architecture. + +For an Arm GIC-based system: +- set the **ARM GIC** bit (typically Bit[3]) in Interrupt Type. +- If your implementation supports polled console only, set Interrupt Type = 0. + +**Recommended:** If your firmware/OS uses polling for early console, it is acceptable to set Interrupt Type = 0 and rely on base address + interface type. + +--- + +### 3.4 Baud rate + UART clock fields + +Provides: +- Configured Baud Rate (enumerated) +- Precise Baud Rate (exact) +- UART Clock Frequency (revision-dependent) + +Your macros include: +- `UART_BAUD_RATE` +- `UART_BAUD_RATE_BPS` +- `UART_CLK_IN_HZ` + +#### (A) `UART_BAUD_RATE` +This is the “Configured Baud Rate” enumerated field. + +Common values: +- `0x7` = 115200 +- `0x6` = 57600 +- `0x4` = 19200 +- `0x3` = 9600 +- `0x0` = “as-is” (OS assumes UART already configured by firmware) + +**Guidance for new platform:** +- If firmware programs UART to a standard baud and you want OS to keep it: set `UART_BAUD_RATE = 0` (“as-is”). +- If you want to explicitly declare 115200: set `UART_BAUD_RATE = 0x7`. + +#### (B) `UART_BAUD_RATE_BPS` +This is typically a *platform helper macro* (not a raw field) used by code to program the UART or to fill “Precise Baud Rate”. + +- If your implementation uses “Precise Baud Rate”, put the exact integer rate here (e.g., `115200`). +- If it does not, still keep it consistent with `UART_BAUD_RATE`. + +#### (C) `UART_CLK_IN_HZ` +For revision 3+, UART clock frequency can be supplied (in Hz) if known. + +Set to: +- the UART reference clock used for divisor generation (e.g., `24000000` for 24 MHz), +- or `0` if indeterminate and you do not want to specify it. + +How to obtain: +- SoC clock tree documentation +- firmware clock configuration for that UART instance +- device tree `clock-frequency` property (if known correct) + +**Pitfall:** Wrong UART clock leads to wrong divisor calculations → garbled output. + +--- + +### 3.5 PCI identity fields (only if UART is a PCI function) + +Table allows describing a UART that lives behind PCI: +- `UART_PCI_DEVICE_ID`, `UART_PCI_VENDOR_ID` +- BDF: `UART_PCI_BUS_NUMBER`, `UART_PCI_DEV_NUMBER`, `UART_PCI_FUNC_NUMBER` +- `UART_PCI_FLAGS` +- `UART_PCI_SEGMENT` + +For MMIO UART (SoC-integrated): +- set `UART_PCI_DEVICE_ID = 0xFFFF` +- set `UART_PCI_VENDOR_ID = 0xFFFF` +- set bus/dev/func = 0 +- set flags = 0 +- segment = 0 + +For PCI UART: +- fill in Vendor/Device ID from PCI config space, +- fill BDF of the UART function, +- segment is your PCI segment (0 for most systems). + +**Rule:** If it is not a PCI device, **Vendor/Device must be `0xFFFF`.** + +--- + +## 4) Minimal checklist for a new platform + +Use this as a bring-up checklist: + +1. **Pick the console UART instance** + - Confirm which UART firmware uses for console (UEFI debug / early prints). +2. **Base Address GAS** + - AddressSpaceID = MMIO (0) + - Address = UART base + - AccessSize matches register width +3. **Interface Type** + - Match actual UART IP (PL011 vs 16550 vs other) +4. **Interrupt fields** + - If polling console: InterruptType=0 (in table build code), GSIV optional + - If interrupt-driven: set GSIV to UART SPI (>= 32) +5. **Baud + Clock** + - Either “as-is” OR declare 115200 explicitly + - If providing clock, ensure the real UART functional clock frequency +6. **PCI fields** + - MMIO UART → Vendor/Device = 0xFFFF + +--- + +## 5) Example template to adapt for a new platform + +```c +/* SPCR / UART console platform config */ + +#define BASE_ADDRESS_ADDRESS_SPACE_ID 0x0 /* 0=MMIO */ +#define BASE_ADDRESS_REGISTER_BIT_WIDTH 0x20 /* 0x20 for 32-bit regs, 0x08 for 8-bit regs */ +#define BASE_ADDRESS_REGISTER_BIT_OFFSET 0x0 +#define BASE_ADDRESS_ADDRESS_SIZE 0x3 /* 3=dword access */ +#define BASE_ADDRESS_ADDRESS 0x /* e.g., 0x2A400000 */ + +#define INTERFACE_TYPE /* e.g., 8 for PL011 */ + +#define UART_IRQ 0x0 /* usually unused on Arm */ +#define UART_GLOBAL_SYSTEM_INTERRUPT 0x /* must be >= 32 if used */ + +#define UART_BAUD_RATE 0x0 /* 0=as-is, or 0x7=115200 */ +#define UART_BAUD_RATE_BPS 115200 /* if used by your build */ +#define UART_CLK_IN_HZ 0 /* or if known */ + +#define UART_PCI_DEVICE_ID 0xFFFF /* non-PCI UART */ +#define UART_PCI_VENDOR_ID 0xFFFF +#define UART_PCI_BUS_NUMBER 0x0 +#define UART_PCI_DEV_NUMBER 0x0 +#define UART_PCI_FUNC_NUMBER 0x0 +#define UART_PCI_FLAGS 0x0 +#define UART_PCI_SEGMENT 0x0 +``` + +---