Implement Diagnostic Fault Library with basic DFM and SOVD interface#5
Implement Diagnostic Fault Library with basic DFM and SOVD interface#5bburda42dot wants to merge 9 commits intoeclipse-opensovd:mainfrom
Conversation
Migrate from single-crate layout to multi-crate workspace with Bazel 8.3 + Cargo dual build system. Add xtask runner for common development commands.
IPC-safe types (IpcDuration, IpcTimestamp), fault descriptors, catalog configuration, debounce/enabling condition config, query protocol definitions, and iceoryx2 service types.
Fault reporter API, IPC worker with exponential backoff retry, fault catalog validation, enabling condition management, and FaultManagerSink for iceoryx2 transport.
SOVD-compliant fault manager with KVS persistent storage, aging manager, operation cycle tracking, fault record processor, and query server with iceoryx2 IPC transport.
E2E tests covering lifecycle transitions, debounce/aging/cycles, persistent storage, concurrent access, boundary values, error paths, multi-catalog, JSON catalog loading, IPC query/clear, and report-and-query flow.
Workflows: build/test, clippy lint, rustfmt, miri, coverage, copyright header check, cargo audit (pinned to SHA), Bazel format check. All workflows set permissions: contents: read.
…rence Architecture overview, fault catalog/reporter/DFM sequence diagrams, library architecture drawing, Sphinx docs scaffold, and HVAC component design reference example.
|
@bburda42dot Just wanted to know,Why was this PR not started on top of the Initial commit in #4 from Qorix and started from scratch and moved all the files here , when it says continuation from #4? |
@vinodreddy-g I did start on top of Qorix's initial commit from #4 - this PR is a direct continuation of that work. On top of the original ~4.9k lines, I added 63 commits (21k+ lines added, ~800 removed) with significant changes and improvements. The resulting 64-commit history was hard to review as-is, so before opening this PR I squashed them all into a cleaner, logically grouped commit history specifically to enable commit-by-commit review. That squash is why the git history may look like it was started from scratch, but the code lineage traces directly back to #4. If proper attribution is important to you, feel free to point out which parts of the current code originate from the original PR and I can add Co-Authored-By to the relevant commits. |
@bburda42dot ok so you split/changed the initial commit for easy review and added a lot of changes offcourse. Could you update also the design changes/add in the svg/puml files to follow the new changes easily from #4 . |
|
To 4. we should start with what we have now (iceoryx2) later we can evaluate the migration to mw::com. For the artifacts potential next step(not now) could be using sphinx needs |
@vinodreddy-g Thanks for the detailed questions. These changes weren't discussed in OpenSOVD architecture meetings - they follow from the design doc requirements and the code review feedback on #4. Happy to discuss any of them in the next Architecture meeting if needed. I've updated all diagrams in the latest force-push, so you can follow the design changes visually. Here's the breakdown: 1. Fault catalog ( Core idea is the same - builder pattern, SHA-256 hash verification with DFM, decentral catalogs. Main change: the original diagram had an
2. Interfaces between lib and DFM ( That diagram showed I removed On DFM side: I added 3. Fault doesn't exist in catalog ( The #4 code had 4. iceoryx2 vs mw::com Agreed with @FScholPer - iceoryx2 for now, evaluate mw::com migration later. The transport is now isolated behind traits on both sides ( 5. Diagram updates All diagrams are now up to date in
|
lh-sag
left a comment
There was a problem hiding this comment.
I have stopped reviewing for now.
@bburda42dot will you wait until you got enough feedback or do you want to fix the mentioned issues asap?
.github/ISSUE_TEMPLATE/bug_fix.md
Outdated
| @@ -0,0 +1,11 @@ | |||
| --- | |||
There was a problem hiding this comment.
You override the organization wide templates defined here https://github.com/eclipse-opensovd/.github
Is this intended? I would rather have a consistent view.
There was a problem hiding this comment.
Didn't know about it, took an inspiration form S-CORE module template. Will fix it.
There was a problem hiding this comment.
Fixed, removed the repo-level issue templates entirely.
.github/workflows/coverage.yml
Outdated
| - name: Install cargo-llvm-cov | ||
| uses: taiki-e/install-action@cargo-llvm-cov | ||
|
|
||
| - name: Cache cargo registry & target |
There was a problem hiding this comment.
Could you please use https://github.com/Swatinem/rust-cache instead of manually rolling out rust caching?
You might also consider https://github.com/actions-rust-lang/setup-rust-toolchain which picks up the rust version from the rust-toolchain avoiding to maintain multiple versions across CI and local builds. And comes with builtin caching support.
There was a problem hiding this comment.
Fixed, replaced dtolnay/rust-toolchain + actions/cache with actions-rust-lang/setup-rust-toolchain@v1 in all workflow files.
| @@ -0,0 +1,37 @@ | |||
| # ******************************************************************************* | |||
| # Copyright (c) 2025 Contributors to the Eclipse Foundation | |||
There was a problem hiding this comment.
There are references to 2025 in the copyright header. Please use 2026 since this is a new project.
There was a problem hiding this comment.
Right, some files were from #4 so I did not change the year, but I will fix it.
There was a problem hiding this comment.
Ah right, I couldn't change Bazel BUILD file's copyright year, because the s-core bazel copyright check was not passing with 2026 set when I ran it on the CI.
There was a problem hiding this comment.
Yup, setting 2026 in Bazel's files won't work because score_cr_checker is not allowing to do so.
There was a problem hiding this comment.
Fixed all non-Bazel files to 2026.
| @@ -0,0 +1,221 @@ | |||
| /* | |||
| * Copyright (c) 2026 The Contributors to Eclipse OpenSOVD (see CONTRIBUTORS) | |||
There was a problem hiding this comment.
You have a mixture of various copyright banner.
I would prefer this one everywhere:
PDX-FileCopyrightText: Copyright (c) 2026 Contributors to the Eclipse Foundation
SPDX-License-Identifier: Apache-2.0
But please double check.
There was a problem hiding this comment.
I will fix all copyrights in new files to follow correct one and use 2026 year.
There was a problem hiding this comment.
Fixed, unified the header to the standard format used across all .rs files.
@lh-sag I think that we will end up with some follow up issues anyway, but it would be great to get enough feedback before merge. The only thing is "how much feedback is enough?" :) |
- Remove repo-level issue templates (.github/ISSUE_TEMPLATE/) to use org-wide templates from eclipse-opensovd/.github - Replace dtolnay/rust-toolchain + actions/cache with actions-rust-lang/setup-rust-toolchain@v1 (reads rust-toolchain.toml, built-in caching) in all 5 workflow files - Fix copyright year to 2026 in lint.yml, .gitignore, docs/conf.py, docs/index.rst, docs/design/design.md - Unify copyright banner in hvac_component_design_reference.rs to match the xtask-enforced format used in all .rs files
- Remove repo-level issue templates (.github/ISSUE_TEMPLATE/) to use org-wide templates from eclipse-opensovd/.github - Replace dtolnay/rust-toolchain + actions/cache with actions-rust-lang/setup-rust-toolchain@v1 (reads rust-toolchain.toml, built-in caching) in all 5 workflow files - Fix copyright year to 2026 in lint.yml, .gitignore, docs/conf.py, docs/index.rst, docs/design/design.md - Unify copyright banner in hvac_component_design_reference.rs to match the xtask-enforced format used in all .rs files
- Remove repo-level issue templates (.github/ISSUE_TEMPLATE/) to use org-wide templates from eclipse-opensovd/.github - Replace dtolnay/rust-toolchain + actions/cache with actions-rust-lang/setup-rust-toolchain@v1 (reads rust-toolchain.toml, built-in caching) in all 5 workflow files - Fix copyright year to 2026 in lint.yml, .gitignore, docs/conf.py, docs/index.rst, docs/design/design.md - Unify copyright banner in hvac_component_design_reference.rs to match the xtask-enforced format used in all .rs files
- Remove repo-level issue templates (.github/ISSUE_TEMPLATE/) to use org-wide templates from eclipse-opensovd/.github - Replace dtolnay/rust-toolchain + actions/cache with actions-rust-lang/setup-rust-toolchain@v1 (reads rust-toolchain.toml, built-in caching) in all 5 workflow files - Fix copyright year to 2026 in lint.yml, .gitignore, docs/conf.py, docs/index.rst, docs/design/design.md - Unify copyright banner in hvac_component_design_reference.rs to match the xtask-enforced format used in all .rs files
We can decide next weeks architecture meeting how to proceed. I will try to provide some feedback incrementally, whenever I have some spare time. |
.github/workflows/build_test.yml
Outdated
| uses: actions-rust-lang/setup-rust-toolchain@v1 | ||
|
|
||
| - name: Build all crates | ||
| run: cargo build --workspace |
There was a problem hiding this comment.
Could you please add --locked to all cargo commands which support it?
This will speed up the build pipeline and allows for better reproducibility.
.github/workflows/copyright.yml
Outdated
| # SPDX-License-Identifier: Apache-2.0 | ||
| # ******************************************************************************* | ||
|
|
||
| name: Copyright Check |
There was a problem hiding this comment.
suggestion: lint, coverage and copyright are available in https://github.com/eclipse-opensovd/cicd-workflows which should be re-used and extended if possible, so we don't re-invent pipelines for all repos.
It would be great to add missing things to the cicd repo instead, so all opensovd projects can profit.
There was a problem hiding this comment.
Just saw that this is using bazel for copyright check. While I do get why bazel is used in the fault lib, I still believe it makes sense to align the CI approaches in opensovd.
| env: | ||
| COVERAGE_THRESHOLD: 90 | ||
|
|
||
| jobs: |
There was a problem hiding this comment.
suggestion: coverage isn't available in https://github.com/eclipse-opensovd/cicd-workflows yet. Would be great to move it there, so it can be shared among all repos.
| @@ -0,0 +1,344 @@ | |||
| // Copyright (c) 2026 Contributors to the Eclipse Foundation | |||
There was a problem hiding this comment.
question: the copyright header differs from cda and proxy. Imho it would be a good idea to align them. (Would come for free when re-using ci/cd repo)
| unsafe_code = "forbid" | ||
|
|
||
| [workspace.lints.clippy] | ||
| todo = "deny" |
There was a problem hiding this comment.
suggestion: Again ci nag, in the architecture group a couple of week ago we aligned on common linting rules: https://github.com/eclipse-opensovd/cicd-workflows/tree/main/shared-lints
I.e. pedantic and index_slicing are missing here.
Adding todo to the shared rules makes sense imho.
The std_instead_of. can be kept here as they mostly make sense for the fault lib.
Cargo.toml
Outdated
| env_logger = "0.11.8" | ||
| iceoryx2 = { git = "https://github.com/eclipse-iceoryx/iceoryx2.git", rev = "eba5da4b8d8cb03bccf1394d88a05e31f58838dc" } | ||
| iceoryx2-bb-container = { git = "https://github.com/eclipse-iceoryx/iceoryx2.git", rev = "eba5da4b8d8cb03bccf1394d88a05e31f58838dc" } | ||
| log = "0.4.22" |
There was a problem hiding this comment.
question: not aligned yet, but does it make sense to using 'tracing' everywhere instead of logging, for example integration into dlt is only available via 'tracing' at the moment, which is probably interesting for fault lib as well.
rustfmt.toml
Outdated
| # check configuration fields here: https://rust-lang.github.io/rustfmt/?version=v1.6.0&search= | ||
|
|
||
|
|
||
| tab_spaces = 4 |
There was a problem hiding this comment.
suggestion: Would be cool make this common as well. i.e. cda is only using max_width = 100 and no tab_spaces. Maybe worth discussing in the next architecture round.
Also new line at EOF missing. (Last CI nag I promise: The ci/cd workflows also check for that)
|
|
||
| create IpcWorker | ||
| FaultLib -> IpcWorker : spawn thread + create\niceoryx2 publisher ("dfm/event") +\nEC notification subscriber | ||
| FaultLib -> FaultLib : create iceoryx2 subscriber\n("dfm/event/hash/response") |
There was a problem hiding this comment.
@bburda42dot So currently, fault lib fully relies on iceoryx2, is that correct? But when integrating OpenSOVD into S-CORE, we would have to use S-CORE's LoLa implementation instead. Has that already been considered in the current design?
There was a problem hiding this comment.
@stmuench No. fault-lib does not fully rely on Iceoryx2. The design already anticipates alternative IPC backends.
There are three abstraction traits that decouple the core logic from the transport layer:
FaultSinkApi(reporter/app side: Abstracts how fault events are sent to the DFM. The production implementation uses iceoryx2, but any backend can implement this trait. The docs even explicitly note: “Implementations can be S-CORE IPC.”DfmTransport(DFM run-loop side): Abstracts how the DFM receives events and publishes responses. The default isIceoryx2Transport, but the trait is designed for pluggable backends (e.g., “implementations can use shared-memory IPC, in-process channels, or any other messaging backend”).DfmQueryApi(query/clear side): Abstracts SOVD fault query/clear, with both in-process and IPC implementations.
This means that integrating S-CORE’s LoLa would mean providing new implementations of these three traits. The core fault-management logic (catalogs, debouncing, aging, SOVD state machine) would not need to change.
… feedback - Reformat codebase with max_width=100 (aligned with CDA) - Enable clippy pedantic lints from shared-lints baseline - Add clippy.toml (too-many-lines-threshold=130, allow-unwrap-in-tests) - Fix clippy warnings: wildcard imports, redundant closures, casts, literals - Migrate log/env_logger to tracing/tracing-subscriber - Replace lint.yml, format.yml, copyright.yml with pr-checks.yml using cicd-workflows rust-lint-and-format-action (dual nightly: pinned strict + latest advisory) - Add pre-commit.yaml using cicd-workflows pre-commit-action - Migrate build_test.yml, coverage.yml, miri.yml to cicd-workflows reusable workflows - Add --locked to all CI cargo commands - All workflows reference eclipse-opensovd/cicd-workflows@main - Remove repo-level PR templates (use org-wide from eclipse-opensovd/.github)
|
|
||
| jobs: | ||
| build-and-test: | ||
| uses: eclipse-opensovd/cicd-workflows/.github/workflows/build-and-test.yml@main |
There was a problem hiding this comment.
nitpick: I would recommend pinning this to a SHA instead, to prevent breaking builds here accidentally when something incompatible in merged to main in cicd (applies to all @main )
Summary
Complete implementation of the Diagnostic Fault Library - a Rust library for managing diagnostic fault reporting, processing, and querying in Software-Defined Vehicles. Replaces the initial scaffold (
src/lib.rs,api.rs,catalog.rs, etc.) with a production-grade multi-crate workspace aligned with the S-CORE module template.What changed
Architecture - multi-crate workspace
common- shared types:FaultId,FaultRecord,FaultCatalog,DebounceMode, IPC service types, compliance tagsfault_lib- reporter-side API:Reporterwith debounce filtering, enabling-condition guards,IpcWorkerwith retry queue (exponential backoff),LogHookobservability,FaultManagerSinkdfm_lib- Diagnostic Fault Manager:FaultRecordProcessor,AgingManager,SovdFaultManagerwith KVS-backed storage,EnablingConditionRegistry,OperationCycleprovider abstractionxtaskcrate for developer automationsrc/lib.rs,src/api.rs,src/model.rs,src/catalog.rs,src/config.rs,src/ids.rs,src/sink.rs,src/utils.rs)Features
CountWithinWindow,HoldTime,EdgeWithCooldown,CountThresholdmodesFaultIdvariant support (Numeric/Text/Uuid)Box::leakwithCow<str>, bounded channelsSafety & quality
#[deny(clippy::unwrap_used)]enforced in runtime code - alltodo!(),expect(), andunwrap()replaced with proper error handlingTODOcomments replaced with documented error pathstests/integration/) covering lifecycle transitions, multi-catalog scenarios, persistent storage, and report-query flowsProject structure alignment
.bazelrc,MODULE.bazel,BUILDfiles for Bazel 8 support.vscode/settings.jsonandextensions.jsonfor development environment.ruff.toml,.yamlfmt,rustfmt.tomlfor formatting consistencyREADME.mdwith architecture overview, getting started, and examplesChecklist
Related
This work is continuation of #4
Notes for Reviewers
Code is quite large, so it is better to review commit by commit. I split them into categories: "common", "fault-lib", "dfm" etc.