Update trustable tenet TT-EXPECTATIONS#22
Update trustable tenet TT-EXPECTATIONS#22ThomasClausnitzer wants to merge 2 commits intoeclipse-score:mainfrom
Conversation
This commit ports the content of score-json/nlohmann_json (#24) onto a fresh upstream/main baseline. Why this is one commit: - The downstream and upstream histories diverged significantly. - Replaying the full original chain would cause many unrelated conflicts. - This keeps the upstream PR diff clean and reviewable while preserving provenance below. Combined original commits from downstream branch: - 18e1f0d Halnasri resolve tt confidence feedback (#21) - caefeae Resolve TT-CONSTRUCTION Feedback (#23) - 28c4ccc Erikhu1 add missing links (#25) - 748b55a Resolve TT-PROVENANCE Feedback (#14) - df946d2 halnasri-Revisit TT-RESULTS (#17) - 6ec3d20 added TA-Releases -> JLS-53 link (#27) - ad3d3e9 Erikhu1 sync with prod (#31) - 2ddacb3 Removed multiple validators from statements by splitting them up (#35) - 849f855 Erikhu1 sync with prod (#38) - 8f0d8cd Resolve TA-Analysis comments (#39) - afccbfb started with TA-Misbehaviours - 3ba9098 Filled out Checklist for TA-Behaviours - 8c9508f filled out checklist for TA-Misbehaviours - df823a3 added additional context information - b6a6227 added checklist answers for TA-Behaviours and TA-Constraints - c11a978 cosmetic changes to TA-BEHAVIOURS - 693adb1 small updates to TA-Misbehaviours - 544f6d0 created JLS-54 to JLS-60 - 20ff2b7 cosmetic change to JLS-54 - 379c542 update - 65bc51c improved checklist and evidence for TA-Behaviour - e181ee7 worked on TA-Misbehaviours - bf67eec added JLS-70 and JLS-71 - 1baed11 added JLS-72 - b5895e7 Worked on Context file of TA-Constraints - 603a563 worked on misbehaviours context file - 225e46a halnasri-Revisit_TT_INDICATORS (#28) - 03d7107 remove references to parent workflow for JLS-54 and JLS-55 - a5a6349 Added TODOs/Comments to TA-Behaviours and TA-Constraints Context files - c633504 created JLS-56 - 4d0ded3 resolved TODOs in TA-Behaviours - 3ce7f40 edited JLS 71 - 70cfb5f JLS-70 and JLS-72 - c19d526 documented misunderstandings - 14dc8ad added https validator to JLS 70 and 71 - e153c07 added JLS-73 - 08ae742 small fix - ff1df9e added references to JLS 24 and 31 - 1282cee edited zthe checklist of TA-Misbehaviour - 025d035 added https validator to JLS-24 - 1695462 typos - 00a8c81 fix pr count gate and coverage gate and add check_artifact_exists evidence to statements JLS-54 and JLS-55 - e9f9109 edited artifact output for the coverage gate - ee4b7a7 version control for PR and coverage gate workflow - b919c7d fix file naming in workflows - af24784 fix answers to the checklist of TA-CONSTRAINTS - 4f34209 edited JLS 72 - b0faaaa edited references for JLS 56 - fea0fd7 TA-constraints edited checklist - f8e2657 set pr count gate to 15 open PRs - 38dd6fa small fixes - ef31e3d exception handling tests evidence - c82acbc TA-Misbehaviours stress tests - fb306b2 TA-Misbehaviours list of misbehaviours - f78941e edited answer about risk analyses in TA-INDICATORS - b8c9aa8 added JLS-76 and comments to TT-Misbehaviours - 17cf293 removed JLS-69, its link and replace context answer with JLS-11. - 5650e92 replaced JLS-69 by JLS-11 in TA-Indicators context file - 40bb11a added link of TA-Misbehaviours to JLS-11 - d635790 added explanation of why there are no incentives to manipulate information - 6b0a4f1 added answer for undiscovered expectations - b0bedd9 answered new expectation identification question - f3ba14f added test data to answer - cbc4546 test repo renaming effects - 2cdda56 filled out remaining answer in TA-Constraints - 02a2ca6 answered to TA-Behaviours - 995183d answered result evaluation question in TA-Misbehaviours - 2ff3f10 added answer to fault induction test misbehaviour - c2fd098 answered ta-misbehaviour question, added evidence to JLS-76 - 80f4108 add risk analysis - daa774b small fixes - 438687e rewrite steps overview - 1e8025f replace - with comma in TA-Behaviours context file - 68e8252 Change phrasing of JLS-24 - 4c1ff69 changed risk analysis AOU-07 formulation - f4e3fec delete unnecessary section in risk analysis - 22d926b deleted unnecessary section in risk analysis - f202d6b changed naming of first risk analysis step - 840152b changed referrals to risk_analysis in ta-indicators and ta-misbehaviours - 67900e5 changed links from legacy gitlab TSF documentation to new ecplise TSF documentation. - 440a1ea Added a control structure diagram both as drawio/png and embedded it in the risk_analysis. - f5a0021 Change risk_analysis headers and structure to fit expected steps of RAFIA STPA procedure. - 398a02c Remade part 4 Unsafe Control Actions to fit review. - 8fa7394 Add step 5 Controller Constraints. - 802ec9d Add step 6 Control loops to risk_analysis - 8975d49 Add step 8 Causal Scenario Constrains to risk analysis - 59f4e69 Add step 10 Review of STPA results to risk analysis. - 780b797 Reworked step 9 Misbehaviours and Expectations in risk analysis to be in line with the STPA review guide. - 0a820ce Update markdown tables of risk_analysis to follow column names and order of STPA results schema in TSF documentation. - bb554ea Add additional Controller Functional Constraints under 5) Device Cotroller Constraints for risk_analysis to fulfill STPA review requirements. - 38fc8fa Updated step 7) Causal Scenarios to follow review guidelines in risk_analysis - ce03617 TA-Constraints: add AOU-31 resource/time budget assumption - c7d94f3 Removed typos and irregularities for review changes. - 5c0eb10 Replace legacy UCA IDs with current combined IDs as in review. - 48d149d Changed minor remarks from review. - 1bd7fa3 Changed failure description in 6) Control Loops and Sequences in risk_analysis - dfc804d Changed minor errors in STPA diagram.
aschemmel-tech
left a comment
There was a problem hiding this comment.
First set of comments inline (can already used for improvements)
Open: check the STPA risk analysis
| @@ -1,4 +1,4 @@ | |||
| #### Checklist for TA-METHODOLOGIES from [Codethink](https://codethinklabs.gitlab.io/trustable/trustable/print_page.html) | |||
| #### Checklist for TA-METHODOLOGIES from [Codethink](https://pages.eclipse.dev/eclipse/tsf/tsfprint_page.html) | |||
There was a problem hiding this comment.
typo: missing "/" between tsf and print_page
| - **Answer**: The list of Expectations has been collected amongst the stakeholders in S-CORE, so we are very confident that the list is comprehensive. | ||
| - Could some participants have incentives to manipulate information? | ||
| - **Answer**: | ||
| - **Answer**: We consider intentional manipulation of information about nlohmann/json to be very unlikely because the library is open source, has no direct revenue or certification attached to this documentation, and all stakeholders share a strong interest in correctness and robustness. Any misrepresentation of expectations or verification would quickly become counterproductive by increasing integration risk, maintenance cost, and reputational damage for the participants. In addition, the requirements, code and history are publicly visible and version-controlled, so inconsistencies can be detected and challenged by other S-CORE stakeholders or the wider community. While unintentional errors are always possible, we see no realistic positive incentive, and several strong negative incentives, for deliberately manipulating this information. |
There was a problem hiding this comment.
It is not clear to me where in the generated report (e.g. https://eclipse-score.github.io/nlohmann_json/main/generated/TA.html) this information can be found. It also differs in some answers to TA-BEHAVIOURS-CHECKLIST.md (duplicated information?).
Same for other *CONTEXT.md files.
| - **Answer**: We are very confident that this list covers all critical requirements. | ||
| - How comprehensive is the list of tests? | ||
| - **Answer**: | ||
| - **Answer**: Currently, the branch coverage is 93.865% and the line coverage is 99.186%, cf. JLS-27. Therefore, we deem the list of tests to be very comprehensive. |
There was a problem hiding this comment.
Do we have any analysis of the causes of missing structural coverage? This would help also the argument towards the question "Are there any Expectations where we believe more coverage would help?"
| - **Answer**: Currently, the branch coverage is 93.865% and the line coverage is 99.186%, cf. JLS-27. Therefore, we deem the list of tests to be very comprehensive. | ||
| - Is every Expectation covered by at least one implemented test? | ||
| - **Answer**: | ||
| - **Answer**: Yes, both of the Expectations are covered by at least one implemented test. |
There was a problem hiding this comment.
We may be need to refine these expectations to improve the understanding of completeness (there may be a mis-match between number of tests and number of expectations). As I understand the JLEX-01/02 are broken down, e.g. WFJ-01 to WFJ-08). Could also refer to the design spec (https://json.nlohmann.me/api/basic_json/) and analyze its coverage.
| - Are indicators from code, component, tool, or data inspections taken into | ||
| consideration? | ||
| - **Answer**: | ||
| - **Answer**: Yes, all types of indicator are taken into consideration. |
There was a problem hiding this comment.
I am not aware we do any of these inspections.
| #### Checklist for TA-METHODOLOGIES from [Codethink](https://codethinklabs.gitlab.io/trustable/trustable/print_page.html) | ||
| #### Checklist for TA-METHODOLOGIES from [Codethink](https://pages.eclipse.dev/eclipse/tsf/tsfprint_page.html) | ||
|
|
||
| This project follows purely the Methodologies of Eclipse S-CORE. |
There was a problem hiding this comment.
Answers for TA-METHODOLOGIES are missing. I was under the impression this PR should complete the TSF work?
|
|
||
| - List of identified Misbehaviours | ||
| - **Answer**: | ||
| - **Answer**: See JLS-11. |
There was a problem hiding this comment.
I do not think that the found bugs should be seen a evidence here. Also below you refer to the JSON parsing errors (causing an exception) also as a misbehaviour (rightly I think).
| - **Answer**: Mitigation expectations are expressed implicitly through (a) documented Quality assurance (https://json.nlohmann.me/community/quality_assurance) requirements and (b) concrete mitigation mechanisms captured by existing Statements: JLS-02 (fuzzing), JLS-31 (static analysis), JLS-25 (review/security policy), JLS-24 (defined failure mode via exceptions), and WFJ-06 (input validation via accept()). | ||
| - Risk analysis | ||
| - **Answer**: | ||
| - **Answer**: Risk analysis has been performed (see docs/risk_analysis.md). |
There was a problem hiding this comment.
from this analysis I would have expected the "list of misbehaviours" as an output
| - Are fault inductions used to demonstrate that tests which usually pass can | ||
| and do fail appropriately? | ||
| - **Answer**: | ||
| - **Answer**: The expected behaviour of nlohmann/json is described by JLS-24 and its substatements in the trustable graph. For a random subset of 10 of these substatements (TIJ-01.1, TIJ-01.2, TIJ-05.1, TIJ-05.3, NPF-01.2, NPF-01.3, NPF-07.2, TIJ-02.4, TIJ-02.5, TIJ-02.2) we checked whether possible misbehaviours (like wrong inputs) are tested using fault induction tests. For all of these 10 substatements, at least one fault induction test is performed. We therefore conclude that most misbehaviours are represented by at least one fault induction test. |
There was a problem hiding this comment.
Actually it is "The expected misbehaviour ..." - This is a tricky question here, as it actually asks for a test to fail which usually passes (to show by error injection). Which I do not think we do. I would try to better state what we do and why we think it is covering this point.
| - Are fault inductions used to demonstrate that tests which usually pass can | ||
| and do fail appropriately? | ||
| - **Answer**: | ||
| - **Answer**: The expected behaviour of nlohmann/json is described by JLS-24 and its substatements in the trustable graph. For a random subset of 10 of these substatements (TIJ-01.1, TIJ-01.2, TIJ-05.1, TIJ-05.3, NPF-01.2, NPF-01.3, NPF-07.2, TIJ-02.4, TIJ-02.5, TIJ-02.2) we checked whether possible misbehaviours (like wrong inputs) are tested using fault induction tests. For all of these 10 substatements, at least one fault induction test is performed. We therefore conclude that most misbehaviours are represented by at least one fault induction test. |
Update TT-EXPECTATIONS
https://pages.eclipse.dev/eclipse/tsf/tsf/extensions/rafia/risk-analysis.html#risk-evaluation
Note: