Feat/universal aggregate metrics by hayleypark · Pull Request #72 · awslabs/stickler

hayleypark · 2026-01-29T23:42:43Z

Issue #, if available: #33 & Fix universal aggregate metrics calculation

Description of changes:

Updated Hungarian Matching for Structured Lists
- Updated algorithm to always match single-item lists to count as TP if similarity score exceeds match_threshold
- Refined logic to ensure matched pairs are always available for aggregate metrics
Updated StructuredListComparator
- When either or both instances for comparison are empty, properly account for child leaf node metrics.
- Simplified calculating nested field metrics to reuse the structured model's compare recursive logic instead of re-inventing the wheel
Dispatcher Logic Refactoring:
- Added helper method to identify List[StructuredModel] fields specifically (_is_structured_list_field)
- When a field is a structured list, all value types (including null) are now properly delegated to StructuredListComparator
Improved Test Cases for Aggregate Functionality:
- Update exiting test cases for correct universal aggregate metrics calculation
- Added a new test cases for universal aggregate metrics calculation
Enhanced Bulk Evaluator with Aggregate Support:
- Added include_aggregates parameter to BulkStructuredModelEvaluator
- New recall_with_fd parameter for flexible recall calculations
- Aggregate metrics are accumulated separately from regular confusion matrix metrics
- Enhanced pretty printing to display both regular and aggregate metrics
Assumption: For aggregate metrics, all matched items (regardless of the similarity score) are considered.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

…model_evaluator to optionally include aggregate results and optionally use recall with fd

…rics to handle aggregation of good and bad matches. The logic is changed from reliance on _handle_hierarchical_field to compare_recursive in comparison_engine.

…turning the field_details. This will allow the aggregate metrics to correctly trickle up.

1) In comparison_dispatcher.py, inside dispatch_field_comparison function, use handle_list_field_dispatch only when simple list. When it's structured list, let compare_struct_list_with_scores handle structured list. 3) In structured_list_comparator.py, _calculate_nested_field_metrics, simplied the code. No threshold of matched pairs is applied. Field details are calculated for all matched pairs and matched objects. Note that the following update is no longer applies as the method is removed in dev: 2) In structured_list_comparator.py, in _handle_struct_list_empty_cases only handle when both are empty. Otherwise, run through the logic. This is necessary to get the field level metrics.

… (e.g., when field value is list versus string) 2) Removed checking for _aggregate flag in test_json_schema_field_converter.py 3) One major change in many test files:

* Update incorrect test cases to pass all test cases in test_structured_model_evaluator_nested.py. Some test cases still fail in test_ecommerce_orders_aggregate_comprehensive.py, which require additional code change. Clean up some print statements, unused tests. * Update comments per feedback --------- Co-authored-by: Hayley Park <parhyunj@amazon.com>

* Updating the logic in dispatcher: 1) Moving from runtime type checking to static field definition checking. Doing this by moving away from _should_use_hierarchical_structure and using _is_structured_list_field instead. This is done mainly for list fields, both primitive and structured. Future changes could expand to other types as needed. 2) when the field is a structured list, delegate handling all types of values whether null or not, to StructuredListComparator Updated structured_model: Add helper method to check if a field is specifically List[StructuredModel], complementing _is_list_field() which checks for any list type.

* Update the early exit logic in Hungarian matching to make sure there is always matched pairs for aggregate metrics --------- Co-authored-by: Hayley Park <parhyunj@amazon.com>

* Previously when a structured list was empty, it would be counted as one TN for aggregate at the object level but aggregation may miss this (because overall metrics are skipped when there are child instances). This change creates a field dict to capture TN at the child leaf node, so that aggregate metric can trickle up correctly. 2) Updated tests to reflect changes due to above logic

* Update hungarian to better handle single-item lists. Always match single-item lists, count as tp if the item similarity score is higher than match_threshold. Update fn count for a test_hungarian test case * Remove an unused, commented out variable

…per.is_effectively_null_for_primitives in comparison_dispatcher

github-actions · 2026-01-29T23:45:53Z

🔒 Security Scan Results

ASH Security Scan Report

Report generated: 2026-01-29T23:45:45+00:00
Time since scan: 1 minute

Scan Metadata

Project: ASH
Scan executed: 2026-01-29T23:43:56+00:00
ASH version: 3.1.2

Summary

Scanner Results

The table below shows findings by scanner, with status based on severity thresholds and dependencies:

Severity levels:
- Suppressed (S): Findings that have been explicitly suppressed and don't affect scanner status
- Critical (C): Highest severity findings that require immediate attention
- High (H): Serious findings that should be addressed soon
- Medium (M): Moderate risk findings
- Low (L): Lower risk findings
- Info (I): Informational findings with minimal risk
Duration (Time): Time taken by the scanner to complete its execution
Actionable: Number of findings at or above the threshold severity level that require attention
Result:
- PASSED = No findings at or above threshold
- FAILED = Findings at or above threshold
- MISSING = Required dependencies not available
- SKIPPED = Scanner explicitly disabled
- ERROR = Scanner execution error
Threshold: The minimum severity level that will cause a scanner to fail
- Thresholds: ALL, LOW, MEDIUM, HIGH, CRITICAL
- Source: Values in parentheses indicate where the threshold is set:
  - global (global_settings section in the ASH_CONFIG used)
  - config (scanner config section in the ASH_CONFIG used)
  - scanner (default configuration in the plugin, if explicitly set)
Statistics calculation:
- All statistics are calculated from the final aggregated SARIF report
- Suppressed findings are counted separately and do not contribute to actionable findings
- Scanner status is determined by comparing actionable findings to the threshold

Scanner	Critical	Low	Actionable	Result	Threshold
bandit	0	3217	0	PASSED	MEDIUM (global)
cdk-nag	0	0	0	PASSED	MEDIUM (global)
cfn-nag	0	0	0	MISSING	MEDIUM (global)
checkov	1	0	1	SKIPPED	MEDIUM (global)
detect-secrets	0	0	0	PASSED	MEDIUM (global)
grype	0	0	0	MISSING	MEDIUM (global)
npm-audit	0	0	0	PASSED	MEDIUM (global)
opengrep	0	0	0	MISSING	MEDIUM (global)
semgrep	1	0	1	FAILED	MEDIUM (global)
syft	0	0	0	MISSING	MEDIUM (global)

Top 2 Hotspots

Files with the highest number of security findings:

Finding Count	File Location
1	.github/workflows/workflow.yml
1	src/stickler/reporting/html/interactive/main.js

Detailed Findings

Show 2 actionable findings

Finding 1: CKV2_GHA_1

Severity: HIGH
Scanner: checkov
Rule ID: CKV2_GHA_1
Location: .github/workflows/workflow.yml:11-12

Description:
Ensure top-level permissions are not set to write-all

Finding 2: javascript.browser.security.insecure-document-method.insecure-document-method

Severity: HIGH
Scanner: semgrep
Rule ID: javascript.browser.security.insecure-document-method.insecure-document-method
Location: src/stickler/reporting/html/interactive/main.js:407-414

Description:
User controlled data in methods like innerHTML, outerHTML or document.write is an anti-pattern that can lead to XSS vulnerabilities

Code Snippet:

pdfItem.innerHTML = `
                <div class="pdf-container">
                    <canvas id="pdf-canvas-${escapeHtml(docId)}" class="pdf-canvas"></canvas>
                    <div class="pdf-loading" id="pdf-loading-${escpeHtml(docId)}">Loading PDF...</div>
                    <div class="pdf-error" id="pdf-error-${escapeHtml(docId)}" style="display: none;">Error loading PDF</div>
                </div>
                <p><strong>${escapeHtml(docId)}</strong></p>
            `;

Report generated by Automated Security Helper (ASH) at 2026-01-29T23:45:46+00:00

github-actions · 2026-01-30T00:20:39Z

🔒 Security Scan Results

ASH Security Scan Report

Report generated: 2026-01-30T00:20:30+00:00
Time since scan: 2 minutes

Scan Metadata

Project: ASH
Scan executed: 2026-01-30T00:18:29+00:00
ASH version: 3.1.2

Summary

Scanner Results

The table below shows findings by scanner, with status based on severity thresholds and dependencies:

Severity levels:
- Suppressed (S): Findings that have been explicitly suppressed and don't affect scanner status
- Critical (C): Highest severity findings that require immediate attention
- High (H): Serious findings that should be addressed soon
- Medium (M): Moderate risk findings
- Low (L): Lower risk findings
- Info (I): Informational findings with minimal risk
Duration (Time): Time taken by the scanner to complete its execution
Actionable: Number of findings at or above the threshold severity level that require attention
Result:
- PASSED = No findings at or above threshold
- FAILED = Findings at or above threshold
- MISSING = Required dependencies not available
- SKIPPED = Scanner explicitly disabled
- ERROR = Scanner execution error
Threshold: The minimum severity level that will cause a scanner to fail
- Thresholds: ALL, LOW, MEDIUM, HIGH, CRITICAL
- Source: Values in parentheses indicate where the threshold is set:
  - global (global_settings section in the ASH_CONFIG used)
  - config (scanner config section in the ASH_CONFIG used)
  - scanner (default configuration in the plugin, if explicitly set)
Statistics calculation:
- All statistics are calculated from the final aggregated SARIF report
- Suppressed findings are counted separately and do not contribute to actionable findings
- Scanner status is determined by comparing actionable findings to the threshold

Scanner	Critical	Low	Actionable	Result	Threshold
bandit	0	3217	0	PASSED	MEDIUM (global)
cdk-nag	0	0	0	PASSED	MEDIUM (global)
cfn-nag	0	0	0	MISSING	MEDIUM (global)
checkov	1	0	1	SKIPPED	MEDIUM (global)
detect-secrets	0	0	0	PASSED	MEDIUM (global)
grype	0	0	0	MISSING	MEDIUM (global)
npm-audit	0	0	0	PASSED	MEDIUM (global)
opengrep	0	0	0	MISSING	MEDIUM (global)
semgrep	1	0	1	FAILED	MEDIUM (global)
syft	0	0	0	MISSING	MEDIUM (global)

Top 2 Hotspots

Files with the highest number of security findings:

Finding Count	File Location
1	.github/workflows/workflow.yml
1	src/stickler/reporting/html/interactive/main.js

Detailed Findings

Show 2 actionable findings

Finding 1: CKV2_GHA_1

Severity: HIGH
Scanner: checkov
Rule ID: CKV2_GHA_1
Location: .github/workflows/workflow.yml:11-12

Description:
Ensure top-level permissions are not set to write-all

Finding 2: javascript.browser.security.insecure-document-method.insecure-document-method

Severity: HIGH
Scanner: semgrep
Rule ID: javascript.browser.security.insecure-document-method.insecure-document-method
Location: src/stickler/reporting/html/interactive/main.js:407-414

Description:
User controlled data in methods like innerHTML, outerHTML or document.write is an anti-pattern that can lead to XSS vulnerabilities

Code Snippet:

pdfItem.innerHTML = `
                <div class="pdf-container">
                    <canvas id="pdf-canvas-${escapeHtml(docId)}" class="pdf-canvas"></canvas>
                    <div class="pdf-loading" id="pdf-loading-${escpeHtml(docId)}">Loading PDF...</div>
                    <div class="pdf-error" id="pdf-error-${escapeHtml(docId)}" style="display: none;">Error loading PDF</div>
                </div>
                <p><strong>${escapeHtml(docId)}</strong></p>
            `;

Report generated by Automated Security Helper (ASH) at 2026-01-30T00:20:31+00:00

github-actions · 2026-01-30T00:32:16Z

🔒 Security Scan Results

ASH Security Scan Report

Report generated: 2026-01-30T00:32:06+00:00
Time since scan: 2 minutes

Scan Metadata

Project: ASH
Scan executed: 2026-01-30T00:30:01+00:00
ASH version: 3.1.2

Summary

Scanner Results

The table below shows findings by scanner, with status based on severity thresholds and dependencies:

Severity levels:
- Suppressed (S): Findings that have been explicitly suppressed and don't affect scanner status
- Critical (C): Highest severity findings that require immediate attention
- High (H): Serious findings that should be addressed soon
- Medium (M): Moderate risk findings
- Low (L): Lower risk findings
- Info (I): Informational findings with minimal risk
Duration (Time): Time taken by the scanner to complete its execution
Actionable: Number of findings at or above the threshold severity level that require attention
Result:
- PASSED = No findings at or above threshold
- FAILED = Findings at or above threshold
- MISSING = Required dependencies not available
- SKIPPED = Scanner explicitly disabled
- ERROR = Scanner execution error
Threshold: The minimum severity level that will cause a scanner to fail
- Thresholds: ALL, LOW, MEDIUM, HIGH, CRITICAL
- Source: Values in parentheses indicate where the threshold is set:
  - global (global_settings section in the ASH_CONFIG used)
  - config (scanner config section in the ASH_CONFIG used)
  - scanner (default configuration in the plugin, if explicitly set)
Statistics calculation:
- All statistics are calculated from the final aggregated SARIF report
- Suppressed findings are counted separately and do not contribute to actionable findings
- Scanner status is determined by comparing actionable findings to the threshold

Scanner	Critical	Low	Actionable	Result	Threshold
bandit	0	3217	0	PASSED	MEDIUM (global)
cdk-nag	0	0	0	PASSED	MEDIUM (global)
cfn-nag	0	0	0	MISSING	MEDIUM (global)
checkov	1	0	1	SKIPPED	MEDIUM (global)
detect-secrets	0	0	0	PASSED	MEDIUM (global)
grype	0	0	0	MISSING	MEDIUM (global)
npm-audit	0	0	0	PASSED	MEDIUM (global)
opengrep	0	0	0	MISSING	MEDIUM (global)
semgrep	1	0	1	FAILED	MEDIUM (global)
syft	0	0	0	MISSING	MEDIUM (global)

Top 2 Hotspots

Files with the highest number of security findings:

Finding Count	File Location
1	.github/workflows/workflow.yml
1	src/stickler/reporting/html/interactive/main.js

Detailed Findings

Show 2 actionable findings

Finding 1: CKV2_GHA_1

Severity: HIGH
Scanner: checkov
Rule ID: CKV2_GHA_1
Location: .github/workflows/workflow.yml:11-12

Description:
Ensure top-level permissions are not set to write-all

Finding 2: javascript.browser.security.insecure-document-method.insecure-document-method

Severity: HIGH
Scanner: semgrep
Rule ID: javascript.browser.security.insecure-document-method.insecure-document-method
Location: src/stickler/reporting/html/interactive/main.js:407-414

Description:
User controlled data in methods like innerHTML, outerHTML or document.write is an anti-pattern that can lead to XSS vulnerabilities

Code Snippet:

pdfItem.innerHTML = `
                <div class="pdf-container">
                    <canvas id="pdf-canvas-${escapeHtml(docId)}" class="pdf-canvas"></canvas>
                    <div class="pdf-loading" id="pdf-loading-${escpeHtml(docId)}">Loading PDF...</div>
                    <div class="pdf-error" id="pdf-error-${escapeHtml(docId)}" style="display: none;">Error loading PDF</div>
                </div>
                <p><strong>${escapeHtml(docId)}</strong></p>
            `;

Report generated by Automated Security Helper (ASH) at 2026-01-30T00:32:07+00:00

sromoam · 2026-02-06T21:16:41Z

tests/structured_object_evaluator/test_correct_classification_definitions.py

-    # items.name should have 1 TP (only item1's name, item2 was below threshold)
+    # items.name should have 2 TP (both item1 and item2)
+    # Overall should have some metrics from poor matches at the leaf node level.
    name_metrics = items_fields["name"]
    if "overall" in name_metrics:
-        assert name_metrics["overall"]["tp"] == 1
+        assert name_metrics["overall"]["tp"] == 2
    else:


I agree with this, looks correct as written here.

sromoam · 2026-02-06T21:17:06Z

tests/structured_object_evaluator/test_correct_classification_definitions.py

-    # items.count should have 1 TP (only item1's count, item2 was below threshold)
+    # items.count should have 1 TP (only item1's count, item2's count did not pass comparison)
    count_metrics = items_fields["count"]
    if "overall" in count_metrics:
        assert count_metrics["overall"]["tp"] == 1  # item1 count matches
        assert (
-            count_metrics["overall"]["fp"] == 0
-        )  # No false positives since item2 not analyzed at field level
-        assert count_metrics["overall"]["fd"] == 0  # No false discoveries for count
+            count_metrics["overall"]["fp"] == 1
+        )  # 1 False positive since item2 matched but count should have been empty
+        assert count_metrics["overall"]["fa"] == 1  # 1 false alarm for count
    else:


Agree with this fix as well.

ayushi1208 · 2026-02-06T21:18:25Z

tests/structured_object_evaluator/test_classification_logic.py

        "Items in EST but not in GT (null) should be False Alarm (FP)"
    )
-    assert items_cm["tn"] == 1, "Both empty lists should be TN"
+    assert items_cm["tn"] == 1, f"Both empty lists should be TN: {items_cm}"


Just better assertion comment.

ayushi1208 · 2026-02-06T21:19:27Z

tests/structured_object_evaluator/test_correct_classification_definitions.py

    name_metrics = items_fields["name"]
    if "overall" in name_metrics:
-        assert name_metrics["overall"]["tp"] == 1
+        assert name_metrics["overall"]["tp"] == 2


Agree, there are two 'name' instances.

sromoam · 2026-02-06T21:20:12Z

tests/structured_object_evaluator/test_correct_classification_definitions.py

-    # items.description should have 1 TP (only item1's description, item2 was below threshold)
+    # items.description should have 1 TP (only item1's description, item2's description did not pass comparison)
    desc_metrics = items_fields["description"]
    if "overall" in desc_metrics:
        assert desc_metrics["overall"]["tp"] == 1  # item1 description matches
        assert (
-            desc_metrics["overall"]["fd"] == 0
-        )  # No false discoveries since item2 not analyzed at field level
-        assert desc_metrics["overall"]["fp"] == 0  # No false positives for description
+            desc_metrics["overall"]["fd"] == 1
+        )  # 1 false discoveries since item2 matched but description not correct at field level
+        assert desc_metrics["overall"]["fp"] == 1  # 1 false positives for description
    else:
        assert desc_metrics["tp"] == 1  # item1 description matches


I agree with this assertion too.

sromoam · 2026-02-06T21:24:39Z

tests/structured_object_evaluator/test_structured_model_evaluator_nested.py

-        # Direct access to metrics for pet fields (not in "overall")
+        # petId metrics
        assert (
-            get_metric(cm["fields"]["pets"]["fields"]["petId"], "tp") == 1
-        ), "Expected 1 true positives"
+            get_metric(cm["fields"]["pets"]["fields"]["petId"], "tp") == 2
+        ), "Expected 2 true positives"
        assert (


sujimart · 2026-02-06T21:25:09Z

tests/structured_object_evaluator/test_structured_model_evaluator_nested.py

-            get_metric(cm["fields"]["pets"], "tn") == 0
+            get_metric(cm["fields"]["pets"], "tn") == 0, 
        ), "Expected 0 true negatives for pets field overall performance"



I think we need to remove that extra comma and that should fix it

sromoam · 2026-02-06T21:25:33Z

tests/structured_object_evaluator/test_structured_model_evaluator_nested.py

-            get_metric(cm["fields"]["pets"]["fields"]["name"], "tp") == 1
-        ), "Expected 1 true positives"
+            get_metric(cm["fields"]["pets"]["fields"]["name"], "tp") == 2
+        ), "Expected 2 true positives"


Again, same comment as before https://github.com/awslabs/stickler/pull/72/changes#r2776109862
the cat buttons falls below the default match_threshold of the Pet object, therefore it's incorect to compare the values of the name. Even though they match.

sromoam · 2026-02-06T21:26:13Z

tests/structured_object_evaluator/test_structured_model_evaluator_nested.py

-            get_metric(cm["fields"]["pets"]["fields"]["species"], "tp") == 0
-        ), "Expected 0 true positives"
+            get_metric(cm["fields"]["pets"]["fields"]["species"], "tp") == 1
+        ), "Expected 1 true positives"


ayushi1208 · 2026-02-06T21:26:57Z

tests/structured_object_evaluator/test_universal_aggregate_field_comprehensive.py

+        expected_top_aggregate = {
+            'tp': 12, 'fa': 1, 'fd': 4, 'fp': 5, 'tn': 0, 'fn': 1
+        }
+


sromoam · 2026-02-06T21:29:19Z

tests/structured_object_evaluator/test_threshold_gated_recursion.py

                    # Exact match: PROD-001 vs PROD-001 should be TP
-                    assert field_metrics["overall"]["tp"] == 1, (
-                        f"Nested field {field} should have 1 TP"
+                    # Even though poor match, at the field level this is an exact match: PROD-002


Actually, this not right, per my comment below. https://github.com/awslabs/stickler/pull/72/changes#r2776109862

sromoam

So we do need some changes to this PR.
Apologies @hayleypark that I didn't catch this earlier.

I recommend you strip the scope down to only the aggregate metrics and the tests here.

The changes to the non-aggregate tests are incorrect, after a review, becuase of the logic we have documented for how we evalaute list objects. https://github.com/awslabs/stickler/blob/d24894c86a2c52a62fa306748cbf2e7740faebe3/docs/docs/Core-Concepts/Threshold_Gated_Recursive_Evaluation.md
This is our current method, docuemnted, where we don't count any confusion matrix counts in objects that aren't matched up to the object's match_threshold, with the logic being, if you need this comparison ,you can turn that value down. IF you want to propose a change to this logic, that's fine, but we need to revise the docuemtnation as well.

sromoam · 2026-02-06T21:40:02Z

tests/structured_object_evaluator/test_threshold_gated_recursion.py

                        field_metrics["overall"]["tp"] + field_metrics["overall"]["fd"]
-                        == 1
-                    ), f"Nested field {field} should have 1 comparison"
+                        == 3
+                    ), f"Nested field {field} should have 3 comparisons"
                elif field == "price":


Actually, now that I'm understanding this better, this is not correct.
The reason is that the second and third products from the list do not match eachother respectively and therfore are considered false detections at the 'product' level. The way to make this assertion true in this test, would be to set the match_threshold on the Product object lower, so that the comparisons would proceed. The current match threshold at 0.8 precludes the 2nd and thrid list items, with the assertion that these are spuriously matched, and thereofre are not valid to compare to each-other.

sromoam · 2026-02-06T21:41:15Z

tests/structured_object_evaluator/test_threshold_gated_recursion.py

                    # Exact match: PROD-001 vs PROD-001 should be TP
-                    assert field_metrics["overall"]["tp"] == 1, (
-                        f"Nested field {field} should have 1 TP"
+                    # Even though poor match, at the field level this is an exact match: PROD-002


Actually, this not right, per my comment below. https://github.com/awslabs/stickler/pull/72/changes#r2776109862

sromoam · 2026-02-06T21:45:24Z

tests/structured_object_evaluator/test_structured_model_evaluator_nested.py

-            get_metric(cm["fields"]["pets"]["fields"]["name"], "tp") == 1
-        ), "Expected 1 true positives"
+            get_metric(cm["fields"]["pets"]["fields"]["name"], "tp") == 2
+        ), "Expected 2 true positives"


Again, same comment as before https://github.com/awslabs/stickler/pull/72/changes#r2776109862
the cat buttons falls below the default match_threshold of the Pet object, therefore it's incorect to compare the values of the name. Even though they match.

hayleypark and others added 18 commits January 29, 2026 11:18

Add updated test cases for aggregate feature. Update bulk_structured_…

23aa74f

…model_evaluator to optionally include aggregate results and optionally use recall with fd

Changes in structured_list_comparator --> _calculate_nested_field_met…

723332a

…rics to handle aggregation of good and bad matches. The logic is changed from reliance on _handle_hierarchical_field to compare_recursive in comparison_engine.

In structured_list_compartor, when caclulating nested_fields, only re…

119b5ae

…turning the field_details. This will allow the aggregate metrics to correctly trickle up.

Addressed unmatched structured list in calculate_nested_field_metrics.

31c7ab5

adding some test cases for aggregate feature

1ce08e1

Add more test cases. 6 failing now

9d86174

1) Removed some tests in test_aggregate_cases for later consideration…

a66ef68

… (e.g., when field value is list versus string) 2) Removed checking for _aggregate flag in test_json_schema_field_converter.py 3) One major change in many test files:

Update early exit logic in hungarian matching (#63)

ebe75ba

* Update the early exit logic in Hungarian matching to make sure there is always matched pairs for aggregate metrics --------- Co-authored-by: Hayley Park <parhyunj@amazon.com>

Apply minor clean ups based on lint outputs

3ec969d

Update StructuredModel._is_effectively_null_for_primitives to NullHel…

450963c

…per.is_effectively_null_for_primitives in comparison_dispatcher

Change type annotations to be string to avoid circular imports

85c654d

Standardize tests from unittest to pytest

6455705

Bring back early exits in structured_list_comparator

9783a71

hayleypark mentioned this pull request Jan 30, 2026

Feat/universal aggregate update #4

Closed

Fix import blocks using ruff

c85ff67

hayleypark requested review from ayushi1208 and sromoam January 30, 2026 00:24

Minor code clean up based on ruff results

d24894c

hayleypark marked this pull request as ready for review January 30, 2026 01:34

hayleypark changed the title ~~Feat/universal aggregate metrics cherry pick~~ Feat/universal aggregate metrics Feb 5, 2026

sromoam reviewed Feb 6, 2026

View reviewed changes

ayushi1208 reviewed Feb 6, 2026

View reviewed changes

sromoam reviewed Feb 6, 2026

View reviewed changes

sujimart reviewed Feb 6, 2026

View reviewed changes

sromoam reviewed Feb 6, 2026

View reviewed changes

ayushi1208 reviewed Feb 6, 2026

View reviewed changes

sromoam reviewed Feb 6, 2026

View reviewed changes

sromoam requested changes Feb 6, 2026

View reviewed changes

Conversation

hayleypark commented Jan 29, 2026 • edited by sujimart Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Jan 29, 2026

🔒 Security Scan Results

ASH Security Scan Report

Scan Metadata

Summary

Scanner Results

Top 2 Hotspots

Detailed Findings

Finding 1: CKV2_GHA_1

Finding 2: javascript.browser.security.insecure-document-method.insecure-document-method

Uh oh!

github-actions bot commented Jan 30, 2026

🔒 Security Scan Results

ASH Security Scan Report

Scan Metadata

Summary

Scanner Results

Top 2 Hotspots

Detailed Findings

Finding 1: CKV2_GHA_1

Finding 2: javascript.browser.security.insecure-document-method.insecure-document-method

Uh oh!

github-actions bot commented Jan 30, 2026

🔒 Security Scan Results

ASH Security Scan Report

Scan Metadata

Summary

Scanner Results

Top 2 Hotspots

Detailed Findings

Finding 1: CKV2_GHA_1

Finding 2: javascript.browser.security.insecure-document-method.insecure-document-method

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sromoam left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

hayleypark commented Jan 29, 2026 •

edited by sujimart

Loading