Resolve rule execution errors by gerrycampion · Pull Request #1653 · cdisc-org/cdisc-rules-engine

gerrycampion · 2026-03-06T16:59:03Z

fix dy operation to return a SKIPPED DomainNotFoundError when DM missing
fix domain wildcard replacement to use domain instead of dataset unsplit name
simplify args for rule_processor.perform_rule_operations to use DatasetMetadata instead of individual args
reraise Operations DomainNotFoundError and KeyError to trickle up as SKIPS
Better handling of empty datasets and treat EMPTY_DATASET as a SKIP
Change ignored data files from warning to info as this is a pretty normal occurrence in a standard study package
Find dataset, class, and dataset variables from the model if dataset is in the model but not in the ig
Fix some instances of confusion between domain and dataset
Move sdtm-related utils from utils to sdtm_utils
Move function from utils to dataset_metadata property
replace extract_file_name_from_path_string with built-in os.path.basename
fix distinct's call to data service referenced datasets
fix relrec merge on columns with differing datatypes
fix relrec merge to trigger skip when it results in no records
clarified SDTMDatasetMetadata wildcard_replacement variable and fixed wildcard replacement instances
fix merge on define xml dataset metadata

To compare the Execution Errors before and after, see these reports:
Report before
Report after

CORE Test Suite Updates

…e-evaluation-errors-that-appear-in-the-logs

…dataset confusion

…differing datatypes

SFJohnson24 · 2026-03-16T20:48:58Z

cdisc_rules_engine/operations/distinct.py

-            dataset = self.data_service.get_dataset(dataset_meta.filename)
-            referenced_datasets[dataset_meta.name] = dataset
+        for dataset_metadata in self.data_service.get_datasets():
+            dataset = self.data_service.get_dataset(dataset_metadata.name)


I believe this needs to stay as
dataset = self.data_service.get_dataset(dataset_meta.filename)

Looking at the interface--i think it is another case of name and filename getting mixed. It needs the file extension, otherwise it returns an empty dataframe. I tested this on cg0370 and was not getting the correct result as it was not finding the LB dataset referenced in the CO dataset (was returning nothing for the distinct operation)

we should change all the get_datasets parameters in the interface and excel/dummy/local/usdm data services to dataset_path to avoid this

Fixed this and renamed dataset_metadata.dataset_name to data_service_identifier to make it a bit clearer. I will do more refactoring and get_datasets parameter renaming in a follow-up pr

SFJohnson24

Nice changes to resolve several issues, organize code, and use native functionality/metadata properties over unneeded functions.
PR preserves relrec merge functionality--tested cg0602
Correctly resolves execution error vs. skip status from parent issue

Only found one issue that needs to be addressed. Please see comment

…e-evaluation-errors-that-appear-in-the-logs

RamilCDISC · 2026-03-17T21:38:20Z

cdisc_rules_engine/dataset_builders/base_dataset_builder.py

+            or self.dataset_metadata.unsplit_name
+        )
+        return define_xml_reader.extract_variables_metadata(
+            domain_name=domain, name=self.dataset_metadata.name


I might be misunderstanding this but before this looked like a lookup only using domain and now we also require dataset name. Could that cause split dataset lookups to fail if in define xml is keyed by base domain?

This was to fix an issue with CORE-001081 (see before-report) where define metadata was not being extracted for RELREC. Here is what I would expect to see for Name/Domain when doing an itemgroupdef lookup.

Name Domain

AE AE

QSPH QS

RELREC

SUPPDM DM

I added additional fixes now so that RELREC only needs a name and not domain. This corresponds with what is in the Define-xml specification:

Thanks for the clarification. Do you think we should a test for these define-xml lookup cases. So the expected matching behavior is clearly covered?

RamilCDISC

The PR does cleanup and improvement of the code. It resolves failing rules because of define xml processing too. The validation was done by:

Reviewing the PR for any unwanted code or comments.
Reviewing the PR in accordance to AC.
Validating all unit and regression testing pass.
Ensuring relevant tests updates.
Ensuring new behavior for define xml is covered in testing.
Ensuring update of cache.
Ensuring successful execution.
Comparing the before and after reports.
Ensuring the report shows intended updates and something that was not intended is not changed.
Ensuring report structure.

…e-evaluation-errors-that-appear-in-the-logs

Fix dy operation and domain wildcard replacement

36aecd3

gerrycampion linked an issue Mar 6, 2026 that may be closed by this pull request

resolve existing rule execution and rule evaluation errors that appear in the logs #1578

Closed

gerrycampion temporarily deployed to DEV March 6, 2026 16:59 — with GitHub Actions Inactive

Convert some OperationErrors to skippable DomainNotFound and KeyErrors

fe8aa5d

gerrycampion temporarily deployed to DEV March 6, 2026 20:03 — with GitHub Actions Inactive

skip the keyerror

2f57346

gerrycampion temporarily deployed to DEV March 6, 2026 20:11 — with GitHub Actions Inactive

gerrycampion changed the title ~~Fix dy operation and domain wildcard replacement~~ Resolve rule execution errors Mar 6, 2026

fix for empty datasets. info instead of warning for bad file formats

2aa7655

gerrycampion temporarily deployed to DEV March 12, 2026 17:58 — with GitHub Actions Inactive

Merge branch 'main' into 1578-resolve-existing-rule-execution-and-rul…

f787d11

…e-evaluation-errors-that-appear-in-the-logs

gerrycampion temporarily deployed to DEV March 12, 2026 18:06 — with GitHub Actions Inactive

fix dask empty check

4b76b1f

gerrycampion temporarily deployed to DEV March 12, 2026 18:33 — with GitHub Actions Inactive

update is_custom_domain to consider model domains. fixed some domain/…

48b58df

…dataset confusion

gerrycampion temporarily deployed to DEV March 13, 2026 02:12 — with GitHub Actions Inactive

move sdtm-related utils to sdtm_utilities

33279fe

gerrycampion temporarily deployed to DEV March 13, 2026 14:48 — with GitHub Actions Inactive

Generalize _replace_variable_wildcards

7cc39ed

gerrycampion temporarily deployed to DEV March 13, 2026 15:48 — with GitHub Actions Inactive

Fixed distinct referenced datasets. Fix relrec merge on columns with …

a87661d

…differing datatypes

gerrycampion temporarily deployed to DEV March 13, 2026 17:53 — with GitHub Actions Inactive

Fix more wildcard replacements

0c2cdea

gerrycampion temporarily deployed to DEV March 13, 2026 18:20 — with GitHub Actions Inactive

Fix case where relrec merge produces no records

a6f77d3

gerrycampion temporarily deployed to DEV March 13, 2026 19:46 — with GitHub Actions Inactive

another wildcard replacement fix

9ec1ec9

gerrycampion temporarily deployed to DEV March 13, 2026 20:48 — with GitHub Actions Inactive

fix define xml variable metadata merge

c2e0c09

gerrycampion temporarily deployed to DEV March 13, 2026 22:07 — with GitHub Actions Inactive

gerrycampion requested review from RamilCDISC and SFJohnson24 March 13, 2026 22:27

gerrycampion linked an issue Mar 13, 2026 that may be closed by this pull request

Run Status in Rules Report, engine v0.15 #1639

Closed

gerrycampion requested a review from alexfurmenkov March 13, 2026 22:37

minor tweak to domain handling in rule processor

ddb8238

gerrycampion temporarily deployed to DEV March 16, 2026 02:19 — with GitHub Actions Inactive

gerrycampion mentioned this pull request Mar 16, 2026

SUPP dataset merge by numeric variable #1660

Open

SFJohnson24 reviewed Mar 16, 2026

View reviewed changes

SFJohnson24 requested changes Mar 16, 2026

View reviewed changes

Merge branch 'main' into 1578-resolve-existing-rule-execution-and-rul…

14a6982

…e-evaluation-errors-that-appear-in-the-logs

gerrycampion temporarily deployed to DEV March 16, 2026 21:53 — with GitHub Actions Inactive

fix distinct get_dataset and rename dataset_metadata.dataset_name

992d4f6

gerrycampion temporarily deployed to DEV March 17, 2026 17:49 — with GitHub Actions Inactive

gerrycampion requested a review from SFJohnson24 March 17, 2026 18:00

RamilCDISC reviewed Mar 17, 2026

View reviewed changes

better define-xml itemgroupdef match handling

553f987

gerrycampion temporarily deployed to DEV March 18, 2026 17:48 — with GitHub Actions Inactive

define metadata documentation

b004725

gerrycampion temporarily deployed to DEV March 18, 2026 17:56 — with GitHub Actions Inactive

gerrycampion requested a review from RamilCDISC March 18, 2026 17:56

Added test for get_define_xml_variables_metadata

c3bf504

gerrycampion temporarily deployed to DEV March 18, 2026 21:26 — with GitHub Actions Inactive

RamilCDISC approved these changes Mar 18, 2026

View reviewed changes

Merge branch 'main' into 1578-resolve-existing-rule-execution-and-rul…

2c591a8

…e-evaluation-errors-that-appear-in-the-logs

SFJohnson24 temporarily deployed to DEV March 20, 2026 12:50 — with GitHub Actions Inactive

SFJohnson24 approved these changes Mar 20, 2026

View reviewed changes

Merge branch 'main' into 1578-resolve-existing-rule-execution-and-rul…

ac21bf8

…e-evaluation-errors-that-appear-in-the-logs

SFJohnson24 temporarily deployed to DEV March 20, 2026 14:32 — with GitHub Actions Inactive

SFJohnson24 merged commit 04ae578 into main Mar 20, 2026
12 checks passed

SFJohnson24 deleted the 1578-resolve-existing-rule-execution-and-rule-evaluation-errors-that-appear-in-the-logs branch March 20, 2026 14:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Resolve rule execution errors#1653

Resolve rule execution errors#1653
SFJohnson24 merged 22 commits intomainfrom
1578-resolve-existing-rule-execution-and-rule-evaluation-errors-that-appear-in-the-logs

gerrycampion commented Mar 6, 2026 •

edited

Loading

Uh oh!

SFJohnson24 Mar 16, 2026 •

edited

Loading

Uh oh!

gerrycampion Mar 17, 2026

Uh oh!

SFJohnson24 left a comment •

edited

Loading

Uh oh!

RamilCDISC Mar 17, 2026

Uh oh!

gerrycampion Mar 18, 2026

Uh oh!

RamilCDISC Mar 18, 2026

Uh oh!

gerrycampion Mar 18, 2026

Uh oh!

RamilCDISC left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

gerrycampion commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SFJohnson24 Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gerrycampion Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

SFJohnson24 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

RamilCDISC Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

gerrycampion Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

RamilCDISC Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

gerrycampion Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

RamilCDISC left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

gerrycampion commented Mar 6, 2026 •

edited

Loading

SFJohnson24 Mar 16, 2026 •

edited

Loading

SFJohnson24 left a comment •

edited

Loading