Skip to content

Resolve rule execution errors#1653

Merged
SFJohnson24 merged 22 commits intomainfrom
1578-resolve-existing-rule-execution-and-rule-evaluation-errors-that-appear-in-the-logs
Mar 20, 2026
Merged

Resolve rule execution errors#1653
SFJohnson24 merged 22 commits intomainfrom
1578-resolve-existing-rule-execution-and-rule-evaluation-errors-that-appear-in-the-logs

Conversation

@gerrycampion
Copy link
Copy Markdown
Collaborator

@gerrycampion gerrycampion commented Mar 6, 2026

  • fix dy operation to return a SKIPPED DomainNotFoundError when DM missing
  • fix domain wildcard replacement to use domain instead of dataset unsplit name
  • simplify args for rule_processor.perform_rule_operations to use DatasetMetadata instead of individual args
  • reraise Operations DomainNotFoundError and KeyError to trickle up as SKIPS
  • Better handling of empty datasets and treat EMPTY_DATASET as a SKIP
  • Change ignored data files from warning to info as this is a pretty normal occurrence in a standard study package
  • Find dataset, class, and dataset variables from the model if dataset is in the model but not in the ig
  • Fix some instances of confusion between domain and dataset
  • Move sdtm-related utils from utils to sdtm_utils
  • Move function from utils to dataset_metadata property
  • replace extract_file_name_from_path_string with built-in os.path.basename
  • fix distinct's call to data service referenced datasets
  • fix relrec merge on columns with differing datatypes
  • fix relrec merge to trigger skip when it results in no records
  • clarified SDTMDatasetMetadata wildcard_replacement variable and fixed wildcard replacement instances
  • fix merge on define xml dataset metadata

To compare the Execution Errors before and after, see these reports:
Report before
Report after

CORE Test Suite Updates

@gerrycampion gerrycampion changed the title Fix dy operation and domain wildcard replacement Resolve rule execution errors Mar 6, 2026
@gerrycampion gerrycampion linked an issue Mar 13, 2026 that may be closed by this pull request
dataset = self.data_service.get_dataset(dataset_meta.filename)
referenced_datasets[dataset_meta.name] = dataset
for dataset_metadata in self.data_service.get_datasets():
dataset = self.data_service.get_dataset(dataset_metadata.name)
Copy link
Copy Markdown
Collaborator

@SFJohnson24 SFJohnson24 Mar 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this needs to stay as
dataset = self.data_service.get_dataset(dataset_meta.filename)

Looking at the interface--i think it is another case of name and filename getting mixed. It needs the file extension, otherwise it returns an empty dataframe. I tested this on cg0370 and was not getting the correct result as it was not finding the LB dataset referenced in the CO dataset (was returning nothing for the distinct operation)

we should change all the get_datasets parameters in the interface and excel/dummy/local/usdm data services to dataset_path to avoid this

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed this and renamed dataset_metadata.dataset_name to data_service_identifier to make it a bit clearer. I will do more refactoring and get_datasets parameter renaming in a follow-up pr

Copy link
Copy Markdown
Collaborator

@SFJohnson24 SFJohnson24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice changes to resolve several issues, organize code, and use native functionality/metadata properties over unneeded functions.
PR preserves relrec merge functionality--tested cg0602
Correctly resolves execution error vs. skip status from parent issue

Only found one issue that needs to be addressed. Please see comment

or self.dataset_metadata.unsplit_name
)
return define_xml_reader.extract_variables_metadata(
domain_name=domain, name=self.dataset_metadata.name
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be misunderstanding this but before this looked like a lookup only using domain and now we also require dataset name. Could that cause split dataset lookups to fail if in define xml is keyed by base domain?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was to fix an issue with CORE-001081 (see before-report) where define metadata was not being extracted for RELREC. Here is what I would expect to see for Name/Domain when doing an itemgroupdef lookup.

Name Domain
AE AE
QSPH QS
RELREC
SUPPDM DM

I added additional fixes now so that RELREC only needs a name and not domain. This corresponds with what is in the Define-xml specification:

Image

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarification. Do you think we should a test for these define-xml lookup cases. So the expected matching behavior is clearly covered?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Copy Markdown
Collaborator

@RamilCDISC RamilCDISC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR does cleanup and improvement of the code. It resolves failing rules because of define xml processing too. The validation was done by:

  1. Reviewing the PR for any unwanted code or comments.
  2. Reviewing the PR in accordance to AC.
  3. Validating all unit and regression testing pass.
  4. Ensuring relevant tests updates.
  5. Ensuring new behavior for define xml is covered in testing.
  6. Ensuring update of cache.
  7. Ensuring successful execution.
  8. Comparing the before and after reports.
  9. Ensuring the report shows intended updates and something that was not intended is not changed.
  10. Ensuring report structure.

@SFJohnson24 SFJohnson24 merged commit 04ae578 into main Mar 20, 2026
12 checks passed
@SFJohnson24 SFJohnson24 deleted the 1578-resolve-existing-rule-execution-and-rule-evaluation-errors-that-appear-in-the-logs branch March 20, 2026 14:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Run Status in Rules Report, engine v0.15 resolve existing rule execution and rule evaluation errors that appear in the logs

3 participants