Skip to content

Conversation

@sofiagiappichini
Copy link
Contributor

New flags are introduced for the cases where multiple subtypes of samples are used, but a flag for the overall type is also needed. For now, is_data, is_dyjets, and is_wjets are supported. Individual subtype flags are maintained alongside them.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces cumulative sample type flags to complement existing individual subtype flags. The change allows for broader categorization of samples (e.g., is_data, is_dyjets, is_wjets) while maintaining granular subtype flags.

Changes:

  • Added cumulative flags for data, dyjets, and wjets sample types based on substring matching
  • Updated subproject commit reference for jsonpog-integration

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
data/jsonpog-integration Updated subproject commit reference
code_generation/configuration.py Added cumulative sample type flags (is_data, is_dyjets, is_wjets) with substring-based detection
Comments suppressed due to low confidence (2)

code_generation/configuration.py:1

  • Line 173 sets is_dyjets instead of is_wjets in the else block for the wjets check. This should be sample_parameters['is_wjets'] = False.
from __future__ import annotations  # needed for type annotations in > python 3.7

code_generation/configuration.py:1

  • The if-else blocks can be simplified to directly assign the boolean result of the substring check, reducing code duplication. For example: sample_parameters['is_data'] = 'data' in self.sample.
from __future__ import annotations  # needed for type annotations in > python 3.7

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

@moritzmolch moritzmolch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with the changes, my only comment is just a stylistic proposal.

sample_parameters["is_{}".format(sampletype)] = True
else:
sample_parameters["is_{}".format(sampletype)] = False
if "data" in self.sample:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can save some lines of code by shortening the if-else statements to:

sample_parameters["is_data"] = "data" in self.sample

The same can be done for the "is_dyjets" and "is_wjets" flags.

@nshadskiy
Copy link
Contributor

nshadskiy commented Jan 16, 2026

I suggest to make it more general by looking at how the sample name starts with. This should also work if later other similar cases would be introduced to an analysis. The splitting is then set to always be an underscore.

for sampletype in self.available_sample_types:
    if self.sample == sampletype or self.sample.startswith(sampletype + "_"):
        sample_parameters["is_{}".format(sampletype)] = True
    else:
        sample_parameters["is_{}".format(sampletype)] = False

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants