Skip to content

Conversation

@cippy
Copy link
Collaborator

@cippy cippy commented Dec 24, 2025

This PR deals with estimating fakes when splitting the events in bins of uT, which uses the estimate in one bin to port to the other with additional corrections. Details and motivation about the procedure will be given in a dedicated presentation.

This PR also adds other features or fixes, often related to the fakes but not necessarily. Some of these might go into an independent PR, but in many cases they are entwined with other changes needed here.

This PR would require some changes in wremnants-data to add new files, this will come later.

cippy and others added 24 commits October 17, 2025 15:44
… analysis bins of specific axes (e.g. utAngleSign) which have identically zero yield in the A-Ax-B-Bx regions

(cherry picked from commit 773c1d9)
… analysis bins of specific axes (e.g. utAngleSign) which have identically zero yield in the A-Ax-B-Bx regions
…current version of Rabbit, it will be probably reverted when the definitions of poi/noi/nuisanceNotConstrained are fixed
…ion on uT>0 with an hardcoded factor (normalization)
@cippy cippy changed the title Do not merge: testing some updates Updates for fakes with u_T^\mu bins and other fixes Jan 3, 2026
@davidwalter2
Copy link
Collaborator

@cippy
Copy link
Collaborator Author

cippy commented Jan 4, 2026

There are differences in the CI: This PR: https://cmsmwbot.web.cern.ch/WMassAnalysis/PRValidation/PR643/2026_01_03/ Reference: https://cmsmwbot.web.cern.ch/WMassAnalysis/PRValidation/ReferenceRuns/2026_01_02_3d5404b/ Is this expected? If so, why?

I'll check, it was not supposed to change the nominal workflow

@cippy
Copy link
Collaborator Author

cippy commented Jan 7, 2026

I think I fixed it, it was a change in the default width variation for the fit, which I wanted to make only for the width as poi but got applied for any case.

This PR: https://cmsmwbot.web.cern.ch/WMassAnalysis/PRValidation/PR643/2026_01_07/
Ref: https://cmsmwbot.web.cern.ch/WMassAnalysis/PRValidation/ReferenceRuns/2026_01_07_7a59b61/

However, before merging I want to clean a few things up

@cippy
Copy link
Collaborator Author

cippy commented Jan 11, 2026

filePath=args.muRmuFPolVarFilePath,
noUL=True,
)
# # Helper for muR and muF as polynomial variations
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this be removed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know. We are not using it at the moment, but it was supposed to be used and possibly even become the default one year ago. @bendavid what do you think? If we remove it, I would try to keep track of the commit where it is implemented to make it easy to restore it if/when we need it back

storage=hist.storage.Double(),
)
)
# if isWorZ and not hasattr(dataset, "out_of_acceptance"):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this be removed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll check, probably yes

select[axis] = hist.overflow
else:
select[axis] = int(value)
if args.selection and args.selection != "none":
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the logic before it was possible to disable the default selection with '--selection none' which doesn't seem to be possible anymore, or is it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might be right, I need to check it better, this part was originally added by Ruben because in some configurations the previous code was not plotting the fakes as desired, but maybe this workaround icebreaking something else.

lowerLegPos=args.lowerLegPos,
lower_leg_padding=args.lowerLegPadding,
subplotsizes=args.subplotSizes,
x_vertLines_edges=args.vertLineEdges,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this change propagated into wums? Does this require an update of wums then?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it is already in wums

"pdfMSHT20AlphaS": "<i>α</i><sub>S</sub> PDF",
"pdfCT18ZAlphaS": "<i>α</i><sub>S</sub> PDF",
"pdfCT18ZNoAlphaS": "PDF",
"pdfCT18ZNoAlphaS": "PDF no <i>α</i><sub>S</sub>",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to relabel the other entries with "PDF + alphaS" instead and keep the NoAlphaS as simply "PDF"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I agree and I'll do that

# we need to warn the class that these don't belong to fit axes nor fakerate axes,
# but are also not "integration axes" because they are removed before entering
# the fake estimate, this can happen for example
# with option --select in setupRabbit.py or --presel in makeDataMCStackPlot.py,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the '--select' and '--presel' set the "globalAction" which is always applied before the fake estimation. Instead of adding "histAxesRemovedBeforeFakes" it is also possible to initialize the histselector with these axes removed. This is how it should be done in setupCombine, if not, would this not be the more suitable solution?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the problem was that somehow the list of integration axes is evaluated before the integration actually happens, and then the code doesn't allow you to run the full smoothing when that happens. Or, even if the axis is removed by integration, the list of integration axes is not updated afterwards so the code thinks it still has to integrate those axes, which it doesn't like.
Are you saying the previous code was already able to do the same? To me it was not running when my histograms had N axes but I was using --fitvar and --fakerateAxes with less axes

"DYJetsToMuMuMass10to50PostVFP": {
"filepaths": [
"{BASE_PATH}/DYJetsToMuMu_M-10to50_H2ErratumFix_TuneCP5_13TeV-powhegMiNNLO-pythia8-photos/NanoV9MCPostVFP_{NANO_PROD_TAG}",
# "{BASE_PATH}/DYJetsToMuMu_M-10to50_H2ErratumFix_TuneCP5_13TeV-powhegMiNNLO-pythia8-photos_ext1/NanoV9MCPostVFP_{NANO_PROD_TAG}",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this be removed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be removed at some point, the current issue is that in Pisa's machine I don't have permissions to copy the files in the other folder, so I had to temporarily create a new path and put them there.

@cippy
Copy link
Collaborator Author

cippy commented Jan 19, 2026

I tried to apply fixes according to some commits.
About the comments, for the moment I left them because it is not clear to me if that code is ever going to be needed now that we have the PDF/QCD scale weight applied through the helicities. If not, we can probably remove all of them completely, otherwise the best would be to make those histogram be produced only on demand when needed, since I think they slow the histmaker down.
For the part about the fakes, I agree that the current logic is a bit cumbersome, but since that part of the code is quite intricate I would leave it as it is for now and try to improve it in a different PR (possibly also fixing the pseudodata for the fakes).

@cippy
Copy link
Collaborator Author

cippy commented Jan 20, 2026

This PR: https://cmsmwbot.web.cern.ch/WMassAnalysis/PRValidation/PR643/2026_01_19/
Reference: https://cmsmwbot.web.cern.ch/WMassAnalysis/PRValidation/ReferenceRuns/2026_01_20_27e2e9b/

The only difference should be in the renaming of the PDF impact groups, in particular calling "PDF" as "PDF+\alpha_S"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants