GitHub

To run skimming on NanoAODs:

Run ./shell coffeateam/coffea-dask-almalinux9:latest to enter a singularity container

a. These are all my current package versions:

- uproot 5.6.2
- awkward 2.7.4
- dask-awkward 2025.2.0
- dask 2024.8.0
- coffea 2025.1.1
- numpy 1.24.0
- hist 2.8.0
- dask-histogram 2025.2.0

Next run ulimit -n 20000 -- this circumvents a "Too many files open error"
Create a dictionary of the datasets like in fileset.py
Add the dictionary to preprocess.py and copy the sturcture of one of the loops to dump the output of the preprocess step to a pkl file

Once that's done running, the first skimming step is done by mutau_bkgd_skim.py

a. The second argument in Line 243 changes where the output root files are written

DisplacedTauAnalysis/mutau_bkgd_skim.py

Line 243 in c008107

    
           out_file = uproot.dask_write(out_dict, "root://cmseos.fnal.gov//store/user/dally/first_skim/"+events.metadata['dataset'], compute=False, tree_name='Events')

b. Line 264 is where you import the pkl file from preprocess.py

DisplacedTauAnalysis/mutau_bkgd_skim.py

Lines 264 to 265 in c008107

    
           with open("preprocessed_fileset.pkl", "rb") as  f: 
        
               Stau_QCD_DY_dataset_runnable = pickle.load(f)

c. Line 296 is there so that a dataset isn't skimmed again in case there were other datasets that failed. Change the directory to wherever you're saving the root files

DisplacedTauAnalysis/mutau_bkgd_skim.py

Line 296 in c008107

if samp not in os.listdir("/eos/uscms/store/user/dally/first_skim/"):

d. Line 297 is there so that if there is a only some datasets you want to run over, you ignore those you don't. You can comment it out if it's not useful to you

DisplacedTauAnalysis/mutau_bkgd_skim.py

Line 297 in c008107

    
           if "TT" in samp or "DY" in samp or "Stau" in samp or "QCD" in samp or "MET" in samp: continue

e. If a dataset has successfully been processed, you should see line 315

DisplacedTauAnalysis/mutau_bkgd_skim.py

Line 315 in c008107

print(f"Finished in {elapsed:.1f}s")

f. If it errors out, the jobs should close themselves

Next, I usually run mergeFiles.py in the directory the root files have been written, but this is not necessary.
Create another dictionary with the output root files from the first skimming step and follow step 5 for this dictionary
The second skimming step is done by selection_cuts.py. Steps 5a-5f apply here at the respective lines.
Again, I usually run mergeFiles.py here since it makes plotting faster
Plotting is done in plotting_processor_mu.py

a. Lines 327-329 will dump the QCD histograms into a pkl file to use later on in ROCC-coffea.py which is where we plot the isolation ROC curves https://github.com/dally96/DisplacedTauAnalysis/blob/main/plotting_processor_mu.py#L327-L329
Load pkl file into ROCC-coffea.py

DisplacedTauAnalysis/ROCC-coffea.py

Line 76 in 867fcf2

with open("muon_QCD_hists_Iso_Displaced.pkl", "rb") as file:
Make sure bins the bins match the ones from plotting_processor_mu.py
Now run the script, and it should output ROC curves

Name		Name	Last commit message	Last commit date
Latest commit History 117 Commits
definitions		definitions
filesets		filesets
input_jsons		input_jsons
scripts		scripts
selections		selections
LepVeto.C		LepVeto.C
README.md		README.md
ZLeptonEtaPlot.C		ZLeptonEtaPlot.C

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

	with open("preprocessed_fileset.pkl", "rb") as f:
	Stau_QCD_DY_dataset_runnable = pickle.load(f)

dally96/DisplacedTauAnalysis

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages