Skip to content

asimsek/DijetScoutingRun3NTupleMaker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

45 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Dijet Scouting Run3 NTuple Maker

Table of Content:

Setup Analysis Environment

Load cmssw libraries:

Use .csh if you're a csh user. Check with echo $0 on your terminal (lxplus/cmslpc).

source /cvmfs/cms.cern.ch/cmsset_default.sh

Setup CMSSW & Pull nTuple maker framework:

cmsrel CMSSW_15_0_6
cd CMSSW_15_0_6/src
cmsenv
git cms-init
git clone git@github.com:asimsek/DijetScoutingRun3NTupleMaker.git
cd DijetScoutingRun3
scram b clean; scram b -j 8

Run nTuple maker locally

Important

An active proxy is required to access and process CMS data.

voms-proxy-init --voms cms --valid 192:00

Warning

Please ensure that the correct era, globalTag, and input-root-file is used in the ScoutingTreeMakerRun3/python/ScoutingTreeMakerRun3.py script.
Also, ensure that the JEC paths are properly specified in the data/cfg/data_jec_list.txt (or mc_jec_list) file.

cmsRun ScoutingTreeMakerRun3/python/ScoutingTreeMakerRun3.py

Produce nTuples from dataset on CRAB3

ScoutingPFRun3 [2024]

python3 createAndSubmitCrab.py -d Output_ScoutingPFRun3 -v ScoutingPFRun3_Run2024C-v1_03October2025 -i Inputs_ScoutingPFRun3/InputList_Run2024C-v1_ScoutingPFRun3.txt -t crab3_template_data.py -c ../ScoutingNtuplizer/python/ScoutingTreeMakerRun3.py --submit

python3 createAndSubmitCrab.py -d Output_ScoutingPFRun3 -v ScoutingPFRun3_Run2024D-v1_03October2025 -i Inputs_ScoutingPFRun3/InputList_Run2024D-v1_ScoutingPFRun3.txt -t crab3_template_data.py -c ../ScoutingNtuplizer/python/ScoutingTreeMakerRun3.py --submit

python3 createAndSubmitCrab.py -d Output_ScoutingPFRun3 -v ScoutingPFRun3_Run2024E-v1_03October2025 -i Inputs_ScoutingPFRun3/InputList_Run2024E-v1_ScoutingPFRun3.txt -t crab3_template_data.py -c ../ScoutingNtuplizer/python/ScoutingTreeMakerRun3.py --submit

python3 createAndSubmitCrab.py -d Output_ScoutingPFRun3 -v ScoutingPFRun3_Run2024F-v1_03October2025 -i Inputs_ScoutingPFRun3/InputList_Run2024F-v1_ScoutingPFRun3.txt -t crab3_template_data.py -c ../ScoutingNtuplizer/python/ScoutingTreeMakerRun3.py --submit

python3 createAndSubmitCrab.py -d Output_ScoutingPFRun3 -v ScoutingPFRun3_Run2024G-v1_03October2025 -i Inputs_ScoutingPFRun3/InputList_Run2024G-v1_ScoutingPFRun3.txt -t crab3_template_data.py -c ../ScoutingNtuplizer/python/ScoutingTreeMakerRun3.py --submit

python3 createAndSubmitCrab.py -d Output_ScoutingPFRun3 -v ScoutingPFRun3_Run2024H-v1_03October2025 -i Inputs_ScoutingPFRun3/InputList_Run2024H-v1_ScoutingPFRun3.txt -t crab3_template_data.py -c ../ScoutingNtuplizer/python/ScoutingTreeMakerRun3.py --submit

python3 createAndSubmitCrab.py -d Output_ScoutingPFRun3 -v ScoutingPFRun3_Run2024I-v1_03October2025 -i Inputs_ScoutingPFRun3/InputList_Run2024I-v1_ScoutingPFRun3.txt -t crab3_template_data.py -c ../ScoutingNtuplizer/python/ScoutingTreeMakerRun3.py --submit

Datasets

Run3 Scouting Datasets

2024 Datasets 2025 Datasets
/ScoutingPFRun3/Run2024C-v1/HLTSCOUT /ScoutingPFRun3/Run2025B-v1/HLTSCOUT
/ScoutingPFRun3/Run2024D-v1/HLTSCOUT /ScoutingPFRun3/Run2025C-v1/HLTSCOUT
/ScoutingPFRun3/Run2024E-v1/HLTSCOUT /ScoutingPFRun3/Run2025D-v1/HLTSCOUT
/ScoutingPFRun3/Run2024F-v1/HLTSCOUT /ScoutingPF[0,1]/Run2025E-v1/HLTSCOUT
/ScoutingPFRun3/Run2024G-v1/HLTSCOUT /ScoutingPF[0,1]/Run2025F-v1/HLTSCOUT
/ScoutingPFRun3/Run2024H-v1/HLTSCOUT
/ScoutingPFRun3/Run2024I-v1/HLTSCOUT

Run3 Monitoring Datasets

2024 Datasets 2025 Datasets
/ScoutingPFMonitor/Run2024C-v1/RAW /ScoutingPFMonitor/Run2024B-v1/RAW
/ScoutingPFMonitor/Run2024D-v1/RAW /ScoutingPFMonitor/Run2025C-v1/RAW
/ScoutingPFMonitor/Run2024E-v1/RAW /ScoutingPFMonitor/Run2025D-v1/RAW
/ScoutingPFMonitor/Run2024F-v1/RAW /ScoutingPFMonitor/Run2025E-v1/RAW
/ScoutingPFMonitor/Run2024G-v1/RAW /ScoutingPFMonitor/Run2025F-v1/RAW
/ScoutingPFMonitor/Run2024H-v1/RAW
/ScoutingPFMonitor/Run2024I-v1/RAW

Run3 QCD MC Samples

QCD Samples (RunIII2024Summer24 AODSIM) cross section (xsec)
/QCD_Bin-PT-50to80_TuneCP5_13p6TeV_pythia8/RunIII2024Summer24DRPremix-140X_mcRun3_2024_realistic_v26-v2/AODSIM 16730000
/QCD_Bin-PT-80to120_TuneCP5_13p6TeV_pythia8/RunIII2024Summer24DRPremix-140X_mcRun3_2024_realistic_v26-v2/AODSIM 2506000
/QCD_Bin-PT-120to170_TuneCP5_13p6TeV_pythia8/RunIII2024Summer24DRPremix-140X_mcRun3_2024_realistic_v26-v2/AODSIM 439800
/QCD_Bin-PT-170to300_TuneCP5_13p6TeV_pythia8/RunIII2024Summer24DRPremix-140X_mcRun3_2024_realistic_v26-v2/AODSIM 113300
/QCD_Bin-PT-300to470_TuneCP5_13p6TeV_pythia8/RunIII2024Summer24DRPremix-140X_mcRun3_2024_realistic_v26-v2/AODSIM 7581
/QCD_Bin-PT-470to600_TuneCP5_13p6TeV_pythia8/RunIII2024Summer24DRPremix-140X_mcRun3_2024_realistic_v26-v2/AODSIM 623.3
/QCD_Bin-PT-600to800_TuneCP5_13p6TeV_pythia8/RunIII2024Summer24DRPremix-140X_mcRun3_2024_realistic_v26-v2/AODSIM 178.7
/QCD_Bin-PT-800to1000_TuneCP5_13p6TeV_pythia8/RunIII2024Summer24DRPremix-140X_mcRun3_2024_realistic_v26-v2/AODSIM 30.62
/QCD_Bin-PT-1000to1500_TuneCP5_13p6TeV_pythia8/RunIII2024Summer24DRPremix-140X_mcRun3_2024_realistic_v26-v2/AODSIM 9.306
/QCD_Bin-PT-1500to2000_TuneCP5_13p6TeV_pythia8/RunIII2024Summer24DRPremix-140X_mcRun3_2024_realistic_v26-v2/AODSIM 0.5015
/QCD_Bin-PT-2000to2500_TuneCP5_13p6TeV_pythia8/RunIII2024Summer24DRPremix-140X_mcRun3_2024_realistic_v26-v2/AODSIM 0.04264
/QCD_Bin-PT-2500to3000_TuneCP5_13p6TeV_pythia8/RunIII2024Summer24DRPremix-140X_mcRun3_2024_realistic_v26-v2/AODSIM 0.004454
/QCD_Bin-PT-3000_TuneCP5_13p6TeV_pythia8/RunIII2024Summer24DRPremix-140X_mcRun3_2024_realistic_v26-v2/AODSIM 0.0005539

Note

Scouting content (eg.: Run3ScoutingPFJet_hltScoutingPFPacker) is not available in the MINIAODSIM format.

CMS DAS Queries

Important

An active proxy is required to access and process CMS data.

voms-proxy-init --voms cms --valid 192:00

Run3 Scouting Datasets

./utils/dasgoclient --query='dataset dataset=/ScoutingPFRun3/Run2024*/HLTSCOUT'
./utils/dasgoclient --query='dataset dataset=/ScoutingPFRun3/Run2025*/HLTSCOUT'

Run3 Monitoring Datasets

./utils/dasgoclient --query='dataset dataset=/ScoutingPFMonitor/Run2024*/RAW'
./utils/dasgoclient --query='dataset dataset=/ScoutingPFMonitor/Run2025*/RAW'

Run3 QCD MC Samples

./utils/dasgoclient --query='dataset dataset=/QCD_*PT-*0_TuneCP5_13p6TeV_pythia8/Run*Summer*/AODSIM'

Run3 Signal Samples

./utils/dasgoclient --query='dataset dataset=/RSGravitonToQuarkQuark*kMpl*/Run*Summer2*/AODSIM'
./utils/dasgoclient --query='dataset dataset=/RSGravitonTo2G_kMpl-001*/Run*Summer2*/AODSIM'
./utils/dasgoclient --query='dataset dataset=/RSGravitonToGluonGluon_kMpl*_TuneCP5_13p6TeV_pythia8/Run*Summer2*/AODSIM'
./utils/dasgoclient --query='dataset dataset=/Qstarto2J*/Run*Summer2*/AODSIM'

Extras

1) List all branches in the official root file

Note

You need an active proxy (voms) (and a prefix root://cms-xrd-global.cern.ch/ for the root paths if you're trying to access outside the CERN machines (lxplus), e.g.: LPC-machines) to be able to access root file and list the branches.

cmsenv
./utils/edmDumpEventFields root://cms-xrd-global.cern.ch//store/data/Run2025C/ScoutingPFRun3/HLTSCOUT/v1/000/392/925/00000/b95d5cc9-62b2-4b3b-a0f9-d0d79b52a85d.root --tree Events --filter Run3ScoutingPFJets_hltScoutingPFPacker --what fields --show-types

Tip

Auto-selects the largest tree if --tree is missing.
--show-types displays object types (float, vector, etc.)
--summary prints a compact per-branch table.
--format {json|yaml|csv|raw}

2) Check the status of your lxplus/cmslpc tasks using a web-based GUI

Warning

Don't forget to change XXX with the machine that you're conencted for LPC machines.

ssh -L localhost:8787:localhost:8787 asimsek@cmslpcXXX.fnal.gov

Navigate to http://localhost:8787 on your browser.

3) How to compute MC cross-sections with GenXSecAnalyzer

This tool allows you to use existing cmssw library (GenXSecAnalyzer.cc) to compute MC cross sections in a more accurate way. This is useful when there is no xsec information given on the XSDB.

Warning

Please provide the dataset as an input to facilitate more precise cross-section calculations, as relying on a single file will yield only a rough estimate.

cmsRun ./utils/genXsec_cfg.py inputFiles="root://cms-xrd-global.cern.ch//store/mc/RunIII2024Summer24DRPremix/QCD_Bin-PT-50to80_TuneCP5_13p6TeV_pythia8/AODSIM/140X_mcRun3_2024_realistic_v26-v2/120000/005210b9-bf51-4f56-be43-814a093fc0af.root" maxEvents=-1
cmsRun ./utils/genXsec_cfg.py dataset="/QCD_Pt_2400to3200_TuneCP5_13p6TeV_pythia8/Run3Winter22MiniAOD-122X_mcRun3_2021_realistic_v9-v2/MINIAODSIM" maxEvents=-1

Tip

Wildcard dataset searches are allowed. Please use star (*) character in the dataset query while using the genXsec_cfg.py tool. (e.g.: /QCD_Pt_*0to*0_TuneCP5_13p6TeV_pythia8/Run3Winter22MiniAOD-122X_mcRun3_2021_realistic_v9-v2/MINIAODSIM').

Tip

You can use comma separated dataset list to process multiple datasets at once without wildcard. (e.g.: dataset="/A/B/C, /X/Y/Z"`)

Tip

If you want to combine multiple datasets to get one cross-section, use combineSamples=True along with the given command line.

For more: https://cms-generators.docs.cern.ch/useful-tools-and-links/HowToGenXSecAnalyzer/#during-the-production-of-mc-samples

4) How to print available plugins within CMSSW

edmPluginDump | grep -i scouting

5) List all available edm tools in CMSSW

ls $CMSSW_RELEASE_BASE/bin/$SCRAM_ARCH/edm*

6) Check dataset availability on sites

./utils/check_dataset_completeness /QCD_PT-*0_TuneCP5_13p6TeV_pythia8/Run3Summer22DRPremix-124X_mcRun3_2022_realistic_v12-v2/AODSIM --check-files
./utils/check_dataset_completeness /Qstarto2J_M-*_TuneCP5_13p6TeV_pythia8/Run3Summer22DRPremix-124X_mcRun3_2022_realistic_v12-v2/AODSIM

Tip

Input list is also supported with the --input <yourList> argument.
Only show sites where dataset presence is 100% (--full_presence).
Use --include-tapes to show TAPE information alongside DISK details. The use of wildcard (*) in the dataset argument is supported.

7) XRootD (XRD) Commands

XRDFS:: Root file size (if not availble on the given EOS path):

xrdfs root://cms-xrd-global.cern.ch stat /store/data/Run2025C/ScoutingPFRun3/HLTSCOUT/v1/000/392/925/00000/b95d5cc9-62b2-4b3b-a0f9-d0d79b52a85d.root | awk '/Size/{print $2}' | numfmt --to=iec
dasgoclient -query='file file=/store/data/Run2025C/ScoutingPFRun3/HLTSCOUT/v1/000/392/925/00000/b95d5cc9-62b2-4b3b-a0f9-d0d79b52a85d.root | grep file.size'

Tip

The output of the xrdfs command provides the file size in bytes.
To convert into humand readable version, add | awk '/Size/{print $2}' | numfmt --to=iec at the end of the xrdfs command. Only | numfmt --to=iec enough for the dasgoclient command.

XRDCP:: Download Root with XRD Path (root://....)

xrdcp root://cms-xrd-global.cern.ch//store/data/Run2025C/ScoutingPFRun3/HLTSCOUT/v1/000/392/925/00000/b95d5cc9-62b2-4b3b-a0f9-d0d79b52a85d.root .

Warning

You may need to install XRootD client: brew install xrootd If homebrew is not installed: /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

8) Find Global Tag info with edmProvDump

edmProvDump root://cms-xrd-global.cern.ch//store/data/Run2025C/ScoutingPFRun3/HLTSCOUT/v1/000/392/925/00000/b95d5cc9-62b2-4b3b-a0f9-d0d79b52a85d.root | grep 'globaltag'

9) How to request dataset transfer with Rucio

Prep the environment (once per session)
source /cvmfs/cms.cern.ch/cmsset_default.sh
source /cvmfs/cms.cern.ch/rucio/setup-py3.sh
voms-proxy-init -voms cms -rfc -valid 192:00
export RUCIO_ACCOUNT=${USER}
rucio whoami
Check where the dataset lives now
rucio list-dataset-replicas cms:/<Your/Dataset/Name>
Request a disk replica

Warning

Replace the site RSE (T3_US_FNALLPC) with the site for which you intend to submit a request.

rucio add-rule cms:/ScoutingPFRun3/Run2024H-v1/HLTSCOUT 1 T3_US_FNALLPC --asynchronous --ask-approval

Tip

Provide a descriptive --comment for the rule that can be viewed at any time.
Provide a --lifetime (in seconds), after which the rule will expire and the data protected by the rule will be free for deletion.
For reference; 30 days = 2592000 seconds | 3 months = 7776000 sec | 6 months = 15552000 sec

10) LPC EOS Commands

Check EOS user quota
eosquota
Check EOS group quota
eosgrpquota lpcjj
Find the members of the eos group area
getent group | grep ^lpcjj

11) Check the dataset size in TB

dasgoclient -query 'summary dataset=/ScoutingPFRun3/Run2024G-v1/HLTSCOUT' | jq -r '.[0].file_size' | awk '{printf "%.3f TB\n", $1/1e12}'

Useful Links

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published