HH4b

Search for two boosted (high transverse momentum) Higgs bosons (H) decaying to four beauty quarks (b).

HH4b

Setting up package

Creating a virtual environment

First, create a virtual environment (micromamba is recommended):

# Download the micromamba setup script (change if needed for your machine https://mamba.readthedocs.io/en/latest/installation/micromamba-installation.html) # Install: (the micromamba directory can end up taking O(1-10GB) so make sure the directory you're using allows that quota) "${SHELL}" <(curl -L micro.mamba.pm/install.sh) # You may need to restart your shell micromamba create -n hh4b python=3.10 -c conda-forge micromamba activate hh4b

Installing package

Remember to install this in your mamba environment.

# Clone the repository git clone https://github.com/LPC-HH/HH4b.git cd HH4b # Perform an editable installation pip install -e . # for committing to the repository pip install pre-commit pre-commit install

Troubleshooting

If your default python in your environment is not Python 3, make sure to use pip3 and python3 commands instead.
You may also need to upgrade pip to perform the editable installation:

python3 -m pip install -e .

Running coffea processors

Setup

For submitting to condor, all you need is python >= 3.7.

For running locally, follow the same virtual environment setup instructions above and install coffea

micromamba activate hh4b pip install coffea

Clone the repository:

git clone https://github.com/LPC-HH/HH4b/ pip install -e .

Running locally

To test locally first (recommended), can do e.g.:

mkdir outfiles python -W ignore src/run.py --starti 0 --endi 1 --year 2022 --processor skimmer --samples QCD --subsamples "QCD_PT-470to600" python -W ignore src/run.py --processor skimmer --year 2022EE --nano-version v12_private --samples HH --subsamples GluGlutoHHto4B_kl-1p00_kt-1p00_c2-0p00_TuneCP5_13p6TeV --starti 0 --endi 1 python -W ignore src/run.py --year 2022 --processor trigger_boosted --samples Muon --subsamples Run2022C --nano_version v11_private --starti 0 --endi 1

Parquet and pickle files will be saved. Pickles are in the format {'nevents': int, 'cutflow': Dict[str, int]}.

Or on a specific file(s):

FILE=/eos/uscms/store/user/rkansal/Hbb/nano/Run3Winter23NanoAOD/QCD_PT-15to7000_TuneCP5_13p6TeV_pythia8/02c29a77-3e0e-40e0-90a1-0562f54144e9.root python -W ignore src/run.py --processor skimmer --year 2023 --files $FILE --files-name QCD

Jobs

The script src/condor/submit.py manually splits up the files into condor jobs:

On a full dataset: e.g. TAG=23Jul13

python src/condor/submit.py --processor skimmer --tag $TAG --files-per-job 20 --submit

On a specific sample:

python src/condor/submit.py --processor skimmer --tag $TAG --nano-version v11_private --samples HH --subsamples GluGlutoHHto4B_kl-1p00_kt-1p00_c2-0p00_TuneCP5_13p6TeV_TSG

Over many samples, using a yaml file:

nohup python src/condor/submit_from_yaml.py --tag $TAG --processor skimmer --save-systematics --submit --yaml src/condor/submit_configs/${YAML}.yaml &> tmp/submitout.txt &

To Submit (if not using the --submit flag):

nohup bash -c 'for i in condor/'"${TAG}"'/*.jdl; do condor_submit $i; done' &> tmp/submitout.txt &

Dask

Log in with ssh tunneling:

ssh -L 8787:localhost:8787 cmslpc-sl7.fnal.gov

Run the ./shell script as setup above via lpcjobqueue:

./shell coffeateam/coffea-dask:0.7.21-fastjet-3.4.0.1-g6238ea8

Renew your grid certificate:

voms-proxy-init --rfc --voms cms -valid 192:00

Run the job submssion script:

python -u -W ignore src/run.py --year 2022EE --yaml src/condor/submit_configs/skimmer_23_10_02.yaml --processor skimmer --nano-version v11 --region signal --save-array --executor dask > dask.out 2>&1

Postprocessing

Setup

Make sure to install the package (#installing-package) and install all the requirements in your conda environment:

pip3 install -r requirements.txt

BDT Training

Multi-class BDT training:

python -W ignore TrainBDT.py --data-path /ceph/cms/store/user/rkansal/bbbb/skimmer/24Apr19LegacyFixes_v12_private_signal/ --model-name 24Apr21_legacy_vbf_vars --legacy --sig-keys hh4b vbfhh4b-k2v0 --no-pnet-plots

Creating templates / FOM Scan / BDT ROC curve

From inside the src/HH4b/postprocessing directory:

python PostProcess.py --templates-tag 24Apr17pT300Cut --tag 24Mar31_v12_signal --legacy --mass H2PNetMass --bdt-model 24Apr21_legacy_vbf_vars --bdt-config 24Apr21_legacy_vbf_vars --txbb-wps 0.99 0.94 --bdt-wps 0.94 0.68 0.03 (--no-fom-scan) (--no-fom-scan-bin1) (--no-fom-scan-bin2) (--no-fom-scan-vbf) (--no-templates) (--bdt-roc)

Condor Scripts

Check jobs

Check that all jobs completed by going through output files:

for year in 2022 2022EE 2023 2023BPix; do python src/condor/check_jobs.py --tag $TAG --processor trigger (--submit) --year $year; done

e.g.

python src/condor/check_jobs.py --year 2018 --tag Oct9 --processor matching --check-running --user cmantill --submit-missing

Combine pickles

Combine all output pickles into one:

for year in 2016APV 2016 2017 2018; do python src/condor/combine_pickles.py --tag $TAG --processor trigger --r --year $year; done

Combine

CMSSW + Combine Quickstart

cmsrel CMSSW_11_3_4 cd CMSSW_11_3_4/src cmsenv # float regex PR was merged so we should be able to switch to the main branch now: git clone -b v9.2.0 https://github.com/cms-analysis/HiggsAnalysis-CombinedLimit.git HiggsAnalysis/CombinedLimit git clone -b v2.0.0 https://github.com/cms-analysis/CombineHarvester.git CombineHarvester # Important: this scram has to be run from src dir scramv1 b clean; scramv1 b -j 4

I also add the combine folder to my PATH in my .bashrc for convenience:

export PATH="$PATH:/uscms_data/d1/rkansal/hh4b/HH4b/src/HH4b/combine"

Create Datacards

After activating the CMSSW environment from above, need to install rhalphalib and this repo:

# rhalphalib git clone https://github.com/rkansal47/rhalphalib cd rhalphalib pip3 install -e . --user # editable installation cd .. # this repo git clone https://github.com/LPC-HH/HH4b.git cd HH4b pip3 install -e . --user # TODO: check editable installation

Then, the command is:

python3 postprocessing/CreateDatacard.py --templates-dir templates/$TAG --model-name $TAG

e.g.

python3 postprocessing/CreateDatacard.py --templates-dir postprocessing/templates/Apr18 --year 2022-2023 --model-name run3-bdt-apr18

Run fits and diagnostics locally

All via the below script, with a bunch of options (see script):

run_blinded_hh4b.sh --workspace --bfit --limits --passbin 0

It will automatically include the VBF category if the directory has a passvbf.txt card.

Postfit plots

e.g.

python3 postprocessing/PlotFits.py --fit-file cards/run3-bdt-apr18/FitShapes.root --plots-dir ../../plots/PostFit/run3-bdt-apr18 --signal-scale 10

F-tests locally

This will take 5-10 minutes for 100 toys will take forever for more than >>100!.

# automatically make workspaces and do the background-only fit for orders 0 - 3 run_ftest_hh4b.sh --cardstag run3-bdt-apr2 --templatestag Apr2 --year 2022-2023 # -dl for saving shapes and limits # run f-test for desired order run_ftest_hh4b.sh --cardstag run3-bdt-apr2 --goftoys --ffits --numtoys 100 --seed 444 --order 0

Moving datasets (WIP)

rucio list-dataset-replicas cms: rucio add-rule cms:/DATASET 1 T1_US_FNAL_Disk --activity "User AutoApprove" --lifetime [# of seconds] --ask-approval --comment ''

Name		Name	Last commit message	Last commit date
Latest commit History 917 Commits
.github		.github
data		data
docs		docs
src		src
tests		tests
.codespell-whitelist.txt		.codespell-whitelist.txt
.copier-answers.yml		.copier-answers.yml
.git_archival.txt		.git_archival.txt
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.readthedocs.yml		.readthedocs.yml
CheckOutputs.ipynb		CheckOutputs.ipynb
LICENSE		LICENSE
README.md		README.md
figure.png		figure.png
noxfile.py		noxfile.py
print_parquet.py		print_parquet.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HH4b

Setting up package

Creating a virtual environment

Installing package

Troubleshooting

Running coffea processors

Setup

Running locally

Jobs

Dask

Postprocessing

Setup

BDT Training

Creating templates / FOM Scan / BDT ROC curve

Condor Scripts

Check jobs

Combine pickles

Combine

CMSSW + Combine Quickstart

Create Datacards

Run fits and diagnostics locally

Postfit plots

F-tests locally

Moving datasets (WIP)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

HH4b

Setting up package

Creating a virtual environment

Installing package

Troubleshooting

Running coffea processors

Setup

Running locally

Jobs

Dask

Postprocessing

Setup

BDT Training

Creating templates / FOM Scan / BDT ROC curve

Condor Scripts

Check jobs

Combine pickles

Combine

CMSSW + Combine Quickstart

Create Datacards

Run fits and diagnostics locally

Postfit plots

F-tests locally

Moving datasets (WIP)

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages