{ "cells": [ { "cell_type": "markdown", "id": "24a6a9df", "metadata": {}, "source": [ "# df103_NanoAODHiggsAnalysis\n", "An example of complex analysis with RDataFrame: reconstructing the Higgs boson.\n", "\n", "This tutorial is a simplified but yet complex example of an analysis reconstructing\n", "the Higgs boson decaying to two Z bosons from events with four leptons. The data\n", "and simulated events are taken from CERN OpenData representing a subset of the data\n", "recorded in 2012 with the CMS detector at the LHC. The tutorials follows the Higgs\n", "to four leptons analysis published on CERN Open Data portal\n", "([10.7483/OPENDATA.CMS.JKB8.RR42](http://opendata.cern.ch/record/5500)).\n", "The resulting plots show the invariant mass of the selected four lepton systems\n", "in different decay modes (four muons, four electrons and two of each kind)\n", "and in a combined plot indicating the decay of the Higgs boson with a mass\n", "of about 125 GeV.\n", "\n", "The following steps are performed for each sample with data and simulated events\n", "in order to reconstruct the Higgs boson from the selected muons and electrons:\n", "1. Select interesting events with multiple cuts on event properties, e.g.,\n", " number of leptons, kinematics of the leptons and quality of the tracks.\n", "2. Reconstruct two Z bosons of which only one on the mass shell from the selected events and apply additional cuts\n", " on the reconstructed objects.\n", "3. Reconstruct the Higgs boson from the remaining Z boson candidates and calculate\n", " its invariant mass.\n", "\n", "The tutorial has the fast mode enabled by default, which reads the data from already skimmed\n", "datasets with a total size of only 51MB. If the fast mode is disabled, the tutorial runs over\n", "the full dataset with a size of 12GB.\n", "\n", "\n", "\n", "\n", "**Author:** Stefan Wunsch (KIT, CERN) \n", "This notebook tutorial was automatically generated with ROOTBOOK-izer from the macro found in the ROOT repository on Tuesday, May 19, 2026 at 08:10 PM." ] }, { "cell_type": "code", "execution_count": 1, "id": "f15e70e7", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:10:33.208634Z", "iopub.status.busy": "2026-05-19T20:10:33.208478Z", "iopub.status.idle": "2026-05-19T20:10:33.218737Z", "shell.execute_reply": "2026-05-19T20:10:33.214466Z" } }, "outputs": [], "source": [ "%%cpp -d\n", "#include \"ROOT/RDataFrame.hxx\"\n", "#include \"ROOT/RDFHelpers.hxx\"\n", "#include \"ROOT/RVec.hxx\"\n", "#include \"ROOT/RDF/RInterface.hxx\"\n", "#include \"TCanvas.h\"\n", "#include \"TH1D.h\"\n", "#include \"TLatex.h\"\n", "#include \"TLegend.h\"\n", "#include \n", "#include \n", "#include \n", "#include \"TStyle.h\"\n", "#include \n", "\n", "using namespace ROOT::VecOps;\n", "using RNode = ROOT::RDF::RNode;\n", "using cRVecF = const ROOT::RVecF &;\n", "\n", "const auto z_mass = 91.2;" ] }, { "cell_type": "markdown", "id": "ccc42f1c", "metadata": {}, "source": [ " Select interesting events with four muons\n", " " ] }, { "cell_type": "code", "execution_count": 2, "id": "dd6b1507", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:10:33.231089Z", "iopub.status.busy": "2026-05-19T20:10:33.230954Z", "iopub.status.idle": "2026-05-19T20:10:33.301296Z", "shell.execute_reply": "2026-05-19T20:10:33.300437Z" } }, "outputs": [], "source": [ "%%cpp -d\n", "RNode selection_4mu(RNode df)\n", "{\n", " auto df_ge4m = df.Filter(\"nMuon>=4\", \"At least four muons\");\n", " auto df_iso = df_ge4m.Filter(\"All(abs(Muon_pfRelIso04_all)<0.40)\", \"Require good isolation\");\n", " auto df_kin = df_iso.Filter(\"All(Muon_pt>5) && All(abs(Muon_eta)<2.4)\", \"Good muon kinematics\");\n", " auto df_ip3d = df_kin.Define(\"Muon_ip3d\", \"sqrt(Muon_dxy*Muon_dxy + Muon_dz*Muon_dz)\");\n", " auto df_sip3d = df_ip3d.Define(\"Muon_sip3d\", \"Muon_ip3d/sqrt(Muon_dxyErr*Muon_dxyErr + Muon_dzErr*Muon_dzErr)\");\n", " auto df_pv = df_sip3d.Filter(\"All(Muon_sip3d<4) && All(abs(Muon_dxy)<0.5) && All(abs(Muon_dz)<1.0)\",\n", " \"Track close to primary vertex with small uncertainty\");\n", " auto df_2p2n = df_pv.Filter(\"nMuon==4 && Sum(Muon_charge==1)==2 && Sum(Muon_charge==-1)==2\",\n", " \"Two positive and two negative muons\");\n", " return df_2p2n;\n", "}" ] }, { "cell_type": "markdown", "id": "af944fb0", "metadata": {}, "source": [ " Select interesting events with four electrons\n", " " ] }, { "cell_type": "code", "execution_count": 3, "id": "782e975e", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:10:33.302953Z", "iopub.status.busy": "2026-05-19T20:10:33.302805Z", "iopub.status.idle": "2026-05-19T20:10:33.323533Z", "shell.execute_reply": "2026-05-19T20:10:33.322999Z" } }, "outputs": [], "source": [ "%%cpp -d\n", "RNode selection_4el(RNode df)\n", "{\n", " auto df_ge4el = df.Filter(\"nElectron>=4\", \"At least our electrons\");\n", " auto df_iso = df_ge4el.Filter(\"All(abs(Electron_pfRelIso03_all)<0.40)\", \"Require good isolation\");\n", " auto df_kin = df_iso.Filter(\"All(Electron_pt>7) && All(abs(Electron_eta)<2.5)\", \"Good Electron kinematics\");\n", " auto df_ip3d = df_kin.Define(\"Electron_ip3d\", \"sqrt(Electron_dxy*Electron_dxy + Electron_dz*Electron_dz)\");\n", " auto df_sip3d = df_ip3d.Define(\"Electron_sip3d\",\n", " \"Electron_ip3d/sqrt(Electron_dxyErr*Electron_dxyErr + Electron_dzErr*Electron_dzErr)\");\n", " auto df_pv = df_sip3d.Filter(\"All(Electron_sip3d<4) && All(abs(Electron_dxy)<0.5) && \"\n", " \"All(abs(Electron_dz)<1.0)\",\n", " \"Track close to primary vertex with small uncertainty\");\n", " auto df_2p2n = df_pv.Filter(\"nElectron==4 && Sum(Electron_charge==1)==2 && Sum(Electron_charge==-1)==2\",\n", " \"Two positive and two negative electrons\");\n", " return df_2p2n;\n", "}" ] }, { "cell_type": "markdown", "id": "91b6ae16", "metadata": {}, "source": [ " Select interesting events with two electrons and two muons\n", " " ] }, { "cell_type": "code", "execution_count": 4, "id": "1e0cc60b", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:10:33.325172Z", "iopub.status.busy": "2026-05-19T20:10:33.325036Z", "iopub.status.idle": "2026-05-19T20:10:33.658821Z", "shell.execute_reply": "2026-05-19T20:10:33.658158Z" } }, "outputs": [], "source": [ "%%cpp -d\n", "RNode selection_2el2mu(RNode df)\n", "{\n", " auto df_ge2el2mu = df.Filter(\"nElectron>=2 && nMuon>=2\", \"At least two electrons and two muons\");\n", " auto df_eta = df_ge2el2mu.Filter(\"All(abs(Electron_eta)<2.5) && All(abs(Muon_eta)<2.4)\", \"Eta cuts\");\n", " auto pt_cuts = [](cRVecF mu_pt, cRVecF el_pt) {\n", " auto mu_pt_sorted = Reverse(Sort(mu_pt));\n", " if (mu_pt_sorted[0] > 20 && mu_pt_sorted[1] > 10) {\n", " return true;\n", " }\n", " auto el_pt_sorted = Reverse(Sort(el_pt));\n", " if (el_pt_sorted[0] > 20 && el_pt_sorted[1] > 10) {\n", " return true;\n", " }\n", " return false;\n", " };\n", " auto df_pt = df_eta.Filter(pt_cuts, {\"Muon_pt\", \"Electron_pt\"}, \"Pt cuts\");\n", " auto dr_cuts = [](cRVecF mu_eta, cRVecF mu_phi, cRVecF el_eta, cRVecF el_phi) {\n", " auto mu_dr = DeltaR(mu_eta[0], mu_eta[1], mu_phi[0], mu_phi[1]);\n", " auto el_dr = DeltaR(el_eta[0], el_eta[1], el_phi[0], el_phi[1]);\n", " if (mu_dr < 0.02 || el_dr < 0.02) {\n", " return false;\n", " }\n", " return true;\n", " };\n", " auto df_dr = df_pt.Filter(dr_cuts, {\"Muon_eta\", \"Muon_phi\", \"Electron_eta\", \"Electron_phi\"}, \"Dr cuts\");\n", " auto df_iso = df_dr.Filter(\"All(abs(Electron_pfRelIso03_all)<0.40) && All(abs(Muon_pfRelIso04_all)<0.40)\",\n", " \"Require good isolation\");\n", " auto df_el_ip3d = df_iso.Define(\"Electron_ip3d_el\", \"sqrt(Electron_dxy*Electron_dxy + Electron_dz*Electron_dz)\");\n", " auto df_el_sip3d = df_el_ip3d.Define(\"Electron_sip3d_el\",\n", " \"Electron_ip3d_el/sqrt(Electron_dxyErr*Electron_dxyErr + \"\n", " \"Electron_dzErr*Electron_dzErr)\");\n", " auto df_el_track = df_el_sip3d.Filter(\"All(Electron_sip3d_el<4) && All(abs(Electron_dxy)<0.5) && All(abs(Electron_dz)<1.0)\",\n", " \"Electron track close to primary vertex with small uncertainty\");\n", " auto df_mu_ip3d = df_el_track.Define(\"Muon_ip3d_mu\", \"sqrt(Muon_dxy*Muon_dxy + Muon_dz*Muon_dz)\");\n", " auto df_mu_sip3d = df_mu_ip3d.Define(\"Muon_sip3d_mu\",\n", " \"Muon_ip3d_mu/sqrt(Muon_dxyErr*Muon_dxyErr + Muon_dzErr*Muon_dzErr)\");\n", " auto df_mu_track = df_mu_sip3d.Filter(\"All(Muon_sip3d_mu<4) && All(abs(Muon_dxy)<0.5) && All(abs(Muon_dz)<1.0)\",\n", " \"Muon track close to primary vertex with small uncertainty\");\n", " auto df_2p2n = df_mu_track.Filter(\"Sum(Electron_charge)==0 && Sum(Muon_charge)==0\",\n", " \"Two opposite charged electron and muon pairs\");\n", " return df_2p2n;\n", "}" ] }, { "cell_type": "markdown", "id": "ce02a7a4", "metadata": {}, "source": [ " Reconstruct two Z candidates from four leptons of the same kind\n", " " ] }, { "cell_type": "code", "execution_count": 5, "id": "d64c3120", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:10:33.663439Z", "iopub.status.busy": "2026-05-19T20:10:33.663311Z", "iopub.status.idle": "2026-05-19T20:10:33.722626Z", "shell.execute_reply": "2026-05-19T20:10:33.721839Z" } }, "outputs": [], "source": [ "%%cpp -d\n", "RVec> reco_zz_to_4l(cRVecF pt, cRVecF eta, cRVecF phi, cRVecF mass, const ROOT::RVecI & charge)\n", "{\n", " RVec> idx(2);\n", " idx[0].reserve(2); idx[1].reserve(2);\n", "\n", " // Find first lepton pair with invariant mass closest to Z mass\n", " auto idx_cmb = Combinations(pt, 2);\n", " auto best_mass = -1;\n", " size_t best_i1 = 0; size_t best_i2 = 0;\n", " for (size_t i = 0; i < idx_cmb[0].size(); i++) {\n", " const auto i1 = idx_cmb[0][i];\n", " const auto i2 = idx_cmb[1][i];\n", " if (charge[i1] != charge[i2]) {\n", " ROOT::Math::PtEtaPhiMVector p1(pt[i1], eta[i1], phi[i1], mass[i1]);\n", " ROOT::Math::PtEtaPhiMVector p2(pt[i2], eta[i2], phi[i2], mass[i2]);\n", " const auto this_mass = (p1 + p2).M();\n", " if (std::abs(z_mass - this_mass) < std::abs(z_mass - best_mass)) {\n", " best_mass = this_mass;\n", " best_i1 = i1;\n", " best_i2 = i2;\n", " }\n", " }\n", " }\n", " idx[0].emplace_back(best_i1);\n", " idx[0].emplace_back(best_i2);\n", "\n", " // Reconstruct second Z from remaining lepton pair\n", " for (size_t i = 0; i < 4; i++) {\n", " if (i != best_i1 && i != best_i2) {\n", " idx[1].emplace_back(i);\n", " }\n", " }\n", "\n", " // Return indices of the pairs building two Z bosons\n", " return idx;\n", "}" ] }, { "cell_type": "markdown", "id": "99a397a1", "metadata": {}, "source": [ " Compute Z masses from four leptons of the same kind and sort ascending in distance to Z mass\n", " " ] }, { "cell_type": "code", "execution_count": 6, "id": "3e226a14", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:10:33.724165Z", "iopub.status.busy": "2026-05-19T20:10:33.724033Z", "iopub.status.idle": "2026-05-19T20:10:33.746865Z", "shell.execute_reply": "2026-05-19T20:10:33.746083Z" } }, "outputs": [], "source": [ "%%cpp -d\n", "ROOT::RVecF compute_z_masses_4l(const RVec> &idx, cRVecF pt, cRVecF eta, cRVecF phi, cRVecF mass)\n", "{\n", " ROOT::RVecF z_masses(2);\n", " for (size_t i = 0; i < 2; i++) {\n", " const auto i1 = idx[i][0]; const auto i2 = idx[i][1];\n", " ROOT::Math::PtEtaPhiMVector p1(pt[i1], eta[i1], phi[i1], mass[i1]);\n", " ROOT::Math::PtEtaPhiMVector p2(pt[i2], eta[i2], phi[i2], mass[i2]);\n", " z_masses[i] = (p1 + p2).M();\n", " }\n", " if (std::abs(z_masses[0] - z_mass) < std::abs(z_masses[1] - z_mass)) {\n", " return z_masses;\n", " } else {\n", " return Reverse(z_masses);\n", " }\n", "}" ] }, { "cell_type": "markdown", "id": "b63573c9", "metadata": {}, "source": [ " Compute mass of Higgs from four leptons of the same kind\n", " " ] }, { "cell_type": "code", "execution_count": 7, "id": "eb4e0e77", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:10:33.748340Z", "iopub.status.busy": "2026-05-19T20:10:33.748207Z", "iopub.status.idle": "2026-05-19T20:10:33.761916Z", "shell.execute_reply": "2026-05-19T20:10:33.761034Z" } }, "outputs": [], "source": [ "%%cpp -d\n", "float compute_higgs_mass_4l(const RVec> &idx, cRVecF pt, cRVecF eta, cRVecF phi, cRVecF mass)\n", "{\n", " const auto i1 = idx[0][0]; const auto i2 = idx[0][1];\n", " const auto i3 = idx[1][0]; const auto i4 = idx[1][1];\n", " ROOT::Math::PtEtaPhiMVector p1(pt[i1], eta[i1], phi[i1], mass[i1]);\n", " ROOT::Math::PtEtaPhiMVector p2(pt[i2], eta[i2], phi[i2], mass[i2]);\n", " ROOT::Math::PtEtaPhiMVector p3(pt[i3], eta[i3], phi[i3], mass[i3]);\n", " ROOT::Math::PtEtaPhiMVector p4(pt[i4], eta[i4], phi[i4], mass[i4]);\n", " return (p1 + p2 + p3 + p4).M();\n", "}" ] }, { "cell_type": "markdown", "id": "38e3bbc1", "metadata": {}, "source": [ " Apply selection on reconstructed Z candidates\n", " " ] }, { "cell_type": "code", "execution_count": 8, "id": "1220d206", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:10:33.763475Z", "iopub.status.busy": "2026-05-19T20:10:33.763334Z", "iopub.status.idle": "2026-05-19T20:10:33.774559Z", "shell.execute_reply": "2026-05-19T20:10:33.773840Z" } }, "outputs": [], "source": [ "%%cpp -d\n", "RNode filter_z_candidates(RNode df)\n", "{\n", " auto df_z1_cut = df.Filter(\"Z_mass[0] > 40 && Z_mass[0] < 120\", \"Mass of first Z candidate in [40, 120]\");\n", " auto df_z2_cut = df_z1_cut.Filter(\"Z_mass[1] > 12 && Z_mass[1] < 120\", \"Mass of second Z candidate in [12, 120]\");\n", " return df_z2_cut;\n", "}" ] }, { "cell_type": "markdown", "id": "e94d19f9", "metadata": {}, "source": [ " Reconstruct Higgs from four muons\n", " " ] }, { "cell_type": "code", "execution_count": 9, "id": "ac957b15", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:10:33.776082Z", "iopub.status.busy": "2026-05-19T20:10:33.775949Z", "iopub.status.idle": "2026-05-19T20:10:34.171926Z", "shell.execute_reply": "2026-05-19T20:10:34.170646Z" } }, "outputs": [], "source": [ "%%cpp -d\n", "RNode reco_higgs_to_4mu(RNode df)\n", "{\n", " // Filter interesting events\n", " auto df_base = selection_4mu(df);\n", "\n", " // Reconstruct Z systems\n", " auto df_z_idx =\n", " df_base.Define(\"Z_idx\", reco_zz_to_4l, {\"Muon_pt\", \"Muon_eta\", \"Muon_phi\", \"Muon_mass\", \"Muon_charge\"});\n", "\n", " // Cut on distance between muons building Z systems\n", " auto filter_z_dr = [](const RVec> &idx, cRVecF eta, cRVecF phi) {\n", " for (size_t i = 0; i < 2; i++) {\n", " const auto i1 = idx[i][0];\n", " const auto i2 = idx[i][1];\n", " const auto dr = DeltaR(eta[i1], eta[i2], phi[i1], phi[i2]);\n", " if (dr < 0.02) {\n", " return false;\n", " }\n", " }\n", " return true;\n", " };\n", " auto df_z_dr =\n", " df_z_idx.Filter(filter_z_dr, {\"Z_idx\", \"Muon_eta\", \"Muon_phi\"}, \"Delta R separation of muons building Z system\");\n", "\n", " // Compute masses of Z systems\n", " auto df_z_mass =\n", " df_z_dr.Define(\"Z_mass\", compute_z_masses_4l, {\"Z_idx\", \"Muon_pt\", \"Muon_eta\", \"Muon_phi\", \"Muon_mass\"});\n", "\n", " // Cut on mass of Z candidates\n", " auto df_z_cut = filter_z_candidates(df_z_mass);\n", "\n", " // Reconstruct H mass\n", " auto df_h_mass =\n", " df_z_cut.Define(\"H_mass\", compute_higgs_mass_4l, {\"Z_idx\", \"Muon_pt\", \"Muon_eta\", \"Muon_phi\", \"Muon_mass\"});\n", "\n", " return df_h_mass;\n", "}" ] }, { "cell_type": "markdown", "id": "8d279918", "metadata": {}, "source": [ " Reconstruct Higgs from four electrons\n", " " ] }, { "cell_type": "code", "execution_count": 10, "id": "58d79c2a", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:10:34.180480Z", "iopub.status.busy": "2026-05-19T20:10:34.180324Z", "iopub.status.idle": "2026-05-19T20:10:34.418557Z", "shell.execute_reply": "2026-05-19T20:10:34.418123Z" } }, "outputs": [], "source": [ "%%cpp -d\n", "RNode reco_higgs_to_4el(RNode df)\n", "{\n", " // Filter interesting events\n", " auto df_base = selection_4el(df);\n", "\n", " // Reconstruct Z systems\n", " auto df_z_idx = df_base.Define(\"Z_idx\", reco_zz_to_4l,\n", " {\"Electron_pt\", \"Electron_eta\", \"Electron_phi\", \"Electron_mass\", \"Electron_charge\"});\n", "\n", " // Cut on distance between Electrons building Z systems\n", " auto filter_z_dr = [](const RVec> &idx, cRVecF eta, cRVecF phi) {\n", " for (size_t i = 0; i < 2; i++) {\n", " const auto i1 = idx[i][0];\n", " const auto i2 = idx[i][1];\n", " const auto dr = DeltaR(eta[i1], eta[i2], phi[i1], phi[i2]);\n", " if (dr < 0.02) {\n", " return false;\n", " }\n", " }\n", " return true;\n", " };\n", " auto df_z_dr = df_z_idx.Filter(filter_z_dr, {\"Z_idx\", \"Electron_eta\", \"Electron_phi\"},\n", " \"Delta R separation of Electrons building Z system\");\n", "\n", " // Compute masses of Z systems\n", " auto df_z_mass = df_z_dr.Define(\"Z_mass\", compute_z_masses_4l,\n", " {\"Z_idx\", \"Electron_pt\", \"Electron_eta\", \"Electron_phi\", \"Electron_mass\"});\n", "\n", " // Cut on mass of Z candidates\n", " auto df_z_cut = filter_z_candidates(df_z_mass);\n", "\n", " // Reconstruct H mass\n", " auto df_h_mass = df_z_cut.Define(\"H_mass\", compute_higgs_mass_4l,\n", " {\"Z_idx\", \"Electron_pt\", \"Electron_eta\", \"Electron_phi\", \"Electron_mass\"});\n", "\n", " return df_h_mass;\n", "}" ] }, { "cell_type": "markdown", "id": "164c8a75", "metadata": {}, "source": [ " Compute mass of two Z candidates from two electrons and two muons and sort ascending in distance to Z mass\n", " " ] }, { "cell_type": "code", "execution_count": 11, "id": "a7420e5b", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:10:34.420995Z", "iopub.status.busy": "2026-05-19T20:10:34.420855Z", "iopub.status.idle": "2026-05-19T20:10:34.437699Z", "shell.execute_reply": "2026-05-19T20:10:34.436985Z" } }, "outputs": [], "source": [ "%%cpp -d\n", "ROOT::RVecF compute_z_masses_2el2mu(cRVecF el_pt, cRVecF el_eta, cRVecF el_phi, cRVecF el_mass, cRVecF mu_pt,\n", " cRVecF mu_eta, cRVecF mu_phi, cRVecF mu_mass)\n", "{\n", " ROOT::Math::PtEtaPhiMVector p1(mu_pt[0], mu_eta[0], mu_phi[0], mu_mass[0]);\n", " ROOT::Math::PtEtaPhiMVector p2(mu_pt[1], mu_eta[1], mu_phi[1], mu_mass[1]);\n", " ROOT::Math::PtEtaPhiMVector p3(el_pt[0], el_eta[0], el_phi[0], el_mass[0]);\n", " ROOT::Math::PtEtaPhiMVector p4(el_pt[1], el_eta[1], el_phi[1], el_mass[1]);\n", " auto mu_z = (p1 + p2).M();\n", " auto el_z = (p3 + p4).M();\n", " ROOT::RVecF z_masses(2);\n", " if (std::abs(mu_z - z_mass) < std::abs(el_z - z_mass)) {\n", " z_masses[0] = mu_z;\n", " z_masses[1] = el_z;\n", " } else {\n", " z_masses[0] = el_z;\n", " z_masses[1] = mu_z;\n", " }\n", " return z_masses;\n", "}" ] }, { "cell_type": "markdown", "id": "ad09a9ad", "metadata": {}, "source": [ " Compute Higgs mass from two electrons and two muons\n", " " ] }, { "cell_type": "code", "execution_count": 12, "id": "331b2fb6", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:10:34.439123Z", "iopub.status.busy": "2026-05-19T20:10:34.438998Z", "iopub.status.idle": "2026-05-19T20:10:34.451423Z", "shell.execute_reply": "2026-05-19T20:10:34.450756Z" } }, "outputs": [], "source": [ "%%cpp -d\n", "float compute_higgs_mass_2el2mu(cRVecF el_pt, cRVecF el_eta, cRVecF el_phi, cRVecF el_mass, cRVecF mu_pt, cRVecF mu_eta,\n", " cRVecF mu_phi, cRVecF mu_mass)\n", "{\n", " ROOT::Math::PtEtaPhiMVector p1(mu_pt[0], mu_eta[0], mu_phi[0], mu_mass[0]);\n", " ROOT::Math::PtEtaPhiMVector p2(mu_pt[1], mu_eta[1], mu_phi[1], mu_mass[1]);\n", " ROOT::Math::PtEtaPhiMVector p3(el_pt[0], el_eta[0], el_phi[0], el_mass[0]);\n", " ROOT::Math::PtEtaPhiMVector p4(el_pt[1], el_eta[1], el_phi[1], el_mass[1]);\n", " return (p1 + p2 + p3 + p4).M();\n", "}" ] }, { "cell_type": "markdown", "id": "a2d925db", "metadata": {}, "source": [ " Reconstruct Higgs from two electrons and two muons\n", " " ] }, { "cell_type": "code", "execution_count": 13, "id": "b84738bd", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:10:34.452734Z", "iopub.status.busy": "2026-05-19T20:10:34.452594Z", "iopub.status.idle": "2026-05-19T20:10:34.608922Z", "shell.execute_reply": "2026-05-19T20:10:34.607918Z" } }, "outputs": [], "source": [ "%%cpp -d\n", "RNode reco_higgs_to_2el2mu(RNode df)\n", "{\n", " // Filter interesting events\n", " auto df_base = selection_2el2mu(df);\n", "\n", " // Compute masses of Z systems\n", " auto df_z_mass =\n", " df_base.Define(\"Z_mass\", compute_z_masses_2el2mu, {\"Electron_pt\", \"Electron_eta\", \"Electron_phi\", \"Electron_mass\",\n", " \"Muon_pt\", \"Muon_eta\", \"Muon_phi\", \"Muon_mass\"});\n", "\n", " // Cut on mass of Z candidates\n", " auto df_z_cut = filter_z_candidates(df_z_mass);\n", "\n", " // Reconstruct H mass\n", " auto df_h_mass = df_z_cut.Define(\n", " \"H_mass\", compute_higgs_mass_2el2mu,\n", " {\"Electron_pt\", \"Electron_eta\", \"Electron_phi\", \"Electron_mass\", \"Muon_pt\", \"Muon_eta\", \"Muon_phi\", \"Muon_mass\"});\n", "\n", " return df_h_mass;\n", "}" ] }, { "cell_type": "markdown", "id": "c0f203a4", "metadata": {}, "source": [ " Plot invariant mass for signal and background processes from simulated events\n", "overlay the measured data.\n", " " ] }, { "cell_type": "code", "execution_count": 14, "id": "b3928476", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:10:34.617949Z", "iopub.status.busy": "2026-05-19T20:10:34.617811Z", "iopub.status.idle": "2026-05-19T20:10:34.625323Z", "shell.execute_reply": "2026-05-19T20:10:34.624298Z" } }, "outputs": [], "source": [ "%%cpp -d\n", "template \n", "void plot(T sig, T bkg, T data, const std::string &x_label, const std::string &filename)\n", "{\n", " // Canvas and general style options\n", " gStyle->SetTextFont(42);\n", " auto c = new TCanvas(\"\", \"\", 800, 700);\n", " c->SetLeftMargin(0.15);\n", "\n", " // Get signal and background histograms and stack them to show Higgs signal\n", " // on top of the background process\n", " auto h_bkg = static_cast(bkg->Clone());\n", " auto h_cmb = static_cast(sig->Clone());\n", "\n", " h_cmb->Add(h_bkg);\n", " h_cmb->SetTitle(\"\");\n", " h_cmb->GetXaxis()->SetTitle(x_label.c_str());\n", " h_cmb->GetXaxis()->SetTitleSize(0.04);\n", " h_cmb->GetYaxis()->SetTitle(\"N_{Events}\");\n", " h_cmb->GetYaxis()->SetTitleSize(0.04);\n", " h_cmb->SetLineColor(kRed);\n", " h_cmb->SetLineWidth(2);\n", " h_cmb->SetMaximum(18);\n", " h_cmb->SetStats(kFALSE);\n", "\n", " h_bkg->SetLineWidth(2);\n", " h_bkg->SetFillStyle(1001);\n", " h_bkg->SetLineColor(kBlack);\n", " h_bkg->SetFillColor(kAzure - 9);\n", "\n", " // Get histogram of data points\n", " auto h_data = static_cast(data->Clone());\n", " h_data->SetLineWidth(1);\n", " h_data->SetMarkerStyle(20);\n", " h_data->SetMarkerSize(1.0);\n", " h_data->SetMarkerColor(kBlack);\n", " h_data->SetLineColor(kBlack);\n", "\n", " // Draw histograms\n", " h_cmb->Draw(\"HIST\");\n", " h_bkg->Draw(\"HIST SAME\");\n", " h_data->Draw(\"PE1 SAME\");\n", "\n", " // Add legend\n", " auto legend = new TLegend(0.62, 0.70, 0.82, 0.88);\n", " legend->SetFillColor(0);\n", " legend->SetBorderSize(0);\n", " legend->SetTextSize(0.03);\n", " legend->AddEntry(h_data, \"Data\", \"pe\");\n", " legend->AddEntry(h_bkg, \"ZZ\", \"f\");\n", " legend->AddEntry(h_cmb, \"m_{H} = 125 GeV\", \"f\");\n", " legend->Draw();\n", "\n", " // Add header\n", " TLatex cms_label;\n", " cms_label.SetTextSize(0.04);\n", " cms_label.DrawLatexNDC(0.16, 0.92, \"#bf{CMS Open Data}\");\n", " TLatex header;\n", " header.SetTextSize(0.03);\n", " header.DrawLatexNDC(0.63, 0.92, \"#sqrt{s} = 8 TeV, L_{int} = 11.6 fb^{-1}\");\n", "\n", " // Save plot\n", " c->SaveAs(filename.c_str());\n", "}" ] }, { "cell_type": "markdown", "id": "299c2ea3", "metadata": {}, "source": [ " Arguments are defined. " ] }, { "cell_type": "code", "execution_count": 15, "id": "554cd3f8", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:10:34.626727Z", "iopub.status.busy": "2026-05-19T20:10:34.626573Z", "iopub.status.idle": "2026-05-19T20:10:35.000000Z", "shell.execute_reply": "2026-05-19T20:10:34.999519Z" } }, "outputs": [], "source": [ "const bool run_fast = true;" ] }, { "cell_type": "markdown", "id": "e3debda8", "metadata": {}, "source": [ "Enable multi-threading" ] }, { "cell_type": "code", "execution_count": 16, "id": "35340f9b", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:10:35.011821Z", "iopub.status.busy": "2026-05-19T20:10:35.011687Z", "iopub.status.idle": "2026-05-19T20:10:35.239705Z", "shell.execute_reply": "2026-05-19T20:10:35.230160Z" } }, "outputs": [], "source": [ "ROOT::EnableImplicitMT();" ] }, { "cell_type": "markdown", "id": "a7304a0d", "metadata": {}, "source": [ "In fast mode, take samples from */cms_opendata_2012_nanoaod_skimmed/*, which has\n", "the preselections from the selection_* functions already applied." ] }, { "cell_type": "code", "execution_count": 17, "id": "488036a9", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:10:35.246403Z", "iopub.status.busy": "2026-05-19T20:10:35.246247Z", "iopub.status.idle": "2026-05-19T20:10:35.483876Z", "shell.execute_reply": "2026-05-19T20:10:35.468491Z" } }, "outputs": [], "source": [ "std::string path = \"root://eospublic.cern.ch//eos/root-eos/cms_opendata_2012_nanoaod/\";\n", "if (run_fast) path = \"root://eospublic.cern.ch//eos/root-eos/cms_opendata_2012_nanoaod_skimmed/\";" ] }, { "cell_type": "markdown", "id": "8a83eb18", "metadata": {}, "source": [ "Create dataframes for signal, background and data samples" ] }, { "cell_type": "markdown", "id": "340b89c8", "metadata": {}, "source": [ "Signal: Higgs -> 4 leptons" ] }, { "cell_type": "code", "execution_count": 18, "id": "021c710b", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:10:35.508187Z", "iopub.status.busy": "2026-05-19T20:10:35.508039Z", "iopub.status.idle": "2026-05-19T20:10:36.588549Z", "shell.execute_reply": "2026-05-19T20:10:36.570646Z" } }, "outputs": [], "source": [ "ROOT::RDataFrame df_sig_4l(\"Events\", path + \"SMHiggsToZZTo4L.root\");" ] }, { "cell_type": "markdown", "id": "435fca84", "metadata": {}, "source": [ "Background: ZZ -> 4 leptons\n", "Note that additional background processes from the original paper with minor contribution were left out for this\n", "tutorial." ] }, { "cell_type": "code", "execution_count": 19, "id": "a733574f", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:10:36.623091Z", "iopub.status.busy": "2026-05-19T20:10:36.622938Z", "iopub.status.idle": "2026-05-19T20:10:37.516863Z", "shell.execute_reply": "2026-05-19T20:10:37.515583Z" } }, "outputs": [], "source": [ "ROOT::RDataFrame df_bkg_4mu(\"Events\", path + \"ZZTo4mu.root\");\n", "ROOT::RDataFrame df_bkg_4el(\"Events\", path + \"ZZTo4e.root\");\n", "ROOT::RDataFrame df_bkg_2el2mu(\"Events\", path + \"ZZTo2e2mu.root\");" ] }, { "cell_type": "markdown", "id": "123cf4b0", "metadata": {}, "source": [ "CMS data taken in 2012 (11.6 fb^-1 integrated luminosity)" ] }, { "cell_type": "code", "execution_count": 20, "id": "e7344976", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:10:37.529364Z", "iopub.status.busy": "2026-05-19T20:10:37.529214Z", "iopub.status.idle": "2026-05-19T20:10:38.113161Z", "shell.execute_reply": "2026-05-19T20:10:38.094354Z" } }, "outputs": [], "source": [ "ROOT::RDataFrame df_data_doublemu(\n", " \"Events\", {path + \"Run2012B_DoubleMuParked.root\", path + \"Run2012C_DoubleMuParked.root\"});\n", "ROOT::RDataFrame df_data_doubleel(\n", " \"Events\", {path + \"Run2012B_DoubleElectron.root\", path + \"Run2012C_DoubleElectron.root\"});" ] }, { "cell_type": "markdown", "id": "dbfd01f2", "metadata": {}, "source": [ "Reconstruct Higgs to 4 muons" ] }, { "cell_type": "code", "execution_count": 21, "id": "19a15592", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:10:38.131498Z", "iopub.status.busy": "2026-05-19T20:10:38.131325Z", "iopub.status.idle": "2026-05-19T20:10:39.763198Z", "shell.execute_reply": "2026-05-19T20:10:39.762450Z" } }, "outputs": [], "source": [ "auto df_sig_4mu_reco = reco_higgs_to_4mu(df_sig_4l);\n", "const auto luminosity = 11580.0; // Integrated luminosity of the data samples\n", "const auto xsec_SMHiggsToZZTo4L = 0.0065; // H->4l: Standard Model cross-section\n", "const auto nevt_SMHiggsToZZTo4L = 299973.0; // H->4l: Number of simulated events\n", "const auto nbins = 36; // Number of bins for the invariant mass spectrum\n", "auto df_h_sig_4mu = df_sig_4mu_reco\n", " .Define(\"weight\", [&] { return luminosity * xsec_SMHiggsToZZTo4L / nevt_SMHiggsToZZTo4L; })\n", " .Histo1D({\"h_sig_4mu\", \"\", nbins, 70, 180}, \"H_mass\", \"weight\");\n", "\n", "const auto scale_ZZTo4l = 1.386; // ZZ->4mu: Scale factor for ZZ to four leptons\n", "const auto xsec_ZZTo4mu = 0.077; // ZZ->4mu: Standard Model cross-section\n", "const auto nevt_ZZTo4mu = 1499064.0; // ZZ->4mu: Number of simulated events\n", "auto df_bkg_4mu_reco = reco_higgs_to_4mu(df_bkg_4mu);\n", "auto df_h_bkg_4mu = df_bkg_4mu_reco\n", " .Define(\"weight\", [&] { return luminosity * xsec_ZZTo4mu * scale_ZZTo4l / nevt_ZZTo4mu; })\n", " .Histo1D({\"h_bkg_4mu\", \"\", nbins, 70, 180}, \"H_mass\", \"weight\");\n", "\n", "auto df_data_4mu_reco = reco_higgs_to_4mu(df_data_doublemu);\n", "auto df_h_data_4mu = df_data_4mu_reco\n", " .Define(\"weight\", [] { return 1.0; })\n", " .Histo1D({\"h_data_4mu\", \"\", nbins, 70, 180}, \"H_mass\", \"weight\");" ] }, { "cell_type": "markdown", "id": "1f229f85", "metadata": {}, "source": [ "Reconstruct Higgs to 4 electrons" ] }, { "cell_type": "code", "execution_count": 22, "id": "18903421", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:10:39.771163Z", "iopub.status.busy": "2026-05-19T20:10:39.771039Z", "iopub.status.idle": "2026-05-19T20:10:40.340124Z", "shell.execute_reply": "2026-05-19T20:10:40.339573Z" } }, "outputs": [], "source": [ "auto df_sig_4el_reco = reco_higgs_to_4el(df_sig_4l);\n", "auto df_h_sig_4el = df_sig_4el_reco\n", " .Define(\"weight\", [&] { return luminosity * xsec_SMHiggsToZZTo4L / nevt_SMHiggsToZZTo4L; })\n", " .Histo1D({\"h_sig_4el\", \"\", nbins, 70, 180}, \"H_mass\", \"weight\");\n", "\n", "const auto xsec_ZZTo4el = xsec_ZZTo4mu; // ZZ->4el: Standard Model cross-section\n", "const auto nevt_ZZTo4el = 1499093.0; // ZZ->4el: Number of simulated events\n", "auto df_bkg_4el_reco = reco_higgs_to_4el(df_bkg_4el);\n", "auto df_h_bkg_4el = df_bkg_4el_reco\n", " .Define(\"weight\", [&] { return luminosity * xsec_ZZTo4el * scale_ZZTo4l / nevt_ZZTo4el; })\n", " .Histo1D({\"h_bkg_4el\", \"\", nbins, 70, 180}, \"H_mass\", \"weight\");\n", "\n", "auto df_data_4el_reco = reco_higgs_to_4el(df_data_doubleel);\n", "auto df_h_data_4el = df_data_4el_reco.Define(\"weight\", [] { return 1.0; })\n", " .Histo1D({\"h_data_4el\", \"\", nbins, 70, 180}, \"H_mass\", \"weight\");" ] }, { "cell_type": "markdown", "id": "75827c5a", "metadata": {}, "source": [ "Reconstruct Higgs to 2 electrons and 2 muons" ] }, { "cell_type": "code", "execution_count": 23, "id": "c8a7cc12", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:10:40.362542Z", "iopub.status.busy": "2026-05-19T20:10:40.362366Z", "iopub.status.idle": "2026-05-19T20:10:40.988046Z", "shell.execute_reply": "2026-05-19T20:10:40.987385Z" } }, "outputs": [], "source": [ "auto df_sig_2el2mu_reco = reco_higgs_to_2el2mu(df_sig_4l);\n", "auto df_h_sig_2el2mu = df_sig_2el2mu_reco\n", " .Define(\"weight\", [&] { return luminosity * xsec_SMHiggsToZZTo4L / nevt_SMHiggsToZZTo4L; })\n", " .Histo1D({\"h_sig_2el2mu\", \"\", nbins, 70, 180}, \"H_mass\", \"weight\");\n", "\n", "const auto xsec_ZZTo2el2mu = 0.18; // ZZ->2el2mu: Standard Model cross-section\n", "const auto nevt_ZZTo2el2mu = 1497445.0; // ZZ->2el2mu: Number of simulated events\n", "auto df_bkg_2el2mu_reco = reco_higgs_to_2el2mu(df_bkg_2el2mu);\n", "auto df_h_bkg_2el2mu = df_bkg_2el2mu_reco\n", " .Define(\"weight\", [&] { return luminosity * xsec_ZZTo2el2mu * scale_ZZTo4l / nevt_ZZTo2el2mu; })\n", " .Histo1D({\"h_bkg_2el2mu\", \"\", nbins, 70, 180}, \"H_mass\", \"weight\");\n", "\n", "auto df_data_2el2mu_reco = reco_higgs_to_2el2mu(df_data_doublemu);\n", "auto df_h_data_2el2mu = df_data_2el2mu_reco.Define(\"weight\", [] { return 1.0; })\n", " .Histo1D({\"h_data_2el2mu_doublemu\", \"\", nbins, 70, 180}, \"H_mass\", \"weight\");" ] }, { "cell_type": "markdown", "id": "1e33ed58", "metadata": {}, "source": [ "RunGraphs allows to run the event loops of the separate RDataFrame graphs\n", "concurrently. This results in an improved usage of the available resources\n", "if each separate RDataFrame can not utilize all available resources, e.g.,\n", "because not enough data is available." ] }, { "cell_type": "code", "execution_count": 24, "id": "dfe6fc4d", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:10:41.001254Z", "iopub.status.busy": "2026-05-19T20:10:41.001108Z", "iopub.status.idle": "2026-05-19T20:10:45.446828Z", "shell.execute_reply": "2026-05-19T20:10:45.442695Z" } }, "outputs": [], "source": [ "ROOT::RDF::RunGraphs({df_h_sig_4mu, df_h_bkg_4mu, df_h_data_4mu,\n", " df_h_sig_4el, df_h_bkg_4el, df_h_data_4el,\n", " df_h_sig_2el2mu, df_h_bkg_2el2mu, df_h_data_2el2mu});" ] }, { "cell_type": "markdown", "id": "e5189631", "metadata": {}, "source": [ "Make plots" ] }, { "cell_type": "code", "execution_count": 25, "id": "c2ec800c", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:10:45.455638Z", "iopub.status.busy": "2026-05-19T20:10:45.455487Z", "iopub.status.idle": "2026-05-19T20:10:46.245080Z", "shell.execute_reply": "2026-05-19T20:10:46.235572Z" } }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "
\n", "
\n", "\n", "\n", "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "\n", "
\n", "
\n", "\n", "\n", "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "\n", "
\n", "
\n", "\n", "\n", "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stderr", "output_type": "stream", "text": [ "Info in : pdf file higgs_4mu.pdf has been created\n", "Info in : pdf file higgs_4el.pdf has been created\n", "Info in : pdf file higgs_2el2mu.pdf has been created\n" ] } ], "source": [ "plot(df_h_sig_4mu, df_h_bkg_4mu, df_h_data_4mu, \"m_{4#mu} (GeV)\", \"higgs_4mu.pdf\");\n", "plot(df_h_sig_4el, df_h_bkg_4el, df_h_data_4el, \"m_{4e} (GeV)\", \"higgs_4el.pdf\");\n", "plot(df_h_sig_2el2mu, df_h_bkg_2el2mu, df_h_data_2el2mu, \"m_{2e2#mu} (GeV)\", \"higgs_2el2mu.pdf\");" ] }, { "cell_type": "markdown", "id": "364f6acb", "metadata": {}, "source": [ "Combine channels for final plot" ] }, { "cell_type": "code", "execution_count": 26, "id": "fd491e41", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:10:46.274422Z", "iopub.status.busy": "2026-05-19T20:10:46.274240Z", "iopub.status.idle": "2026-05-19T20:10:46.653114Z", "shell.execute_reply": "2026-05-19T20:10:46.635441Z" } }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "
\n", "
\n", "\n", "\n", "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stderr", "output_type": "stream", "text": [ "Info in : pdf file higgs_4l.pdf has been created\n" ] } ], "source": [ "auto h_data_4l = df_h_data_4mu.GetPtr();\n", "h_data_4l->Add(df_h_data_4el.GetPtr());\n", "h_data_4l->Add(df_h_data_2el2mu.GetPtr());\n", "auto h_sig_4l = df_h_sig_4mu.GetPtr();\n", "h_sig_4l->Add(df_h_sig_4el.GetPtr());\n", "h_sig_4l->Add(df_h_sig_2el2mu.GetPtr());\n", "auto h_bkg_4l = df_h_bkg_4mu.GetPtr();\n", "h_bkg_4l->Add(df_h_bkg_4el.GetPtr());\n", "h_bkg_4l->Add(df_h_bkg_2el2mu.GetPtr());\n", "plot(h_sig_4l, h_bkg_4l, h_data_4l, \"m_{4l} (GeV)\", \"higgs_4l.pdf\");" ] }, { "cell_type": "markdown", "id": "ff671d4c", "metadata": {}, "source": [ "Draw all canvases " ] }, { "cell_type": "code", "execution_count": 27, "id": "b5a943ad", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:10:46.675144Z", "iopub.status.busy": "2026-05-19T20:10:46.674997Z", "iopub.status.idle": "2026-05-19T20:10:46.921539Z", "shell.execute_reply": "2026-05-19T20:10:46.911654Z" } }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "
\n", "
\n", "\n", "\n", "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "\n", "
\n", "
\n", "\n", "\n", "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "\n", "
\n", "
\n", "\n", "\n", "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "\n", "
\n", "
\n", "\n", "\n", "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "gROOT->GetListOfCanvases()->Draw()" ] } ], "metadata": { "kernelspec": { "display_name": "ROOT C++", "language": "c++", "name": "root" }, "language_info": { "codemirror_mode": "text/x-c++src", "file_extension": ".C", "mimetype": " text/x-c++src", "name": "c++" } }, "nbformat": 4, "nbformat_minor": 5 }