{ "cells": [ { "cell_type": "markdown", "id": "efa859b5", "metadata": {}, "source": [ "# df004_cutFlowReport\n", "Display cut/Filter efficiencies with RDataFrame.\n", "\n", "This tutorial shows how to get information about the efficiency of the filters\n", "applied\n", "\n", "\n", "\n", "\n", "**Author:** Danilo Piparo (CERN) \n", "This notebook tutorial was automatically generated with ROOTBOOK-izer from the macro found in the ROOT repository on Tuesday, May 19, 2026 at 08:09 PM." ] }, { "cell_type": "code", "execution_count": 1, "id": "e498dffd", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:09:25.716009Z", "iopub.status.busy": "2026-05-19T20:09:25.715861Z", "iopub.status.idle": "2026-05-19T20:09:26.038927Z", "shell.execute_reply": "2026-05-19T20:09:26.038533Z" } }, "outputs": [], "source": [ "using FourVector = ROOT::Math::XYZTVector;\n", "using FourVectors = std::vector;\n", "using CylFourVector = ROOT::Math::RhoEtaPhiVector;" ] }, { "cell_type": "markdown", "id": "32f07e9e", "metadata": {}, "source": [ " A simple helper function to fill a test tree: this makes the example\n", "stand-alone.\n", " " ] }, { "cell_type": "code", "execution_count": 2, "id": "f4b8694c", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:09:26.051342Z", "iopub.status.busy": "2026-05-19T20:09:26.051207Z", "iopub.status.idle": "2026-05-19T20:09:26.716970Z", "shell.execute_reply": "2026-05-19T20:09:26.716599Z" } }, "outputs": [], "source": [ "%%cpp -d\n", "void fill_tree(const char *treeName, const char *fileName)\n", "{\n", " ROOT::RDataFrame d(50);\n", " int i(0);\n", " d.Define(\"b1\", [&i]() { return (double)i; })\n", " .Define(\"b2\",\n", " [&i]() {\n", " auto j = i * i;\n", " ++i;\n", " return j;\n", " })\n", " .Snapshot(treeName, fileName);\n", "}" ] }, { "cell_type": "markdown", "id": "3db8168a", "metadata": {}, "source": [ "We prepare an input tree to run on" ] }, { "cell_type": "code", "execution_count": 3, "id": "1693304e", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:09:26.742188Z", "iopub.status.busy": "2026-05-19T20:09:26.742038Z", "iopub.status.idle": "2026-05-19T20:09:27.599050Z", "shell.execute_reply": "2026-05-19T20:09:27.591669Z" } }, "outputs": [], "source": [ "auto fileName = \"df004_cutFlowReport.root\";\n", "auto treeName = \"myTree\";\n", "fill_tree(treeName, fileName);" ] }, { "cell_type": "markdown", "id": "2a4cd8b6", "metadata": {}, "source": [ "We read the tree from the file and create a RDataFrame" ] }, { "cell_type": "code", "execution_count": 4, "id": "d9d6c70f", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:09:27.601105Z", "iopub.status.busy": "2026-05-19T20:09:27.600965Z", "iopub.status.idle": "2026-05-19T20:09:27.803247Z", "shell.execute_reply": "2026-05-19T20:09:27.802741Z" } }, "outputs": [], "source": [ "ROOT::RDataFrame d(treeName, fileName, {\"b1\", \"b2\"});" ] }, { "cell_type": "markdown", "id": "de2e7176", "metadata": {}, "source": [ "## Define cuts and create the report\n", "Here we define two simple cuts" ] }, { "cell_type": "code", "execution_count": 5, "id": "98e762c0", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:09:27.805086Z", "iopub.status.busy": "2026-05-19T20:09:27.804971Z", "iopub.status.idle": "2026-05-19T20:09:28.008104Z", "shell.execute_reply": "2026-05-19T20:09:28.007657Z" } }, "outputs": [], "source": [ "auto cut1 = [](double b1) { return b1 > 25.; };\n", "auto cut2 = [](int b2) { return 0 == b2 % 2; };" ] }, { "cell_type": "markdown", "id": "a20ba8c5", "metadata": {}, "source": [ "An optional string parameter name can be passed to the Filter method to create a named filter.\n", "Named filters work as usual, but also keep track of how many entries they accept and reject." ] }, { "cell_type": "code", "execution_count": 6, "id": "5fdfe547", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:09:28.018464Z", "iopub.status.busy": "2026-05-19T20:09:28.018335Z", "iopub.status.idle": "2026-05-19T20:09:28.806092Z", "shell.execute_reply": "2026-05-19T20:09:28.805738Z" } }, "outputs": [], "source": [ "auto filtered1 = d.Filter(cut1, {\"b1\"}, \"Cut1\");\n", "auto filtered2 = d.Filter(cut2, {\"b2\"}, \"Cut2\");\n", "\n", "auto augmented1 = filtered2.Define(\"b3\", [](double b1, int b2) { return b1 / b2; });\n", "auto cut3 = [](double x) { return x < .5; };\n", "auto filtered3 = augmented1.Filter(cut3, {\"b3\"}, \"Cut3\");" ] }, { "cell_type": "markdown", "id": "756deb1f", "metadata": {}, "source": [ "Statistics are retrieved through a call to the Report method:\n", "when Report is called on the main RDataFrame object, it retrieves stats\n", "for all named filters declared up to that point.\n", "When called on a stored chain state (i.e. a chain/graph node), it\n", "retrieves stats for all named filters in the section of the chain between\n", "the main RDataFrame and that node (included).\n", "Stats are printed in the same order as named filters that have been added to\n", "the graph, and refer to the latest event-loop that has been running using the\n", "relevant RDataFrame." ] }, { "cell_type": "code", "execution_count": 7, "id": "149745a4", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:09:28.823351Z", "iopub.status.busy": "2026-05-19T20:09:28.823213Z", "iopub.status.idle": "2026-05-19T20:09:29.674784Z", "shell.execute_reply": "2026-05-19T20:09:29.673634Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Cut3 stats:\n", "Cut2 : pass=25 all=50 -- eff=50.00 % cumulative eff=50.00 %\n", "Cut3 : pass=23 all=25 -- eff=92.00 % cumulative eff=46.00 %\n" ] } ], "source": [ "std::cout << \"Cut3 stats:\" << std::endl;\n", "filtered3.Report()->Print();" ] }, { "cell_type": "markdown", "id": "f8dade2b", "metadata": {}, "source": [ "It is not only possible to print the information about cuts, but also to\n", "retrieve it to then use it programmatically." ] }, { "cell_type": "code", "execution_count": 8, "id": "6db29545", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:09:29.687197Z", "iopub.status.busy": "2026-05-19T20:09:29.687066Z", "iopub.status.idle": "2026-05-19T20:09:30.244938Z", "shell.execute_reply": "2026-05-19T20:09:30.244229Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "All stats:\n", "Cut1 : pass=24 all=50 -- eff=48.00 % cumulative eff=48.00 %\n", "Cut2 : pass=25 all=50 -- eff=50.00 % cumulative eff=50.00 %\n", "Cut3 : pass=23 all=25 -- eff=92.00 % cumulative eff=46.00 %\n", "Cut3 : pass=23 all=25 -- eff=92.00 % cumulative eff=46.00 %\n", "Cut2 : pass=25 all=50 -- eff=50.00 % cumulative eff=50.00 %\n" ] } ], "source": [ "std::cout << \"All stats:\" << std::endl;\n", "auto allCutsReport = d.Report();\n", "allCutsReport->Print();" ] }, { "cell_type": "markdown", "id": "94c06060", "metadata": {}, "source": [ "We can now loop on the cuts" ] }, { "cell_type": "code", "execution_count": 9, "id": "9194b57f", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:09:30.246925Z", "iopub.status.busy": "2026-05-19T20:09:30.246800Z", "iopub.status.idle": "2026-05-19T20:09:30.450345Z", "shell.execute_reply": "2026-05-19T20:09:30.449765Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Name\tAll\tPass\tEfficiency\n", "Cut1\t50\t24\t48 %\n", "Cut2\t50\t25\t50 %\n", "Cut3\t25\t23\t92 %\n", "Cut3\t25\t23\t92 %\n", "Cut2\t50\t25\t50 %\n" ] } ], "source": [ "std::cout << \"Name\\tAll\\tPass\\tEfficiency\" << std::endl;\n", "for (auto &&cutInfo : allCutsReport) {\n", " std::cout << cutInfo.GetName() << \"\\t\" << cutInfo.GetAll() << \"\\t\" << cutInfo.GetPass() << \"\\t\"\n", " << cutInfo.GetEff() << \" %\" << std::endl;\n", "}" ] }, { "cell_type": "markdown", "id": "6fbf0047", "metadata": {}, "source": [ "Or get information about them individually" ] }, { "cell_type": "code", "execution_count": 10, "id": "26e18ec3", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:09:30.452073Z", "iopub.status.busy": "2026-05-19T20:09:30.451951Z", "iopub.status.idle": "2026-05-19T20:09:30.654727Z", "shell.execute_reply": "2026-05-19T20:09:30.654414Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Cut1 efficiency is 48 %\n" ] } ], "source": [ "auto cutName = \"Cut1\";\n", "auto cut = allCutsReport->At(\"Cut1\");\n", "std::cout << cutName << \" efficiency is \" << cut.GetEff() << \" %\" << std::endl;" ] }, { "cell_type": "markdown", "id": "edd3e0d2", "metadata": {}, "source": [ "Draw all canvases " ] }, { "cell_type": "code", "execution_count": 11, "id": "c8a9983c", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:09:30.675370Z", "iopub.status.busy": "2026-05-19T20:09:30.675217Z", "iopub.status.idle": "2026-05-19T20:09:30.877737Z", "shell.execute_reply": "2026-05-19T20:09:30.877343Z" } }, "outputs": [], "source": [ "gROOT->GetListOfCanvases()->Draw()" ] } ], "metadata": { "kernelspec": { "display_name": "ROOT C++", "language": "c++", "name": "root" }, "language_info": { "codemirror_mode": "text/x-c++src", "file_extension": ".C", "mimetype": " text/x-c++src", "name": "c++" } }, "nbformat": 4, "nbformat_minor": 5 }