{
"cells": [
{
"cell_type": "markdown",
"id": "0ef7ded5",
"metadata": {},
"source": [
"# df013_InspectAnalysis\n",
"Use callbacks to update a plot and a progress bar during the event loop.\n",
"\n",
"Showcase registration of callback functions that act on partial results while\n",
"the event-loop is running using `OnPartialResult` and `OnPartialResultSlot`.\n",
"This tutorial is not meant to run in batch mode.\n",
"\n",
"\n",
"\n",
"\n",
"**Author:** Enrico Guiraud (CERN) \n",
"This notebook tutorial was automatically generated with ROOTBOOK-izer from the macro found in the ROOT repository on Tuesday, May 19, 2026 at 08:09 PM."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "4b91cf21",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2026-05-19T20:09:41.705348Z",
"iopub.status.busy": "2026-05-19T20:09:41.705199Z",
"iopub.status.idle": "2026-05-19T20:09:42.050354Z",
"shell.execute_reply": "2026-05-19T20:09:42.039361Z"
}
},
"outputs": [],
"source": [
"using namespace ROOT; // RDataFrame lives in here"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "32573aa3",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2026-05-19T20:09:42.077295Z",
"iopub.status.busy": "2026-05-19T20:09:42.077136Z",
"iopub.status.idle": "2026-05-19T20:09:42.322281Z",
"shell.execute_reply": "2026-05-19T20:09:42.312680Z"
}
},
"outputs": [],
"source": [
"ROOT::EnableImplicitMT();\n",
"const auto poolSize = ROOT::GetThreadPoolSize();\n",
"const auto nSlots = 0 == poolSize ? 1 : poolSize;"
]
},
{
"cell_type": "markdown",
"id": "cee9f073",
"metadata": {},
"source": [
"## Setup a simple RDataFrame\n",
"We start by creating a RDataFrame with a good number of empty events"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "28d60d72",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2026-05-19T20:09:42.339319Z",
"iopub.status.busy": "2026-05-19T20:09:42.339187Z",
"iopub.status.idle": "2026-05-19T20:09:42.542082Z",
"shell.execute_reply": "2026-05-19T20:09:42.541387Z"
}
},
"outputs": [],
"source": [
"const auto nEvents = nSlots * 10000ull;\n",
"RDataFrame d(nEvents);"
]
},
{
"cell_type": "markdown",
"id": "2325ee03",
"metadata": {},
"source": [
"`heavyWork` is a lambda that fakes some interesting computation and just returns a normally distributed double"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "3788743f",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2026-05-19T20:09:42.543749Z",
"iopub.status.busy": "2026-05-19T20:09:42.543616Z",
"iopub.status.idle": "2026-05-19T20:09:42.746195Z",
"shell.execute_reply": "2026-05-19T20:09:42.745625Z"
}
},
"outputs": [],
"source": [
"TRandom r;\n",
"auto heavyWork = [&r]() {\n",
" for (volatile int i = 0; i < 1000000; ++i)\n",
" ;\n",
" return r.Gaus();\n",
"};"
]
},
{
"cell_type": "markdown",
"id": "2a8d0d02",
"metadata": {},
"source": [
"Let's define a column \"x\" produced by invoking `heavyWork` for each event\n",
"`df` stores a modified data-frame that contains \"x\""
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "68feba7c",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2026-05-19T20:09:42.748254Z",
"iopub.status.busy": "2026-05-19T20:09:42.748127Z",
"iopub.status.idle": "2026-05-19T20:09:43.235723Z",
"shell.execute_reply": "2026-05-19T20:09:43.234531Z"
}
},
"outputs": [],
"source": [
"auto df = d.Define(\"x\", heavyWork);"
]
},
{
"cell_type": "markdown",
"id": "1f1f0c92",
"metadata": {},
"source": [
"Now we register a histogram-filling action with the RDataFrame.\n",
"`h` can be used just like a pointer to TH1D but it is actually a TResultProxy, a smart object that triggers\n",
"an event-loop to fill the pointee histogram if needed."
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "544d4db2",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2026-05-19T20:09:43.237375Z",
"iopub.status.busy": "2026-05-19T20:09:43.237252Z",
"iopub.status.idle": "2026-05-19T20:09:44.443530Z",
"shell.execute_reply": "2026-05-19T20:09:44.442253Z"
}
},
"outputs": [],
"source": [
"auto h = df.Histo1D({\"browserHisto\", \"\", 100, -2., 2.}, \"x\");"
]
},
{
"cell_type": "markdown",
"id": "1d2cb63a",
"metadata": {},
"source": [
"## Use the callback mechanism to draw the histogram on a TBrowser while it is being filled\n",
"So far we have registered a column \"x\" to a data-frame with `nEvents` events and we registered the filling of a\n",
"histogram with the values of column \"x\".\n",
"In the following we will register three functions for execution during the event-loop:\n",
"- one is to be executed once just before the loop and adds a partially-filled histogram to a TBrowser\n",
"- the next is executed every 50 events and draws the partial histogram on the TBrowser's TPad\n",
"- another callback is responsible of updating a simple progress bar from multiple threads"
]
},
{
"cell_type": "markdown",
"id": "d452e67a",
"metadata": {},
"source": [
"First off we create a TBrowser that contains a \"RDFResults\" directory"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "8850c0a3",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2026-05-19T20:09:44.453954Z",
"iopub.status.busy": "2026-05-19T20:09:44.453818Z",
"iopub.status.idle": "2026-05-19T20:09:44.656895Z",
"shell.execute_reply": "2026-05-19T20:09:44.656359Z"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Warning in : The ROOT browser cannot run in batch mode\n"
]
}
],
"source": [
"auto dfDirectory = new TMemFile(\"RDFResults\", \"RECREATE\");\n",
"auto browser = new TBrowser(\"b\", dfDirectory);"
]
},
{
"cell_type": "markdown",
"id": "2bdefa90",
"metadata": {},
"source": [
"The global pad should now be set to the TBrowser's canvas, let's store its value in a local variable"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "bfb039b1",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2026-05-19T20:09:44.658968Z",
"iopub.status.busy": "2026-05-19T20:09:44.658806Z",
"iopub.status.idle": "2026-05-19T20:09:44.869850Z",
"shell.execute_reply": "2026-05-19T20:09:44.869417Z"
}
},
"outputs": [],
"source": [
"auto browserPad = gPad;"
]
},
{
"cell_type": "markdown",
"id": "34d4ee93",
"metadata": {},
"source": [
"A useful feature of `TResultProxy` is its `OnPartialResult` method: it allows us to register a callback that is\n",
"executed once per specified number of events during the event-loop, on \"partial\" versions of the result objects\n",
"contained in the `TResultProxy`. In this case, the partial result is going to be a histogram filled with an\n",
"increasing number of events.\n",
"Instead of requesting the callback to be executed every N entries, this time we use the special value `kOnce` to\n",
"request that it is executed once right before starting the event-loop.\n",
"The callback is a C++11 lambda that registers the partial result object in `dfDirectory`."
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "11baddcf",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2026-05-19T20:09:44.882539Z",
"iopub.status.busy": "2026-05-19T20:09:44.882401Z",
"iopub.status.idle": "2026-05-19T20:09:45.085094Z",
"shell.execute_reply": "2026-05-19T20:09:45.084747Z"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"input_line_62:2:30: error: 'dfDirectory' cannot be captured because it does not have automatic storage duration\n",
" h.OnPartialResult(h.kOnce, [dfDirectory](TH1D &h_) { dfDirectory->Add(&h_); });\n",
" ^\n",
"input_line_58:2:7: note: 'dfDirectory' declared here\n",
" auto dfDirectory = new TMemFile(\"RDFResults\", \"RECREATE\");\n",
" ^\n"
]
}
],
"source": [
"h.OnPartialResult(h.kOnce, [dfDirectory](TH1D &h_) { dfDirectory->Add(&h_); });"
]
},
{
"cell_type": "markdown",
"id": "778495aa",
"metadata": {},
"source": [
"Note that we called `OnPartialResult` with a dot, `.`, since this is a method of `TResultProxy` itself.\n",
"We do not want to call `OnPartialResult` on the pointee histogram!)"
]
},
{
"cell_type": "markdown",
"id": "81a0ac96",
"metadata": {},
"source": [
"Multiple callbacks can be registered on the same `TResultProxy` (they are executed one after the other in the\n",
"same order as they were registered). We now request that the partial result is drawn and the TBrowser's TPad is\n",
"updated every 50 events."
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "399ac32c",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2026-05-19T20:09:45.088255Z",
"iopub.status.busy": "2026-05-19T20:09:45.087873Z",
"iopub.status.idle": "2026-05-19T20:09:45.291133Z",
"shell.execute_reply": "2026-05-19T20:09:45.290537Z"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"input_line_63:2:26: error: 'browserPad' cannot be captured because it does not have automatic storage duration\n",
" h.OnPartialResult(50, [&browserPad](TH1D &hist) {\n",
" ^\n",
"input_line_61:2:7: note: 'browserPad' declared here\n",
" auto browserPad = gPad;\n",
" ^\n",
"In module 'std' imported from input_line_1:1:\n",
"/usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/std_function.h:435:2: error: constructor for 'std::function' must explicitly initialize the base class '_Maybe_unary_or_binary_function' which does not have a default constructor\n",
" function(_Functor&& __f)\n",
" ^\n",
"input_line_63:2:24: note: in instantiation of function template specialization 'std::function::function<(lambda at input_line_63:2:24), void>' requested here\n",
" h.OnPartialResult(50, [&browserPad](TH1D &hist) {\n",
" ^\n",
"/usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/refwrap.h:65:12: note: 'std::_Maybe_unary_or_binary_function' declared here\n",
" struct _Maybe_unary_or_binary_function<_Res, _T1>\n",
" ^\n"
]
}
],
"source": [
"h.OnPartialResult(50, [&browserPad](TH1D &hist) {\n",
" if (!browserPad)\n",
" return; // in case root -b was invoked\n",
" browserPad->cd();\n",
" hist.Draw();\n",
" browserPad->Update();\n",
" // This call tells ROOT to process all pending GUI events\n",
" // It allows users to use the TBrowser as usual while the event-loop is running\n",
" gSystem->ProcessEvents();\n",
"});"
]
},
{
"cell_type": "markdown",
"id": "f90d9d8e",
"metadata": {},
"source": [
"Finally, we would like to print a progress bar on the terminal to show how the event-loop is progressing.\n",
"To take into account _all_ events we use `OnPartialResultSlot`: when Implicit Multi-Threading is enabled, in fact,\n",
"`OnPartialResult` invokes the callback only in one of the worker threads, and always returns that worker threads'\n",
"partial result. This is useful because it means we don't have to worry about concurrent execution and\n",
"thread-safety of the callbacks if we are happy with just one threads' partial result.\n",
"`OnPartialResultSlot`, on the other hand, invokes the callback in each one of the worker threads, every time a\n",
"thread finishes processing a batch of `everyN` events. This is what we want for the progress bar, but we need to\n",
"take care that two threads will not print to terminal at the same time: we need a std::mutex for synchronization."
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "c12a949f",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2026-05-19T20:09:45.292885Z",
"iopub.status.busy": "2026-05-19T20:09:45.292749Z",
"iopub.status.idle": "2026-05-19T20:09:45.496729Z",
"shell.execute_reply": "2026-05-19T20:09:45.496375Z"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"In module 'std' imported from input_line_1:1:\n",
"/usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/std_function.h:134:2: error: function 'std::_Function_base::_Base_manager<(lambda)>::_M_get_pointer' is used but not defined in this translation unit, and cannot be defined in any other translation unit because its type does not have linkage\n",
" _M_get_pointer(const _Any_data& __source) noexcept\n",
" ^\n",
"/usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/std_function.h:290:39: note: used here\n",
" return std::__invoke_r<_Res>(*_Base::_M_get_pointer(__functor),\n",
" ^\n",
"/usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/std_function.h:134:2: error: function 'std::_Function_base::_Base_manager<(lambda)>::_M_get_pointer' is used but not defined in this translation unit, and cannot be defined in any other translation unit because its type does not have linkage\n",
" _M_get_pointer(const _Any_data& __source) noexcept\n",
" ^\n",
"/usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/std_function.h:290:39: note: used here\n",
" return std::__invoke_r<_Res>(*_Base::_M_get_pointer(__functor),\n",
" ^\n"
]
}
],
"source": [
"std::string progressBar;\n",
"std::mutex barMutex; // Only one thread at a time can lock a mutex. Let's use this to avoid concurrent printing."
]
},
{
"cell_type": "markdown",
"id": "4cb34bad",
"metadata": {},
"source": [
"Magic numbers that yield good progress bars for nSlots = 1,2,4,8"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "d88d2035",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2026-05-19T20:09:45.499020Z",
"iopub.status.busy": "2026-05-19T20:09:45.498893Z",
"iopub.status.idle": "2026-05-19T20:09:45.702034Z",
"shell.execute_reply": "2026-05-19T20:09:45.701518Z"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"input_line_65:4:44: error: use of undeclared identifier 'progressBar'\n",
"h.OnPartialResultSlot(everyN, [&barWidth, &progressBar, &barMutex](unsigned int /*slot*/, TH1D & /*partialHist*/) {\n",
" ^\n",
"input_line_65:4:58: error: use of undeclared identifier 'barMutex'\n",
"h.OnPartialResultSlot(everyN, [&barWidth, &progressBar, &barMutex](unsigned int /*slot*/, TH1D & /*partialHist*/) {\n",
" ^\n",
"input_line_65:5:34: error: unknown type name 'barMutex'\n",
" std::lock_guard l(barMutex); // lock_guard locks the mutex at construction, releases it at destruction\n",
" ^\n",
"input_line_65:5:33: warning: parentheses were disambiguated as a function declaration [-Wvexing-parse]\n",
" std::lock_guard l(barMutex); // lock_guard locks the mutex at construction, releases it at destruction\n",
" ^~~~~~~~~~\n",
"input_line_65:5:34: note: add a pair of parentheses to declare a variable\n",
" std::lock_guard l(barMutex); // lock_guard locks the mutex at construction, releases it at destruction\n",
" ^\n",
" (\n",
"input_line_65:6:4: error: use of undeclared identifier 'progressBar'\n",
" progressBar.push_back('#');\n",
" ^\n",
"input_line_65:8:62: error: use of undeclared identifier 'progressBar'\n",
" std::cout << \"\\r[\" << std::left << std::setw(barWidth) << progressBar << ']' << std::flush;\n",
" ^\n"
]
}
],
"source": [
"const auto everyN = nSlots == 8 ? 1000 : 100ull * nSlots;\n",
"const auto barWidth = nEvents / everyN;\n",
"h.OnPartialResultSlot(everyN, [&barWidth, &progressBar, &barMutex](unsigned int /*slot*/, TH1D & /*partialHist*/) {\n",
" std::lock_guard l(barMutex); // lock_guard locks the mutex at construction, releases it at destruction\n",
" progressBar.push_back('#');\n",
" // re-print the line with the progress bar\n",
" std::cout << \"\\r[\" << std::left << std::setw(barWidth) << progressBar << ']' << std::flush;\n",
"});"
]
},
{
"cell_type": "markdown",
"id": "00e04cea",
"metadata": {},
"source": [
"## Running the analysis\n",
"So far we told RDataFrame what we want to happen during the event-loop, but we have not actually run any of those\n",
"actions: the TBrowser is still empty, the progress bar has not been printed even once, and we haven't produced\n",
"a single data-point!\n",
"As usual with RDataFrame, the event-loop is triggered by accessing the contents of a TResultProxy for the first\n",
"time. Let's run!"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "7f89d5ab",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2026-05-19T20:09:45.703908Z",
"iopub.status.busy": "2026-05-19T20:09:45.703782Z",
"iopub.status.idle": "2026-05-19T20:09:45.907090Z",
"shell.execute_reply": "2026-05-19T20:09:45.906546Z"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"In module 'std' imported from input_line_1:1:\n",
"/usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/std_function.h:134:2: error: function 'std::_Function_base::_Base_manager<(lambda)>::_M_get_pointer' is used but not defined in this translation unit, and cannot be defined in any other translation unit because its type does not have linkage\n",
" _M_get_pointer(const _Any_data& __source) noexcept\n",
" ^\n",
"/usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/std_function.h:290:39: note: used here\n",
" return std::__invoke_r<_Res>(*_Base::_M_get_pointer(__functor),\n",
" ^\n"
]
}
],
"source": [
"std::cout << \"Analysis running...\" << std::endl;\n",
"h->Draw(); // the final, complete result will be drawn after the event-loop has completed.\n",
"std::cout << \"\\nDone!\" << std::endl;"
]
},
{
"cell_type": "markdown",
"id": "3721b6dd",
"metadata": {},
"source": [
"Finally, some book-keeping: in the TMemFile that we are using as TBrowser directory, we substitute the partial\n",
"result with a clone of the final result (the \"original\" final result will be deleted at the end of the macro)."
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "8f7a500b",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2026-05-19T20:09:45.908798Z",
"iopub.status.busy": "2026-05-19T20:09:45.908683Z",
"iopub.status.idle": "2026-05-19T20:09:46.139945Z",
"shell.execute_reply": "2026-05-19T20:09:46.138714Z"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"[runStaticInitializersOnce]: Failed to materialize symbols: { (main, { __orc_init_func.cling-module-301, _ZNKSt12__shared_ptrIN4ROOT8Internal3RDF11RActionBaseELN9__gnu_cxx12_Lock_policyE2EE3getEv, _ZN12__cling_N53216__cling_Un1Qu332EPv, _ZN4ROOT3RDF10RResultPtrI4TH1DE11ThrowIfNullEv, cling_module_301_, _ZN4ROOT3RDF10RResultPtrI4TH1DEptEv, $.cling-module-301.__inits.0, _ZNKSt19__shared_ptr_accessIN4ROOT8Internal3RDF11RActionBaseELN9__gnu_cxx12_Lock_policyE2ELb0ELb0EE6_M_getEv, _ZStneIN4ROOT8Internal3RDF11RActionBaseEEbRKSt10shared_ptrIT_EDn, __vd_init_order__cling_Un1Qu33, _ZNKSt19__shared_ptr_accessIN4ROOT8Internal3RDF11RActionBaseELN9__gnu_cxx12_Lock_policyE2ELb0ELb0EEptEv, cling_module_301_.3, _ZN12__cling_N5325cloneE, _ZN4ROOT3RDF10RResultPtrI4TH1DE12GetSharedPtrEv, _ZNKSt12__shared_ptrIN4ROOT8Internal3RDF11RActionBaseELN9__gnu_cxx12_Lock_policyE2EEcvbEv, _ZN4ROOT3RDF10RResultPtrI4TH1DE10TriggerRunEv, _Z30__fd_init_order__cling_Un1Qu32v, _GLOBAL__sub_I_cling_module_301 }) }\n",
"IncrementalExecutor::executeFunction: symbol '_ZSteqI4TH1DEbRKSt10shared_ptrIT_EDn' unresolved while linking [cling interface function]!\n",
"You are probably missing the definition of bool std::operator==(std::shared_ptr const&, decltype(nullptr))\n",
"Maybe you need to load the corresponding shared library?\n"
]
}
],
"source": [
"dfDirectory->Clear();\n",
"auto clone = static_cast(h->Clone());\n",
"clone->SetDirectory(nullptr);\n",
"dfDirectory->Add(clone);\n",
"if (!browserPad)\n",
" return; // in case root -b was invoked\n",
"browserPad->cd();\n",
"clone->Draw();\n",
"browserPad->Update();"
]
},
{
"cell_type": "markdown",
"id": "eba5a4db",
"metadata": {},
"source": [
"Draw all canvases "
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "c43868b6",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2026-05-19T20:09:46.141685Z",
"iopub.status.busy": "2026-05-19T20:09:46.141508Z",
"iopub.status.idle": "2026-05-19T20:09:46.344939Z",
"shell.execute_reply": "2026-05-19T20:09:46.344282Z"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"[runStaticInitializersOnce]: Failed to materialize symbols: { (main, { __orc_init_func.cling-module-301 }) }\n"
]
}
],
"source": [
"gROOT->GetListOfCanvases()->Draw()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "ROOT C++",
"language": "c++",
"name": "root"
},
"language_info": {
"codemirror_mode": "text/x-c++src",
"file_extension": ".C",
"mimetype": " text/x-c++src",
"name": "c++"
}
},
"nbformat": 4,
"nbformat_minor": 5
}