{ "cells": [ { "cell_type": "markdown", "id": "763b1215", "metadata": {}, "source": [ "# df006_ranges\n", "Use Range to limit the amount of data processed.\n", "\n", "This tutorial shows how to express the concept of ranges when working with the RDataFrame.\n", "\n", "\n", "\n", "\n", "**Author:** Danilo Piparo (CERN) \n", "This notebook tutorial was automatically generated with ROOTBOOK-izer from the macro found in the ROOT repository on Tuesday, May 19, 2026 at 08:09 PM." ] }, { "cell_type": "markdown", "id": "d393e400", "metadata": {}, "source": [ " A simple helper function to fill a test tree: this makes the example\n", "stand-alone.\n", " " ] }, { "cell_type": "code", "execution_count": 1, "id": "091cb219", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:09:28.694591Z", "iopub.status.busy": "2026-05-19T20:09:28.694476Z", "iopub.status.idle": "2026-05-19T20:09:29.364634Z", "shell.execute_reply": "2026-05-19T20:09:29.363740Z" } }, "outputs": [], "source": [ "%%cpp -d\n", "void fill_tree(const char *treeName, const char *fileName)\n", "{\n", " ROOT::RDataFrame d(100);\n", " int i(0);\n", " d.Define(\"b1\", [&i]() { return i; })\n", " .Define(\"b2\",\n", " [&i]() {\n", " float j = i * i;\n", " ++i;\n", " return j;\n", " })\n", " .Snapshot(treeName, fileName);\n", "}" ] }, { "cell_type": "markdown", "id": "a66d6675", "metadata": {}, "source": [ "We prepare an input tree to run on" ] }, { "cell_type": "code", "execution_count": 2, "id": "b579a6fa", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:09:29.366187Z", "iopub.status.busy": "2026-05-19T20:09:29.366057Z", "iopub.status.idle": "2026-05-19T20:09:30.577818Z", "shell.execute_reply": "2026-05-19T20:09:30.556463Z" } }, "outputs": [], "source": [ "auto fileName = \"df006_ranges.root\";\n", "auto treeName = \"myTree\";\n", "fill_tree(treeName, fileName);" ] }, { "cell_type": "markdown", "id": "9cab0f59", "metadata": {}, "source": [ "We read the tree from the file and create a RDataFrame." ] }, { "cell_type": "code", "execution_count": 3, "id": "e0ba682f", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:09:30.595133Z", "iopub.status.busy": "2026-05-19T20:09:30.594985Z", "iopub.status.idle": "2026-05-19T20:09:30.818239Z", "shell.execute_reply": "2026-05-19T20:09:30.817265Z" } }, "outputs": [], "source": [ "ROOT::RDataFrame d(treeName, fileName);" ] }, { "cell_type": "markdown", "id": "070299be", "metadata": {}, "source": [ "## Usage of ranges\n", "Now we'll count some entries using ranges" ] }, { "cell_type": "code", "execution_count": 4, "id": "c736edeb", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:09:30.820780Z", "iopub.status.busy": "2026-05-19T20:09:30.820637Z", "iopub.status.idle": "2026-05-19T20:09:31.487031Z", "shell.execute_reply": "2026-05-19T20:09:31.486662Z" } }, "outputs": [], "source": [ "auto c_all = d.Count();" ] }, { "cell_type": "markdown", "id": "9fd6c128", "metadata": {}, "source": [ "This is how you can express a range of the first 30 entries" ] }, { "cell_type": "code", "execution_count": 5, "id": "eaa9f197", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:09:31.499353Z", "iopub.status.busy": "2026-05-19T20:09:31.499221Z", "iopub.status.idle": "2026-05-19T20:09:32.200779Z", "shell.execute_reply": "2026-05-19T20:09:32.200381Z" } }, "outputs": [], "source": [ "auto d_0_30 = d.Range(30);\n", "auto c_0_30 = d_0_30.Count();" ] }, { "cell_type": "markdown", "id": "223ad3fd", "metadata": {}, "source": [ "This is how you pick all entries from 15 onwards" ] }, { "cell_type": "code", "execution_count": 6, "id": "eff037b5", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:09:32.211010Z", "iopub.status.busy": "2026-05-19T20:09:32.210885Z", "iopub.status.idle": "2026-05-19T20:09:32.529548Z", "shell.execute_reply": "2026-05-19T20:09:32.528391Z" } }, "outputs": [], "source": [ "auto d_15_end = d.Range(15, 0);\n", "auto c_15_end = d_15_end.Count();" ] }, { "cell_type": "markdown", "id": "10b97976", "metadata": {}, "source": [ "We can use a stride too, in this case we pick an event every 3 entries" ] }, { "cell_type": "code", "execution_count": 7, "id": "633b460a", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:09:32.531424Z", "iopub.status.busy": "2026-05-19T20:09:32.531266Z", "iopub.status.idle": "2026-05-19T20:09:32.849006Z", "shell.execute_reply": "2026-05-19T20:09:32.848362Z" } }, "outputs": [], "source": [ "auto d_15_end_3 = d.Range(15, 0, 3);\n", "auto c_15_end_3 = d_15_end_3.Count();" ] }, { "cell_type": "markdown", "id": "0bb51ab4", "metadata": {}, "source": [ "The Range here acts first on the (whole) RDataFrame graph:\n", "Not only actions (like Count) but also filters and new columns can be added to it." ] }, { "cell_type": "code", "execution_count": 8, "id": "f018b561", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:09:32.861305Z", "iopub.status.busy": "2026-05-19T20:09:32.861161Z", "iopub.status.idle": "2026-05-19T20:09:33.454469Z", "shell.execute_reply": "2026-05-19T20:09:33.454040Z" } }, "outputs": [], "source": [ "auto d_0_50 = d.Range(50);\n", "auto c_0_50_odd_b1 = d_0_50.Filter(\"1 == b1 % 2\").Count();" ] }, { "cell_type": "markdown", "id": "4533e128", "metadata": {}, "source": [ "An important thing to notice is that the counts of a filter are relative to the\n", "number of entries a filter \"sees\". Therefore, if a Range depends on a filter,\n", "the Range will act on the entries passing the filter only." ] }, { "cell_type": "code", "execution_count": 9, "id": "51753d63", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:09:33.460245Z", "iopub.status.busy": "2026-05-19T20:09:33.460115Z", "iopub.status.idle": "2026-05-19T20:09:34.105652Z", "shell.execute_reply": "2026-05-19T20:09:34.105123Z" } }, "outputs": [], "source": [ "auto c_0_3_after_even_b1 = d.Filter(\"0 == b1 % 2\").Range(0, 3).Count();" ] }, { "cell_type": "markdown", "id": "9f18e537", "metadata": {}, "source": [ "Ok, time to wrap up: let's print all counts!" ] }, { "cell_type": "code", "execution_count": 10, "id": "e574a889", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:09:34.112967Z", "iopub.status.busy": "2026-05-19T20:09:34.112812Z", "iopub.status.idle": "2026-05-19T20:09:34.555841Z", "shell.execute_reply": "2026-05-19T20:09:34.554689Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Usage of ranges:\n", " - All entries: 100\n", " - Entries from 0 to 30: 30\n", " - Entries from 15 onwards: 85\n", " - Entries from 15 onwards in steps of 3: 29\n", " - Entries from 0 to 50, odd only: 25\n", " - First three entries of all even entries: 3\n" ] } ], "source": [ "cout << \"Usage of ranges:\\n\"\n", " << \" - All entries: \" << *c_all << endl\n", " << \" - Entries from 0 to 30: \" << *c_0_30 << endl\n", " << \" - Entries from 15 onwards: \" << *c_15_end << endl\n", " << \" - Entries from 15 onwards in steps of 3: \" << *c_15_end_3 << endl\n", " << \" - Entries from 0 to 50, odd only: \" << *c_0_50_odd_b1 << endl\n", " << \" - First three entries of all even entries: \" << *c_0_3_after_even_b1 << endl;\n", "\n", "return 0;" ] }, { "cell_type": "markdown", "id": "24630523", "metadata": {}, "source": [ "Draw all canvases " ] }, { "cell_type": "code", "execution_count": 11, "id": "a24983c5", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:09:34.564312Z", "iopub.status.busy": "2026-05-19T20:09:34.564163Z", "iopub.status.idle": "2026-05-19T20:09:34.774782Z", "shell.execute_reply": "2026-05-19T20:09:34.773710Z" } }, "outputs": [], "source": [ "gROOT->GetListOfCanvases()->Draw()" ] } ], "metadata": { "kernelspec": { "display_name": "ROOT C++", "language": "c++", "name": "root" }, "language_info": { "codemirror_mode": "text/x-c++src", "file_extension": ".C", "mimetype": " text/x-c++src", "name": "c++" } }, "nbformat": 4, "nbformat_minor": 5 }