{
"cells": [
{
"cell_type": "markdown",
"id": "00497b7e",
"metadata": {},
"source": [
"# df007_snapshot\n",
"Write ROOT data with RDataFrame.\n",
"\n",
"This tutorial shows how to write out datasets in ROOT format using RDataFrame.\n",
"\n",
"\n",
"\n",
"\n",
"**Author:** Danilo Piparo (CERN) \n",
"This notebook tutorial was automatically generated with ROOTBOOK-izer from the macro found in the ROOT repository on Tuesday, May 19, 2026 at 08:09 PM."
]
},
{
"cell_type": "markdown",
"id": "12f6fcfb",
"metadata": {},
"source": [
" A simple helper function to fill a test tree: this makes the example\n",
"stand-alone.\n",
" "
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "6043218e",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2026-05-19T20:09:29.131274Z",
"iopub.status.busy": "2026-05-19T20:09:29.131148Z",
"iopub.status.idle": "2026-05-19T20:09:29.800338Z",
"shell.execute_reply": "2026-05-19T20:09:29.799906Z"
}
},
"outputs": [],
"source": [
"%%cpp -d\n",
"void fill_tree(const char *treeName, const char *fileName)\n",
"{\n",
" ROOT::RDataFrame d(10000);\n",
" int i(0);\n",
" d.Define(\"b1\", [&i]() { return i; })\n",
" .Define(\"b2\",\n",
" [&i]() {\n",
" float j = i * i;\n",
" ++i;\n",
" return j;\n",
" })\n",
" .Snapshot(treeName, fileName);\n",
"}"
]
},
{
"cell_type": "markdown",
"id": "4def4386",
"metadata": {},
"source": [
"We prepare an input tree to run on"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "62460881",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2026-05-19T20:09:29.816538Z",
"iopub.status.busy": "2026-05-19T20:09:29.816387Z",
"iopub.status.idle": "2026-05-19T20:09:30.990542Z",
"shell.execute_reply": "2026-05-19T20:09:30.989839Z"
}
},
"outputs": [],
"source": [
"auto fileName = \"df007_snapshot.root\";\n",
"auto outFileName = \"df007_snapshot_output.root\";\n",
"auto outFileNameAllColumns = \"df007_snapshot_output_allColumns.root\";\n",
"auto treeName = \"myTree\";\n",
"fill_tree(treeName, fileName);"
]
},
{
"cell_type": "markdown",
"id": "88fc87bb",
"metadata": {},
"source": [
"We read the tree from the file and create a RDataFrame"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "f04139f8",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2026-05-19T20:09:30.995795Z",
"iopub.status.busy": "2026-05-19T20:09:30.995667Z",
"iopub.status.idle": "2026-05-19T20:09:31.215361Z",
"shell.execute_reply": "2026-05-19T20:09:31.206688Z"
}
},
"outputs": [],
"source": [
"ROOT::RDataFrame d(treeName, fileName);"
]
},
{
"cell_type": "markdown",
"id": "8664f9a8",
"metadata": {},
"source": [
"## Select entries\n",
"We now select some entries in the dataset"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "824696df",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2026-05-19T20:09:31.232457Z",
"iopub.status.busy": "2026-05-19T20:09:31.232304Z",
"iopub.status.idle": "2026-05-19T20:09:31.455461Z",
"shell.execute_reply": "2026-05-19T20:09:31.442710Z"
}
},
"outputs": [],
"source": [
"auto d_cut = d.Filter(\"b1 % 2 == 0\");"
]
},
{
"cell_type": "markdown",
"id": "d57a39f2",
"metadata": {},
"source": [
"## Enrich the dataset\n",
"Build some temporary columns: we'll write them out"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "6ae5808b",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2026-05-19T20:09:31.468436Z",
"iopub.status.busy": "2026-05-19T20:09:31.468291Z",
"iopub.status.idle": "2026-05-19T20:09:31.865981Z",
"shell.execute_reply": "2026-05-19T20:09:31.862743Z"
}
},
"outputs": [],
"source": [
"auto d2 = d_cut.Define(\"b1_square\", \"b1 * b1\")\n",
" .Define(\"b2_vector\",\n",
" [](float b2) {\n",
" std::vector v;\n",
" for (int i = 0; i < 3; i++)\n",
" v.push_back(b2 * i);\n",
" return v;\n",
" },\n",
" {\"b2\"});"
]
},
{
"cell_type": "markdown",
"id": "b0591679",
"metadata": {},
"source": [
"## Write it to disk in ROOT format\n",
"We now write to disk a new dataset with one of the variables originally\n",
"present in the tree and the new variables.\n",
"The user can explicitly specify the types of the columns as template\n",
"arguments of the Snapshot method, otherwise they will be automatically\n",
"inferred."
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "7a6504c8",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2026-05-19T20:09:31.888431Z",
"iopub.status.busy": "2026-05-19T20:09:31.888285Z",
"iopub.status.idle": "2026-05-19T20:09:32.885044Z",
"shell.execute_reply": "2026-05-19T20:09:32.874958Z"
}
},
"outputs": [],
"source": [
"d2.Snapshot(treeName, outFileName, {\"b1\", \"b1_square\", \"b2_vector\"});"
]
},
{
"cell_type": "markdown",
"id": "302ff153",
"metadata": {},
"source": [
"Open the new file and list the columns of the tree"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "288b08f3",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2026-05-19T20:09:32.895333Z",
"iopub.status.busy": "2026-05-19T20:09:32.895162Z",
"iopub.status.idle": "2026-05-19T20:09:33.119471Z",
"shell.execute_reply": "2026-05-19T20:09:33.115344Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"These are the columns b1, b1_square and b2_vector:\n",
"Branch: b1\n",
"Branch: b1_square\n",
"Branch: b2_vector\n"
]
}
],
"source": [
"TFile f1(outFileName);\n",
"auto t = f1.Get(treeName);\n",
"std::cout << \"These are the columns b1, b1_square and b2_vector:\" << std::endl;\n",
"for (auto branch : *t->GetListOfBranches()) {\n",
" std::cout << \"Branch: \" << branch->GetName() << std::endl;\n",
"}\n",
"f1.Close();"
]
},
{
"cell_type": "markdown",
"id": "3f6fce9f",
"metadata": {},
"source": [
"We are not forced to write the full set of column names. We can also\n",
"specify a regular expression for that. In case nothing is specified, all\n",
"columns are persistified."
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "b57a6b61",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2026-05-19T20:09:33.120940Z",
"iopub.status.busy": "2026-05-19T20:09:33.120816Z",
"iopub.status.idle": "2026-05-19T20:09:33.508778Z",
"shell.execute_reply": "2026-05-19T20:09:33.508271Z"
}
},
"outputs": [],
"source": [
"d2.Snapshot(treeName, outFileNameAllColumns);"
]
},
{
"cell_type": "markdown",
"id": "b100a11b",
"metadata": {},
"source": [
"Open the new file and list the columns of the tree"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "5c557b44",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2026-05-19T20:09:33.518875Z",
"iopub.status.busy": "2026-05-19T20:09:33.518745Z",
"iopub.status.idle": "2026-05-19T20:09:33.737828Z",
"shell.execute_reply": "2026-05-19T20:09:33.737290Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"These are all the columns available to this dataframe:\n",
"Branch: b1_square\n",
"Branch: b2_vector\n",
"Branch: b1\n",
"Branch: b2\n"
]
}
],
"source": [
"TFile f2(outFileNameAllColumns);\n",
"t = f2.Get(treeName);\n",
"std::cout << \"These are all the columns available to this dataframe:\" << std::endl;\n",
"for (auto branch : *t->GetListOfBranches()) {\n",
" std::cout << \"Branch: \" << branch->GetName() << std::endl;\n",
"}\n",
"f2.Close();"
]
},
{
"cell_type": "markdown",
"id": "0d37b393",
"metadata": {},
"source": [
"We can also get a fresh RDataFrame out of the snapshot and restart the\n",
"analysis chain from it. The default columns are the ones selected.\n",
"Notice also how we can decide to be more explicit with the types of the\n",
"columns."
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "4ca15d10",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2026-05-19T20:09:33.741349Z",
"iopub.status.busy": "2026-05-19T20:09:33.741225Z",
"iopub.status.idle": "2026-05-19T20:09:34.933298Z",
"shell.execute_reply": "2026-05-19T20:09:34.932883Z"
}
},
"outputs": [],
"source": [
"auto snapshot_df = d2.Snapshot(treeName, outFileName, {\"b1_square\"});\n",
"auto h = snapshot_df->Histo1D();\n",
"auto c = new TCanvas();\n",
"h->DrawClone();\n",
"\n",
"return 0;"
]
},
{
"cell_type": "markdown",
"id": "bc7b5fad",
"metadata": {},
"source": [
"Draw all canvases "
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "00cb0a52",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2026-05-19T20:09:34.935537Z",
"iopub.status.busy": "2026-05-19T20:09:34.935410Z",
"iopub.status.idle": "2026-05-19T20:09:35.138143Z",
"shell.execute_reply": "2026-05-19T20:09:35.137726Z"
}
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"\n",
"
\n",
"\n",
"\n",
"\n"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"gROOT->GetListOfCanvases()->Draw()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "ROOT C++",
"language": "c++",
"name": "root"
},
"language_info": {
"codemirror_mode": "text/x-c++src",
"file_extension": ".C",
"mimetype": " text/x-c++src",
"name": "c++"
}
},
"nbformat": 4,
"nbformat_minor": 5
}