{ "cells": [ { "cell_type": "markdown", "id": "e2f9810c", "metadata": {}, "source": [ "# ntpl014_framework\n", "\n", "Example of framework usage for writing RNTuples:\n", "1. Creation of (bare) RNTupleModels and RFieldTokens.\n", "2. Creation of RNTupleWriter and RNTupleParallelWriter when appending to a single TFile.\n", "3. Creation of RNTupleFillContext and RRawPtrWriteEntry per thread, and usage of BindRawPtr.\n", "4. Usage of FillNoFlush(), RNTupleFillStatus::ShouldFlushCluster(), FlushColumns(), and FlushCluster().\n", "\n", "Please note that this tutorial has very simplified versions of classes that could be found in a framework, such as\n", "DataProduct, FileService, ParallelOutputter, and SerializingOutputter. They try to mimick the usage in a framework\n", "(for example, Outputters are agnostic of the data written, which is encapsulated in std::vector), but\n", "are not meant for production usage!\n", "\n", "Also note that this tutorial uses std::thread and std::mutex directly instead of a task scheduling library such as\n", "Threading Building Blocks (TBB). For that reason, turning on ROOT's implicit multithreading (IMT) would not be very\n", "efficient with the simplified code in this tutorial because a thread blocking to acquire a std::mutex cannot \"help\"\n", "the other thread that is currently in the critical section by executing its tasks. If that is wanted, the framework\n", "should use synchronization methods provided by TBB directly (which goes beyond the scope of this tutorial).\n", "\n", "\n", "\n", "\n", "**Author:** The ROOT Team \n", "This notebook tutorial was automatically generated with ROOTBOOK-izer from the macro found in the ROOT repository on Tuesday, May 19, 2026 at 08:15 PM." ] }, { "cell_type": "code", "execution_count": 1, "id": "1bea828e", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:15:26.824179Z", "iopub.status.busy": "2026-05-19T20:15:26.824066Z", "iopub.status.idle": "2026-05-19T20:15:26.834212Z", "shell.execute_reply": "2026-05-19T20:15:26.833630Z" } }, "outputs": [], "source": [ "%%cpp -d\n", "\n", "#include \n", "#include \n", "#include \n", "#include \n", "#include \n", "#include \n", "#include \n", "#include \n", "\n", "#include \n", "#include // for std::size_t\n", "#include // for std::uint32_t\n", "#include // for std::ref\n", "#include \n", "#include \n", "#include \n", "#include \n", "#include \n", "#include \n", "#include // for std::pair\n", "#include \n", "\n", "using ModelTokensPair = std::pair, std::vector>;" ] }, { "cell_type": "markdown", "id": "0b777659", "metadata": {}, "source": [ "A DataProduct associates an arbitrary address to an index in the model." ] }, { "cell_type": "code", "execution_count": 2, "id": "54c258fa", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:15:26.835447Z", "iopub.status.busy": "2026-05-19T20:15:26.835331Z", "iopub.status.idle": "2026-05-19T20:15:27.155485Z", "shell.execute_reply": "2026-05-19T20:15:27.154904Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "input_line_45:8:2: error: expected expression\n", " %%cpp -d\n", " ^\n", "input_line_45:8:3: error: expected expression\n", " %%cpp -d\n", " ^\n", "input_line_45:8:4: error: use of undeclared identifier 'cpp'\n", " %%cpp -d\n", " ^\n", "input_line_45:8:9: error: use of undeclared identifier 'd'\n", " %%cpp -d\n", " ^\n" ] } ], "source": [ "struct DataProduct {\n", " std::size_t index;\n", " const void *address;\n", "\n", " DataProduct(std::size_t i, const void *a) : index(i), address(a) {}\n", "};\n", "%%cpp -d" ] }, { "cell_type": "markdown", "id": "95608558", "metadata": {}, "source": [ "The FileService opens a TFile and provides synchronization." ] }, { "cell_type": "code", "execution_count": 3, "id": "56762a67", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:15:27.157264Z", "iopub.status.busy": "2026-05-19T20:15:27.157144Z", "iopub.status.idle": "2026-05-19T20:15:27.361362Z", "shell.execute_reply": "2026-05-19T20:15:27.360760Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "input_line_54:16:2: error: expected expression\n", " %%cpp -d\n", " ^\n", "input_line_54:16:3: error: expected expression\n", " %%cpp -d\n", " ^\n", "input_line_54:16:4: error: use of undeclared identifier 'cpp'\n", " %%cpp -d\n", " ^\n", "input_line_54:16:9: error: use of undeclared identifier 'd'\n", " %%cpp -d\n", " ^\n" ] } ], "source": [ "class FileService {\n", " std::unique_ptr fFile;\n", " std::mutex fMutex;\n", "\n", "public:\n", " FileService(std::string_view url, std::string_view options = \"\")\n", " {\n", " fFile.reset(TFile::Open(std::string(url).c_str(), std::string(options).c_str()));\n", " // The file is automatically closed when destructing the std::unique_ptr.\n", " }\n", "\n", " TFile &GetFile() { return *fFile; }\n", " std::mutex &GetMutex() { return fMutex; }\n", "};\n", "%%cpp -d" ] }, { "cell_type": "markdown", "id": "96dff42e", "metadata": {}, "source": [ "An Outputter provides the interface to fill DataProducts into an RNTuple." ] }, { "cell_type": "code", "execution_count": 4, "id": "402265d3", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:15:27.363105Z", "iopub.status.busy": "2026-05-19T20:15:27.362985Z", "iopub.status.idle": "2026-05-19T20:15:27.567382Z", "shell.execute_reply": "2026-05-19T20:15:27.566694Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "input_line_55:6:55: error: use of undeclared identifier 'DataProduct'\n", " virtual void Fill(unsigned slot, const std::vector &products) = 0;\n", " ^\n", "input_line_55:9:2: error: expected expression\n", " %%cpp -d\n", " ^\n", "input_line_55:9:3: error: expected expression\n", " %%cpp -d\n", " ^\n", "input_line_55:9:4: error: use of undeclared identifier 'cpp'\n", " %%cpp -d\n", " ^\n", "input_line_55:9:9: error: use of undeclared identifier 'd'\n", " %%cpp -d\n", " ^\n" ] } ], "source": [ "class Outputter {\n", "public:\n", " virtual ~Outputter() = default;\n", "\n", " virtual void InitSlot(unsigned slot) = 0;\n", " virtual void Fill(unsigned slot, const std::vector &products) = 0;\n", "};\n", "%%cpp -d" ] }, { "cell_type": "markdown", "id": "81f52a26", "metadata": {}, "source": [ "A ParallelOutputter uses an RNTupleParallelWriter to append an RNTuple to a TFile." ] }, { "cell_type": "code", "execution_count": 5, "id": "a2faa3b9", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:15:27.568921Z", "iopub.status.busy": "2026-05-19T20:15:27.568802Z", "iopub.status.idle": "2026-05-19T20:15:27.773768Z", "shell.execute_reply": "2026-05-19T20:15:27.773018Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "input_line_56:2:41: error: expected class name\n", " class ParallelOutputter final : public Outputter {\n", " ^\n", "input_line_56:3:4: error: unknown type name 'FileService'\n", " FileService &fFileService;\n", " ^\n", "input_line_56:14:51: error: unknown type name 'FileService'\n", " ParallelOutputter(ModelTokensPair modelTokens, FileService &fileService, std::string_view ntupleName,\n", " ^\n", "input_line_56:25:33: error: only virtual member functions can be marked 'final'\n", " void InitSlot(unsigned slot) final\n", " ^~~~~\n", "input_line_56:35:47: error: use of undeclared identifier 'DataProduct'\n", " void Fill(unsigned slot, const std::vector &products) final\n", " ^\n", "input_line_56:63:1: error: expected expression\n", "%%cpp -d\n", "^\n", "input_line_56:63:2: error: expected expression\n", "%%cpp -d\n", " ^\n", "input_line_56:63:3: error: use of undeclared identifier 'cpp'\n", "%%cpp -d\n", " ^\n", "input_line_56:63:8: error: use of undeclared identifier 'd'\n", "%%cpp -d\n", " ^\n" ] } ], "source": [ "class ParallelOutputter final : public Outputter {\n", " FileService &fFileService;\n", " std::unique_ptr fParallelWriter;\n", " std::vector fTokens;\n", "\n", " struct SlotData {\n", " std::shared_ptr fillContext;\n", " std::unique_ptr entry;\n", " };\n", " std::vector fSlots;\n", "\n", "public:\n", " ParallelOutputter(ModelTokensPair modelTokens, FileService &fileService, std::string_view ntupleName,\n", " const ROOT::RNTupleWriteOptions &options)\n", " : fFileService(fileService), fTokens(std::move(modelTokens.second))\n", " {\n", " auto &model = modelTokens.first;\n", "\n", " std::lock_guard g(fileService.GetMutex());\n", " fParallelWriter =\n", " ROOT::RNTupleParallelWriter::Append(std::move(model), ntupleName, fFileService.GetFile(), options);\n", " }\n", "\n", " void InitSlot(unsigned slot) final\n", " {\n", " if (slot >= fSlots.size()) {\n", " fSlots.resize(slot + 1);\n", " }\n", " // Create an RNTupleFillContext and RRawPtrWriteEntry that are used for all fills from this slot.\n", " fSlots[slot].fillContext = fParallelWriter->CreateFillContext();\n", " fSlots[slot].entry = fSlots[slot].fillContext->GetModel().CreateRawPtrWriteEntry();\n", " }\n", "\n", " void Fill(unsigned slot, const std::vector &products) final\n", " {\n", " assert(slot < fSlots.size());\n", " auto &fillContext = *fSlots[slot].fillContext;\n", " auto &entry = *fSlots[slot].entry;\n", "\n", " // Use the field tokens to bind the products' raw pointers.\n", " for (auto &&product : products) {\n", " entry.BindRawPtr(fTokens[product.index], product.address);\n", " }\n", "\n", " // Fill the entry without triggering an implicit flush.\n", " ROOT::RNTupleFillStatus status;\n", " fillContext.FillNoFlush(entry, status);\n", " if (status.ShouldFlushCluster()) {\n", " // If we are asked to flush, first try to do as much work as possible outside of the critical section:\n", " // FlushColumns() will flush column data and trigger compression, but not actually write to storage.\n", " // (A framework may of course also decide to flush more often.)\n", " fillContext.FlushColumns();\n", "\n", " {\n", " // FlushCluster() will flush data to the underlying TFile, so it requires synchronization.\n", " std::lock_guard g(fFileService.GetMutex());\n", " fillContext.FlushCluster();\n", " }\n", " }\n", " }\n", "};\n", "%%cpp -d" ] }, { "cell_type": "markdown", "id": "6d91cb10", "metadata": {}, "source": [ "A SerializingOutputter uses a sequential RNTupleWriter to append an RNTuple to a TFile and a std::mutex to\n", "synchronize multiple threads. Note that ROOT's implicit multithreading would not be very efficient with this\n", "implementation because a thread blocking to acquire a std::mutex cannot \"help\" the other thread that is currently\n", "in the critical section by executing its tasks. See also the note at the top of the file." ] }, { "cell_type": "code", "execution_count": 6, "id": "90151b29", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:15:27.775261Z", "iopub.status.busy": "2026-05-19T20:15:27.775142Z", "iopub.status.idle": "2026-05-19T20:15:27.980512Z", "shell.execute_reply": "2026-05-19T20:15:27.979800Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "input_line_57:2:44: error: expected class name\n", " class SerializingOutputter final : public Outputter {\n", " ^\n", "input_line_57:3:4: error: unknown type name 'FileService'\n", " FileService &fFileService;\n", " ^\n", "input_line_57:14:54: error: unknown type name 'FileService'\n", " SerializingOutputter(ModelTokensPair modelTokens, FileService &fileService, std::string_view ntupleName,\n", " ^\n", "input_line_57:24:33: error: only virtual member functions can be marked 'final'\n", " void InitSlot(unsigned slot) final\n", " ^~~~~\n", "input_line_57:33:47: error: use of undeclared identifier 'DataProduct'\n", " void Fill(unsigned slot, const std::vector &products) final\n", " ^\n", "input_line_57:16:66: error: no member named 'second' in 'std::pair >, std::vector > >'\n", " : fFileService(fileService), fTokens(std::move(modelTokens.second))\n", " ~~~~~~~~~~~ ^\n", "input_line_57:18:33: error: no member named 'first' in 'std::pair >, std::vector > >'\n", " auto &model = modelTokens.first;\n", " ~~~~~~~~~~~ ^\n", "input_line_57:64:1: error: expected expression\n", "%%cpp -d\n", "^\n", "input_line_57:64:2: error: expected expression\n", "%%cpp -d\n", " ^\n", "input_line_57:64:3: error: use of undeclared identifier 'cpp'\n", "%%cpp -d\n", " ^\n", "input_line_57:64:8: error: use of undeclared identifier 'd'\n", "%%cpp -d\n", " ^\n" ] } ], "source": [ "class SerializingOutputter final : public Outputter {\n", " FileService &fFileService;\n", " std::unique_ptr fWriter;\n", " std::mutex fWriterMutex;\n", " std::vector fTokens;\n", "\n", " struct SlotData {\n", " std::unique_ptr entry;\n", " };\n", " std::vector fSlots;\n", "\n", "public:\n", " SerializingOutputter(ModelTokensPair modelTokens, FileService &fileService, std::string_view ntupleName,\n", " const ROOT::RNTupleWriteOptions &options)\n", " : fFileService(fileService), fTokens(std::move(modelTokens.second))\n", " {\n", " auto &model = modelTokens.first;\n", "\n", " std::lock_guard g(fileService.GetMutex());\n", " fWriter = ROOT::RNTupleWriter::Append(std::move(model), ntupleName, fileService.GetFile(), options);\n", " }\n", "\n", " void InitSlot(unsigned slot) final\n", " {\n", " if (slot >= fSlots.size()) {\n", " fSlots.resize(slot + 1);\n", " }\n", " // Create an RRawPtrWriteEntry that is used for all fills from this slot.\n", " fSlots[slot].entry = fWriter->GetModel().CreateRawPtrWriteEntry();\n", " }\n", "\n", " void Fill(unsigned slot, const std::vector &products) final\n", " {\n", " assert(slot < fSlots.size());\n", " auto &entry = *fSlots[slot].entry;\n", "\n", " // Use the field tokens to bind the products' raw pointers.\n", " for (auto &&product : products) {\n", " entry.BindRawPtr(fTokens[product.index], product.address);\n", " }\n", "\n", " {\n", " // Fill the entry without triggering an implicit flush. This requires synchronization with other threads using\n", " // the same writer, but not (yet) with the underlying TFile.\n", " std::lock_guard g(fWriterMutex);\n", " ROOT::RNTupleFillStatus status;\n", " fWriter->FillNoFlush(entry, status);\n", " if (status.ShouldFlushCluster()) {\n", " // If we are asked to flush, first try to do as much work as possible outside of the critical section:\n", " // FlushColumns() will flush column data and trigger compression, but not actually write to storage.\n", " // (A framework may of course also decide to flush more often.)\n", " fWriter->FlushColumns();\n", "\n", " {\n", " // FlushCluster() will flush data to the underlying TFile, so it requires synchronization.\n", " std::lock_guard g(fFileService.GetMutex());\n", " fWriter->FlushCluster();\n", " }\n", " }\n", " }\n", " }\n", "};\n", "%%cpp -d" ] }, { "cell_type": "markdown", "id": "96635d9b", "metadata": {}, "source": [ "=== END OF TUTORIAL FRAMEWORK CODE ===" ] }, { "cell_type": "markdown", "id": "28ccef3a", "metadata": {}, "source": [ "Simple structs to store events" ] }, { "cell_type": "code", "execution_count": 7, "id": "c833733d", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:15:27.982048Z", "iopub.status.busy": "2026-05-19T20:15:27.981929Z", "iopub.status.idle": "2026-05-19T20:15:28.186515Z", "shell.execute_reply": "2026-05-19T20:15:28.185866Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "input_line_58:8:2: error: expected expression\n", " %%cpp -d\n", " ^\n", "input_line_58:8:3: error: expected expression\n", " %%cpp -d\n", " ^\n", "input_line_58:8:4: error: use of undeclared identifier 'cpp'\n", " %%cpp -d\n", " ^\n", "input_line_58:8:9: error: use of undeclared identifier 'd'\n", " %%cpp -d\n", " ^\n", "input_line_58:13:1: error: expected expression\n", "%%cpp -d\n", "^\n", "input_line_58:13:2: error: expected expression\n", "%%cpp -d\n", " ^\n", "input_line_58:13:3: error: use of undeclared identifier 'cpp'\n", "%%cpp -d\n", " ^\n", "input_line_58:13:8: error: use of undeclared identifier 'd'\n", "%%cpp -d\n", " ^\n", "input_line_58:23:1: error: expected expression\n", "%%cpp -d\n", "^\n", "input_line_58:23:2: error: expected expression\n", "%%cpp -d\n", " ^\n", "input_line_58:23:3: error: use of undeclared identifier 'cpp'\n", "%%cpp -d\n", " ^\n", "input_line_58:23:8: error: use of undeclared identifier 'd'\n", "%%cpp -d\n", " ^\n" ] } ], "source": [ "struct Track {\n", " float eta;\n", " float mass;\n", " float pt;\n", " float phi;\n", "};\n", "%%cpp -d\n", "\n", "struct ChargedTrack : public Track {\n", " std::int8_t charge;\n", "};\n", "%%cpp -d\n", "\n", "struct Event {\n", " std::uint32_t eventId;\n", " std::uint32_t runId;\n", " std::vector electrons;\n", " std::vector photons;\n", " std::vector muons;\n", "};\n", "\n", "%%cpp -d" ] }, { "cell_type": "markdown", "id": "7cbd7e65", "metadata": {}, "source": [ "Simple struct to store runs" ] }, { "cell_type": "code", "execution_count": 8, "id": "addf6b40", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:15:28.188098Z", "iopub.status.busy": "2026-05-19T20:15:28.187979Z", "iopub.status.idle": "2026-05-19T20:15:28.402216Z", "shell.execute_reply": "2026-05-19T20:15:28.401476Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "In module 'std' imported from input_line_1:1:\n", "/usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/stl_vector.h:1141:7: error: function 'std::vector >::operator[]' is used but not defined in this translation unit, and cannot be defined in any other translation unit because its type does not have linkage\n", " operator[](size_type __n) _GLIBCXX_NOEXCEPT\n", " ^\n", "input_line_56: note: used here\n", "In module 'std' imported from input_line_1:1:\n", "/usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/stl_vector.h:1141:7: error: function 'std::vector >::operator[]' is used but not defined in this translation unit, and cannot be defined in any other translation unit because its type does not have linkage\n", " operator[](size_type __n) _GLIBCXX_NOEXCEPT\n", " ^\n", "input_line_57: note: used here\n" ] } ], "source": [ "struct Run {\n", " std::uint32_t runId;\n", " std::uint32_t nEvents;\n", "};\n", "\n", "constexpr unsigned kNRunsPerThread = 100;\n", "constexpr unsigned kMeanNEventsPerRun = 400;\n", "constexpr unsigned kStddevNEventsPerRun = 100;\n", "constexpr unsigned kMeanNTracks = 5;\n", "\n", "constexpr unsigned kNThreads = 4;" ] }, { "cell_type": "markdown", "id": "79832eae", "metadata": {}, "source": [ " RNTupleModel for Events; in a real framework, this would likely be dynamic.\n", " " ] }, { "cell_type": "code", "execution_count": 9, "id": "ae8598f0", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:15:28.403794Z", "iopub.status.busy": "2026-05-19T20:15:28.403640Z", "iopub.status.idle": "2026-05-19T20:15:28.421319Z", "shell.execute_reply": "2026-05-19T20:15:28.420785Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "input_line_60:8:30: error: use of undeclared identifier 'Event'\n", " model->MakeField(\"eventId\");\n", " ^\n", "input_line_60:11:30: error: use of undeclared identifier 'Event'\n", " model->MakeField(\"runId\");\n", " ^\n", "input_line_60:14:30: error: use of undeclared identifier 'Event'\n", " model->MakeField(\"electrons\");\n", " ^\n", "input_line_60:17:30: error: use of undeclared identifier 'Event'\n", " model->MakeField(\"photons\");\n", " ^\n", "input_line_60:20:30: error: use of undeclared identifier 'Event'\n", " model->MakeField(\"muons\");\n", " ^\n", "input_line_60:23:11: error: no matching constructor for initialization of 'ModelTokensPair' (aka 'pair, std::vector >')\n", " return {std::move(model), std::move(tokens)};\n", " ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n", "/usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/stl_pair.h:294:17: note: candidate constructor not viable: requires 1 argument, but 2 were provided\n", " constexpr pair(const pair&) = default; ///< Copy constructor\n", " ^ ~~~~~~~~~~~\n", "/usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/stl_pair.h:295:17: note: candidate constructor not viable: requires 1 argument, but 2 were provided\n", " constexpr pair(pair&&) = default; ///< Move constructor\n", " ^ ~~~~~~\n" ] } ], "source": [ "%%cpp -d\n", "ModelTokensPair CreateEventModel()\n", "{\n", " // We recommend creating a bare model if the default entry is not used.\n", " auto model = ROOT::RNTupleModel::CreateBare();\n", " // For more efficient access, also create field tokens.\n", " std::vector tokens;\n", "\n", " model->MakeField(\"eventId\");\n", " tokens.push_back(model->GetToken(\"eventId\"));\n", "\n", " model->MakeField(\"runId\");\n", " tokens.push_back(model->GetToken(\"runId\"));\n", "\n", " model->MakeField(\"electrons\");\n", " tokens.push_back(model->GetToken(\"electrons\"));\n", "\n", " model->MakeField(\"photons\");\n", " tokens.push_back(model->GetToken(\"photons\"));\n", "\n", " model->MakeField(\"muons\");\n", " tokens.push_back(model->GetToken(\"muons\"));\n", "\n", " return {std::move(model), std::move(tokens)};\n", "}" ] }, { "cell_type": "markdown", "id": "45492f21", "metadata": {}, "source": [ " DataProducts with addresses that point into the Event object.\n", " " ] }, { "cell_type": "code", "execution_count": 10, "id": "4eb4f95c", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:15:28.422846Z", "iopub.status.busy": "2026-05-19T20:15:28.422721Z", "iopub.status.idle": "2026-05-19T20:15:28.425996Z", "shell.execute_reply": "2026-05-19T20:15:28.425492Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "input_line_61:1:13: error: use of undeclared identifier 'DataProduct'\n", "std::vector CreateEventDataProducts(Event &event)\n", " ^\n", "input_line_61:1:50: error: unknown type name 'Event'\n", "std::vector CreateEventDataProducts(Event &event)\n", " ^\n", "input_line_61:3:16: error: use of undeclared identifier 'DataProduct'\n", " std::vector products;\n", " ^\n" ] } ], "source": [ "%%cpp -d\n", "std::vector CreateEventDataProducts(Event &event)\n", "{\n", " std::vector products;\n", " // The indices have to match the order of std::vector above.\n", " products.emplace_back(0, &event.eventId);\n", " products.emplace_back(1, &event.runId);\n", " products.emplace_back(2, &event.electrons);\n", " products.emplace_back(3, &event.photons);\n", " products.emplace_back(4, &event.muons);\n", " return products;\n", "}" ] }, { "cell_type": "markdown", "id": "532980b6", "metadata": {}, "source": [ " RNTupleModel for Runs; in a real framework, this would likely be dynamic.\n", " " ] }, { "cell_type": "code", "execution_count": 11, "id": "22104eb0", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:15:28.427415Z", "iopub.status.busy": "2026-05-19T20:15:28.427284Z", "iopub.status.idle": "2026-05-19T20:15:28.431137Z", "shell.execute_reply": "2026-05-19T20:15:28.430598Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "input_line_62:8:30: error: use of undeclared identifier 'Run'\n", " model->MakeField(\"runId\");\n", " ^\n", "input_line_62:11:30: error: use of undeclared identifier 'Run'\n", " model->MakeField(\"nEvents\");\n", " ^\n", "input_line_62:14:11: error: no matching constructor for initialization of 'ModelTokensPair' (aka 'pair, std::vector >')\n", " return {std::move(model), std::move(tokens)};\n", " ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n", "/usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/stl_pair.h:294:17: note: candidate constructor not viable: requires 1 argument, but 2 were provided\n", " constexpr pair(const pair&) = default; ///< Copy constructor\n", " ^ ~~~~~~~~~~~\n", "/usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/stl_pair.h:295:17: note: candidate constructor not viable: requires 1 argument, but 2 were provided\n", " constexpr pair(pair&&) = default; ///< Move constructor\n", " ^ ~~~~~~\n" ] } ], "source": [ "%%cpp -d\n", "ModelTokensPair CreateRunModel()\n", "{\n", " // We recommend creating a bare model if the default entry is not used.\n", " auto model = ROOT::RNTupleModel::CreateBare();\n", " // For more efficient access, also create field tokens.\n", " std::vector tokens;\n", "\n", " model->MakeField(\"runId\");\n", " tokens.push_back(model->GetToken(\"runId\"));\n", "\n", " model->MakeField(\"nEvents\");\n", " tokens.push_back(model->GetToken(\"nEvents\"));\n", "\n", " return {std::move(model), std::move(tokens)};\n", "}" ] }, { "cell_type": "markdown", "id": "6f544981", "metadata": {}, "source": [ " DataProducts with addresses that point into the Run object.\n", " " ] }, { "cell_type": "code", "execution_count": 12, "id": "fa54064e", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:15:28.432526Z", "iopub.status.busy": "2026-05-19T20:15:28.432410Z", "iopub.status.idle": "2026-05-19T20:15:28.435389Z", "shell.execute_reply": "2026-05-19T20:15:28.434772Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "input_line_63:1:13: error: use of undeclared identifier 'DataProduct'\n", "std::vector CreateRunDataProducts(Run &run)\n", " ^\n", "input_line_63:1:48: error: unknown type name 'Run'\n", "std::vector CreateRunDataProducts(Run &run)\n", " ^\n", "input_line_63:3:16: error: use of undeclared identifier 'DataProduct'\n", " std::vector products;\n", " ^\n" ] } ], "source": [ "%%cpp -d\n", "std::vector CreateRunDataProducts(Run &run)\n", "{\n", " std::vector products;\n", " // The indices have to match the order of std::vector above.\n", " products.emplace_back(0, &run.runId);\n", " products.emplace_back(1, &run.nEvents);\n", " return products;\n", "}" ] }, { "cell_type": "markdown", "id": "d6f61f9b", "metadata": {}, "source": [ " Definition of a helper function: " ] }, { "cell_type": "code", "execution_count": 13, "id": "36047e0f", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:15:28.436571Z", "iopub.status.busy": "2026-05-19T20:15:28.436460Z", "iopub.status.idle": "2026-05-19T20:15:28.450164Z", "shell.execute_reply": "2026-05-19T20:15:28.449508Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "input_line_64:1:46: error: unknown type name 'Outputter'\n", "void ProcessRunsAndEvents(unsigned threadId, Outputter &eventOutputter, Outputter &runOutputter)\n", " ^\n", "input_line_64:1:73: error: unknown type name 'Outputter'\n", "void ProcessRunsAndEvents(unsigned threadId, Outputter &eventOutputter, Outputter &runOutputter)\n", " ^\n", "input_line_64:4:49: error: unknown type name 'kMeanNEventsPerRun'\n", " std::normal_distribution nEventsDist(kMeanNEventsPerRun, kStddevNEventsPerRun);\n", " ^\n", "input_line_64:4:69: error: unknown type name 'kStddevNEventsPerRun'\n", " std::normal_distribution nEventsDist(kMeanNEventsPerRun, kStddevNEventsPerRun);\n", " ^\n", "input_line_64:4:48: warning: parentheses were disambiguated as a function declaration [-Wvexing-parse]\n", " std::normal_distribution nEventsDist(kMeanNEventsPerRun, kStddevNEventsPerRun);\n", " ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n", "input_line_64:4:49: note: add a pair of parentheses to declare a variable\n", " std::normal_distribution nEventsDist(kMeanNEventsPerRun, kStddevNEventsPerRun);\n", " ^\n", " (\n", "input_line_64:5:44: error: unknown type name 'kMeanNTracks'\n", " std::poisson_distribution<> nTracksDist(kMeanNTracks);\n", " ^\n", "input_line_64:5:43: warning: parentheses were disambiguated as a function declaration [-Wvexing-parse]\n", " std::poisson_distribution<> nTracksDist(kMeanNTracks);\n", " ^~~~~~~~~~~~~~\n", "input_line_64:5:44: note: add a pair of parentheses to declare a variable\n", " std::poisson_distribution<> nTracksDist(kMeanNTracks);\n", " ^\n", " (\n", "input_line_64:8:42: error: use of undeclared identifier 'kNRunsPerThread'\n", " for (std::uint32_t runId = threadId * kNRunsPerThread; runId < (threadId + 1) * kNRunsPerThread; runId++) {\n", " ^\n", "input_line_64:8:84: error: use of undeclared identifier 'kNRunsPerThread'\n", " for (std::uint32_t runId = threadId * kNRunsPerThread; runId < (threadId + 1) * kNRunsPerThread; runId++) {\n", " ^\n", "input_line_64:16:7: error: unknown type name 'Event'\n", " Event event;\n", " ^\n", "input_line_64:51:7: error: unknown type name 'Run'\n", " Run run;\n", " ^\n" ] } ], "source": [ "%%cpp -d\n", "void ProcessRunsAndEvents(unsigned threadId, Outputter &eventOutputter, Outputter &runOutputter)\n", "{\n", " std::mt19937 gen(threadId);\n", " std::normal_distribution nEventsDist(kMeanNEventsPerRun, kStddevNEventsPerRun);\n", " std::poisson_distribution<> nTracksDist(kMeanNTracks);\n", " std::uniform_real_distribution floatDist;\n", "\n", " for (std::uint32_t runId = threadId * kNRunsPerThread; runId < (threadId + 1) * kNRunsPerThread; runId++) {\n", " double nEventsD = nEventsDist(gen);\n", " std::uint32_t nEvents = 0;\n", " if (nEventsD > 0) {\n", " nEvents = static_cast(nEventsD);\n", " }\n", "\n", " // Process events, reusing a single Event object.\n", " Event event;\n", " event.runId = runId;\n", " auto eventProducts = CreateEventDataProducts(event);\n", " for (std::uint32_t eventId = 0; eventId < nEvents; eventId++) {\n", " event.eventId = eventId;\n", "\n", " // Produce some data; eta, phi, and pt are just filled with uniformly distributed data.\n", " event.electrons.resize(nTracksDist(gen));\n", " for (auto &electron : event.electrons) {\n", " electron.eta = floatDist(gen);\n", " electron.mass = 0.511 /* MeV */;\n", " electron.phi = floatDist(gen);\n", " electron.pt = floatDist(gen);\n", " electron.charge = (gen() % 2 ? 1 : -1);\n", " }\n", " event.photons.resize(nTracksDist(gen));\n", " for (auto &photon : event.photons) {\n", " photon.eta = floatDist(gen);\n", " photon.mass = 0;\n", " photon.phi = floatDist(gen);\n", " photon.pt = floatDist(gen);\n", " }\n", " event.muons.resize(nTracksDist(gen));\n", " for (auto &muon : event.muons) {\n", " muon.eta = floatDist(gen);\n", " muon.mass = 105.658 /* MeV */;\n", " muon.phi = floatDist(gen);\n", " muon.pt = floatDist(gen);\n", " muon.charge = (gen() % 2 ? 1 : -1);\n", " }\n", "\n", " eventOutputter.Fill(threadId, eventProducts);\n", " }\n", "\n", " // Fill the Run data.\n", " Run run;\n", " run.runId = runId;\n", " run.nEvents = nEvents;\n", "\n", " auto runProducts = CreateRunDataProducts(run);\n", " runOutputter.Fill(threadId, runProducts);\n", " }\n", "}" ] }, { "cell_type": "code", "execution_count": 14, "id": "efbd9c2e", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:15:28.451384Z", "iopub.status.busy": "2026-05-19T20:15:28.451262Z", "iopub.status.idle": "2026-05-19T20:15:28.656356Z", "shell.execute_reply": "2026-05-19T20:15:28.655767Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "input_line_65:2:2: error: unknown type name 'FileService'\n", " FileService fileService(\"ntpl014_framework.root\", \"RECREATE\");\n", " ^\n" ] } ], "source": [ "FileService fileService(\"ntpl014_framework.root\", \"RECREATE\");\n", "\n", "ROOT::RNTupleWriteOptions options;" ] }, { "cell_type": "markdown", "id": "95a9acce", "metadata": {}, "source": [ "Parallel writing requires buffered writing; force it on (even if it is the default)." ] }, { "cell_type": "code", "execution_count": 15, "id": "735404d1", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:15:28.658022Z", "iopub.status.busy": "2026-05-19T20:15:28.657899Z", "iopub.status.idle": "2026-05-19T20:15:28.863246Z", "shell.execute_reply": "2026-05-19T20:15:28.862588Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "input_line_67:2:3: error: use of undeclared identifier 'options'\n", " (options.SetUseBufferedWrite(true))\n", " ^\n", "Error in : Error evaluating expression (options.SetUseBufferedWrite(true))\n", "Execution of your code was aborted.\n" ] } ], "source": [ "options.SetUseBufferedWrite(true);" ] }, { "cell_type": "markdown", "id": "2596037f", "metadata": {}, "source": [ "For demonstration purposes, reduce the cluster size to 2 MiB." ] }, { "cell_type": "code", "execution_count": 16, "id": "50ccd069", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:15:28.864727Z", "iopub.status.busy": "2026-05-19T20:15:28.864591Z", "iopub.status.idle": "2026-05-19T20:15:29.070086Z", "shell.execute_reply": "2026-05-19T20:15:29.069601Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "input_line_68:3:1: error: unknown type name 'ParallelOutputter'\n", "ParallelOutputter eventOutputter(CreateEventModel(), fileService, \"Events\", options);\n", "^\n", "input_line_68:3:34: error: use of undeclared identifier 'CreateEventModel'\n", "ParallelOutputter eventOutputter(CreateEventModel(), fileService, \"Events\", options);\n", " ^\n", "input_line_68:3:54: error: use of undeclared identifier 'fileService'\n", "ParallelOutputter eventOutputter(CreateEventModel(), fileService, \"Events\", options);\n", " ^\n", "input_line_68:3:77: error: use of undeclared identifier 'options'\n", "ParallelOutputter eventOutputter(CreateEventModel(), fileService, \"Events\", options);\n", " ^\n" ] } ], "source": [ "options.SetApproxZippedClusterSize(2 * 1024 * 1024);\n", "ParallelOutputter eventOutputter(CreateEventModel(), fileService, \"Events\", options);" ] }, { "cell_type": "markdown", "id": "e4f25c8c", "metadata": {}, "source": [ "SerializingOutputter also relies on buffered writing; force it on (even if it is the default)." ] }, { "cell_type": "code", "execution_count": 17, "id": "56ef99ac", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:15:29.071470Z", "iopub.status.busy": "2026-05-19T20:15:29.071360Z", "iopub.status.idle": "2026-05-19T20:15:29.276746Z", "shell.execute_reply": "2026-05-19T20:15:29.276171Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "input_line_70:2:3: error: use of undeclared identifier 'options'\n", " (options.SetUseBufferedWrite(true))\n", " ^\n", "Error in : Error evaluating expression (options.SetUseBufferedWrite(true))\n", "Execution of your code was aborted.\n" ] } ], "source": [ "options.SetUseBufferedWrite(true);" ] }, { "cell_type": "markdown", "id": "f4416594", "metadata": {}, "source": [ "For demonstration purposes, reduce the cluster size for the very simple Run data to 1 KiB." ] }, { "cell_type": "code", "execution_count": 18, "id": "e469e90d", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:15:29.278274Z", "iopub.status.busy": "2026-05-19T20:15:29.278151Z", "iopub.status.idle": "2026-05-19T20:15:29.483550Z", "shell.execute_reply": "2026-05-19T20:15:29.483090Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "input_line_71:3:1: error: unknown type name 'SerializingOutputter'\n", "SerializingOutputter runOutputter(CreateRunModel(), fileService, \"Runs\", options);\n", "^\n", "input_line_71:3:35: error: use of undeclared identifier 'CreateRunModel'\n", "SerializingOutputter runOutputter(CreateRunModel(), fileService, \"Runs\", options);\n", " ^\n", "input_line_71:3:53: error: use of undeclared identifier 'fileService'\n", "SerializingOutputter runOutputter(CreateRunModel(), fileService, \"Runs\", options);\n", " ^\n", "input_line_71:3:74: error: use of undeclared identifier 'options'\n", "SerializingOutputter runOutputter(CreateRunModel(), fileService, \"Runs\", options);\n", " ^\n" ] } ], "source": [ "options.SetApproxZippedClusterSize(1024);\n", "SerializingOutputter runOutputter(CreateRunModel(), fileService, \"Runs\", options);" ] }, { "cell_type": "markdown", "id": "9f64ac50", "metadata": {}, "source": [ "Initialize slots in the two Outputters." ] }, { "cell_type": "code", "execution_count": 19, "id": "4325ef3b", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:15:29.484883Z", "iopub.status.busy": "2026-05-19T20:15:29.484766Z", "iopub.status.idle": "2026-05-19T20:15:29.687656Z", "shell.execute_reply": "2026-05-19T20:15:29.687159Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "input_line_72:8:26: error: use of undeclared identifier 'kNThreads'\n", "for (unsigned i = 0; i < kNThreads; i++) {\n", " ^\n", "input_line_72:9:25: error: use of undeclared identifier 'ProcessRunsAndEvents'\n", " threads.emplace_back(ProcessRunsAndEvents, i, std::ref(eventOutputter), std::ref(runOutputter));\n", " ^\n", "input_line_72:9:59: error: use of undeclared identifier 'eventOutputter'\n", " threads.emplace_back(ProcessRunsAndEvents, i, std::ref(eventOutputter), std::ref(runOutputter));\n", " ^\n", "input_line_72:9:85: error: use of undeclared identifier 'runOutputter'\n", " threads.emplace_back(ProcessRunsAndEvents, i, std::ref(eventOutputter), std::ref(runOutputter));\n", " ^\n", "input_line_72:11:26: error: use of undeclared identifier 'kNThreads'\n", "for (unsigned i = 0; i < kNThreads; i++) {\n", " ^\n" ] } ], "source": [ "for (unsigned i = 0; i < kNThreads; i++) {\n", " eventOutputter.InitSlot(i);\n", " runOutputter.InitSlot(i);\n", "}\n", "\n", "std::vector threads;\n", "for (unsigned i = 0; i < kNThreads; i++) {\n", " threads.emplace_back(ProcessRunsAndEvents, i, std::ref(eventOutputter), std::ref(runOutputter));\n", "}\n", "for (unsigned i = 0; i < kNThreads; i++) {\n", " threads[i].join();\n", "}" ] }, { "cell_type": "markdown", "id": "3dd7b2cb", "metadata": {}, "source": [ "Draw all canvases " ] }, { "cell_type": "code", "execution_count": 20, "id": "109b2d5f", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:15:29.689147Z", "iopub.status.busy": "2026-05-19T20:15:29.689023Z", "iopub.status.idle": "2026-05-19T20:15:29.893689Z", "shell.execute_reply": "2026-05-19T20:15:29.893124Z" } }, "outputs": [], "source": [ "gROOT->GetListOfCanvases()->Draw()" ] } ], "metadata": { "kernelspec": { "display_name": "ROOT C++", "language": "c++", "name": "root" }, "language_info": { "codemirror_mode": "text/x-c++src", "file_extension": ".C", "mimetype": " text/x-c++src", "name": "c++" } }, "nbformat": 4, "nbformat_minor": 5 }