Logo ROOT  
Reference Guide
No Matches
Go to the documentation of this file.
1/// \file
2/// \ingroup tutorial_dataframe
3/// \notebook -draw
4/// Cache a processed RDataFrame in memory for further usage.
6/// This tutorial shows how the content of a data frame can be cached in memory
7/// in form of a data frame. The content of the columns is stored in memory in
8/// contiguous slabs of memory and is "ready to use", i.e. no ROOT IO operation
9/// is performed.
11/// Creating a cached data frame storing all of its content deserialised and uncompressed
12/// in memory is particularly useful when dealing with datasets of a moderate size
13/// (small enough to fit the RAM) over which several explorative loops need to be
14/// performed at as fast as possible. In addition, caching can be useful when no file
15/// on disk needs to be created as a side effect of checkpointing part of the analysis.
17/// All steps in the caching are lazy, i.e. the cached data frame is actually filled
18/// only when the event loop is triggered on it.
20/// \macro_code
21/// \macro_image
23/// \date June 2018
24/// \author Danilo Piparo (CERN)
26void df019_Cache()
28 // We create a data frame on top of the hsimple example
29 auto hsimplePath = gROOT->GetTutorialDir();
30 hsimplePath += "/hsimple.root";
31 ROOT::RDataFrame df("ntuple", hsimplePath.Data());
33 // We apply a simple cut and define a new column
34 auto df_cut = df.Filter([](float py) { return py > 0.f; }, {"py"})
35 .Define("px_plus_py", [](float px, float py) { return px + py; }, {"px", "py"});
37 // We cache the content of the dataset. Nothing has happened yet: the work to accomplish
38 // has been described. As for `Snapshot`, the types and columns can be written out explicitly
39 // or left for the jitting to handle (`df_cached` is intentionally unused - it shows how to
40 // to create a *cached* data frame specifying column types explicitly):
41 auto df_cached = df_cut.Cache<float, float>({"px_plus_py", "py"});
42 auto df_cached_implicit = df_cut.Cache();
43 auto h = df_cached_implicit.Histo1D<float>("px_plus_py");
45 // Now the event loop on the cached dataset is triggered. This event triggers the loop
46 // on the `df` data frame lazily.
47 h->DrawCopy();
#define h(i)
Definition RSha256.hxx:106
#define gROOT
Definition TROOT.h:406
ROOT's RDataFrame offers a high level interface for analyses of data stored in TTrees,...