Logo ROOT  
Reference Guide
 
Loading...
Searching...
No Matches
ROOT::RDF Namespace Reference

Namespaces

namespace  Experimental
 
namespace  Internal
 

Classes

class  RArrowDS
 RDataFrame data source class to interface with Apache Arrow. More...
 
class  RCsvDS
 RDataFrame data source class for reading CSV files. More...
 
class  RCutFlowReport
 
class  RDataSource
 RDataSource defines an API that RDataFrame can use to read arbitrary data formats. More...
 
class  RDFDescription
 A DFDescription contains useful information about a given RDataFrame computation graph. More...
 
class  RDFTypeNameGetter
 Helper to get the contents of a given column. More...
 
class  RDisplay
 This class is the textual representation of the content of a columnar dataset. More...
 
class  RInterface
 The public interface to the RDataFrame federation of classes. More...
 
class  RInterfaceBase
 The base public interface to the RDataFrame federation of classes. More...
 
class  RLazyDS
 A RDataSource implementation which is built on top of result proxies. More...
 
class  RNTupleDS
 The RDataSource implementation for RNTuple. More...
 
class  RResultHandle
 A type-erased version of RResultPtr and RResultMap. More...
 
class  RResultPtr
 Smart pointer for the return type of actions. More...
 
class  RSampleInfo
 This type represents a sample identifier, to be used in conjunction with RDataFrame features such as DefinePerSample() and per-sample callbacks. More...
 
struct  RSnapshotOptions
 A collection of options to steer the creation of the dataset on disk through Snapshot(). More...
 
class  RSqliteDS
 RSqliteDS is an RDF data source implementation for SQL result sets from sqlite3 files. More...
 
class  RTrivialDS
 A simple data-source implementation, for demo purposes. More...
 
class  RVariationsDescription
 A descriptor for the systematic variations known to a given RDataFrame node. More...
 
class  TCutInfo
 
class  TH1DModel
 A struct which stores some basic parameters of a TH1D. More...
 
class  TH2DModel
 A struct which stores some basic parameters of a TH2D. More...
 
class  TH3DModel
 A struct which stores some basic parameters of a TH3D. More...
 
class  THnDModel
 A struct which stores some basic parameters of a THnD. More...
 
class  THnSparseDModel
 A struct which stores some basic parameters of a THnSparseD. More...
 
class  TProfile1DModel
 A struct which stores some basic parameters of a TProfile. More...
 
class  TProfile2DModel
 A struct which stores some basic parameters of a TProfile2D. More...
 
class  VerifyValidColumnType
 Helper to determine if a given Column is a supported type. More...
 

Typedefs

using ColumnNames_t = std::vector<std::string>
 
using RNode = RInterface<::ROOT::Detail::RDF::RNodeBase>
 
using SampleCallback_t = std::function<void(unsigned int, const ROOT::RDF::RSampleInfo &)>
 The type of a data-block callback, registered with an RDataFrame computation graph via e.g.
 

Enumerations

enum class  ESnapshotOutputFormat { kDefault , kTTree , kRNTuple }
 

Functions

template<typename AccFun, typename MergeFun, typename R = typename TTraits::CallableTraits<AccFun>::ret_type, typename ArgTypes = typename TTraits::CallableTraits<AccFun>::arg_types, typename ArgTypesNoDecay = typename TTraits::CallableTraits<AccFun>::arg_types_nodecay, typename U = TTraits::TakeFirstParameter_t<ArgTypes>, typename T = TTraits::TakeFirstParameter_t<TTraits::RemoveFirstParameter_t<ArgTypes>>>
RResultPtr< U > Aggregate (AccFun aggregator, MergeFun merger, std::string_view columnName, const U &aggIdentity)
 Execute a user-defined accumulation operation on the processed column values in each processing slot.
 
template<typename AccFun, typename MergeFun, typename R = typename TTraits::CallableTraits<AccFun>::ret_type, typename ArgTypes = typename TTraits::CallableTraits<AccFun>::arg_types, typename U = TTraits::TakeFirstParameter_t<ArgTypes>, typename T = TTraits::TakeFirstParameter_t<TTraits::RemoveFirstParameter_t<ArgTypes>>>
RResultPtr< U > Aggregate (AccFun aggregator, MergeFun merger, std::string_view columnName="")
 Execute a user-defined accumulation operation on the processed column values in each processing slot.
 
RInterface< Proxied > Alias (std::string_view alias, std::string_view columnName)
 Allow to refer to a column with a different name.
 
template<typename NodeType>
RNode AsRNode (NodeType node)
 Cast a RDataFrame node to the common type ROOT::RDF::RNode.
 
template<typename FirstColumn = RDFDetail::RInferredType, typename... OtherColumns, typename Helper>
RResultPtr< typename std::decay_t< Helper >::Result_t > Book (Helper &&helper, const ColumnNames_t &columns={})
 Book execution of a custom action using a user-defined helper object.
 
template<typename... ColumnTypes>
RInterface< RLoopManagerCache (const ColumnNames_t &columnList)
 Save selected columns in memory.
 
RInterface< RLoopManagerCache (const ColumnNames_t &columnList)
 Save selected columns in memory.
 
RInterface< RLoopManagerCache (std::initializer_list< std::string > columnList)
 Save selected columns in memory.
 
RInterface< RLoopManagerCache (std::string_view columnNameRegexp="")
 Save selected columns in memory.
 
template<typename... ColTypes, std::size_t... S>
RInterface< RLoopManagerCacheImpl (const ColumnNames_t &columnList, std::index_sequence< S... >)
 Implementation of cache.
 
template<typename Helper, typename ActionResultType, typename... Others>
RResultPtr< ActionResultType > CallCreateActionWithoutColsIfPossible (const std::shared_ptr< ActionResultType > &, const std::shared_ptr< Helper > &, Others...)
 
template<typename Helper, typename ActionResultType>
auto CallCreateActionWithoutColsIfPossible (const std::shared_ptr< ActionResultType > &resPtr, const std::shared_ptr< Helper > &hPtr, TTraits::TypeList< RDFDetail::RInferredType >) -> decltype(hPtr->Exec(0u), RResultPtr< ActionResultType >{})
 
RResultPtr< ULong64_tCount ()
 Return the number of entries processed (lazy action).
 
template<typename F, typename DefineType, typename RetType = typename TTraits::CallableTraits<F>::ret_type>
std::enable_if_t< std::is_default_constructible< RetType >::value, RInterface< Proxied > > DefineImpl (std::string_view name, F &&expression, const ColumnNames_t &columns, const std::string &where)
 
template<typename F, typename DefineType, typename RetType = typename TTraits::CallableTraits<F>::ret_type, bool IsFStringConv = std::is_convertible<F, std::string>::value, bool IsRetTypeDefConstr = std::is_default_constructible<RetType>::value>
std::enable_if_t<!IsFStringConv &&!IsRetTypeDefConstr, RInterface< Proxied > > DefineImpl (std::string_view, F, const ColumnNames_t &, const std::string &)
 
template<typename... ColumnTypes>
RResultPtr< RDisplayDisplay (const ColumnNames_t &columnList, size_t nRows=5, size_t nMaxCollectionElements=10)
 Provides a representation of the columns in the dataset.
 
RResultPtr< RDisplayDisplay (const ColumnNames_t &columnList, size_t nRows=5, size_t nMaxCollectionElements=10)
 Provides a representation of the columns in the dataset.
 
RResultPtr< RDisplayDisplay (std::initializer_list< std::string > columnList, size_t nRows=5, size_t nMaxCollectionElements=10)
 Provides a representation of the columns in the dataset.
 
RResultPtr< RDisplayDisplay (std::string_view columnNameRegexp="", size_t nRows=5, size_t nMaxCollectionElements=10)
 Provides a representation of the columns in the dataset.
 
template<typename FirstColumn = RDFDetail::RInferredType, typename... OtherColumns, typename T>
RResultPtr< std::decay_t< T > > Fill (T &&model, const ColumnNames_t &columnList)
 Return an object of type T on which T::Fill will be called once per event (lazy action).
 
template<typename F>
void Foreach (F f, const ColumnNames_t &columns={})
 Execute a user-defined function on each entry (instant action).
 
template<typename F>
void ForeachSlot (F f, const ColumnNames_t &columns={})
 Execute a user-defined function requiring a processing slot index on each entry (instant action).
 
RDataFrame FromArrow (std::shared_ptr< arrow::Table > table, std::vector< std::string > const &columnNames)
 Factory method to create a Apache Arrow RDataFrame.
 
RDataFrame FromCSV (std::string_view fileName, bool readHeaders=true, char delimiter=',', Long64_t linesChunkSize=-1LL, std::unordered_map< std::string, char > &&colTypes={})
 Factory method to create a CSV RDataFrame.
 
RDataFrame FromCSV (std::string_view fileName, const RCsvDS::ROptions &options)
 Factory method to create a CSV RDataFrame.
 
RDataFrame FromRNTuple (std::string_view ntupleName, const std::vector< std::string > &fileNames)
 
RDataFrame FromRNTuple (std::string_view ntupleName, std::string_view fileName)
 
RDataFrame FromSqlite (std::string_view fileName, std::string_view query)
 Factory method to create a SQlite RDataFrame.
 
template<typename T>
std::shared_ptr< arrow::ChunkedArray > getData (T p)
 
std::vector< std::string > GetFilterNames ()
 Returns the names of the filters created.
 
int getNRecords (std::shared_ptr< arrow::Table > &table, std::vector< std::string > &columnNames)
 
const std::shared_ptr< Proxied > & GetProxiedPtr () const
 
template<typename X = RDFDetail::RInferredType, typename Y = RDFDetail::RInferredType>
RResultPtr<::TGraphGraph (std::string_view x="", std::string_view y="")
 Fill and return a TGraph object (lazy action).
 
template<typename X = RDFDetail::RInferredType, typename Y = RDFDetail::RInferredType, typename EXL = RDFDetail::RInferredType, typename EXH = RDFDetail::RInferredType, typename EYL = RDFDetail::RInferredType, typename EYH = RDFDetail::RInferredType>
RResultPtr<::TGraphAsymmErrorsGraphAsymmErrors (std::string_view x="", std::string_view y="", std::string_view exl="", std::string_view exh="", std::string_view eyl="", std::string_view eyh="")
 Fill and return a TGraphAsymmErrors object (lazy action).
 
template<typename ColumnType = RDFDetail::RInferredType, typename... ColumnTypes, typename BinContentType>
RResultPtr< ROOT::Experimental::RHist< BinContentType > > Hist (std::shared_ptr< ROOT::Experimental::RHist< BinContentType > > h, const ColumnNames_t &columnList)
 Fill the provided RHist (lazy action).
 
template<typename ColumnType = RDFDetail::RInferredType, typename... ColumnTypes, typename BinContentType>
RResultPtr< ROOT::Experimental::RHist< BinContentType > > Hist (std::shared_ptr< ROOT::Experimental::RHist< BinContentType > > h, const ColumnNames_t &columnList, std::string_view wName)
 Fill the provided RHist with weights (lazy action).
 
template<typename ColumnType = RDFDetail::RInferredType, typename... ColumnTypes, typename BinContentType>
RResultPtr< ROOT::Experimental::RHistEngine< BinContentType > > Hist (std::shared_ptr< ROOT::Experimental::RHistEngine< BinContentType > > h, const ColumnNames_t &columnList)
 Fill the provided RHistEngine (lazy action).
 
template<typename ColumnType = RDFDetail::RInferredType, typename... ColumnTypes, typename BinContentType>
RResultPtr< ROOT::Experimental::RHistEngine< BinContentType > > Hist (std::shared_ptr< ROOT::Experimental::RHistEngine< BinContentType > > h, const ColumnNames_t &columnList, std::string_view wName)
 Fill the provided RHistEngine with weights (lazy action).
 
template<typename BinContentType = double, typename V = RDFDetail::RInferredType>
RResultPtr< ROOT::Experimental::RHist< BinContentType > > Hist (std::uint64_t nNormalBins, std::pair< double, double > interval, std::string_view vName)
 Fill and return a one-dimensional RHist (lazy action).
 
template<typename BinContentType = ROOT::Experimental::RBinWithError, typename V = RDFDetail::RInferredType, typename W = RDFDetail::RInferredType>
RResultPtr< ROOT::Experimental::RHist< BinContentType > > Hist (std::uint64_t nNormalBins, std::pair< double, double > interval, std::string_view vName, std::string_view wName)
 Fill and return a one-dimensional RHist with weights (lazy action).
 
template<typename BinContentType = double, typename ColumnType = RDFDetail::RInferredType, typename... ColumnTypes>
RResultPtr< ROOT::Experimental::RHist< BinContentType > > Hist (std::vector< ROOT::Experimental::RAxisVariant > axes, const ColumnNames_t &columnList)
 Fill and return an RHist (lazy action).
 
template<typename BinContentType = ROOT::Experimental::RBinWithError, typename ColumnType = RDFDetail::RInferredType, typename... ColumnTypes>
RResultPtr< ROOT::Experimental::RHist< BinContentType > > Hist (std::vector< ROOT::Experimental::RAxisVariant > axes, const ColumnNames_t &columnList, std::string_view wName)
 Fill and return an RHist with weights (lazy action).
 
template<typename V = RDFDetail::RInferredType, typename W = RDFDetail::RInferredType>
RResultPtr<::TH1DHisto1D (const TH1DModel &model, std::string_view vName, std::string_view wName)
 Fill and return a one-dimensional histogram with the weighted values of a column (lazy action).
 
template<typename V, typename W>
RResultPtr<::TH1DHisto1D (const TH1DModel &model={"", "", 128u, 0., 0.})
 Fill and return a one-dimensional histogram with the weighted values of a column (lazy action).
 
template<typename V = RDFDetail::RInferredType>
RResultPtr<::TH1DHisto1D (const TH1DModel &model={"", "", 128u, 0., 0.}, std::string_view vName="")
 Fill and return a one-dimensional histogram with the values of a column (lazy action).
 
template<typename V = RDFDetail::RInferredType>
RResultPtr<::TH1DHisto1D (std::string_view vName)
 Fill and return a one-dimensional histogram with the values of a column (lazy action).
 
template<typename V = RDFDetail::RInferredType, typename W = RDFDetail::RInferredType>
RResultPtr<::TH1DHisto1D (std::string_view vName, std::string_view wName)
 Fill and return a one-dimensional histogram with the weighted values of a column (lazy action).
 
template<typename V1, typename V2, typename W>
RResultPtr<::TH2DHisto2D (const TH2DModel &model)
 
template<typename V1 = RDFDetail::RInferredType, typename V2 = RDFDetail::RInferredType, typename W = RDFDetail::RInferredType>
RResultPtr<::TH2DHisto2D (const TH2DModel &model, std::string_view v1Name, std::string_view v2Name, std::string_view wName)
 Fill and return a weighted two-dimensional histogram (lazy action).
 
template<typename V1 = RDFDetail::RInferredType, typename V2 = RDFDetail::RInferredType>
RResultPtr<::TH2DHisto2D (const TH2DModel &model, std::string_view v1Name="", std::string_view v2Name="")
 Fill and return a two-dimensional histogram (lazy action).
 
template<typename V1, typename V2, typename V3, typename W>
RResultPtr<::TH3DHisto3D (const TH3DModel &model)
 
template<typename V1 = RDFDetail::RInferredType, typename V2 = RDFDetail::RInferredType, typename V3 = RDFDetail::RInferredType, typename W = RDFDetail::RInferredType>
RResultPtr<::TH3DHisto3D (const TH3DModel &model, std::string_view v1Name, std::string_view v2Name, std::string_view v3Name, std::string_view wName)
 Fill and return a three-dimensional histogram (lazy action).
 
template<typename V1 = RDFDetail::RInferredType, typename V2 = RDFDetail::RInferredType, typename V3 = RDFDetail::RInferredType>
RResultPtr<::TH3DHisto3D (const TH3DModel &model, std::string_view v1Name="", std::string_view v2Name="", std::string_view v3Name="")
 Fill and return a three-dimensional histogram (lazy action).
 
template<typename FirstColumn, typename... OtherColumns>
RResultPtr<::THnDHistoND (const THnDModel &model, const ColumnNames_t &columnList, std::string_view wName="")
 Fill and return an N-dimensional histogram (lazy action).
 
RResultPtr<::THnDHistoND (const THnDModel &model, const ColumnNames_t &columnList, std::string_view wName="")
 Fill and return an N-dimensional histogram (lazy action).
 
template<typename FirstColumn, typename... OtherColumns>
RResultPtr<::THnSparseDHistoNSparseD (const THnSparseDModel &model, const ColumnNames_t &columnList, std::string_view wName="")
 Fill and return a sparse N-dimensional histogram (lazy action).
 
RResultPtr<::THnSparseDHistoNSparseD (const THnSparseDModel &model, const ColumnNames_t &columnList, std::string_view wName="")
 Fill and return a sparse N-dimensional histogram (lazy action).
 
RInterface< Proxied > JittedVaryImpl (const std::vector< std::string > &colNames, std::string_view expression, const std::vector< std::string > &variationTags, std::string_view variationName, bool isSingleColumn)
 
template<typename... ColumnTypes>
RDataFrame MakeLazyDataFrame (std::pair< std::string, RResultPtr< std::vector< ColumnTypes > > > &&... colNameProxyPairs)
 Factory method to create a Lazy RDataFrame.
 
RInterface< RDFDetail::RLoopManager > MakeTrivialDataFrame ()
 Make a RDF wrapping a RTrivialDS with infinite entries, for demo purposes.
 
RInterface< RDFDetail::RLoopManager > MakeTrivialDataFrame (ULong64_t size, bool skipEvenEntries=false)
 Make a RDF wrapping a RTrivialDS with the specified amount of entries.
 
template<typename T = RDFDetail::RInferredType>
RResultPtr< RDFDetail::MaxReturnType_t< T > > Max (std::string_view columnName="")
 Return the maximum of processed column values (lazy action).
 
template<typename T = RDFDetail::RInferredType>
RResultPtr< doubleMean (std::string_view columnName="")
 Return the mean of processed column values (lazy action).
 
template<typename T = RDFDetail::RInferredType>
RResultPtr< RDFDetail::MinReturnType_t< T > > Min (std::string_view columnName="")
 Return the minimum of processed column values (lazy action).
 
template<typename F, typename Args = typename ROOT::TypeTraits::CallableTraits<std::decay_t<F>>::arg_types_nodecay, typename Ret = typename ROOT::TypeTraits::CallableTraits<std::decay_t<F>>::ret_type>
auto Not (F &&f) -> decltype(RDFInternal::NotHelper(Args(), std::forward< F >(f)))
 Given a callable with signature bool(T1, T2, ...) return a callable with same signature that returns the negated result.
 
template<class T1, class T2>
bool operator!= (const RResultPtr< T1 > &lhs, const RResultPtr< T2 > &rhs)
 
template<class T1>
bool operator!= (const RResultPtr< T1 > &lhs, std::nullptr_t rhs)
 
template<class T1>
bool operator!= (std::nullptr_t lhs, const RResultPtr< T1 > &rhs)
 
std::ostream & operator<< (std::ostream &os, const RDFDescription &description)
 
template<class T1, class T2>
bool operator== (const RResultPtr< T1 > &lhs, const RResultPtr< T2 > &rhs)
 
template<class T1>
bool operator== (const RResultPtr< T1 > &lhs, std::nullptr_t rhs)
 
template<class T1>
bool operator== (std::nullptr_t lhs, const RResultPtr< T1 > &rhs)
 
template<std::size_t N, typename T, typename F>
auto PassAsVec (F &&f) -> RDFInternal::PassAsVecHelper< std::make_index_sequence< N >, T, F >
 PassAsVec is a callable generator that allows passing N variables of type T to a function as a single collection.
 
template<typename V1, typename V2, typename W>
RResultPtr<::TProfileProfile1D (const TProfile1DModel &model)
 Fill and return a one-dimensional profile (lazy action). See the first Profile1D() overload for more details.
 
template<typename V1 = RDFDetail::RInferredType, typename V2 = RDFDetail::RInferredType, typename W = RDFDetail::RInferredType>
RResultPtr<::TProfileProfile1D (const TProfile1DModel &model, std::string_view v1Name, std::string_view v2Name, std::string_view wName)
 Fill and return a one-dimensional profile (lazy action).
 
template<typename V1 = RDFDetail::RInferredType, typename V2 = RDFDetail::RInferredType>
RResultPtr<::TProfileProfile1D (const TProfile1DModel &model, std::string_view v1Name="", std::string_view v2Name="")
 Fill and return a one-dimensional profile (lazy action).
 
template<typename V1, typename V2, typename V3, typename W>
RResultPtr<::TProfile2DProfile2D (const TProfile2DModel &model)
 Fill and return a two-dimensional profile (lazy action). See the first Profile2D() overload for more details.
 
template<typename V1 = RDFDetail::RInferredType, typename V2 = RDFDetail::RInferredType, typename V3 = RDFDetail::RInferredType, typename W = RDFDetail::RInferredType>
RResultPtr<::TProfile2DProfile2D (const TProfile2DModel &model, std::string_view v1Name, std::string_view v2Name, std::string_view v3Name, std::string_view wName)
 Fill and return a two-dimensional profile (lazy action).
 
template<typename V1 = RDFDetail::RInferredType, typename V2 = RDFDetail::RInferredType, typename V3 = RDFDetail::RInferredType>
RResultPtr<::TProfile2DProfile2D (const TProfile2DModel &model, std::string_view v1Name="", std::string_view v2Name="", std::string_view v3Name="")
 Fill and return a two-dimensional profile (lazy action).
 
RInterface< RDFDetail::RRange< Proxied > > Range (unsigned int begin, unsigned int end, unsigned int stride=1)
 Creates a node that filters entries based on range: [begin, end).
 
RInterface< RDFDetail::RRange< Proxied > > Range (unsigned int end)
 Creates a node that filters entries based on range.
 
template<typename F, typename T = typename TTraits::CallableTraits<F>::ret_type>
RResultPtr< T > Reduce (F f, std::string_view columnName, const T &redIdentity)
 Execute a user-defined reduce operation on the values of a column.
 
template<typename F, typename T = typename TTraits::CallableTraits<F>::ret_type>
RResultPtr< T > Reduce (F f, std::string_view columnName="")
 Execute a user-defined reduce operation on the values of a column.
 
RResultPtr< RCutFlowReportReport ()
 Gather filtering statistics.
 
 RInterface (const std::shared_ptr< Proxied > &proxied, RLoopManager &lm, const RDFInternal::RColumnRegister &colRegister)
 
unsigned int RunGraphs (std::vector< RResultHandle > handles)
 Run the event loops of multiple RDataFrames concurrently.
 
template<typename NodeType>
std::string SaveGraph (NodeType node)
 Create a graphviz representation of the dataframe computation graph, return it as a string.
 
template<typename NodeType>
void SaveGraph (NodeType node, const std::string &outputFile)
 Create a graphviz representation of the dataframe computation graph, write it to the specified file.
 
template<typename... ColumnTypes>
RResultPtr< RInterface< RLoopManager > > Snapshot (std::string_view treename, std::string_view filename, const ColumnNames_t &columnList, const RSnapshotOptions &options=RSnapshotOptions())
 
RResultPtr< RInterface< RLoopManager > > Snapshot (std::string_view treename, std::string_view filename, const ColumnNames_t &columnList, const RSnapshotOptions &options=RSnapshotOptions())
 Save selected columns to disk, in a new TTree or RNTuple treename in file filename.
 
RResultPtr< RInterface< RLoopManager > > Snapshot (std::string_view treename, std::string_view filename, std::initializer_list< std::string > columnList, const RSnapshotOptions &options=RSnapshotOptions())
 Save selected columns to disk, in a new TTree or RNTuple treename in file filename.
 
RResultPtr< RInterface< RLoopManager > > Snapshot (std::string_view treename, std::string_view filename, std::string_view columnNameRegexp="", const RSnapshotOptions &options=RSnapshotOptions())
 Save selected columns to disk, in a new TTree or RNTuple treename in file filename.
 
void splitInEqualRanges (std::vector< std::pair< ULong64_t, ULong64_t > > &ranges, int nRecords, unsigned int nSlots)
 
template<typename V = RDFDetail::RInferredType, typename W = RDFDetail::RInferredType>
RResultPtr< TStatisticStats (std::string_view value, std::string_view weight)
 Return a TStatistic object, filled once per event (lazy action).
 
template<typename V = RDFDetail::RInferredType>
RResultPtr< TStatisticStats (std::string_view value="")
 Return a TStatistic object, filled once per event (lazy action).
 
template<typename T = RDFDetail::RInferredType>
RResultPtr< doubleStdDev (std::string_view columnName="")
 Return the unbiased standard deviation of processed column values (lazy action).
 
template<typename T = RDFDetail::RInferredType>
RResultPtr< RDFDetail::SumReturnType_t< T > > Sum (std::string_view columnName="", const RDFDetail::SumReturnType_t< T > &initValue=RDFDetail::SumReturnType_t< T >{})
 Return the sum of processed column values (lazy action).
 
template<typename T, typename COLL = std::vector<T>>
RResultPtr< COLL > Take (std::string_view column="")
 Return a collection of values of a column (lazy action, returns a std::vector by default).
 
RInterface< Proxied > Vary (const std::vector< std::string > &colNames, std::string_view expression, const std::vector< std::string > &variationTags, std::string_view variationName)
 Register systematic variations for multiple existing columns using custom variation tags.
 
RInterface< Proxied > Vary (const std::vector< std::string > &colNames, std::string_view expression, std::size_t nVariations, std::string_view variationName)
 Register systematic variations for multiple existing columns using auto-generated variation tags.
 
RInterface< Proxied > Vary (std::initializer_list< std::string > colNames, std::string_view expression, std::size_t nVariations, std::string_view variationName)
 Register systematic variations for multiple existing columns using auto-generated variation tags.
 
template<typename Proxied>
ROOT::RDF::RInterface RInterfaceBase Vary (std::string_view colName, std::string_view expression, std::size_t nVariations, std::string_view variationName="")
 Register systematic variations for a single existing column using auto-generated variation tags.
 
template<bool IsSingleColumn, typename F>
RInterface< Proxied > VaryImpl (const std::vector< std::string > &colNames, F &&expression, const ColumnNames_t &inputColumns, const std::vector< std::string > &variationTags, std::string_view variationName)
 

Typedef Documentation

◆ ColumnNames_t

typedef std::vector< std::string > ROOT::RDF::ColumnNames_t = std::vector<std::string>

Definition at line 35 of file RInterfaceBase.hxx.

◆ RNode

◆ SampleCallback_t

using ROOT::RDF::SampleCallback_t = std::function<void(unsigned int, const ROOT::RDF::RSampleInfo &)>

The type of a data-block callback, registered with an RDataFrame computation graph via e.g.

DefinePerSample() or by certain actions (e.g. Snapshot()).

Definition at line 140 of file RSampleInfo.hxx.

Enumeration Type Documentation

◆ ESnapshotOutputFormat

Enumerator
kDefault 
kTTree 
kRNTuple 

Definition at line 21 of file RSnapshotOptions.hxx.

Function Documentation

◆ Aggregate() [1/2]

template<typename AccFun, typename MergeFun, typename R = typename TTraits::CallableTraits<AccFun>::ret_type, typename ArgTypes = typename TTraits::CallableTraits<AccFun>::arg_types, typename ArgTypesNoDecay = typename TTraits::CallableTraits<AccFun>::arg_types_nodecay, typename U = TTraits::TakeFirstParameter_t<ArgTypes>, typename T = TTraits::TakeFirstParameter_t<TTraits::RemoveFirstParameter_t<ArgTypes>>>
RResultPtr< U > ROOT::RDF::Aggregate ( AccFun aggregator,
MergeFun merger,
std::string_view columnName,
const U & aggIdentity )

Execute a user-defined accumulation operation on the processed column values in each processing slot.

Template Parameters
FThe type of the aggregator callable. Automatically deduced.
UThe type of the aggregator variable. Must be default-constructible, copy-constructible and copy-assignable. Automatically deduced.
TThe type of the column to apply the reduction to. Automatically deduced.
Parameters
[in]aggregatorA callable with signature U(U,T) or void(U&,T), where T is the type of the column, U is the type of the aggregator variable
[in]mergerA callable with signature U(U,U) or void(std::vector<U>&) used to merge the results of the accumulations of each thread
[in]columnNameThe column to be aggregated. If omitted, the first default column is used instead.
[in]aggIdentityThe aggregator variable of each thread is initialized to this value (or is default-constructed if the parameter is omitted)
Returns
the result of the aggregation wrapped in a RResultPtr.

An aggregator callable takes two values, an aggregator variable and a column value. The aggregator variable is initialized to aggIdentity or default-constructed if aggIdentity is omitted. This action calls the aggregator callable for each processed entry, passing in the aggregator variable and the value of the column columnName. If the signature is U(U,T) the aggregator variable is then copy-assigned the result of the execution of the callable. Otherwise the signature of aggregator must be void(U&,T).

The merger callable is used to merge the partial accumulation results of each processing thread. It is only called in multi-thread executions. If its signature is U(U,U) the aggregator variables of each thread are merged two by two. If its signature is void(std::vector<U>& a) it is assumed that it merges all aggregators in a[0].

This action is lazy: upon invocation of this method the calculation is booked but not executed. Also see RResultPtr.

Example usage:

auto aggregator = [](double acc, double x) { return acc * x; };
// If multithread is enabled, the aggregator function will be called by more threads
// and will produce a vector of partial accumulators.
// The merger function performs the final aggregation of these partial results.
auto merger = [](std::vector<double> &accumulators) {
for (auto i : ROOT::TSeqU(1u, accumulators.size())) {
accumulators[0] *= accumulators[i];
}
};
// The accumulator is initialized at this value by every thread.
double initValue = 1.;
// Multiplies all elements of the column "x"
auto result = d.Aggregate(aggregator, merger, "x", initValue);
#define d(i)
Definition RSha256.hxx:102
Option_t Option_t TPoint TPoint const char GetTextMagnitude GetFillStyle GetLineColor GetLineWidth GetMarkerStyle GetTextAlign GetTextColor GetTextSize void char Point_t Rectangle_t WindowAttributes_t Float_t Float_t Float_t Int_t Int_t UInt_t UInt_t Rectangle_t result
Double_t x[n]
Definition legend1.C:17
void EnableImplicitMT(UInt_t numthreads=0)
Enable ROOT's implicit multi-threading for all objects and methods that provide an internal paralleli...
Definition TROOT.cxx:613
TSeq< unsigned int > TSeqU
Definition TSeq.hxx:204

Definition at line 3515 of file RInterface.hxx.

◆ Aggregate() [2/2]

template<typename AccFun, typename MergeFun, typename R = typename TTraits::CallableTraits<AccFun>::ret_type, typename ArgTypes = typename TTraits::CallableTraits<AccFun>::arg_types, typename U = TTraits::TakeFirstParameter_t<ArgTypes>, typename T = TTraits::TakeFirstParameter_t<TTraits::RemoveFirstParameter_t<ArgTypes>>>
RResultPtr< U > ROOT::RDF::Aggregate ( AccFun aggregator,
MergeFun merger,
std::string_view columnName = "" )

Execute a user-defined accumulation operation on the processed column values in each processing slot.

Template Parameters
FThe type of the aggregator callable. Automatically deduced.
UThe type of the aggregator variable. Must be default-constructible, copy-constructible and copy-assignable. Automatically deduced.
TThe type of the column to apply the reduction to. Automatically deduced.
Parameters
[in]aggregatorA callable with signature U(U,T) or void(U,T), where T is the type of the column, U is the type of the aggregator variable
[in]mergerA callable with signature U(U,U) or void(std::vector<U>&) used to merge the results of the accumulations of each thread
[in]columnNameThe column to be aggregated. If omitted, the first default column is used instead.
Returns
the result of the aggregation wrapped in a RResultPtr.

See previous Aggregate overload for more information.

Definition at line 3549 of file RInterface.hxx.

◆ Alias()

RInterface< Proxied > ROOT::RDF::Alias ( std::string_view alias,
std::string_view columnName )

Allow to refer to a column with a different name.

Parameters
[in]aliasname of the column alias
[in]columnNameof the column to be aliased
Returns
the first node of the computation graph for which the alias is available.

Aliasing an alias is supported.

Example usage:

auto df_with_alias = df.Alias("simple_name", "very_long&complex_name!!!");

Definition at line 1290 of file RInterface.hxx.

◆ AsRNode()

template<typename NodeType>
RNode ROOT::RDF::AsRNode ( NodeType node)

Cast a RDataFrame node to the common type ROOT::RDF::RNode.

Parameters
[in]nodeAny node of a RDataFrame graph

Definition at line 158 of file RDFHelpers.hxx.

◆ Book()

template<typename FirstColumn = RDFDetail::RInferredType, typename... OtherColumns, typename Helper>
RResultPtr< typename std::decay_t< Helper >::Result_t > ROOT::RDF::Book ( Helper && helper,
const ColumnNames_t & columns = {} )

Book execution of a custom action using a user-defined helper object.

Template Parameters
FirstColumnThe type of the first column used by this action. Inferred together with OtherColumns if not present.
OtherColumnsA list of the types of the other columns used by this action
HelperThe type of the user-defined helper. See below for the required interface it should expose.
Parameters
[in]helperThe Action Helper to be scheduled.
[in]columnsThe names of the columns on which the helper acts.
Returns
the result of the helper wrapped in a RResultPtr.

This method books a custom action for execution. The behavior of the action is completely dependent on the Helper object provided by the caller. The required interface for the helper is described below (more methods that the ones required can be present, e.g. a constructor that takes the number of worker threads is usually useful):

Mandatory interface

  • Helper must publicly inherit from ROOT::Detail::RDF::RActionImpl<Helper>
  • Helper::Result_t: public alias for the type of the result of this action helper. Result_t must be default-constructible.
  • Helper(Helper &&): a move-constructor is required. Copy-constructors are discouraged.
  • std::shared_ptr<Result_t> GetResultPtr() const: return a shared_ptr to the result of this action (of type Result_t). The RResultPtr returned by Book will point to this object. Note that this method can be called before Initialize(), because the RResultPtr is constructed before the event loop is started.
  • void Initialize(): this method is called once before starting the event-loop. Useful for setup operations. It must reset the state of the helper to the expected state at the beginning of the event loop: the same helper, or copies of it, might be used for multiple event loops (e.g. in the presence of systematic variations).
  • void InitTask(TTreeReader *, unsigned int slot): each working thread shall call this method during the event loop, before processing a batch of entries. The pointer passed as argument, if not null, will point to the TTreeReader that RDataFrame has set up to read the task's batch of entries. It is passed to the helper to allow certain advanced optimizations it should not usually serve any purpose for the Helper. This method is often no-op for simple helpers.
  • void Exec(unsigned int slot, ColumnTypes...columnValues): each working thread shall call this method during the event-loop, possibly concurrently. No two threads will ever call Exec with the same 'slot' value: this parameter is there to facilitate writing thread-safe helpers. The other arguments will be the values of the requested columns for the particular entry being processed.
  • void Finalize(): this method is called at the end of the event loop. Commonly used to finalize the contents of the result.
  • std::string GetActionName(): it returns a string identifier for this type of action that RDataFrame will use in diagnostics, SaveGraph(), etc.

Optional methods

If these methods are implemented they enable extra functionality as per the description below.

  • Result_t &PartialUpdate(unsigned int slot): if present, it must return the value of the partial result of this action for the given 'slot'. Different threads might call this method concurrently, but will do so with different 'slot' numbers. RDataFrame leverages this method to implement RResultPtr::OnPartialResult().
  • ROOT::RDF::SampleCallback_t GetSampleCallback(): if present, it must return a callable with the appropriate signature (see ROOT::RDF::SampleCallback_t) that will be invoked at the beginning of the processing of every sample, as in DefinePerSample().
  • Helper MakeNew(void *newResult, std::string_view variation = "nominal"): if implemented, it enables varying the action's result with VariationsFor(). It takes a type-erased new result that can be safely cast to a std::shared_ptr<Result_t> * (a pointer to shared pointer) and should be used as the action's output result. The function optionally takes the name of the current variation which could be useful in customizing its behaviour.

In case Book is called without specifying column types as template arguments, corresponding typed code will be just-in-time compiled by RDataFrame. In that case the Helper class needs to be known to the ROOT interpreter.

This action is lazy: upon invocation of this method the calculation is booked but not executed. Also see RResultPtr.

Examples

See this tutorial for an example implementation of an action helper.

It is also possible to inspect the code used by built-in RDataFrame actions at ActionHelpers.hxx.

Definition at line 3621 of file RInterface.hxx.

◆ Cache() [1/4]

template<typename... ColumnTypes>
RInterface< RLoopManager > ROOT::RDF::Cache ( const ColumnNames_t & columnList)

Save selected columns in memory.

Template Parameters
ColumnTypesvariadic list of branch/column types.
Parameters
[in]columnListcolumns to be cached in memory.
Returns
a RDataFrame that wraps the cached dataset.

This action returns a new RDataFrame object, completely detached from the originating RDataFrame. The new dataframe only contains the cached columns and stores their content in memory for fast, zero-copy subsequent access.

Use Cache if you know you will only need a subset of the (Filtered) data that fits in memory and that will be accessed many times.

Note
Cache will refuse to process columns with names of the form #columnname. These are special columns made available by some data sources (e.g. RNTupleDS) that represent the size of column columnname, and are not meant to be written out with that name (which is not a valid C++ variable name). Instead, go through an Alias(): df.Alias("nbar", "#bar").Cache<std::size_t>(..., {"nbar"}).

Example usage:

Types and columns specified:

auto cache_some_cols_df = df.Cache<double, MyClass, int>({"col0", "col1", "col2"});

Types inferred and columns specified (this invocation relies on jitting):

auto cache_some_cols_df = df.Cache({"col0", "col1", "col2"});

Types inferred and columns selected with a regexp (this invocation relies on jitting):

auto cache_all_cols_df = df.Cache(myRegexp);

Definition at line 1634 of file RInterface.hxx.

◆ Cache() [2/4]

RInterface< RLoopManager > ROOT::RDF::Cache ( const ColumnNames_t & columnList)

Save selected columns in memory.

Parameters
[in]columnListcolumns to be cached in memory
Returns
a RDataFrame that wraps the cached dataset.

See the previous overloads for more information.

Definition at line 1646 of file RInterface.hxx.

◆ Cache() [3/4]

RInterface< RLoopManager > ROOT::RDF::Cache ( std::initializer_list< std::string > columnList)

Save selected columns in memory.

Parameters
[in]columnListcolumns to be cached in memory.
Returns
a RDataFrame that wraps the cached dataset.

See the previous overloads for more information.

Definition at line 1717 of file RInterface.hxx.

◆ Cache() [4/4]

RInterface< RLoopManager > ROOT::RDF::Cache ( std::string_view columnNameRegexp = "")

Save selected columns in memory.

Parameters
[in]columnNameRegexpThe regular expression to match the column names to be selected. The presence of a '^' and a '$' at the end of the string is implicitly assumed if they are not specified. The dialect supported is PCRE via the TPRegexp class. An empty string signals the selection of all columns.
Returns
a RDataFrame that wraps the cached dataset.

The existing columns are matched against the regular expression. If the string provided is empty, all columns are selected. See the previous overloads for more information.

Definition at line 1695 of file RInterface.hxx.

◆ CacheImpl()

template<typename... ColTypes, std::size_t... S>
RInterface< RLoopManager > ROOT::RDF::CacheImpl ( const ColumnNames_t & columnList,
std::index_sequence< S... >  )
private

Implementation of cache.

Definition at line 3798 of file RInterface.hxx.

◆ CallCreateActionWithoutColsIfPossible() [1/2]

template<typename Helper, typename ActionResultType, typename... Others>
RResultPtr< ActionResultType > ROOT::RDF::CallCreateActionWithoutColsIfPossible ( const std::shared_ptr< ActionResultType > & ,
const std::shared_ptr< Helper > & ,
Others...  )
private

Definition at line 3900 of file RInterface.hxx.

◆ CallCreateActionWithoutColsIfPossible() [2/2]

template<typename Helper, typename ActionResultType>
auto ROOT::RDF::CallCreateActionWithoutColsIfPossible ( const std::shared_ptr< ActionResultType > & resPtr,
const std::shared_ptr< Helper > & hPtr,
TTraits::TypeList< RDFDetail::RInferredType >  ) -> decltype(hPtr->Exec(0u), RResultPtr<ActionResultType>{})
private

Definition at line 3890 of file RInterface.hxx.

◆ Count()

RResultPtr< ULong64_t > ROOT::RDF::Count ( )

Return the number of entries processed (lazy action).

Returns
the number of entries wrapped in a RResultPtr.

Useful e.g. for counting the number of entries passing a certain filter (see also Report). This action is lazy: upon invocation of this method the calculation is booked but not executed. Also see RResultPtr.

Example usage:

auto nEntriesAfterCuts = myFilteredDf.Count();

Definition at line 1900 of file RInterface.hxx.

◆ DefineImpl() [1/2]

template<typename F, typename DefineType, typename RetType = typename TTraits::CallableTraits<F>::ret_type>
std::enable_if_t< std::is_default_constructible< RetType >::value, RInterface< Proxied > > ROOT::RDF::DefineImpl ( std::string_view name,
F && expression,
const ColumnNames_t & columns,
const std::string & where )
private

Definition at line 3737 of file RInterface.hxx.

◆ DefineImpl() [2/2]

template<typename F, typename DefineType, typename RetType = typename TTraits::CallableTraits<F>::ret_type, bool IsFStringConv = std::is_convertible<F, std::string>::value, bool IsRetTypeDefConstr = std::is_default_constructible<RetType>::value>
std::enable_if_t<!IsFStringConv &&!IsRetTypeDefConstr, RInterface< Proxied > > ROOT::RDF::DefineImpl ( std::string_view ,
F ,
const ColumnNames_t & ,
const std::string &  )
private

Definition at line 3788 of file RInterface.hxx.

◆ Display() [1/4]

template<typename... ColumnTypes>
RResultPtr< RDisplay > ROOT::RDF::Display ( const ColumnNames_t & columnList,
size_t nRows = 5,
size_t nMaxCollectionElements = 10 )

Provides a representation of the columns in the dataset.

Template Parameters
ColumnTypesvariadic list of branch/column types.
Parameters
[in]columnListNames of the columns to be displayed.
[in]nRowsNumber of events for each column to be displayed.
[in]nMaxCollectionElementsMaximum number of collection elements to display per row.
Returns
the RDisplay instance wrapped in a RResultPtr.

This function returns a RResultPtr<RDisplay> containing all the entries to be displayed, organized in a tabular form. RDisplay will either print on the standard output a summarized version through RDisplay::Print() or will return a complete version through RDisplay::AsString().

This action is lazy: upon invocation of this method the calculation is booked but not executed. Also see RResultPtr.

Example usage:

// Preparing the RResultPtr<RDisplay> object with all columns and default number of entries
auto d1 = rdf.Display("");
// Preparing the RResultPtr<RDisplay> object with two columns and 128 entries
auto d2 = d.Display({"x", "y"}, 128);
// Printing the short representations, the event loop will run
d1->Print();
d2->Print();

Definition at line 3666 of file RInterface.hxx.

◆ Display() [2/4]

RResultPtr< RDisplay > ROOT::RDF::Display ( const ColumnNames_t & columnList,
size_t nRows = 5,
size_t nMaxCollectionElements = 10 )

Provides a representation of the columns in the dataset.

Parameters
[in]columnListNames of the columns to be displayed.
[in]nRowsNumber of events for each column to be displayed.
[in]nMaxCollectionElementsMaximum number of collection elements to display per row.
Returns
the RDisplay instance wrapped in a RResultPtr.

This overload automatically infers the column types. See the previous overloads for further details.

Invoked when no types are specified to Display

Definition at line 3689 of file RInterface.hxx.

◆ Display() [3/4]

RResultPtr< RDisplay > ROOT::RDF::Display ( std::initializer_list< std::string > columnList,
size_t nRows = 5,
size_t nMaxCollectionElements = 10 )

Provides a representation of the columns in the dataset.

Parameters
[in]columnListNames of the columns to be displayed.
[in]nRowsNumber of events for each column to be displayed.
[in]nMaxCollectionElementsNumber of maximum elements in collection.
Returns
the RDisplay instance wrapped in a RResultPtr.

See the previous overloads for further details.

Definition at line 3728 of file RInterface.hxx.

◆ Display() [4/4]

RResultPtr< RDisplay > ROOT::RDF::Display ( std::string_view columnNameRegexp = "",
size_t nRows = 5,
size_t nMaxCollectionElements = 10 )

Provides a representation of the columns in the dataset.

Parameters
[in]columnNameRegexpA regular expression to select the columns.
[in]nRowsNumber of events for each column to be displayed.
[in]nMaxCollectionElementsMaximum number of collection elements to display per row.
Returns
the RDisplay instance wrapped in a RResultPtr.

The existing columns are matched against the regular expression. If the string provided is empty, all columns are selected. See the previous overloads for further details.

Definition at line 3712 of file RInterface.hxx.

◆ Fill()

template<typename FirstColumn = RDFDetail::RInferredType, typename... OtherColumns, typename T>
RResultPtr< std::decay_t< T > > ROOT::RDF::Fill ( T && model,
const ColumnNames_t & columnList )

Return an object of type T on which T::Fill will be called once per event (lazy action).

Type T must provide at least:

  • a copy-constructor
  • a Fill method that accepts as many arguments and with same types as the column names passed as columnList (these types can also be passed as template parameters to this method)
  • a Merge method with signature Merge(TCollection *) or Merge(const std::vector<T *>&) that merges the objects passed as argument into the object on which Merge was called (an analogous of TH1::Merge). Note that if the signature that takes a TCollection* is used, then T must inherit from TObject (to allow insertion in the TCollection*).
Template Parameters
FirstColumnThe first type of the column the values of which are used to fill the object. Inferred together with OtherColumns if not present.
OtherColumnsA list of the other types of the columns the values of which are used to fill the object.
TThe type of the object to fill. Automatically deduced.
Parameters
[in]modelThe model to be considered to build the new return value.
[in]columnListA list containing the names of the columns that will be passed when calling Fill
Returns
the filled object wrapped in a RResultPtr.

The user gives up ownership of the model object. The list of column names to be used for filling must always be specified. This action is lazy: upon invocation of this method the calculation is booked but not executed. Also see RResultPtr.

Example usage:

MyClass obj;
// Deduce column types (this invocation needs jitting internally, and in this case
// MyClass needs to be known to the interpreter)
auto myFilledObj = myDf.Fill(obj, {"col0", "col1"});
// explicit column types
auto myFilledObj = myDf.Fill<float, float>(obj, {"col0", "col1"});

Definition at line 3170 of file RInterface.hxx.

◆ Foreach()

template<typename F>
void ROOT::RDF::Foreach ( F f,
const ColumnNames_t & columns = {} )

Execute a user-defined function on each entry (instant action).

Parameters
[in]fFunction, lambda expression, functor class or any other callable object performing user defined calculations.
[in]columnsNames of the columns/branches in input to the user function.

The callable f is invoked once per entry. This is an instant action: upon invocation, an event loop as well as execution of all scheduled actions is triggered. Users are responsible for the thread-safety of this callable when executing with implicit multi-threading enabled (i.e. ROOT::EnableImplicitMT).

Example usage:

myDf.Foreach([](int i){ std::cout << i << std::endl;}, {"myIntColumn"});

Definition at line 1782 of file RInterface.hxx.

◆ ForeachSlot()

template<typename F>
void ROOT::RDF::ForeachSlot ( F f,
const ColumnNames_t & columns = {} )

Execute a user-defined function requiring a processing slot index on each entry (instant action).

Parameters
[in]fFunction, lambda expression, functor class or any other callable object performing user defined calculations.
[in]columnsNames of the columns/branches in input to the user function.

Same as Foreach, but the user-defined function takes an extra unsigned int as its first parameter, the processing slot index. This slot index will be assigned a different value, 0 to poolSize - 1, for each thread of execution. This is meant as a helper in writing thread-safe Foreach actions when using RDataFrame after ROOT::EnableImplicitMT(). The user-defined processing callable is able to follow different streams of processing indexed by the first parameter. ForeachSlot works just as well with single-thread execution: in that case slot will always be 0.

Example usage:

myDf.ForeachSlot([](unsigned int s, int i){ std::cout << "Slot " << s << ": "<< i << std::endl;}, {"myIntColumn"});

Definition at line 1812 of file RInterface.hxx.

◆ FromArrow()

RDataFrame ROOT::RDF::FromArrow ( std::shared_ptr< arrow::Table > table,
std::vector< std::string > const & columnNames )

Factory method to create a Apache Arrow RDataFrame.

Creates a RDataFrame using an arrow::Table as input.

Parameters
[in]tablean apache::arrow table to use as a source / to observe.
[in]columnNamesthe name of the columns to use In case columnNames is empty, we use all the columns found in the table

Definition at line 606 of file RArrowDS.cxx.

◆ FromCSV() [1/2]

RDataFrame ROOT::RDF::FromCSV ( std::string_view fileName,
bool readHeaders = true,
char delimiter = ',',
Long64_t linesChunkSize = -1LL,
std::unordered_map< std::string, char > && colTypes = {} )

Factory method to create a CSV RDataFrame.

Parameters
[in]fileNamePath of the CSV file.
[in]readHeaderstrue if the CSV file contains headers as first row, false otherwise (default true).
[in]delimiterDelimiter character (default ',').
[in]linesChunkSizebunch of lines to read, use -1 to read all
[in]colTypesAllow user to specify custom column types, accepts an unordered map with keys being column type, values being type alias ('O' for boolean, 'D' for double, 'L' for Long64_t, 'T' for std::string)

Definition at line 650 of file RCsvDS.cxx.

◆ FromCSV() [2/2]

RDataFrame ROOT::RDF::FromCSV ( std::string_view fileName,
const RCsvDS::ROptions & options )

Factory method to create a CSV RDataFrame.

Parameters
[in]fileNamePath of the CSV file.
[in]optionsFile parsing settings.

Definition at line 644 of file RCsvDS.cxx.

◆ FromRNTuple() [1/2]

ROOT::RDataFrame ROOT::RDF::FromRNTuple ( std::string_view ntupleName,
const std::vector< std::string > & fileNames )

Definition at line 986 of file RNTupleDS.cxx.

◆ FromRNTuple() [2/2]

ROOT::RDataFrame ROOT::RDF::FromRNTuple ( std::string_view ntupleName,
std::string_view fileName )

Definition at line 981 of file RNTupleDS.cxx.

◆ FromSqlite()

RDataFrame ROOT::RDF::FromSqlite ( std::string_view fileName,
std::string_view query )

Factory method to create a SQlite RDataFrame.

Parameters
[in]fileNamePath of the sqlite file.
[in]querySQL query that defines the data set.

Definition at line 524 of file RSqliteDS.cxx.

◆ getData()

template<typename T>
std::shared_ptr< arrow::ChunkedArray > ROOT::RDF::getData ( T p)

Definition at line 542 of file RArrowDS.cxx.

◆ GetFilterNames()

std::vector< std::string > ROOT::RDF::GetFilterNames ( )

Returns the names of the filters created.

Returns
the container of filters names.

If called on a root node, all the filters in the computation graph will be printed. For any other node, only the filters upstream of that node. Filters without a name are printed as "Unnamed Filter" This is not an action nor a transformation, just a query to the RDataFrame object.

Example usage:

auto filtNames = d.GetFilterNames();
for (auto &&filtName : filtNames) std::cout << filtName << std::endl;

Definition at line 3463 of file RInterface.hxx.

◆ getNRecords()

int ROOT::RDF::getNRecords ( std::shared_ptr< arrow::Table > & table,
std::vector< std::string > & columnNames )

Definition at line 535 of file RArrowDS.cxx.

◆ GetProxiedPtr()

const std::shared_ptr< Proxied > & ROOT::RDF::GetProxiedPtr ( ) const
protected

Definition at line 3917 of file RInterface.hxx.

◆ Graph()

template<typename X = RDFDetail::RInferredType, typename Y = RDFDetail::RInferredType>
RResultPtr<::TGraph > ROOT::RDF::Graph ( std::string_view x = "",
std::string_view y = "" )

Fill and return a TGraph object (lazy action).

Template Parameters
XThe type of the column used to fill the x axis.
YThe type of the column used to fill the y axis.
Parameters
[in]xThe name of the column that will fill the x axis.
[in]yThe name of the column that will fill the y axis.
Returns
the TGraph wrapped in a RResultPtr.

Columns can be of a container type (e.g. std::vector<double>), in which case the TGraph is filled with each one of the elements of the container. If Multithreading is enabled, the order in which points are inserted is undefined. If the Graph has to be drawn, it is suggested to the user to sort it on the x before printing. A name and a title to the TGraph is given based on the input column names.

This action is lazy: upon invocation of this method the calculation is booked but not executed. Also see RResultPtr.

Example usage:

// Deduce column types (this invocation needs jitting internally)
auto myGraph1 = myDf.Graph("xValues", "yValues");
// Explicit column types
auto myGraph2 = myDf.Graph<int, float>("xValues", "yValues");
Note
Differently from other ROOT interfaces, the returned TGraph is not associated to gDirectory and the caller is responsible for its lifetime (in particular, a typical source of confusion is that if result histograms go out of scope before the end of the program, ROOT might display a blank canvas).

Definition at line 2847 of file RInterface.hxx.

◆ GraphAsymmErrors()

template<typename X = RDFDetail::RInferredType, typename Y = RDFDetail::RInferredType, typename EXL = RDFDetail::RInferredType, typename EXH = RDFDetail::RInferredType, typename EYL = RDFDetail::RInferredType, typename EYH = RDFDetail::RInferredType>
RResultPtr<::TGraphAsymmErrors > ROOT::RDF::GraphAsymmErrors ( std::string_view x = "",
std::string_view y = "",
std::string_view exl = "",
std::string_view exh = "",
std::string_view eyl = "",
std::string_view eyh = "" )

Fill and return a TGraphAsymmErrors object (lazy action).

Parameters
[in]xThe name of the column that will fill the x axis.
[in]yThe name of the column that will fill the y axis.
[in]exlThe name of the column of X low errors
[in]exhThe name of the column of X high errors
[in]eylThe name of the column of Y low errors
[in]eyhThe name of the column of Y high errors
Returns
the TGraphAsymmErrors wrapped in a RResultPtr.

Columns can be of a container type (e.g. std::vector<double>), in which case the graph is filled with each one of the elements of the container. If Multithreading is enabled, the order in which points are inserted is undefined.

This action is lazy: upon invocation of this method the calculation is booked but not executed. Also see RResultPtr.

Example usage:

// Deduce column types (this invocation needs jitting internally)
auto myGAE1 = myDf.GraphAsymmErrors("xValues", "yValues", "exl", "exh", "eyl", "eyh");
// Explicit column types
using f = float
auto myGAE2 = myDf.GraphAsymmErrors<f, f, f, f, f, f>("xValues", "yValues", "exl", "exh", "eyl", "eyh");
#define f(i)
Definition RSha256.hxx:104

GraphAsymmErrors should also be used for the cases in which values associated only with one of the axes have associated errors. For example, only ey exist and ex are equal to zero. In such cases, user should do the following:

// Create a column of zeros in RDataFrame
auto rdf_withzeros = rdf.Define("zero", "0");
// or alternatively:
auto rdf_withzeros = rdf.Define("zero", []() -> double { return 0.;});
// Create the graph with y errors only
auto rdf_errorsOnYOnly = rdf_withzeros.GraphAsymmErrors("xValues", "yValues", "zero", "zero", "eyl", "eyh");
Note
Differently from other ROOT interfaces, the returned TGraphAsymmErrors is not associated to gDirectory and the caller is responsible for its lifetime (in particular, a typical source of confusion is that if result histograms go out of scope before the end of the program, ROOT might display a blank canvas).

Definition at line 2912 of file RInterface.hxx.

◆ Hist() [1/8]

template<typename ColumnType = RDFDetail::RInferredType, typename... ColumnTypes, typename BinContentType>
RResultPtr< ROOT::Experimental::RHist< BinContentType > > ROOT::RDF::Hist ( std::shared_ptr< ROOT::Experimental::RHist< BinContentType > > h,
const ColumnNames_t & columnList )

Fill the provided RHist (lazy action).

Parameters
[in]hThe histogram that should be filled.
[in]columnListA list containing the names of the columns that will be passed when calling Fill
Returns
the histogram wrapped in a RResultPtr.

This action is lazy: upon invocation of this method the calculation is booked but not executed. Also see RResultPtr.

During execution of the computation graph, the passed histogram must only be accessed with methods that are allowed during concurrent filling.

Example usage:

auto h = std::make_shared<ROOT::Experimental::RHist<double>>(10, {5.0, 15.0});
auto myHist = myDf.Hist(h, {"col0"});
#define h(i)
Definition RSha256.hxx:106

Definition at line 2617 of file RInterface.hxx.

◆ Hist() [2/8]

template<typename ColumnType = RDFDetail::RInferredType, typename... ColumnTypes, typename BinContentType>
RResultPtr< ROOT::Experimental::RHist< BinContentType > > ROOT::RDF::Hist ( std::shared_ptr< ROOT::Experimental::RHist< BinContentType > > h,
const ColumnNames_t & columnList,
std::string_view wName )

Fill the provided RHist with weights (lazy action).

Parameters
[in]hThe histogram that should be filled.
[in]columnListA list containing the names of the columns that will be passed when calling Fill
[in]wNameThe name of the column that will provide the weights.
Returns
the histogram wrapped in a RResultPtr.

This action is lazy: upon invocation of this method the calculation is booked but not executed. Also see RResultPtr.

This overload is not available for integral bin content types (see RHistEngine::SupportsWeightedFilling).

During execution of the computation graph, the passed histogram must only be accessed with methods that are allowed during concurrent filling.

Example usage:

auto h = std::make_shared<ROOT::Experimental::RHist<double>>(10, {5.0, 15.0});
auto myHist = myDf.Hist(h, {"col0"}, "colW");

Definition at line 2718 of file RInterface.hxx.

◆ Hist() [3/8]

template<typename ColumnType = RDFDetail::RInferredType, typename... ColumnTypes, typename BinContentType>
RResultPtr< ROOT::Experimental::RHistEngine< BinContentType > > ROOT::RDF::Hist ( std::shared_ptr< ROOT::Experimental::RHistEngine< BinContentType > > h,
const ColumnNames_t & columnList )

Fill the provided RHistEngine (lazy action).

Parameters
[in]hThe histogram that should be filled.
[in]columnListA list containing the names of the columns that will be passed when calling Fill
Returns
the histogram wrapped in a RResultPtr.

This action is lazy: upon invocation of this method the calculation is booked but not executed. Also see RResultPtr.

During execution of the computation graph, the passed histogram must only be accessed with methods that are allowed during concurrent filling.

Example usage:

auto h = std::make_shared<ROOT::Experimental::RHistEngine<double>>(10, {5.0, 15.0});
auto myHist = myDf.Hist(h, {"col0"});

Definition at line 2759 of file RInterface.hxx.

◆ Hist() [4/8]

template<typename ColumnType = RDFDetail::RInferredType, typename... ColumnTypes, typename BinContentType>
RResultPtr< ROOT::Experimental::RHistEngine< BinContentType > > ROOT::RDF::Hist ( std::shared_ptr< ROOT::Experimental::RHistEngine< BinContentType > > h,
const ColumnNames_t & columnList,
std::string_view wName )

Fill the provided RHistEngine with weights (lazy action).

Parameters
[in]hThe histogram that should be filled.
[in]columnListA list containing the names of the columns that will be passed when calling Fill
[in]wNameThe name of the column that will provide the weights.
Returns
the histogram wrapped in a RResultPtr.

This action is lazy: upon invocation of this method the calculation is booked but not executed. Also see RResultPtr.

This overload is not available for integral bin content types (see RHistEngine::SupportsWeightedFilling).

During execution of the computation graph, the passed histogram must only be accessed with methods that are allowed during concurrent filling.

Example usage:

auto h = std::make_shared<ROOT::Experimental::RHistEngine<double>>(10, {5.0, 15.0});
auto myHist = myDf.Hist(h, {"col0"}, "colW");

Definition at line 2795 of file RInterface.hxx.

◆ Hist() [5/8]

template<typename BinContentType = double, typename V = RDFDetail::RInferredType>
RResultPtr< ROOT::Experimental::RHist< BinContentType > > ROOT::RDF::Hist ( std::uint64_t nNormalBins,
std::pair< double, double > interval,
std::string_view vName )

Fill and return a one-dimensional RHist (lazy action).

Template Parameters
BinContentTypeThe bin content type of the returned RHist.
Parameters
[in]nNormalBinsThe returned histogram will be constructed using this number of normal bins.
[in]intervalThe axis interval of the constructed histogram (lower end inclusive, upper end exclusive).
[in]vNameThe name of the column that will fill the histogram.
Returns
the histogram wrapped in a RResultPtr.

This action is lazy: upon invocation of this method the calculation is booked but not executed. Also see RResultPtr.

Example usage:

auto myHist = myDf.Hist(10, {5, 15}, "col0");

Definition at line 2559 of file RInterface.hxx.

◆ Hist() [6/8]

template<typename BinContentType = ROOT::Experimental::RBinWithError, typename V = RDFDetail::RInferredType, typename W = RDFDetail::RInferredType>
RResultPtr< ROOT::Experimental::RHist< BinContentType > > ROOT::RDF::Hist ( std::uint64_t nNormalBins,
std::pair< double, double > interval,
std::string_view vName,
std::string_view wName )

Fill and return a one-dimensional RHist with weights (lazy action).

Template Parameters
BinContentTypeThe bin content type of the returned RHist.
Parameters
[in]nNormalBinsThe returned histogram will be constructed using this number of normal bins.
[in]intervalThe axis interval of the constructed histogram (lower end inclusive, upper end exclusive).
[in]vNameThe name of the column that will fill the histogram.
[in]wNameThe name of the column that will provide the weights.
Returns
the histogram wrapped in a RResultPtr.

This action is lazy: upon invocation of this method the calculation is booked but not executed. Also see RResultPtr.

Example usage:

auto myHist = myDf.Hist(10, {5, 15}, "col0", "colW");

Definition at line 2650 of file RInterface.hxx.

◆ Hist() [7/8]

template<typename BinContentType = double, typename ColumnType = RDFDetail::RInferredType, typename... ColumnTypes>
RResultPtr< ROOT::Experimental::RHist< BinContentType > > ROOT::RDF::Hist ( std::vector< ROOT::Experimental::RAxisVariant > axes,
const ColumnNames_t & columnList )

Fill and return an RHist (lazy action).

Template Parameters
BinContentTypeThe bin content type of the returned RHist.
Parameters
[in]axesThe returned histogram will be constructed using these axes.
[in]columnListA list containing the names of the columns that will be passed when calling Fill
Returns
the histogram wrapped in a RResultPtr.

This action is lazy: upon invocation of this method the calculation is booked but not executed. Also see RResultPtr.

Example usage:

ROOT::Experimental::RRegularAxis axis(10, {5.0, 15.0});
auto myHist = myDf.Hist({axis}, {"col0"});
A regular axis with equidistant bins in the interval .

Definition at line 2585 of file RInterface.hxx.

◆ Hist() [8/8]

template<typename BinContentType = ROOT::Experimental::RBinWithError, typename ColumnType = RDFDetail::RInferredType, typename... ColumnTypes>
RResultPtr< ROOT::Experimental::RHist< BinContentType > > ROOT::RDF::Hist ( std::vector< ROOT::Experimental::RAxisVariant > axes,
const ColumnNames_t & columnList,
std::string_view wName )

Fill and return an RHist with weights (lazy action).

Template Parameters
BinContentTypeThe bin content type of the returned RHist.
Parameters
[in]axesThe returned histogram will be constructed using these axes.
[in]columnListA list containing the names of the columns that will be passed when calling Fill
[in]wNameThe name of the column that will provide the weights.
Returns
the histogram wrapped in a RResultPtr.

This action is lazy: upon invocation of this method the calculation is booked but not executed. Also see RResultPtr.

This overload is not available for integral bin content types (see RHistEngine::SupportsWeightedFilling).

Example usage:

ROOT::Experimental::RRegularAxis axis(10, {5.0, 15.0});
auto myHist = myDf.Hist({axis}, {"col0"}, "colW");

Definition at line 2680 of file RInterface.hxx.

◆ Histo1D() [1/5]

template<typename V = RDFDetail::RInferredType, typename W = RDFDetail::RInferredType>
RResultPtr<::TH1D > ROOT::RDF::Histo1D ( const TH1DModel & model,
std::string_view vName,
std::string_view wName )

Fill and return a one-dimensional histogram with the weighted values of a column (lazy action).

Template Parameters
VThe type of the column used to fill the histogram.
WThe type of the column used as weights.
Parameters
[in]modelThe returned histogram will be constructed using this as a model.
[in]vNameThe name of the column that will fill the histogram.
[in]wNameThe name of the column that will provide the weights.
Returns
the monodimensional histogram wrapped in a RResultPtr.

See the description of the first Histo1D() overload for more details.

Example usage:

// Deduce column type (this invocation needs jitting internally)
auto myHist1 = myDf.Histo1D({"histName", "histTitle", 64u, 0., 128.}, "myValue", "myweight");
// Explicit column type
auto myHist2 = myDf.Histo1D<float, int>({"histName", "histTitle", 64u, 0., 128.}, "myValue", "myweight");

Definition at line 2036 of file RInterface.hxx.

◆ Histo1D() [2/5]

template<typename V, typename W>
RResultPtr<::TH1D > ROOT::RDF::Histo1D ( const TH1DModel & model = {"", "", 128u, 0., 0.})

Fill and return a one-dimensional histogram with the weighted values of a column (lazy action).

Template Parameters
VThe type of the column used to fill the histogram.
WThe type of the column used as weights.
Parameters
[in]modelThe returned histogram will be constructed using this as a model.
Returns
the monodimensional histogram wrapped in a RResultPtr.

This overload will use the first two default columns as column names. See the description of the first Histo1D() overload for more details.

Definition at line 2093 of file RInterface.hxx.

◆ Histo1D() [3/5]

template<typename V = RDFDetail::RInferredType>
RResultPtr<::TH1D > ROOT::RDF::Histo1D ( const TH1DModel & model = {"", "", 128u, 0., 0.},
std::string_view vName = "" )

Fill and return a one-dimensional histogram with the values of a column (lazy action).

Template Parameters
VThe type of the column used to fill the histogram.
Parameters
[in]modelThe returned histogram will be constructed using this as a model.
[in]vNameThe name of the column that will fill the histogram.
Returns
the monodimensional histogram wrapped in a RResultPtr.

Columns can be of a container type (e.g. std::vector<double>), in which case the histogram is filled with each one of the elements of the container. In case multiple columns of container type are provided (e.g. values and weights) they must have the same length for each one of the events (but possibly different lengths between events). This action is lazy: upon invocation of this method the calculation is booked but not executed. Also see RResultPtr.

Example usage:

// Deduce column type (this invocation needs jitting internally)
auto myHist1 = myDf.Histo1D({"histName", "histTitle", 64u, 0., 128.}, "myColumn");
// Explicit column type
auto myHist2 = myDf.Histo1D<float>({"histName", "histTitle", 64u, 0., 128.}, "myColumn");
Note
Differently from other ROOT interfaces, the returned histogram is not associated to gDirectory and the caller is responsible for its lifetime (in particular, a typical source of confusion is that if result histograms go out of scope before the end of the program, ROOT might display a blank canvas).

Definition at line 1975 of file RInterface.hxx.

◆ Histo1D() [4/5]

template<typename V = RDFDetail::RInferredType>
RResultPtr<::TH1D > ROOT::RDF::Histo1D ( std::string_view vName)

Fill and return a one-dimensional histogram with the values of a column (lazy action).

Template Parameters
VThe type of the column used to fill the histogram.
Parameters
[in]vNameThe name of the column that will fill the histogram.
Returns
the monodimensional histogram wrapped in a RResultPtr.

This overload uses a default model histogram TH1D(name, title, 128u, 0., 0.). The "name" and "title" strings are built starting from the input column name. See the description of the first Histo1D() overload for more details.

Example usage:

// Deduce column type (this invocation needs jitting internally)
auto myHist1 = myDf.Histo1D("myColumn");
// Explicit column type
auto myHist2 = myDf.Histo1D<float>("myColumn");

Definition at line 2010 of file RInterface.hxx.

◆ Histo1D() [5/5]

template<typename V = RDFDetail::RInferredType, typename W = RDFDetail::RInferredType>
RResultPtr<::TH1D > ROOT::RDF::Histo1D ( std::string_view vName,
std::string_view wName )

Fill and return a one-dimensional histogram with the weighted values of a column (lazy action).

Template Parameters
VThe type of the column used to fill the histogram.
WThe type of the column used as weights.
Parameters
[in]vNameThe name of the column that will fill the histogram.
[in]wNameThe name of the column that will provide the weights.
Returns
the monodimensional histogram wrapped in a RResultPtr.

This overload uses a default model histogram TH1D(name, title, 128u, 0., 0.). The "name" and "title" strings are built starting from the input column names. See the description of the first Histo1D() overload for more details.

Example usage:

// Deduce column types (this invocation needs jitting internally)
auto myHist1 = myDf.Histo1D("myValue", "myweight");
// Explicit column types
auto myHist2 = myDf.Histo1D<float, int>("myValue", "myweight");

Definition at line 2073 of file RInterface.hxx.

◆ Histo2D() [1/3]

template<typename V1, typename V2, typename W>
RResultPtr<::TH2D > ROOT::RDF::Histo2D ( const TH2DModel & model)

Definition at line 2188 of file RInterface.hxx.

◆ Histo2D() [2/3]

template<typename V1 = RDFDetail::RInferredType, typename V2 = RDFDetail::RInferredType, typename W = RDFDetail::RInferredType>
RResultPtr<::TH2D > ROOT::RDF::Histo2D ( const TH2DModel & model,
std::string_view v1Name,
std::string_view v2Name,
std::string_view wName )

Fill and return a weighted two-dimensional histogram (lazy action).

Template Parameters
V1The type of the column used to fill the x axis of the histogram.
V2The type of the column used to fill the y axis of the histogram.
WThe type of the column used for the weights of the histogram.
Parameters
[in]modelThe returned histogram will be constructed using this as a model.
[in]v1NameThe name of the column that will fill the x axis.
[in]v2NameThe name of the column that will fill the y axis.
[in]wNameThe name of the column that will provide the weights.
Returns
the bidimensional histogram wrapped in a RResultPtr.

This action is lazy: upon invocation of this method the calculation is booked but not executed. Also see RResultPtr.

Example usage:

// Deduce column types (this invocation needs jitting internally)
auto myHist1 = myDf.Histo2D({"histName", "histTitle", 64u, 0., 128., 32u, -4., 4.}, "myValueX", "myValueY", "myWeight");
// Explicit column types
auto myHist2 = myDf.Histo2D<float, float, double>({"histName", "histTitle", 64u, 0., 128., 32u, -4., 4.}, "myValueX", "myValueY", "myWeight");

See the documentation of the first Histo2D() overload for more details.

Definition at line 2170 of file RInterface.hxx.

◆ Histo2D() [3/3]

template<typename V1 = RDFDetail::RInferredType, typename V2 = RDFDetail::RInferredType>
RResultPtr<::TH2D > ROOT::RDF::Histo2D ( const TH2DModel & model,
std::string_view v1Name = "",
std::string_view v2Name = "" )

Fill and return a two-dimensional histogram (lazy action).

Template Parameters
V1The type of the column used to fill the x axis of the histogram.
V2The type of the column used to fill the y axis of the histogram.
Parameters
[in]modelThe returned histogram will be constructed using this as a model.
[in]v1NameThe name of the column that will fill the x axis.
[in]v2NameThe name of the column that will fill the y axis.
Returns
the bidimensional histogram wrapped in a RResultPtr.

Columns can be of a container type (e.g. std::vector<double>), in which case the histogram is filled with each one of the elements of the container. In case multiple columns of container type are provided (e.g. values and weights) they must have the same length for each one of the events (but possibly different lengths between events). This action is lazy: upon invocation of this method the calculation is booked but not executed. Also see RResultPtr.

Example usage:

// Deduce column types (this invocation needs jitting internally)
auto myHist1 = myDf.Histo2D({"histName", "histTitle", 64u, 0., 128., 32u, -4., 4.}, "myValueX", "myValueY");
// Explicit column types
auto myHist2 = myDf.Histo2D<float, float>({"histName", "histTitle", 64u, 0., 128., 32u, -4., 4.}, "myValueX", "myValueY");
Note
Differently from other ROOT interfaces, the returned histogram is not associated to gDirectory and the caller is responsible for its lifetime (in particular, a typical source of confusion is that if result histograms go out of scope before the end of the program, ROOT might display a blank canvas).

Definition at line 2127 of file RInterface.hxx.

◆ Histo3D() [1/3]

template<typename V1, typename V2, typename V3, typename W>
RResultPtr<::TH3D > ROOT::RDF::Histo3D ( const TH3DModel & model)

Definition at line 2292 of file RInterface.hxx.

◆ Histo3D() [2/3]

template<typename V1 = RDFDetail::RInferredType, typename V2 = RDFDetail::RInferredType, typename V3 = RDFDetail::RInferredType, typename W = RDFDetail::RInferredType>
RResultPtr<::TH3D > ROOT::RDF::Histo3D ( const TH3DModel & model,
std::string_view v1Name,
std::string_view v2Name,
std::string_view v3Name,
std::string_view wName )

Fill and return a three-dimensional histogram (lazy action).

Template Parameters
V1The type of the column used to fill the x axis of the histogram. Inferred if not present.
V2The type of the column used to fill the y axis of the histogram. Inferred if not present.
V3The type of the column used to fill the z axis of the histogram. Inferred if not present.
WThe type of the column used for the weights of the histogram. Inferred if not present.
Parameters
[in]modelThe returned histogram will be constructed using this as a model.
[in]v1NameThe name of the column that will fill the x axis.
[in]v2NameThe name of the column that will fill the y axis.
[in]v3NameThe name of the column that will fill the z axis.
[in]wNameThe name of the column that will provide the weights.
Returns
the tridimensional histogram wrapped in a RResultPtr.

This action is lazy: upon invocation of this method the calculation is booked but not executed. Also see RResultPtr.

Example usage:

// Deduce column types (this invocation needs jitting internally)
auto myHist1 = myDf.Histo3D({"name", "title", 64u, 0., 128., 32u, -4., 4., 8u, -2., 2.},
"myValueX", "myValueY", "myValueZ", "myWeight");
// Explicit column types
using d_t = double;
auto myHist2 = myDf.Histo3D<d_t, d_t, float, d_t>({"name", "title", 64u, 0., 128., 32u, -4., 4., 8u, -2., 2.},
"myValueX", "myValueY", "myValueZ", "myWeight");

See the documentation of the first Histo2D() overload for more details.

Definition at line 2273 of file RInterface.hxx.

◆ Histo3D() [3/3]

template<typename V1 = RDFDetail::RInferredType, typename V2 = RDFDetail::RInferredType, typename V3 = RDFDetail::RInferredType>
RResultPtr<::TH3D > ROOT::RDF::Histo3D ( const TH3DModel & model,
std::string_view v1Name = "",
std::string_view v2Name = "",
std::string_view v3Name = "" )

Fill and return a three-dimensional histogram (lazy action).

Template Parameters
V1The type of the column used to fill the x axis of the histogram. Inferred if not present.
V2The type of the column used to fill the y axis of the histogram. Inferred if not present.
V3The type of the column used to fill the z axis of the histogram. Inferred if not present.
Parameters
[in]modelThe returned histogram will be constructed using this as a model.
[in]v1NameThe name of the column that will fill the x axis.
[in]v2NameThe name of the column that will fill the y axis.
[in]v3NameThe name of the column that will fill the z axis.
Returns
the tridimensional histogram wrapped in a RResultPtr.

This action is lazy: upon invocation of this method the calculation is booked but not executed. Also see RResultPtr.

Example usage:

// Deduce column types (this invocation needs jitting internally)
auto myHist1 = myDf.Histo3D({"name", "title", 64u, 0., 128., 32u, -4., 4., 8u, -2., 2.},
"myValueX", "myValueY", "myValueZ");
// Explicit column types
auto myHist2 = myDf.Histo3D<double, double, float>({"name", "title", 64u, 0., 128., 32u, -4., 4., 8u, -2., 2.},
"myValueX", "myValueY", "myValueZ");
Note
If three-dimensional histograms consume too much memory in multithreaded runs, the cloning of TH3D per thread can be reduced using ROOT::RDF::Experimental::ThreadsPerTH3(). See the section "Memory Usage" in the RDataFrame description.
Differently from other ROOT interfaces, the returned histogram is not associated to gDirectory and the caller is responsible for its lifetime (in particular, a typical source of confusion is that if result histograms go out of scope before the end of the program, ROOT might display a blank canvas).

Definition at line 2224 of file RInterface.hxx.

◆ HistoND() [1/2]

template<typename FirstColumn, typename... OtherColumns>
RResultPtr<::THnD > ROOT::RDF::HistoND ( const THnDModel & model,
const ColumnNames_t & columnList,
std::string_view wName = "" )

Fill and return an N-dimensional histogram (lazy action).

Template Parameters
FirstColumnThe first type of the column the values of which are used to fill the object. Inferred if not present.
OtherColumnsA list of the other types of the columns the values of which are used to fill the object.
Parameters
[in]modelThe returned histogram will be constructed using this as a model.
[in]columnListA list containing the names of the columns that will be passed when calling Fill.
[in]wNameThe name of the column that will provide the weights.
Returns
the N-dimensional histogram wrapped in a RResultPtr.

This action is lazy: upon invocation of this method the calculation is booked but not executed. See RResultPtr documentation.

Example usage:

auto myFilledObj = myDf.HistoND<float, float, float, float>({"name","title", 4,
{40,40,40,40}, {20.,20.,20.,20.}, {60.,60.,60.,60.}},
{"col0", "col1", "col2", "col3"});
Note
A column with event weights should not be passed as part of columnList, but instead be passed in the new argument wName: HistoND(model, cols, weightCol).

Definition at line 2323 of file RInterface.hxx.

◆ HistoND() [2/2]

RResultPtr<::THnD > ROOT::RDF::HistoND ( const THnDModel & model,
const ColumnNames_t & columnList,
std::string_view wName = "" )

Fill and return an N-dimensional histogram (lazy action).

Parameters
[in]modelThe returned histogram will be constructed using this as a model.
[in]columnListA list containing the names of the columns that will be passed when calling Fill
[in]wNameThe name of the column that will provide the weights.
Returns
the N-dimensional histogram wrapped in a RResultPtr.

This action is lazy: upon invocation of this method the calculation is booked but not executed. Also see RResultPtr.

Example usage:

auto myFilledObj = myDf.HistoND({"name","title", 4,
{40,40,40,40}, {20.,20.,20.,20.}, {60.,60.,60.,60.}},
{"col0", "col1", "col2", "col3"});
Note
A column with event weights should not be passed as part of columnList, but instead be passed in the new argument wName: HistoND(model, cols, weightCol).

Definition at line 2380 of file RInterface.hxx.

◆ HistoNSparseD() [1/2]

template<typename FirstColumn, typename... OtherColumns>
RResultPtr<::THnSparseD > ROOT::RDF::HistoNSparseD ( const THnSparseDModel & model,
const ColumnNames_t & columnList,
std::string_view wName = "" )

Fill and return a sparse N-dimensional histogram (lazy action).

Template Parameters
FirstColumnThe first type of the column the values of which are used to fill the object. Inferred if not present.
OtherColumnsA list of the other types of the columns the values of which are used to fill the object.
Parameters
[in]modelThe returned histogram will be constructed using this as a model.
[in]columnListA list containing the names of the columns that will be passed when calling Fill.
[in]wNameThe name of the column that will provide the weights.
Returns
the N-dimensional histogram wrapped in a RResultPtr.

This action is lazy: upon invocation of this method the calculation is booked but not executed. See RResultPtr documentation.

Example usage:

auto myFilledObj = myDf.HistoNSparseD<float, float, float, float>({"name","title", 4,
{40,40,40,40}, {20.,20.,20.,20.}, {60.,60.,60.,60.}},
{"col0", "col1", "col2", "col3"});
Note
A column with event weights should not be passed as part of columnList, but instead be passed in the new argument wName: HistoND(model, cols, weightCol).

Definition at line 2444 of file RInterface.hxx.

◆ HistoNSparseD() [2/2]

RResultPtr<::THnSparseD > ROOT::RDF::HistoNSparseD ( const THnSparseDModel & model,
const ColumnNames_t & columnList,
std::string_view wName = "" )

Fill and return a sparse N-dimensional histogram (lazy action).

Parameters
[in]modelThe returned histogram will be constructed using this as a model.
[in]columnListA list containing the names of the columns that will be passed when calling Fill
[in]wNameThe name of the column that will provide the weights.
Returns
the N-dimensional histogram wrapped in a RResultPtr.

This action is lazy: upon invocation of this method the calculation is booked but not executed. Also see RResultPtr.

Example usage:

auto myFilledObj = myDf.HistoNSparseD({"name","title", 4,
{40,40,40,40}, {20.,20.,20.,20.}, {60.,60.,60.,60.}},
{"col0", "col1", "col2", "col3"});
Note
A column with event weights should not be passed as part of columnList, but instead be passed in the new argument wName: HistoND(model, cols, weightCol).

Definition at line 2503 of file RInterface.hxx.

◆ JittedVaryImpl()

RInterface< Proxied > ROOT::RDF::JittedVaryImpl ( const std::vector< std::string > & colNames,
std::string_view expression,
const std::vector< std::string > & variationTags,
std::string_view variationName,
bool isSingleColumn )
private

Definition at line 3853 of file RInterface.hxx.

◆ MakeLazyDataFrame()

template<typename... ColumnTypes>
RDataFrame ROOT::RDF::MakeLazyDataFrame ( std::pair< std::string, RResultPtr< std::vector< ColumnTypes > > > &&... colNameProxyPairs)

Factory method to create a Lazy RDataFrame.

Parameters
[in]colNameProxyPairsthe series of pairs to describe the columns of the data source, first element of the pair is the name of the column and the second is the RResultPtr to the column in the parent data frame.

Definition at line 29 of file RLazyDS.hxx.

◆ MakeTrivialDataFrame() [1/2]

RInterface< RDFDetail::RLoopManager > ROOT::RDF::MakeTrivialDataFrame ( )

Make a RDF wrapping a RTrivialDS with infinite entries, for demo purposes.

Definition at line 117 of file RTrivialDS.cxx.

◆ MakeTrivialDataFrame() [2/2]

RInterface< RDFDetail::RLoopManager > ROOT::RDF::MakeTrivialDataFrame ( ULong64_t size,
bool skipEvenEntries = false )

Make a RDF wrapping a RTrivialDS with the specified amount of entries.

Constructing an RDataFrame as RDataFrame(nEntries) is a superior alternative. If size is std::numeric_limits<ULong64_t>::max(), this acts as an infinite data-source: it returns entries from GetEntryRanges forever or until a Range stops the event loop (for test purposes).

Definition at line 110 of file RTrivialDS.cxx.

◆ Max()

template<typename T = RDFDetail::RInferredType>
RResultPtr< RDFDetail::MaxReturnType_t< T > > ROOT::RDF::Max ( std::string_view columnName = "")

Return the maximum of processed column values (lazy action).

Template Parameters
TThe type of the branch/column.
Parameters
[in]columnNameThe name of the branch/column to be treated.
Returns
the maximum value of the selected column wrapped in a RResultPtr.

If T is not specified, RDataFrame will infer it from the data and just-in-time compile the correct template specialization of this method. If the type of the column is inferred, the return type is double, the type of the column otherwise.

This action is lazy: upon invocation of this method the calculation is booked but not executed. Also see RResultPtr.

Example usage:

// Deduce column type (this invocation needs jitting internally)
auto maxVal0 = myDf.Max("values");
// Explicit column type
auto maxVal1 = myDf.Max<double>("values");

Definition at line 3304 of file RInterface.hxx.

◆ Mean()

template<typename T = RDFDetail::RInferredType>
RResultPtr< double > ROOT::RDF::Mean ( std::string_view columnName = "")

Return the mean of processed column values (lazy action).

Template Parameters
TThe type of the branch/column.
Parameters
[in]columnNameThe name of the branch/column to be treated.
Returns
the mean value of the selected column wrapped in a RResultPtr.

If T is not specified, RDataFrame will infer it from the data and just-in-time compile the correct template specialization of this method. Note that internally, the summations are executed with Kahan sums in double precision, irrespective of the type of column that is read.

This action is lazy: upon invocation of this method the calculation is booked but not executed. Also see RResultPtr.

Example usage:

// Deduce column type (this invocation needs jitting internally)
auto meanVal0 = myDf.Mean("values");
// Explicit column type
auto meanVal1 = myDf.Mean<double>("values");

Definition at line 3335 of file RInterface.hxx.

◆ Min()

template<typename T = RDFDetail::RInferredType>
RResultPtr< RDFDetail::MinReturnType_t< T > > ROOT::RDF::Min ( std::string_view columnName = "")

Return the minimum of processed column values (lazy action).

Template Parameters
TThe type of the branch/column.
Parameters
[in]columnNameThe name of the branch/column to be treated.
Returns
the minimum value of the selected column wrapped in a RResultPtr.

If T is not specified, RDataFrame will infer it from the data and just-in-time compile the correct template specialization of this method. If the type of the column is inferred, the return type is double, the type of the column otherwise.

This action is lazy: upon invocation of this method the calculation is booked but not executed. Also see RResultPtr.

Example usage:

// Deduce column type (this invocation needs jitting internally)
auto minVal0 = myDf.Min("values");
// Explicit column type
auto minVal1 = myDf.Min<double>("values");

Definition at line 3274 of file RInterface.hxx.

◆ Not()

template<typename F, typename Args = typename ROOT::TypeTraits::CallableTraits<std::decay_t<F>>::arg_types_nodecay, typename Ret = typename ROOT::TypeTraits::CallableTraits<std::decay_t<F>>::ret_type>
auto ROOT::RDF::Not ( F && f) -> decltype(RDFInternal::NotHelper(Args(), std::forward<F>(f)))

Given a callable with signature bool(T1, T2, ...) return a callable with same signature that returns the negated result.

The callable must have one single non-template definition of operator(). This is a limitation with respect to std::not_fn, required for interoperability with RDataFrame.

Definition at line 83 of file RDFHelpers.hxx.

◆ operator!=() [1/3]

template<class T1, class T2>
bool ROOT::RDF::operator!= ( const RResultPtr< T1 > & lhs,
const RResultPtr< T2 > & rhs )

Definition at line 428 of file RResultPtr.hxx.

◆ operator!=() [2/3]

template<class T1>
bool ROOT::RDF::operator!= ( const RResultPtr< T1 > & lhs,
std::nullptr_t rhs )

Definition at line 446 of file RResultPtr.hxx.

◆ operator!=() [3/3]

template<class T1>
bool ROOT::RDF::operator!= ( std::nullptr_t lhs,
const RResultPtr< T1 > & rhs )

Definition at line 452 of file RResultPtr.hxx.

◆ operator<<()

std::ostream & ROOT::RDF::operator<< ( std::ostream & os,
const RDFDescription & description )

Definition at line 34 of file RDFDescription.cxx.

◆ operator==() [1/3]

template<class T1, class T2>
bool ROOT::RDF::operator== ( const RResultPtr< T1 > & lhs,
const RResultPtr< T2 > & rhs )

Definition at line 422 of file RResultPtr.hxx.

◆ operator==() [2/3]

template<class T1>
bool ROOT::RDF::operator== ( const RResultPtr< T1 > & lhs,
std::nullptr_t rhs )

Definition at line 434 of file RResultPtr.hxx.

◆ operator==() [3/3]

template<class T1>
bool ROOT::RDF::operator== ( std::nullptr_t lhs,
const RResultPtr< T1 > & rhs )

Definition at line 440 of file RResultPtr.hxx.

◆ PassAsVec()

template<std::size_t N, typename T, typename F>
auto ROOT::RDF::PassAsVec ( F && f) -> RDFInternal::PassAsVecHelper<std::make_index_sequence<N>, T, F>

PassAsVec is a callable generator that allows passing N variables of type T to a function as a single collection.

PassAsVec<N, T>(func) returns a callable that takes N arguments of type T, passes them down to function func as an initializer list {t1, t2, t3,..., tN} and returns whatever f({t1, t2, t3, ..., tN}) returns.

Note that for this to work with RDataFrame the type of all columns that the callable is applied to must be exactly T. Example usage together with RDataFrame ("varX" columns must all be float variables):

bool myVecFunc(std::vector<float> args);
df.Filter(PassAsVec<3, float>(myVecFunc), {"var1", "var2", "var3"});
auto PassAsVec(F &&f) -> RDFInternal::PassAsVecHelper< std::make_index_sequence< N >, T, F >
PassAsVec is a callable generator that allows passing N variables of type T to a function as a single...

Definition at line 103 of file RDFHelpers.hxx.

◆ Profile1D() [1/3]

template<typename V1, typename V2, typename W>
RResultPtr<::TProfile > ROOT::RDF::Profile1D ( const TProfile1DModel & model)

Fill and return a one-dimensional profile (lazy action). See the first Profile1D() overload for more details.

Definition at line 3026 of file RInterface.hxx.

◆ Profile1D() [2/3]

template<typename V1 = RDFDetail::RInferredType, typename V2 = RDFDetail::RInferredType, typename W = RDFDetail::RInferredType>
RResultPtr<::TProfile > ROOT::RDF::Profile1D ( const TProfile1DModel & model,
std::string_view v1Name,
std::string_view v2Name,
std::string_view wName )

Fill and return a one-dimensional profile (lazy action).

Template Parameters
V1The type of the column the values of which are used to fill the profile. Inferred if not present.
V2The type of the column the values of which are used to fill the profile. Inferred if not present.
WThe type of the column the weights of which are used to fill the profile. Inferred if not present.
Parameters
[in]modelThe model to be considered to build the new return value.
[in]v1NameThe name of the column that will fill the x axis.
[in]v2NameThe name of the column that will fill the y axis.
[in]wNameThe name of the column that will provide the weights.
Returns
the monodimensional profile wrapped in a RResultPtr.

This action is lazy: upon invocation of this method the calculation is booked but not executed. Also see RResultPtr.

Example usage:

// Deduce column types (this invocation needs jitting internally)
auto myProf1 = myDf.Profile1D({"profName", "profTitle", 64u, -4., 4.}, "xValues", "yValues", "weight");
// Explicit column types
auto myProf2 = myDf.Profile1D<int, float, double>({"profName", "profTitle", 64u, -4., 4.},
"xValues", "yValues", "weight");

See the first Profile1D() overload for more details.

Definition at line 3004 of file RInterface.hxx.

◆ Profile1D() [3/3]

template<typename V1 = RDFDetail::RInferredType, typename V2 = RDFDetail::RInferredType>
RResultPtr<::TProfile > ROOT::RDF::Profile1D ( const TProfile1DModel & model,
std::string_view v1Name = "",
std::string_view v2Name = "" )

Fill and return a one-dimensional profile (lazy action).

Template Parameters
V1The type of the column the values of which are used to fill the profile. Inferred if not present.
V2The type of the column the values of which are used to fill the profile. Inferred if not present.
Parameters
[in]modelThe model to be considered to build the new return value.
[in]v1NameThe name of the column that will fill the x axis.
[in]v2NameThe name of the column that will fill the y axis.
Returns
the monodimensional profile wrapped in a RResultPtr.

This action is lazy: upon invocation of this method the calculation is booked but not executed. Also see RResultPtr.

Example usage:

// Deduce column types (this invocation needs jitting internally)
auto myProf1 = myDf.Profile1D({"profName", "profTitle", 64u, -4., 4.}, "xValues", "yValues");
// Explicit column types
auto myProf2 = myDf.Graph<int, float>({"profName", "profTitle", 64u, -4., 4.}, "xValues", "yValues");
Note
Differently from other ROOT interfaces, the returned profile is not associated to gDirectory and the caller is responsible for its lifetime (in particular, a typical source of confusion is that if result histograms go out of scope before the end of the program, ROOT might display a blank canvas).

Definition at line 2959 of file RInterface.hxx.

◆ Profile2D() [1/3]

template<typename V1, typename V2, typename V3, typename W>
RResultPtr<::TProfile2D > ROOT::RDF::Profile2D ( const TProfile2DModel & model)

Fill and return a two-dimensional profile (lazy action). See the first Profile2D() overload for more details.

Definition at line 3130 of file RInterface.hxx.

◆ Profile2D() [2/3]

template<typename V1 = RDFDetail::RInferredType, typename V2 = RDFDetail::RInferredType, typename V3 = RDFDetail::RInferredType, typename W = RDFDetail::RInferredType>
RResultPtr<::TProfile2D > ROOT::RDF::Profile2D ( const TProfile2DModel & model,
std::string_view v1Name,
std::string_view v2Name,
std::string_view v3Name,
std::string_view wName )

Fill and return a two-dimensional profile (lazy action).

Template Parameters
V1The type of the column used to fill the x axis of the histogram. Inferred if not present.
V2The type of the column used to fill the y axis of the histogram. Inferred if not present.
V3The type of the column used to fill the z axis of the histogram. Inferred if not present.
WThe type of the column used for the weights of the histogram. Inferred if not present.
Parameters
[in]modelThe returned histogram will be constructed using this as a model.
[in]v1NameThe name of the column that will fill the x axis.
[in]v2NameThe name of the column that will fill the y axis.
[in]v3NameThe name of the column that will fill the z axis.
[in]wNameThe name of the column that will provide the weights.
Returns
the bidimensional profile wrapped in a RResultPtr.

This action is lazy: upon invocation of this method the calculation is booked but not executed. Also see RResultPtr.

Example usage:

// Deduce column types (this invocation needs jitting internally)
auto myProf1 = myDf.Profile2D({"profName", "profTitle", 40, -4, 4, 40, -4, 4, 0, 20},
"xValues", "yValues", "zValues", "weight");
// Explicit column types
auto myProf2 = myDf.Profile2D<int, float, double, int>({"profName", "profTitle", 40, -4, 4, 40, -4, 4, 0, 20},
"xValues", "yValues", "zValues", "weight");

See the first Profile2D() overload for more details.

Definition at line 3108 of file RInterface.hxx.

◆ Profile2D() [3/3]

template<typename V1 = RDFDetail::RInferredType, typename V2 = RDFDetail::RInferredType, typename V3 = RDFDetail::RInferredType>
RResultPtr<::TProfile2D > ROOT::RDF::Profile2D ( const TProfile2DModel & model,
std::string_view v1Name = "",
std::string_view v2Name = "",
std::string_view v3Name = "" )

Fill and return a two-dimensional profile (lazy action).

Template Parameters
V1The type of the column used to fill the x axis of the histogram. Inferred if not present.
V2The type of the column used to fill the y axis of the histogram. Inferred if not present.
V3The type of the column used to fill the z axis of the histogram. Inferred if not present.
Parameters
[in]modelThe returned profile will be constructed using this as a model.
[in]v1NameThe name of the column that will fill the x axis.
[in]v2NameThe name of the column that will fill the y axis.
[in]v3NameThe name of the column that will fill the z axis.
Returns
the bidimensional profile wrapped in a RResultPtr.

This action is lazy: upon invocation of this method the calculation is booked but not executed. Also see RResultPtr.

Example usage:

// Deduce column types (this invocation needs jitting internally)
auto myProf1 = myDf.Profile2D({"profName", "profTitle", 40, -4, 4, 40, -4, 4, 0, 20},
"xValues", "yValues", "zValues");
// Explicit column types
auto myProf2 = myDf.Profile2D<int, float, double>({"profName", "profTitle", 40, -4, 4, 40, -4, 4, 0, 20},
"xValues", "yValues", "zValues");
Note
Differently from other ROOT interfaces, the returned profile is not associated to gDirectory and the caller is responsible for its lifetime (in particular, a typical source of confusion is that if result histograms go out of scope before the end of the program, ROOT might display a blank canvas).

Definition at line 3060 of file RInterface.hxx.

◆ Range() [1/2]

RInterface< RDFDetail::RRange< Proxied > > ROOT::RDF::Range ( unsigned int begin,
unsigned int end,
unsigned int stride = 1 )

Creates a node that filters entries based on range: [begin, end).

Parameters
[in]beginInitial entry number considered for this range.
[in]endFinal entry number (excluded) considered for this range. 0 means that the range goes until the end of the dataset.
[in]strideProcess one entry of the [begin, end) range every stride entries. Must be strictly greater than 0.
Returns
the first node of the computation graph for which the event loop is limited to a certain range of entries.

Note that in case of previous Ranges and Filters the selected range refers to the transformed dataset. Ranges are only available if EnableImplicitMT has not been called. Multi-thread ranges are not supported.

Example usage:

auto d_0_30 = d.Range(0, 30); // Pick the first 30 entries
auto d_15_end = d.Range(15, 0); // Pick all entries from 15 onwards
auto d_15_end_3 = d.Range(15, 0, 3); // Stride: from event 15, pick an event every 3

Definition at line 1741 of file RInterface.hxx.

◆ Range() [2/2]

RInterface< RDFDetail::RRange< Proxied > > ROOT::RDF::Range ( unsigned int end)

Creates a node that filters entries based on range.

Parameters
[in]endFinal entry number (excluded) considered for this range. 0 means that the range goes until the end of the dataset.
Returns
a node of the computation graph for which the range is defined.

See the other Range overload for a detailed description.

Definition at line 1762 of file RInterface.hxx.

◆ Reduce() [1/2]

template<typename F, typename T = typename TTraits::CallableTraits<F>::ret_type>
RResultPtr< T > ROOT::RDF::Reduce ( F f,
std::string_view columnName,
const T & redIdentity )

Execute a user-defined reduce operation on the values of a column.

Template Parameters
FThe type of the reduce callable. Automatically deduced.
TThe type of the column to apply the reduction to. Automatically deduced.
Parameters
[in]fA callable with signature T(T,T)
[in]columnNameThe column to be reduced. If omitted, the first default column is used instead.
[in]redIdentityThe reduced object of each thread is initialized to this value.
Returns
the reduced quantity wrapped in a RResultPtr.

Example usage:

auto sumOfIntColWithOffset = d.Reduce([](int x, int y) { return x + y; }, "intCol", 42);
Double_t y[n]
Definition legend1.C:17

See the description of the first Reduce overload for more information.

Definition at line 1882 of file RInterface.hxx.

◆ Reduce() [2/2]

template<typename F, typename T = typename TTraits::CallableTraits<F>::ret_type>
RResultPtr< T > ROOT::RDF::Reduce ( F f,
std::string_view columnName = "" )

Execute a user-defined reduce operation on the values of a column.

Template Parameters
FThe type of the reduce callable. Automatically deduced.
TThe type of the column to apply the reduction to. Automatically deduced.
Parameters
[in]fA callable with signature T(T,T)
[in]columnNameThe column to be reduced. If omitted, the first default column is used instead.
Returns
the reduced quantity wrapped in a ROOT::RDF:RResultPtr.

A reduction takes two values of a column and merges them into one (e.g. by summing them, taking the maximum, etc). This action performs the specified reduction operation on all processed column values, returning a single value of the same type. The callable f must satisfy the general requirements of a processing function besides having signature T(T,T) where T is the type of column columnName.

The returned reduced value of each thread (e.g. the initial value of a sum) is initialized to a default-constructed T object. This is commonly expected to be the neutral/identity element for the specific reduction operation f (e.g. 0 for a sum, 1 for a product). If a default-constructed T does not satisfy this requirement, users should explicitly specify an initialization value for T by calling the appropriate Reduce overload.

Example usage:

auto sumOfIntCol = d.Reduce([](int x, int y) { return x + y; }, "intCol");

This action is lazy: upon invocation of this method the calculation is booked but not executed. Also see RResultPtr.

Definition at line 1859 of file RInterface.hxx.

◆ Report()

RResultPtr< RCutFlowReport > ROOT::RDF::Report ( )

Gather filtering statistics.

Returns
the resulting RCutFlowReport instance wrapped in a RResultPtr.

Calling Report on the main RDataFrame object gathers stats for all named filters in the call graph. Calling this method on a stored chain state (i.e. a graph node different from the first) gathers the stats for all named filters in the chain section between the original RDataFrame and that node (included). Stats are gathered in the same order as the named filters have been added to the graph. A RResultPtr<RCutFlowReport> is returned to allow inspection of the effects cuts had.

This action is lazy: upon invocation of this method the calculation is booked but not executed. See RResultPtr documentation.

Example usage:

auto filtered = d.Filter(cut1, {"b1"}, "Cut1").Filter(cut2, {"b2"}, "Cut2");
auto cutReport = filtered3.Report();
cutReport->Print();
RInterface< RDFDetail::RFilter< F, Proxied > > Filter(F f, const ColumnNames_t &columns={}, std::string_view name="")
Append a filter to the call graph.

Definition at line 3428 of file RInterface.hxx.

◆ RInterface()

ROOT::RDF::RInterface ( const std::shared_ptr< Proxied > & proxied,
RLoopManager & lm,
const RDFInternal::RColumnRegister & colRegister )
protected

Definition at line 3911 of file RInterface.hxx.

◆ RunGraphs()

unsigned int ROOT::RDF::RunGraphs ( std::vector< RResultHandle > handles)

Run the event loops of multiple RDataFrames concurrently.

Parameters
[in]handlesA vector of RResultHandles whose event loops should be run.
Returns
The number of distinct computation graphs that have been processed.

This function triggers the event loop of all computation graphs which relate to the given RResultHandles. The advantage compared to running the event loop implicitly by accessing the RResultPtr is that the event loops will run concurrently. Therefore, the overall computation of all results can be scheduled more efficiently. It should be noted that user-defined operations (e.g., Filters and Defines) of the different RDataFrame graphs are assumed to be safe to call concurrently. RDataFrame will pass slot numbers in the range [0, NThread-1] to all helpers used in nodes such as DefineSlot. NThread is the number of threads ROOT was configured with in EnableImplicitMT(). Slot numbers are unique across all graphs, so no two tasks with the same slot number will run concurrently. Note that it is not guaranteed that each slot number will be reached in every graph.

ROOT::RDataFrame df1("tree1", "file1.root");
auto r1 = df1.Histo1D("var1");
ROOT::RDataFrame df2("tree2", "file2.root");
auto r2 = df2.Sum("var2");
// RResultPtr -> RResultHandle conversion is automatic
ROOT's RDataFrame offers a modern, high-level interface for analysis of data stored in TTree ,...
unsigned int RunGraphs(std::vector< RResultHandle > handles)
Run the event loops of multiple RDataFrames concurrently.

Definition at line 101 of file RDFHelpers.cxx.

◆ SaveGraph() [1/2]

template<typename NodeType>
std::string ROOT::RDF::SaveGraph ( NodeType node)

Create a graphviz representation of the dataframe computation graph, return it as a string.

Parameters
[in]nodeany node of the graph. Called on the head (first) node, it prints the entire graph. Otherwise, only the branch the node belongs to.

The output can be displayed with a command akin to dot -Tpng output.dot > output.png && open output.png.

Note that "hanging" Defines, i.e. Defines without downstream nodes, will not be displayed by SaveGraph as they are effectively optimized away from the computation graph.

Note that SaveGraph is not thread-safe and must not be called concurrently from different threads.

Definition at line 120 of file RDFHelpers.hxx.

◆ SaveGraph() [2/2]

template<typename NodeType>
void ROOT::RDF::SaveGraph ( NodeType node,
const std::string & outputFile )

Create a graphviz representation of the dataframe computation graph, write it to the specified file.

Parameters
[in]nodeany node of the graph. Called on the head (first) node, it prints the entire graph. Otherwise, only the branch the node belongs to.
[in]outputFilefile where to save the representation.

The output can be displayed with a command akin to dot -Tpng output.dot > output.png && open output.png.

Note that "hanging" Defines, i.e. Defines without downstream nodes, will not be displayed by SaveGraph as they are effectively optimized away from the computation graph.

Note that SaveGraph is not thread-safe and must not be called concurrently from different threads.

Definition at line 139 of file RDFHelpers.hxx.

◆ Snapshot() [1/4]

template<typename... ColumnTypes>
RResultPtr< RInterface< RLoopManager > > ROOT::RDF::Snapshot ( std::string_view treename,
std::string_view filename,
const ColumnNames_t & columnList,
const RSnapshotOptions & options = RSnapshotOptions() )

Definition at line 1317 of file RInterface.hxx.

◆ Snapshot() [2/4]

RResultPtr< RInterface< RLoopManager > > ROOT::RDF::Snapshot ( std::string_view treename,
std::string_view filename,
const ColumnNames_t & columnList,
const RSnapshotOptions & options = RSnapshotOptions() )

Save selected columns to disk, in a new TTree or RNTuple treename in file filename.

Parameters
[in]treenameThe name of the output TTree or RNTuple.
[in]filenameThe name of the output TFile.
[in]columnListThe list of names of the columns/branches/fields to be written.
[in]optionsRSnapshotOptions struct with extra options to pass to TFile and TTree/RNTuple.
Returns
a RDataFrame that wraps the snapshotted dataset.

This function returns a RDataFrame built with the output TTree or RNTuple as a source. The types of the columns are automatically inferred and do not need to be specified.

Support for writing of nested branches/fields is limited (although RDataFrame is able to read them) and dot ('.') characters in input column names will be replaced by underscores ('_') in the branches produced by Snapshot. When writing a variable size array through Snapshot, it is required that the column indicating its size is also written out and it appears before the array in the columnList.

By default, in case of TTree, TChain or RNTuple inputs, Snapshot will try to write out all top-level branches. For other types of inputs, all columns returned by GetColumnNames() will be written out. Systematic variations of columns will be included if the corresponding flag is set in RSnapshotOptions. See Snapshot with Variations for more details. If friend trees or chains are present, by default all friend top-level branches that have names that do not collide with names of branches in the main TTree/TChain will be written out. Since v6.24, Snapshot will also write out friend branches with the same names of branches in the main TTree/TChain with names of the form <friendname>_<branchname> in order to differentiate them from the branches in the main tree/chain.

Writing to a sub-directory

Snapshot supports writing the TTree or RNTuple in a sub-directory inside the TFile. It is sufficient to specify the directory path as part of the TTree or RNTuple name, e.g. df.Snapshot("subdir/t", "f.root") writes TTree t in the sub-directory subdir of file f.root (creating file and sub-directory as needed).

Attention
In multi-thread runs (i.e. when EnableImplicitMT() has been called) threads will loop over clusters of entries in an undefined order, so Snapshot will produce outputs in which (clusters of) entries will be shuffled with respect to the input TTree. Using such "shuffled" TTrees as friends of the original trees would result in wrong associations between entries in the main TTree and entries in the "shuffled" friend. Since v6.22, ROOT will error out if such a "shuffled" TTree is used in a friendship.
Note
In case no events are written out (e.g. because no event passes all filters), Snapshot will still write the requested output TTree or RNTuple to the file, with all the branches requested to preserve the dataset schema.
Snapshot will refuse to process columns with names of the form #columnname. These are special columns made available by some data sources (e.g. RNTupleDS) that represent the size of column columnname, and are not meant to be written out with that name (which is not a valid C++ variable name). Instead, go through an Alias(): df.Alias("nbar", "#bar").Snapshot(..., {"nbar"}).

Example invocations:

// No need to specify column types, they are automatically deduced thanks
// to information coming from the data source
df.Snapshot("outputTree", "outputFile.root", {"x", "y"});

To book a Snapshot without triggering the event loop, one needs to set the appropriate flag in RSnapshotOptions:

opts.fLazy = true;
df.Snapshot("outputTree", "outputFile.root", {"x"}, opts);
A collection of options to steer the creation of the dataset on disk through Snapshot().
bool fLazy
Do not start the event loop when Snapshot is called.

To snapshot to the RNTuple data format, the fOutputFormat option in RSnapshotOptions needs to be set accordingly:

df.Snapshot("outputNTuple", "outputFile.root", {"x"}, opts);
ESnapshotOutputFormat fOutputFormat
Which data format to write to.

Snapshot systematic variations resulting from a Vary() call (see details here):

opts.fIncludeVariations = true;
df.Snapshot("outputTree", "outputFile.root", {"x"}, opts);
bool fIncludeVariations
Include columns that result from a Vary() action.

Definition at line 1398 of file RInterface.hxx.

◆ Snapshot() [3/4]

RResultPtr< RInterface< RLoopManager > > ROOT::RDF::Snapshot ( std::string_view treename,
std::string_view filename,
std::initializer_list< std::string > columnList,
const RSnapshotOptions & options = RSnapshotOptions() )

Save selected columns to disk, in a new TTree or RNTuple treename in file filename.

Parameters
[in]treenameThe name of the output TTree or RNTuple.
[in]filenameThe name of the output TFile.
[in]columnListThe list of names of the columns/branches to be written.
[in]optionsRSnapshotOptions struct with extra options to pass to TFile and TTree/RNTuple.
Returns
a RDataFrame that wraps the snapshotted dataset.

This function returns a RDataFrame built with the output TTree or RNTuple as a source. The types of the columns are automatically inferred and do not need to be specified.

See Snapshot(std::string_view, std::string_view, const ColumnNames_t&, const RSnapshotOptions &) for a more complete description and example usages.

Definition at line 1590 of file RInterface.hxx.

◆ Snapshot() [4/4]

RResultPtr< RInterface< RLoopManager > > ROOT::RDF::Snapshot ( std::string_view treename,
std::string_view filename,
std::string_view columnNameRegexp = "",
const RSnapshotOptions & options = RSnapshotOptions() )

Save selected columns to disk, in a new TTree or RNTuple treename in file filename.

Parameters
[in]treenameThe name of the output TTree or RNTuple.
[in]filenameThe name of the output TFile.
[in]columnNameRegexpThe regular expression to match the column names to be selected. The presence of a '^' and a '$' at the end of the string is implicitly assumed if they are not specified. The dialect supported is PCRE via the TPRegexp class. An empty string signals the selection of all columns.
[in]optionsRSnapshotOptions struct with extra options to pass to TFile and TTree/RNTuple
Returns
a RDataFrame that wraps the snapshotted dataset.

This function returns a RDataFrame built with the output TTree or RNTuple as a source. The types of the columns are automatically inferred and do not need to be specified.

See Snapshot(std::string_view, std::string_view, const ColumnNames_t&, const RSnapshotOptions &) for a more complete description and example usages.

Definition at line 1537 of file RInterface.hxx.

◆ splitInEqualRanges()

void ROOT::RDF::splitInEqualRanges ( std::vector< std::pair< ULong64_t, ULong64_t > > & ranges,
int nRecords,
unsigned int nSlots )

Definition at line 519 of file RArrowDS.cxx.

◆ Stats() [1/2]

template<typename V = RDFDetail::RInferredType, typename W = RDFDetail::RInferredType>
RResultPtr< TStatistic > ROOT::RDF::Stats ( std::string_view value,
std::string_view weight )

Return a TStatistic object, filled once per event (lazy action).

Template Parameters
VThe type of the value column
WThe type of the weight column
Parameters
[in]valueThe name of the column with the values to fill the statistics with.
[in]weightThe name of the column with the weights to fill the statistics with.
Returns
the filled TStatistic object wrapped in a RResultPtr.

Example usage:

// Deduce column types (this invocation needs jitting internally)
auto stats0 = myDf.Stats("values", "weights");
// Explicit column types
auto stats1 = myDf.Stats<int, float>("values", "weights");

Definition at line 3228 of file RInterface.hxx.

◆ Stats() [2/2]

template<typename V = RDFDetail::RInferredType>
RResultPtr< TStatistic > ROOT::RDF::Stats ( std::string_view value = "")

Return a TStatistic object, filled once per event (lazy action).

Template Parameters
VThe type of the value column
Parameters
[in]valueThe name of the column with the values to fill the statistics with.
Returns
the filled TStatistic object wrapped in a RResultPtr.

Example usage:

// Deduce column type (this invocation needs jitting internally)
auto stats0 = myDf.Stats("values");
// Explicit column type
auto stats1 = myDf.Stats<float>("values");

Definition at line 3196 of file RInterface.hxx.

◆ StdDev()

template<typename T = RDFDetail::RInferredType>
RResultPtr< double > ROOT::RDF::StdDev ( std::string_view columnName = "")

Return the unbiased standard deviation of processed column values (lazy action).

Template Parameters
TThe type of the branch/column.
Parameters
[in]columnNameThe name of the branch/column to be treated.
Returns
the standard deviation value of the selected column wrapped in a RResultPtr.

If T is not specified, RDataFrame will infer it from the data and just-in-time compile the correct template specialization of this method.

This action is lazy: upon invocation of this method the calculation is booked but not executed. Also see RResultPtr.

Example usage:

// Deduce column type (this invocation needs jitting internally)
auto stdDev0 = myDf.StdDev("values");
// Explicit column type
auto stdDev1 = myDf.StdDev<double>("values");

Definition at line 3363 of file RInterface.hxx.

◆ Sum()

template<typename T = RDFDetail::RInferredType>
RResultPtr< RDFDetail::SumReturnType_t< T > > ROOT::RDF::Sum ( std::string_view columnName = "",
const RDFDetail::SumReturnType_t< T > & initValue = RDFDetail::SumReturnType_t<T>{} )

Return the sum of processed column values (lazy action).

Template Parameters
TThe type of the branch/column.
Parameters
[in]columnNameThe name of the branch/column.
[in]initValueOptional initial value for the sum. If not present, the column values must be default-constructible.
Returns
the sum of the selected column wrapped in a RResultPtr.

If T is not specified, RDataFrame will infer it from the data and just-in-time compile the correct template specialization of this method. If the type of the column is inferred, the return type is double, the type of the column otherwise.

This action is lazy: upon invocation of this method the calculation is booked but not executed. Also see RResultPtr.

Example usage:

// Deduce column type (this invocation needs jitting internally)
auto sum0 = myDf.Sum("values");
// Explicit column type
auto sum1 = myDf.Sum<double>("values");

Definition at line 3395 of file RInterface.hxx.

◆ Take()

template<typename T, typename COLL = std::vector<T>>
RResultPtr< COLL > ROOT::RDF::Take ( std::string_view column = "")

Return a collection of values of a column (lazy action, returns a std::vector by default).

Template Parameters
TThe type of the column.
COLLThe type of collection used to store the values.
Parameters
[in]columnThe name of the column to collect the values of.
Returns
the content of the selected column wrapped in a RResultPtr.

The collection type to be specified for C-style array columns is RVec<T>: in this case the returned collection is a std::vector<RVec<T>>.

Example usage:

// In this case intCol is a std::vector<int>
auto intCol = rdf.Take<int>("integerColumn");
// Same content as above but in this case taken as a RVec<int>
auto intColAsRVec = rdf.Take<int, RVec<int>>("integerColumn");
// In this case intCol is a std::vector<RVec<int>>, a collection of collections
auto cArrayIntCol = rdf.Take<RVec<int>>("cArrayInt");
A "std::vector"-like collection of values implementing handy operation to analyse them.
Definition RVec.hxx:1524

This action is lazy: upon invocation of this method the calculation is booked but not executed. Also see RResultPtr.

Definition at line 1932 of file RInterface.hxx.

◆ Vary() [1/4]

RInterface< Proxied > ROOT::RDF::Vary ( const std::vector< std::string > & colNames,
std::string_view expression,
const std::vector< std::string > & variationTags,
std::string_view variationName )

Register systematic variations for multiple existing columns using custom variation tags.

Parameters
[in]colNamesset of names of the columns for which varied values are provided.
[in]expressiona string containing valid C++ code that evaluates to an RVec or RVecs containing the varied values for the specified columns.
[in]variationTagsnames for each of the varied values, e.g. "up" and "down".
[in]variationNamea generic name for this set of varied values, e.g. "ptvariation".

This overload adds the possibility for the expression used to evaluate the varied values to be just-in-time compiled. The example below shows how Vary() is used while dealing with multiple columns. The tags are defined as {"down", "up"}.

auto nominal_hx =
df.Vary({"x", "y"}, "ROOT::RVec<ROOT::RVecD>{{x*0.9, x*1.1}, {y*0.9, y*1.1}}", {"down", "up"}, "xy")
.Histo1D("x", "y");
hx["nominal"].Draw();
hx["xy:down"].Draw("SAME");
hx["xy:up"].Draw("SAME");
RResultMap< T > VariationsFor(RResultPtr< T > resPtr)
Produce all required systematic variations for the given result.

Short-hand expression syntax

For convenience, when a C++ expression is passed to Vary, the return type can be omitted if the string begins with '{' and ends with '}' (whitespace, tab and newline characters are excluded from the search). This means that the following is equivalent to the example above:

auto nominal_hx =
df.Vary("pt", "{{x*0.9, x*1.1}, {y*0.9, y*1.1}}", {"down", "up"}, "xy")
// Same as above

or also:

auto nominal_hx =
df.Vary("pt", R"(
{
{x*0.9, x*1.1}, // x variations
{y*0.9, y*1.1} // y variations
}
)", {"down", "up"}, "xy")
// Same as above
Note
See also This Vary() overload for more information.

Definition at line 1272 of file RInterface.hxx.

◆ Vary() [2/4]

RInterface< Proxied > ROOT::RDF::Vary ( const std::vector< std::string > & colNames,
std::string_view expression,
std::size_t nVariations,
std::string_view variationName )

Register systematic variations for multiple existing columns using auto-generated variation tags.

Parameters
[in]colNamesset of names of the columns for which varied values are provided.
[in]expressiona string containing valid C++ code that evaluates to an RVec or RVecs containing the varied values for the specified columns.
[in]nVariationsnumber of variations returned by the expression. The corresponding tags will be "0", "1", etc.
[in]variationNamea generic name for this set of varied values, e.g. "ptvariation".

This overload adds the possibility for the expression used to evaluate the varied values to be just-in-time compiled. It takes an nVariations parameter instead of a list of tag names. The varied results will be accessible via the keys of the dictionary with the form variationName:N where N is the corresponding sequential tag starting at 0 and going up to nVariations - 1. The example below shows how Vary() is used while dealing with multiple columns.

auto nominal_hx =
df.Vary({"x", "y"}, "ROOT::RVec<ROOT::RVecD>{{x*0.9, x*1.1}, {y*0.9, y*1.1}}", 2, "xy")
.Histo1D("x", "y");
hx["nominal"].Draw();
hx["xy:0"].Draw("SAME");
hx["xy:1"].Draw("SAME");

Short-hand expression syntax

For convenience, when a C++ expression is passed to Vary, the return type can be omitted if the string begins with '{' and ends with '}' (whitespace, tab and newline characters are excluded from the search). This means that the following is equivalent to the example above:

auto nominal_hx =
df.Vary("pt", "{{x*0.9, x*1.1}, {y*0.9, y*1.1}}", 2, "xy")
// Same as above

or also:

auto nominal_hx =
df.Vary("pt", R"(
{
{x*0.9, x*1.1}, // x variations
{y*0.9, y*1.1} // y variations
}
)", 2, "xy")
// Same as above
Note
See also This Vary() overload for more information.

Definition at line 1195 of file RInterface.hxx.

◆ Vary() [3/4]

RInterface< Proxied > ROOT::RDF::Vary ( std::initializer_list< std::string > colNames,
std::string_view expression,
std::size_t nVariations,
std::string_view variationName )

Register systematic variations for multiple existing columns using auto-generated variation tags.

Parameters
[in]colNamesset of names of the columns for which varied values are provided.
[in]expressiona string containing valid C++ code that evaluates to an RVec containing the varied values for the specified column.
[in]nVariationsnumber of variations returned by the expression. The corresponding tags will be "0", "1", etc.
[in]variationNamea generic name for this set of varied values, e.g. "ptvariation". colName is used if none is provided.
Note
This overload ensures that the ambiguity between C++20 string, vector<string> construction from init list is avoided.
See also This Vary() overload for more information.

Definition at line 1219 of file RInterface.hxx.

◆ Vary() [4/4]

template<typename Proxied>
ROOT::RDF::RInterface RInterfaceBase ROOT::RDF::Vary ( std::string_view colName,
std::string_view expression,
std::size_t nVariations,
std::string_view variationName = "" )

Register systematic variations for a single existing column using auto-generated variation tags.

Parameters
[in]colNamename of the column for which varied values are provided.
[in]expressiona string containing valid C++ code that evaluates to an RVec containing the varied values for the specified column.
[in]nVariationsnumber of variations returned by the expression. The corresponding tags will be "0", "1", etc.
[in]variationNamea generic name for this set of varied values, e.g. "ptvariation". colName is used if none is provided.

This overload adds the possibility for the expression used to evaluate the varied values to be a just-in-time compiled. The example below shows how Vary() is used while dealing with a single column. The variation tags are auto-generated.

auto nominal_hx =
df.Vary("pt", "ROOT::RVecD{pt*0.9, pt*1.1}", 2)
.Histo1D("pt");
hx["nominal"].Draw();
hx["pt:0"].Draw("SAME");
hx["pt:1"].Draw("SAME");

Short-hand expression syntax

For convenience, when a C++ expression is passed to Vary, the return type can be omitted if the string begins with '{' and ends with '}' (whitespace, tab and newline characters are excluded from the search). This means that the following is equivalent to the example above:

auto nominal_hx =
df.Vary("pt", "{pt*0.9, pt*1.1}", 2)
// Same as above
Note
See also This Vary() overload for more information.

Definition at line 106 of file RInterface.hxx.

◆ VaryImpl()

template<bool IsSingleColumn, typename F>
RInterface< Proxied > ROOT::RDF::VaryImpl ( const std::vector< std::string > & colNames,
F && expression,
const ColumnNames_t & inputColumns,
const std::vector< std::string > & variationTags,
std::string_view variationName )
private

Definition at line 3820 of file RInterface.hxx.