The head node of a RDF computation graph.
This class is responsible of running the event loop.
Definition at line 118 of file RLoopManager.hxx.
Private Types | |
using | ColumnNames_t = std::vector<std::string> |
enum class | ELoopType { kInvalid , kNoFiles , kNoFilesMT , kDataSource , kDataSourceMT } |
Private Member Functions | |
void | CleanUpNodes () |
Perform clean-up operations. To be called at the end of each event loop. | |
void | CleanUpTask (TTreeReader *r, unsigned int slot) |
Perform clean-up operations. To be called at the end of each task execution. | |
void | EvalChildrenCounts () |
Trigger counting of number of children nodes for each node of the functional graph. | |
void | InitNodes () |
Initialize all nodes of the functional graph before running the event loop. | |
void | InitNodeSlots (TTreeReader *r, unsigned int slot) |
Build TTreeReaderValues for all nodes This method loops over all filters, actions and other booked objects and calls their InitSlot method, to get them ready for running a task. | |
void | RunAndCheckFilters (unsigned int slot, Long64_t entry) |
Execute actions and make sure named filters are called for each event. | |
void | RunDataSource () |
Run event loop over data accessed through a DataSource, in sequence. | |
void | RunDataSourceMT () |
Run event loop over data accessed through a DataSource, in parallel. | |
void | RunEmptySource () |
Run event loop with no source files, in sequence. | |
void | RunEmptySourceMT () |
Run event loop with no source files, in parallel. | |
void | SetupSampleCallbacks (TTreeReader *r, unsigned int slot) |
std::shared_ptr< ROOT::Internal::RSlotStack > | SlotStack () const |
Create a slot stack with the desired number of slots or reuse a shared instance. | |
void | UpdateSampleInfo (unsigned int slot, const std::pair< ULong64_t, ULong64_t > &range) |
void | UpdateSampleInfo (unsigned int slot, TTreeReader &r) |
Private Attributes | |
Long64_t | fBeginEntry {0} |
std::vector< RDFInternal::RActionBase * > | fBookedActions |
Non-owning pointers to actions to be run. | |
std::vector< RDefineBase * > | fBookedDefines |
std::vector< RFilterBase * > | fBookedFilters |
std::vector< RFilterBase * > | fBookedNamedFilters |
Contains a subset of fBookedFilters, i.e. only the named filters. | |
std::vector< RRangeBase * > | fBookedRanges |
std::vector< RDFInternal::RVariationBase * > | fBookedVariations |
ROOT::Internal::RDF::RStringCache | fCachedColNames |
std::vector< RDFInternal::RCallback > | fCallbacksEveryNEvents |
Registered callbacks to be executed every N events. | |
std::vector< RDFInternal::ROneTimeCallback > | fCallbacksOnce |
Registered callbacks to invoke just once before running the loop. | |
std::vector< std::unordered_map< std::string, std::unique_ptr< RColumnReaderBase > > > | fDatasetColumnReaders |
Readers for TTree/RDataSource columns (one per slot), shared by all nodes in the computation graph. | |
std::unique_ptr< RDataSource > | fDataSource {} |
Owning pointer to a data-source object. | |
ColumnNames_t | fDefaultColumns |
std::pair< ULong64_t, ULong64_t > | fEmptyEntryRange {} |
Range of entries created when no data source is specified. | |
Long64_t | fEndEntry {std::numeric_limits<Long64_t>::max()} |
ELoopType | fLoopType {ELoopType::kInvalid} |
The kind of event loop that is going to be run (e.g. on ROOT files, on no files) | |
bool | fMustRunNamedFilters {true} |
RDFInternal::RNewSampleNotifier | fNewSampleNotifier |
unsigned int | fNRuns {0} |
Number of event loops run. | |
unsigned int | fNSlots {1} |
std::vector< RDFInternal::RActionBase * > | fRunActions |
Non-owning pointers to actions already run. | |
std::unordered_map< void *, ROOT::RDF::SampleCallback_t > | fSampleCallbacks |
Registered callbacks to call at the beginning of each "data block". | |
std::vector< ROOT::RDF::RSampleInfo > | fSampleInfos |
std::unordered_map< std::string, ROOT::RDF::Experimental::RSample * > | fSampleMap |
Keys are fname + "/" + treename as RSampleInfo::fID; Values are pointers to the corresponding sample. | |
std::vector< ROOT::RDF::Experimental::RSample > | fSamples |
Samples need to survive throughout the whole event loop, hence stored as an attribute. | |
std::weak_ptr< ROOT::Internal::RSlotStack > | fSlotStack |
Pointer to a shared slot stack in case this instance runs concurrently with others: | |
std::set< std::string > | fSuppressErrorsForMissingBranches {} |
std::any | fTTreeLifeline {} |
std::set< std::pair< std::string_view, std::unique_ptr< ROOT::Internal::RDF::RDefinesWithReaders > > > | fUniqueDefinesWithReaders |
std::set< std::pair< std::string_view, std::unique_ptr< ROOT::Internal::RDF::RVariationsWithReaders > > > | fUniqueVariationsWithReaders |
ColumnNames_t | fValidBranchNames |
Cache of the tree/chain branch names. Never access directy, always use GetBranchNames(). | |
Friends | |
struct | RCallCleanUpTask |
struct | ROOT::Internal::RDF::RDSRangeRAII |
Additional Inherited Members | |
![]() | |
RLoopManager * | fLoopManager |
unsigned int | fNChildren {0} |
Number of nodes of the functional graph hanging from this object. | |
unsigned int | fNStopsReceived {0} |
Number of times that a children node signaled to stop processing entries. | |
std::vector< std::string > | fVariations |
List of systematic variations that affect this node. | |
#include <ROOT/RDF/RLoopManager.hxx>
|
private |
Definition at line 119 of file RLoopManager.hxx.
|
strongprivate |
Enumerator | |
---|---|
kInvalid | |
kNoFiles | |
kNoFilesMT | |
kDataSource | |
kDataSourceMT |
Definition at line 120 of file RLoopManager.hxx.
ROOT::Detail::RDF::RLoopManager::RLoopManager | ( | const ColumnNames_t & | defaultColumns = {} | ) |
RLoopManager::RLoopManager | ( | TTree * | tree, |
const ColumnNames_t & | defaultBranches ) |
Definition at line 362 of file RLoopManager.cxx.
RLoopManager::RLoopManager | ( | ULong64_t | nEmptyEntries | ) |
Definition at line 374 of file RLoopManager.cxx.
RLoopManager::RLoopManager | ( | std::unique_ptr< RDataSource > | ds, |
const ColumnNames_t & | defaultBranches ) |
Definition at line 384 of file RLoopManager.cxx.
RLoopManager::RLoopManager | ( | ROOT::RDF::Experimental::RDatasetSpec && | spec | ) |
Definition at line 396 of file RLoopManager.cxx.
|
delete |
|
delete |
|
overridedefault |
RColumnReaderBase * RLoopManager::AddDataSourceColumnReader | ( | unsigned int | slot, |
std::string_view | col, | ||
const std::type_info & | ti, | ||
TTreeReader * | treeReader ) |
Definition at line 1142 of file RLoopManager.cxx.
void RLoopManager::AddDataSourceColumnReaders | ( | std::string_view | col, |
std::vector< std::unique_ptr< RColumnReaderBase > > && | readers, | ||
const std::type_info & | ti ) |
Definition at line 1113 of file RLoopManager.cxx.
|
inlinefinalvirtual |
End of recursive chain of calls, does nothing.
Implements ROOT::Detail::RDF::RNodeBase.
Definition at line 261 of file RLoopManager.hxx.
void RLoopManager::AddSampleCallback | ( | void * | nodePtr, |
ROOT::RDF::SampleCallback_t && | callback ) |
Definition at line 1166 of file RLoopManager.cxx.
RColumnReaderBase * RLoopManager::AddTreeColumnReader | ( | unsigned int | slot, |
std::string_view | col, | ||
std::unique_ptr< RColumnReaderBase > && | reader, | ||
const std::type_info & | ti ) |
Register a new RTreeColumnReader with this RLoopManager.
Definition at line 1129 of file RLoopManager.cxx.
Definition at line 1177 of file RLoopManager.cxx.
void RLoopManager::ChangeSpec | ( | ROOT::RDF::Experimental::RDatasetSpec && | spec | ) |
Changes the internal TTree held by the RLoopManager.
spec | The specification of the dataset to be adopted. |
Definition at line 451 of file RLoopManager.cxx.
Implements ROOT::Detail::RDF::RNodeBase.
Definition at line 1019 of file RLoopManager.cxx.
|
private |
Perform clean-up operations. To be called at the end of each event loop.
Definition at line 809 of file RLoopManager.cxx.
|
private |
Perform clean-up operations. To be called at the end of each task execution.
Definition at line 833 of file RLoopManager.cxx.
void ROOT::Detail::RDF::RLoopManager::DataSourceThreadTask | ( | const std::pair< ULong64_t, ULong64_t > & | entryRange, |
ROOT::Internal::RSlotStack & | slotStack, | ||
std::atomic< ULong64_t > & | entryCount ) |
The task run by every thread on the input entry range, for the generic RDataSource.
Definition at line 1332 of file RLoopManager.cxx.
void RLoopManager::Deregister | ( | RDefineBase * | definePtr | ) |
Definition at line 1002 of file RLoopManager.cxx.
void RLoopManager::Deregister | ( | RDFInternal::RActionBase * | actionPtr | ) |
Definition at line 965 of file RLoopManager.cxx.
void RLoopManager::Deregister | ( | RDFInternal::RVariationBase * | varPtr | ) |
Definition at line 1013 of file RLoopManager.cxx.
void RLoopManager::Deregister | ( | RFilterBase * | filterPtr | ) |
Definition at line 981 of file RLoopManager.cxx.
void RLoopManager::Deregister | ( | RRangeBase * | rangePtr | ) |
Definition at line 992 of file RLoopManager.cxx.
|
inline |
Definition at line 306 of file RLoopManager.hxx.
|
private |
Trigger counting of number of children nodes for each node of the functional graph.
This is done once before starting the event loop. Each action sends an increase children count
signal upstream, which is propagated until RLoopManager. Each time a node receives the signal, in increments its children counter. Each node only propagates the signal once, even if it receives it multiple times. Named filters also send an increase children count
signal, just like actions, as they always execute during the event loop so the graph branch they belong to must count as active even if it does not end in an action.
Definition at line 884 of file RLoopManager.cxx.
std::vector< RDFInternal::RActionBase * > RLoopManager::GetAllActions | ( | ) | const |
Return all actions, either booked or already run.
Definition at line 1063 of file RLoopManager.cxx.
const ColumnNames_t & RLoopManager::GetBranchNames | ( | ) |
Return all valid TTree::Branch names (caching results for subsequent calls).
Never use fBranchNames directy, always request it through this method.
Definition at line 1094 of file RLoopManager.cxx.
|
inline |
Definition at line 283 of file RLoopManager.hxx.
RColumnReaderBase * RLoopManager::GetDatasetColumnReader | ( | unsigned int | slot, |
std::string_view | col, | ||
const std::type_info & | ti ) const |
Definition at line 1157 of file RLoopManager.cxx.
|
inline |
Definition at line 230 of file RLoopManager.hxx.
const ColumnNames_t & RLoopManager::GetDefaultColumnNames | ( | ) | const |
Return the list of default columns – empty if none was provided when constructing the RDataFrame.
Definition at line 943 of file RLoopManager.cxx.
std::vector< std::string > RLoopManager::GetFiltersNames | ( | ) |
For each booked filter, returns either the name or "Unnamed Filter".
Definition at line 1045 of file RLoopManager.cxx.
|
finalvirtual |
Implements ROOT::Detail::RDF::RNodeBase.
Definition at line 1071 of file RLoopManager.cxx.
std::vector< RNodeBase * > RLoopManager::GetGraphEdges | ( | ) | const |
Return all graph edges known to RLoopManager This includes Filters and Ranges but not Defines.
Definition at line 1055 of file RLoopManager.cxx.
|
inlinefinalvirtual |
Reimplemented from ROOT::Detail::RDF::RNodeBase.
Definition at line 225 of file RLoopManager.hxx.
|
inline |
Definition at line 229 of file RLoopManager.hxx.
|
inline |
Definition at line 250 of file RLoopManager.hxx.
|
inline |
Definition at line 242 of file RLoopManager.hxx.
|
inline |
Definition at line 310 of file RLoopManager.hxx.
TTree * RLoopManager::GetTree | ( | ) | const |
Definition at line 948 of file RLoopManager.cxx.
|
inline |
Definition at line 285 of file RLoopManager.hxx.
|
inline |
Definition at line 290 of file RLoopManager.hxx.
bool RLoopManager::HasDataSourceColumnReaders | ( | std::string_view | col, |
const std::type_info & | ti ) const |
Return true if AddDataSourceColumnReaders was called for column name col.
Definition at line 1103 of file RLoopManager.cxx.
|
inlinefinalvirtual |
Implements ROOT::Detail::RDF::RNodeBase.
Definition at line 246 of file RLoopManager.hxx.
|
private |
Initialize all nodes of the functional graph before running the event loop.
This method is called once per event-loop and performs generic initialization operations that do not depend on the specific processing slot (i.e. operations that are common for all threads).
Definition at line 797 of file RLoopManager.cxx.
|
private |
Build TTreeReaderValues for all nodes This method loops over all filters, actions and other booked objects and calls their InitSlot
method, to get them ready for running a task.
Definition at line 716 of file RLoopManager.cxx.
|
inline |
Definition at line 302 of file RLoopManager.hxx.
void RLoopManager::Jit | ( | ) |
Add RDF nodes that require just-in-time compilation to the computation graph.
This method also clears the contents of GetCodeToJit().
Definition at line 854 of file RLoopManager.cxx.
|
delete |
|
delete |
|
inlinefinalvirtual |
End of recursive chain of calls, does nothing.
Implements ROOT::Detail::RDF::RNodeBase.
Definition at line 245 of file RLoopManager.hxx.
void RLoopManager::Register | ( | RDefineBase * | definePtr | ) |
Definition at line 997 of file RLoopManager.cxx.
void RLoopManager::Register | ( | RDFInternal::RActionBase * | actionPtr | ) |
Definition at line 959 of file RLoopManager.cxx.
void RLoopManager::Register | ( | RDFInternal::RVariationBase * | varPtr | ) |
Definition at line 1008 of file RLoopManager.cxx.
void RLoopManager::Register | ( | RFilterBase * | filterPtr | ) |
Definition at line 972 of file RLoopManager.cxx.
void RLoopManager::Register | ( | RRangeBase * | rangePtr | ) |
Definition at line 987 of file RLoopManager.cxx.
void RLoopManager::RegisterCallback | ( | ULong64_t | everyNEvents, |
std::function< void(unsigned int)> && | f ) |
Definition at line 1037 of file RLoopManager.cxx.
|
finalvirtual |
Call FillReport
on all booked filters.
Implements ROOT::Detail::RDF::RNodeBase.
Definition at line 1025 of file RLoopManager.cxx.
Start the event loop with a different mechanism depending on IMT/no IMT, data source/no data source.
Also perform a few setup and clean-up operations (jit actions if necessary, clear booked actions after the loop...). The jitting phase is skipped if the jit
parameter is false
(unsafe, use with care).
Definition at line 895 of file RLoopManager.cxx.
Execute actions and make sure named filters are called for each event.
Named filters must be called even if the analysis logic would not require it, lest they report confusing results.
Definition at line 696 of file RLoopManager.cxx.
|
private |
Run event loop over data accessed through a DataSource, in sequence.
Definition at line 610 of file RLoopManager.cxx.
|
private |
Run event loop over data accessed through a DataSource, in parallel.
Definition at line 676 of file RLoopManager.cxx.
|
private |
Run event loop with no source files, in sequence.
Definition at line 539 of file RLoopManager.cxx.
|
private |
Run event loop with no source files, in parallel.
Definition at line 492 of file RLoopManager.cxx.
void ROOT::Detail::RDF::RLoopManager::SetDataSource | ( | std::unique_ptr< ROOT::RDF::RDataSource > | dataSource | ) |
Definition at line 1323 of file RLoopManager.cxx.
Definition at line 1172 of file RLoopManager.cxx.
|
inline |
Register a slot stack to be used by this RLoopManager.
This allows for sharing RDataFrame helpers safely in the context of RunGraphs(). Note that the loop manager only stores a weak_ptr, in between runs.
Definition at line 298 of file RLoopManager.hxx.
void ROOT::Detail::RDF::RLoopManager::SetTTreeLifeline | ( | std::any | lifeline | ) |
Definition at line 1183 of file RLoopManager.cxx.
|
private |
Definition at line 732 of file RLoopManager.cxx.
|
private |
Create a slot stack with the desired number of slots or reuse a shared instance.
When a LoopManager runs in isolation, it will create its own slot stack from the number of slots. When it runs as part of RunGraphs(), each loop manager will be assigned a shared slot stack, so dataframe helpers can be shared in a thread-safe manner.
Definition at line 780 of file RLoopManager.cxx.
|
inlinefinalvirtual |
Implements ROOT::Detail::RDF::RNodeBase.
Definition at line 247 of file RLoopManager.hxx.
void RLoopManager::ToJitExec | ( | const std::string & | code | ) | const |
Definition at line 1031 of file RLoopManager.cxx.
void ROOT::Detail::RDF::RLoopManager::TTreeThreadTask | ( | TTreeReader & | treeReader, |
ROOT::Internal::RSlotStack & | slotStack, | ||
std::atomic< ULong64_t > & | entryCount ) |
The task run by every thread on an entry range (known by the input TTreeReader), for the TTree data source.
Definition at line 1369 of file RLoopManager.cxx.
|
private |
Definition at line 746 of file RLoopManager.cxx.
|
private |
Definition at line 751 of file RLoopManager.cxx.
|
friend |
Definition at line 128 of file RLoopManager.hxx.
|
friend |
Definition at line 129 of file RLoopManager.hxx.
|
private |
Definition at line 146 of file RLoopManager.hxx.
|
private |
Non-owning pointers to actions to be run.
Definition at line 138 of file RLoopManager.hxx.
|
private |
Definition at line 143 of file RLoopManager.hxx.
|
private |
Definition at line 140 of file RLoopManager.hxx.
|
private |
Contains a subset of fBookedFilters, i.e. only the named filters.
Definition at line 141 of file RLoopManager.hxx.
|
private |
Definition at line 142 of file RLoopManager.hxx.
|
private |
Definition at line 144 of file RLoopManager.hxx.
|
private |
Definition at line 203 of file RLoopManager.hxx.
|
private |
Registered callbacks to be executed every N events.
The registration happens via the RegisterCallback method.
Definition at line 164 of file RLoopManager.hxx.
|
private |
Registered callbacks to invoke just once before running the loop.
The registration happens via the RegisterCallback method.
Definition at line 167 of file RLoopManager.hxx.
|
private |
Readers for TTree/RDataSource columns (one per slot), shared by all nodes in the computation graph.
Definition at line 176 of file RLoopManager.hxx.
|
private |
Owning pointer to a data-source object.
Null if no data-source
Definition at line 161 of file RLoopManager.hxx.
|
private |
Definition at line 154 of file RLoopManager.hxx.
Range of entries created when no data source is specified.
Definition at line 156 of file RLoopManager.hxx.
Definition at line 147 of file RLoopManager.hxx.
|
private |
The kind of event loop that is going to be run (e.g. on ROOT files, on no files)
Definition at line 160 of file RLoopManager.hxx.
Definition at line 158 of file RLoopManager.hxx.
|
private |
Definition at line 171 of file RLoopManager.hxx.
|
private |
Number of event loops run.
Definition at line 173 of file RLoopManager.hxx.
|
private |
Definition at line 157 of file RLoopManager.hxx.
|
private |
Non-owning pointers to actions already run.
Definition at line 139 of file RLoopManager.hxx.
|
private |
Registered callbacks to call at the beginning of each "data block".
The key is the pointer of the corresponding node in the computation graph (a RDefinePerSample or a RAction).
Definition at line 170 of file RLoopManager.hxx.
|
private |
Definition at line 172 of file RLoopManager.hxx.
|
private |
Keys are fname + "/" + treename
as RSampleInfo::fID; Values are pointers to the corresponding sample.
Definition at line 150 of file RLoopManager.hxx.
|
private |
Samples need to survive throughout the whole event loop, hence stored as an attribute.
Definition at line 152 of file RLoopManager.hxx.
|
private |
Pointer to a shared slot stack in case this instance runs concurrently with others:
Definition at line 182 of file RLoopManager.hxx.
|
private |
Definition at line 202 of file RLoopManager.hxx.
|
private |
Definition at line 136 of file RLoopManager.hxx.
|
private |
Definition at line 205 of file RLoopManager.hxx.
|
private |
Definition at line 207 of file RLoopManager.hxx.
|
private |
Cache of the tree/chain branch names. Never access directy, always use GetBranchNames().
Definition at line 179 of file RLoopManager.hxx.