ROOT  6.06/09
Reference Guide
Public Member Functions | Private Types | Private Member Functions | Private Attributes | List of all members
TProcPool Class Reference

This class provides a simple interface to execute the same task multiple times in parallel, possibly with different arguments every time.

This mimics the behaviour of python's pool.Map method.

TProcPool::Map

The two possible usages of the Map method are:

For either signature, func is executed as many times as needed by a pool of fNWorkers workers; the number of workers can be passed to the constructor or set via SetNWorkers. It defaults to the number of cores.
A collection containing the result of each execution is returned.
Note: the user is responsible for the deletion of any object that might be created upon execution of func, returned objects included: TProcPool never deletes what it returns, it simply forgets it.
Note: that the usage of TProcPool::Map is indicated only when the task to be executed takes more than a few seconds, otherwise the overhead introduced by Map will outrun the benefits of parallel execution on most machines.

Parameters
funca lambda expression, an std::function, a loaded macro, a functor class or a function that takes zero arguments (for the first signature) or one (for the second signature).
argsa standard container (vector, list, deque), an initializer list or a pointer to a TCollection (TList*, TObjArray*, ...).

Note: the version of TProcPool::Map that takes a TCollection* as argument incurs in the overhead of copying data from the TCollection to an STL container. Only use it when absolutely necessary.
Note: in cases where the function to be executed takes more than zero/one argument but all are fixed except zero/one, the function can be wrapped in a lambda or via std::bind to give it the right signature.
Note: the user should take care of initializing random seeds differently in each process (e.g. using the process id in the seed). Otherwise several parallel executions might generate the same sequence of pseudo-random numbers.

Return value:

If T derives from TCollection Map returns a TObjArray, otherwise it returns an std::vector. In both cases, the elements in the container will be the objects returned by func.

Examples:

root[] TProcPool pool; auto hists = pool.Map(CreateHisto, 10);
root[] TProcPool pool(2); auto squares = pool.Map([](int a) { return a*a; }, {1,2,3});

TProcPool::MapReduce

This set of methods behaves exactly like Map, but takes an additional function as a third argument. This function is applied to the set of objects returned by the corresponding Map execution to "squash" them to a single object.

Examples:

root[] TProcPool pool; auto ten = pool.MapReduce([]() { return 1; }, 10, [](std::vector<int> v) { return std::accumulate(v.begin(), v.end(), 0); })
root[] TProcPool pool; auto hist = pool.MapReduce(CreateAndFillHists, 10, PoolUtils::ReduceObjects);

Definition at line 38 of file TProcPool.h.

Public Member Functions

 TProcPool (unsigned nWorkers=0)
 Class constructor. More...
 
 ~TProcPool ()
 
 TProcPool (const TProcPool &)=delete
 
TProcPooloperator= (const TProcPool &)=delete
 
template<class F >
auto Map (F func, unsigned nTimes) -> std::vector< decltype(func())>
 Execute func (with no arguments) nTimes in parallel. More...
 
template<class F , class T >
auto Map (F func, T &args) -> std::vector< decltype(++(args.begin()), args.end(), func(args.front())) >
 Execute func in parallel distributing the elements of the args collection between the workers. More...
 
template<class F , class R >
auto MapReduce (F func, unsigned nTimes, R redfunc) -> decltype(func())
 This method behaves just like Map, but an additional redfunc function must be provided. More...
 
template<class F , class T , class R >
auto MapReduce (F func, T &args, R redfunc) -> decltype(++(args.begin()), args.end(), func(args.front()))
 This method behaves just like Map, but an additional redfunc function must be provided. More...
 
template<class F >
auto ProcTree (const std::vector< std::string > &fileNames, F procFunc, const std::string &treeName="", ULong64_t nToProcess=0) -> typename std::result_of< F(std::reference_wrapper< TTreeReader >)>::type
 
template<class F >
auto ProcTree (const std::string &fileName, F procFunc, const std::string &treeName="", ULong64_t nToProcess=0) -> typename std::result_of< F(std::reference_wrapper< TTreeReader >)>::type
 
template<class F >
auto ProcTree (TFileCollection &files, F procFunc, const std::string &treeName="", ULong64_t nToProcess=0) -> typename std::result_of< F(std::reference_wrapper< TTreeReader >)>::type
 
template<class F >
auto ProcTree (TChain &files, F procFunc, const std::string &treeName="", ULong64_t nToProcess=0) -> typename std::result_of< F(std::reference_wrapper< TTreeReader >)>::type
 
template<class F >
auto ProcTree (TTree &tree, F procFunc, ULong64_t nToProcess=0) -> typename std::result_of< F(std::reference_wrapper< TTreeReader >)>::type
 
void SetNWorkers (unsigned n)
 
unsigned GetNWorkers () const
 

Private Types

enum  ETask : unsigned {
  ETask::kNoTask = 0, ETask::kMap, ETask::kMapWithArg, ETask::kMapRed,
  ETask::kMapRedWithArg, ETask::kProcByRange, ETask::kProcByFile
}
 A collection of the types of tasks that TProcPool can execute. More...
 

Private Member Functions

template<class T >
void Collect (std::vector< T > &reslist)
 Listen for messages sent by the workers and call the appropriate handler function. More...
 
template<class T >
void HandlePoolCode (MPCodeBufPair &msg, TSocket *sender, std::vector< T > &reslist)
 Handle message and reply to the worker. More...
 
void Reset ()
 Reset TProcPool's state. More...
 
template<class T , class R >
Reduce (const std::vector< T > &objs, R redfunc)
 Check that redfunc has the right signature and call it on objs. More...
 
void ReplyToFuncResult (TSocket *s)
 Reply to a worker who just sent a result. More...
 
void ReplyToIdle (TSocket *s)
 Reply to a worker who is idle. More...
 
- Private Member Functions inherited from TMPClient
 TMPClient (unsigned nWorkers=0)
 Class constructor. More...
 
 ~TMPClient ()
 Class destructor. More...
 
 TMPClient (const TMPClient &)=delete
 
TMPClientoperator= (const TMPClient &)=delete
 
bool Fork (TMPWorker &server)
 This method forks the ROOT session into fNWorkers children processes. More...
 
unsigned Broadcast (unsigned code, unsigned nMessages=0)
 Send a message with the specified code to at most nMessages workers. More...
 
template<class T >
unsigned Broadcast (unsigned code, const std::vector< T > &objs)
 Send a message with a different object to each server. More...
 
template<class T >
unsigned Broadcast (unsigned code, std::initializer_list< T > &objs)
 Send a message with a different object to each server. More...
 
template<class T >
unsigned Broadcast (unsigned code, T obj, unsigned nMessages=0)
 Send a message containing code and obj to each worker, up to a maximum number of nMessages workers. More...
 
TMonitorGetMonitor ()
 
bool GetIsParent () const
 
void SetNWorkers (unsigned n)
 Set the number of workers that will be spawned by the next call to Fork() More...
 
unsigned GetNWorkers () const
 
void DeActivate (TSocket *s)
 DeActivate a certain socket. More...
 
void Remove (TSocket *s)
 Remove a certain socket from the monitor. More...
 
void ReapWorkers ()
 Wait on worker processes and remove their pids from fWorkerPids. More...
 
void HandleMPCode (MPCodeBufPair &msg, TSocket *sender)
 Handle messages containing an EMPCode. More...
 

Private Attributes

unsigned fNProcessed
 number of arguments already passed to the workers More...
 
unsigned fNToProcess
 total number of arguments to pass to the workers More...
 
enum TProcPool::ETask fTask
 the kind of task that is being executed, if any More...
 

#include <TProcPool.h>

+ Inheritance diagram for TProcPool:
+ Collaboration diagram for TProcPool:

Member Enumeration Documentation

enum TProcPool::ETask : unsigned
strongprivate

A collection of the types of tasks that TProcPool can execute.

It is used to interpret in the right way and properly reply to the messages received (see, for example, TProcPool::HandleInput)

Enumerator
kNoTask 

no task is being executed

kMap 

a Map method with no arguments is being executed

kMapWithArg 

a Map method with arguments is being executed

kMapRed 

a MapReduce method with no arguments is being executed

kMapRedWithArg 

a MapReduce method with arguments is being executed

kProcByRange 

a ProcTree method is being executed and each worker will process a certain range of each file

kProcByFile 

a ProcTree method is being executed and each worker will process a different file

Definition at line 95 of file TProcPool.h.

Constructor & Destructor Documentation

TProcPool::TProcPool ( unsigned  nWorkers = 0)
explicit

Class constructor.

nWorkers is the number of times this ROOT session will be forked, i.e. the number of workers that will be spawned.

Definition at line 79 of file TProcPool.cxx.

TProcPool::~TProcPool ( )
inline

Definition at line 41 of file TProcPool.h.

TProcPool::TProcPool ( const TProcPool )
delete

Member Function Documentation

template<class T >
void TProcPool::Collect ( std::vector< T > &  reslist)
private

Listen for messages sent by the workers and call the appropriate handler function.

TProcPool::HandlePoolCode is called on messages with a code < 1000 and TMPClient::HandleMPCode is called on messages with a code >= 1000.

Definition at line 509 of file TProcPool.h.

unsigned TProcPool::GetNWorkers ( ) const
inline

Definition at line 78 of file TProcPool.h.

template<class T >
void TProcPool::HandlePoolCode ( MPCodeBufPair msg,
TSocket sender,
std::vector< T > &  reslist 
)
private

Handle message and reply to the worker.

Definition at line 530 of file TProcPool.h.

Referenced by Collect().

template<class F >
auto TProcPool::Map ( F  func,
unsigned  nTimes 
) -> std::vector<decltype(func())>

Execute func (with no arguments) nTimes in parallel.

A vector containg executions' results is returned. Functions that take more than zero arguments can be executed (with fixed arguments) by wrapping them in a lambda or with std::bind.

Definition at line 115 of file TProcPool.h.

template<class F , class T >
auto TProcPool::Map ( F  func,
T &  args 
) -> std::vector < decltype(++(args.begin()), args.end(), func(args.front())) >

Execute func in parallel distributing the elements of the args collection between the workers.

See class description for the valid types of collections and containers that can be used. A vector containing each execution's result is returned. The user is responsible of deleting objects that might be created upon the execution of func, returned objects included. Note: the collection of arguments is modified by Map and should be considered empty or otherwise invalidated after Map's execution (std::move might be applied to it).

Definition at line 159 of file TProcPool.h.

template<class F , class R >
auto TProcPool::MapReduce ( F  func,
unsigned  nTimes,
R  redfunc 
) -> decltype(func())

This method behaves just like Map, but an additional redfunc function must be provided.

redfunc is applied to the vector Map would return and must return the same type as func. In practice, redfunc can be used to "squash" the vector returned by Map into a single object by merging, adding, mixing the elements of the vector.

Definition at line 259 of file TProcPool.h.

template<class F , class T , class R >
auto TProcPool::MapReduce ( F  func,
T &  args,
R  redfunc 
) -> decltype(++(args.begin()), args.end(), func(args.front()))

This method behaves just like Map, but an additional redfunc function must be provided.

redfunc is applied to the vector Map would return and must return the same type as func. In practice, redfunc can be used to "squash" the vector returned by Map into a single object by merging, adding, mixing the elements of the vector.

Definition at line 300 of file TProcPool.h.

TProcPool& TProcPool::operator= ( const TProcPool )
delete
template<class F >
auto TProcPool::ProcTree ( const std::vector< std::string > &  fileNames,
F  procFunc,
const std::string &  treeName = "",
ULong64_t  nToProcess = 0 
) -> typename std::result_of<F(std::reference_wrapper<TTreeReader>)>::type

Definition at line 377 of file TProcPool.h.

template<class F >
auto TProcPool::ProcTree ( const std::string &  fileName,
F  procFunc,
const std::string &  treeName = "",
ULong64_t  nToProcess = 0 
) -> typename std::result_of<F(std::reference_wrapper<TTreeReader>)>::type

Definition at line 430 of file TProcPool.h.

template<class F >
auto TProcPool::ProcTree ( TFileCollection files,
F  procFunc,
const std::string &  treeName = "",
ULong64_t  nToProcess = 0 
) -> typename std::result_of<F(std::reference_wrapper<TTreeReader>)>::type

Definition at line 438 of file TProcPool.h.

template<class F >
auto TProcPool::ProcTree ( TChain files,
F  procFunc,
const std::string &  treeName = "",
ULong64_t  nToProcess = 0 
) -> typename std::result_of<F(std::reference_wrapper<TTreeReader>)>::type

Definition at line 450 of file TProcPool.h.

template<class F >
auto TProcPool::ProcTree ( TTree tree,
F  procFunc,
ULong64_t  nToProcess = 0 
) -> typename std::result_of<F(std::reference_wrapper<TTreeReader>)>::type

Definition at line 463 of file TProcPool.h.

template<class T , class R >
T TProcPool::Reduce ( const std::vector< T > &  objs,
R  redfunc 
)
private

Check that redfunc has the right signature and call it on objs.

Definition at line 556 of file TProcPool.h.

void TProcPool::ReplyToFuncResult ( TSocket s)
private

Reply to a worker who just sent a result.

If another argument to process exists, tell the worker. Otherwise send a shutdown order.

Definition at line 99 of file TProcPool.cxx.

Referenced by HandlePoolCode().

void TProcPool::ReplyToIdle ( TSocket s)
private

Reply to a worker who is idle.

If another argument to process exists, tell the worker. Otherwise ask for a result

Definition at line 117 of file TProcPool.cxx.

Referenced by HandlePoolCode().

void TProcPool::Reset ( void  )
private

Reset TProcPool's state.

Definition at line 87 of file TProcPool.cxx.

Referenced by TProcPool().

void TProcPool::SetNWorkers ( unsigned  n)
inline

Definition at line 77 of file TProcPool.h.

Member Data Documentation

unsigned TProcPool::fNProcessed
private

number of arguments already passed to the workers

Definition at line 89 of file TProcPool.h.

Referenced by ReplyToFuncResult(), ReplyToIdle(), and Reset().

unsigned TProcPool::fNToProcess
private

total number of arguments to pass to the workers

Definition at line 90 of file TProcPool.h.

Referenced by ReplyToFuncResult(), ReplyToIdle(), and Reset().

enum TProcPool::ETask TProcPool::fTask
private

the kind of task that is being executed, if any

Referenced by ReplyToFuncResult(), ReplyToIdle(), and Reset().


The documentation for this class was generated from the following files: