Logo ROOT  
Reference Guide
 
Loading...
Searching...
No Matches
Job.cxx
Go to the documentation of this file.
1/*
2 * Project: RooFit
3 * Authors:
4 * PB, Patrick Bos, Netherlands eScience Center, p.bos@esciencecenter.nl
5 * IP, Inti Pelupessy, Netherlands eScience Center, i.pelupessy@esciencecenter.nl
6 *
7 * Copyright (c) 2021, CERN
8 *
9 * Redistribution and use in source and binary forms,
10 * with or without modification, are permitted according to the terms
11 * listed in LICENSE (http://roofit.sourceforge.net/license.txt)
12 */
13
17
18namespace RooFit {
19namespace MultiProcess {
20
21/** @class Job
22 *
23 * @brief interface class for defining the actual work that must be done
24 *
25 * Think of "job" as in "employment", e.g. the job of a baker, which
26 * involves *tasks* like baking and selling bread. The Job must define the
27 * tasks through its execution (evaluate_task), based on a task index argument.
28 *
29 * Classes inheriting from Job must implement the pure virtual methods:
30 * - void evaluate_task(std::size_t task)
31 * - void send_back_task_result_from_worker(std::size_t task)
32 * - void receive_task_result_on_master(const zmq::message_t & message)
33 *
34 * An example/reference implementation can be found in test_Job.cxx.
35 *
36 * Most Jobs will also want to override the virtual update_state() function.
37 * This function can be used to send and receive state from master to worker.
38 * In the worker loop, when something is received over the ZeroMQ "SUB" socket,
39 * update_state() is called to put the received data into the right places,
40 * thus updating for instance parameter values on the worker that were updated
41 * since the last call on the master side.
42 *
43 * ## Message protocol
44 *
45 * One simple rule must be upheld for the messages that the implementer will
46 * send with 'send_back_task_result_from_worker' and 'update_state': the first
47 * part of the message must always be the 'Job''s ID, stored in 'Job::id'.
48 * The rest of the message, i.e. the actual data to be sent, is completely up
49 * to the implementation. Note that on the receiving end, i.e. in the
50 * implementation of 'receive_task_result_on_master', one will get the whole
51 * message, but the 'Job' ID part will already have been identified in the
52 * 'JobManager', so one needn't worry about it further inside
53 * 'Job::receive_task_result_on_master' (it is already routed to the correct
54 * 'Job'). The same goes for the receiving end of 'update_state', except that
55 * update_state is routed from the 'worker_loop', not the 'JobManager'.
56 *
57 * A second rule applies to 'update_state' messages: the second part must be
58 * a state identifier. This identifier will also be sent along with tasks to
59 * the queue. When a worker then takes a task from the queue, it can check
60 * whether it has already updated its state to what is expected to be there
61 * for the task at hand. If not, it should wait for the new state to arrive
62 * over the state subscription socket. Note: it is the implementer's task to
63 * actually update 'Job::state_id_' inside 'Job::update_state()'!
64 *
65 * ## Implementers notes
66 *
67 * The type of result from each task is strongly dependent on the Job at hand
68 * and so Job does not provide a default results member. It is up to the
69 * inheriting class to implement this in the above functions. We would have
70 * liked a template parameter task_result_t, so that we could also provide a
71 * default "boilerplate" calculate function to show a typical Job use-case of
72 * all the above infrastructure. This is not trivial, because the JobManager
73 * has to keep a list of Job pointers, so if there would be different template
74 * instantiations of Jobs, this would complicate this list.
75 *
76 * A typical Job implementation will have an evaluation function that is
77 * called from the master process, like RooAbsArg::getVal calls evaluate().
78 * This function will have three purposes: 1. send updated parameter values
79 * to the workers (possibly through update_state() or in a dedicated
80 * function), 2. queue tasks and 3. wait for the results to be retrieved.
81 * 'Job::gather_worker_results()' is provided for convenience to wait for
82 * all tasks to be retrieved for the current Job. Implementers can also
83 * choose to have the master process perform other tasks in between any of
84 * these three steps, or even skip steps completely.
85 *
86 * Child classes should refrain from direct access to the JobManager instance
87 * (through JobManager::instance), but rather use the here provided
88 * Job::get_manager(). This function starts the worker_loop on the worker when
89 * first called, meaning that the workers will not be running before they
90 * are needed.
91 */
92
93Job::Job() : id_(JobManager::add_job_object(this)) {}
94
95Job::Job(const Job &other) : id_(JobManager::add_job_object(this)), state_id_(other.state_id_), _manager(other._manager)
96{
97}
98
100{
102}
103
104/** \brief Get JobManager instance; create and activate if necessary
105 *
106 * Child classes should refrain from direct access to the JobManager instance
107 * (through JobManager::instance), but rather use the here provided
108 * Job::get_manager(). This function starts the worker_loop on the worker when
109 * first called, meaning that the workers will not be running before they
110 * are needed.
111 */
113{
114 if (!_manager) {
116 }
117
118 if (!_manager->is_activated()) {
120 }
121
122 return _manager;
123}
124
125/// Wait for all tasks to be retrieved for the current Job.
127{
129}
130
131/// \brief Virtual function to update any necessary state on workers
132///
133/// This function is called from the worker loop when something is received
134/// over the ZeroMQ "SUB" socket. The master process sends messages to workers
135/// on its "PUB" socket. Thus, we can update, for instance, parameter values
136/// on the worker that were updated since the last call on the master side.
137/// \note Implementers: make sure to also update the state_id_ member.
139
140/// Get the current state identifier
141std::size_t Job::get_state_id()
142{
143 return state_id_;
144}
145
146} // namespace MultiProcess
147} // namespace RooFit
Main point of access for all MultiProcess infrastructure.
Definition JobManager.h:30
static JobManager * instance()
static bool remove_job_object(std::size_t job_object_id)
void retrieve(std::size_t requesting_job_id)
Retrieve results for a Job.
void activate()
Start queue and worker loops on child processes.
interface class for defining the actual work that must be done
Definition Job.h:25
std::size_t get_state_id()
Get the current state identifier.
Definition Job.cxx:141
std::size_t id_
Definition Job.h:45
std::size_t state_id_
Definition Job.h:46
virtual void update_state()
Virtual function to update any necessary state on workers.
Definition Job.cxx:138
JobManager * _manager
Definition Job.h:50
JobManager * get_manager()
Get JobManager instance; create and activate if necessary.
Definition Job.cxx:112
void gather_worker_results()
Wait for all tasks to be retrieved for the current Job.
Definition Job.cxx:126
The namespace RooFit contains mostly switches that change the behaviour of functions of PDFs (or othe...
Definition CodegenImpl.h:64