Analysing the performance tree
The performance tree is a TTree containing detailed information about the workflow of a PROOF query. The tree is not created by default: detailed instructions to create and save it can be found in the dedicated section.
Content of this page:
- TProofPerfAnalysis: set of tools to analyse the performance tree
- Description of the performance tree
1. TProofPerfAnalysis: set of tools to analyse the performance tree
The container class TProofPerfAnalysis is an interface to a set of tools to analyse the performance tree of a given query. The class is located under $ROOTSYS/proof/proofbench and therefore appears in the $ROOTSYS/lib/libProofBench.so plug-in. The class has been introduced during the development cycle 5.33/01 and back-ported to the patch branches of production versions 5.30 and 5.32; it therefore ships with the ROOT development release 5.33/02 and with the patch releases 5.30/07 and 5.32/01.
1.1 Constructor
The class constructor has the following signature:
TProofPerfAnalysis(const char *perffile, const char *title = "", const char *treename = "PROOF_PerfStats");
with one mandatory and two optional arguments:
- perffile, mandatory, is the file to be analysed; can also be located remotely, for example on the web.
- title, optional, is the title to be used on the canvas to characterize the results of the analyzed queries; by default is set to ""
- treename, optional, the name of the tree to be analyzed; default is "PROOF_PerfStats". It is also possible to speficy a subdirectory or to add the string passed her to 'PROOF_PerfStats', so that the final serached name is 'PROOF_PerfStats<treename>'. If the name specified by treename, or derived by treename, is not found, the constructor will look into the file directory structure for the first TTree whose name contains treename. This is typically enough tolocate the performance tree in the file.
If the initialization is successful, the constructor will run some analysis on the worker performance and print a summary of the result.
Here is an example:
root [0] TProofPerfAnalysis ppa1("H4muEOS-001.root") +++ TTree 'PROOF_PerfStats' has 612 entries +++ 16 workers were active during this query +++ Total query time: 713.634155 secs (init: 0.133229 secs, merge: 432.601013 secs) +++ Avg processing rates: 153.8833 evts/s, 29.3369 MB/s +++ Max processing rates: 594.7538 evts/s, 80.6259 MB/s +++ 11 files were processed during this query
The summary information can be also re-printed with the method
void Summary(Option_t *opt = "", const char *out = "") .
root [1] ppa1.Summary("S") 16 713.634155 0.133229 432.601013 153.883303 594.753847 29.336907 80.625946
The second argument specifies a text file where to save the information; the file is (re)created at each call, unless a '+' is pre-pended to the file name, in which case the information is appended to en existing file.
1.2 Worker query information
The performance tree contains detailed information about packets. This can be used to the performance of a given workers during the query. A summary of the worker activity can be displayed using the method PrintWrkInfo, available with two signatures:
void PrintWrkInfo(Int_t showlast = 10); void PrintWrkInfo(const char *wrk);
root [1] pf.PrintWrkInfo(2) +++ TWrkInfo ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ +++ Worker: 0.100, lxfssl3310.cern.ch +++ Activity interval: 3.842385 -> 433.544617 +++ Amounts processed: 32 packets (32 remote), 263633 evts, 2516378014 bytes +++ Processing time: 429.675842 s (CPU: 48.580000 s) +++ Averages: 613.562538 evts/s, 5.585152 MB/s +++ Total latency: 0.025222 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ +++ TWrkInfo ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ +++ Worker: 0.112, lxfssl3404.cern.ch +++ Activity interval: 3.500722 -> 433.617889 +++ Amounts processed: 18 packets (18 remote), 234250 evts, 2084701916 bytes +++ Processing time: 430.099195 s (CPU: 49.230000 s) +++ Averages: 544.641801 evts/s, 4.622484 MB/s +++ Total latency: 0.017882 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
root [1] pf.PrintWrkInfo("0.100,lxfssl3404.cern.ch") +++ TWrkInfo ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ +++ Worker: 0.100, lxfssl3310.cern.ch +++ Activity interval: 3.842385 -> 433.544617 +++ Amounts processed: 32 packets (32 remote), 263633 evts, 2516378014 bytes +++ Processing time: 429.675842 s (CPU: 48.580000 s) +++ Averages: 613.562538 evts/s, 5.585152 MB/s +++ Total latency: 0.025222 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ +++ TWrkInfo ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ +++ Worker: 0.112, lxfssl3404.cern.ch +++ Activity interval: 3.500722 -> 433.617889 +++ Amounts processed: 18 packets (18 remote), 234250 evts, 2084701916 bytes +++ Processing time: 430.099195 s (CPU: 49.230000 s) +++ Averages: 544.641801 evts/s, 4.622484 MB/s +++ Total latency: 0.017882 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Graphically is interesting to see how the processing activity developed during the query time. This can be displayed using the WorkerActivity method
void WorkerActivity();

The upper plots show the query start-up and processing end. For a good query - like the one shown in here - the starting and finishing time is very close for all the workers. Bad performing workers will show up as long tails in here. The bottom plot quantifies the query finishing time and measures the spread across the workers.
1.3 Event and packet distributions
The packet and event distribution across workers can be displayed using the method EventDist():
void EventDist();
1.4 Processing rate vs query time
The method RatePlot() displays the event and MB processing rates as a function of the query time per each worker. The signature is the following:
void RatePlot(const char *wrks = 0);
An example of the plot is shown in the figure. 
1.5 Packet retrieval latency vs query time
The latency in getting a packet from the master measured by the worker can displayed as a function of the query time using the method LatencyPlot():
void LatencyPlot(const char *wrks = 0);
An example of the plot is shown in the figure. 
1.6 File distribution
For data-driven queries reading files from several servers it may be interesting to look at the distribution of packets across the servers and workers. This is the purpose of the FileDist() method:
void FileDist(Bool_t writedetails = kFALSE);
Examples of the plots are shown in the following figures. 

1.7 Detailed processing information per file
Two methods are provided to display detailed information about the processing of each file:
void PrintFileInfo(Int_t showslowest = 10, const char *opt = "", const char *out = 0); void PrintFileInfo(const char *fn, const char *opt = "P", const char *out = 0);
The first method displays information about the files that took longer to process (by default the last ten):
root [6] ppa1.PrintFileInfo(3) +++ TFileInfo ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ +++ Server: root://eosatlas/ +++ File: /eos/atlas/user/g/gganis/data1/mc12_8TeV.160155.PowhegPythia8_AU2CT10_ggH125_ZZ4lep.merge.NTUP_HSG2.e1191_s1469_s1470_r3542_r3549_p1344_tid01133894_00/NTUP_HSG2.01133894._000009.root.1 +++ Processing interval: 0.090499 -> 229.785583 +++ Packets: 45 (45 remote) +++ Processing wrks: 7 (7 remote) +++ MB rates: 5.388210 MB/s (avg), 2.640950 MB/s (min), 10.326082 MB/s (max) +++ Sizes: 222 (avg), 19 (min), 1000 (max) +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ +++ TFileInfo ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ +++ Server: root://eosatlas/ +++ File: /eos/atlas/user/g/gganis/data1/mc12_8TeV.160155.PowhegPythia8_AU2CT10_ggH125_ZZ4lep.merge.NTUP_HSG2.e1191_s1469_s1470_r3542_r3549_p1344_tid01133894_00/NTUP_HSG2.01133894._000001.root.1 +++ Processing interval: 0.062248 -> 612.267883 +++ Packets: 11 (11 remote) +++ Processing wrks: 2 (2 remote) +++ MB rates: 7.362400 MB/s (avg), 0.239816 MB/s (min), 11.984926 MB/s (max) +++ Sizes: 909 (avg), 686 (min), 1180 (max) +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ +++ TFileInfo ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ +++ Server: root://eosatlas/ +++ File: /eos/atlas/user/g/gganis/data1/mc12_8TeV.160155.PowhegPythia8_AU2CT10_ggH125_ZZ4lep.merge.NTUP_HSG2.e1191_s1469_s1470_r3542_r3549_p1344_tid01133894_00/NTUP_HSG2.01133894._000004.root.1 +++ Processing interval: 151.918732 -> 713.634155 +++ Packets: 159 (159 remote) +++ Processing wrks: 15 (15 remote) +++ MB rates: 7.150163 MB/s (avg), 0.129910 MB/s (min), 19.217761 MB/s (max) +++ Sizes: 62 (avg), 8 (min), 1000 (max) +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
The second prototype allows to address a given file. Using the option filed, detailed information about the packets can be obtained:
root [10] ppa1.PrintFileInfo("/eos/atlas/user/g/gganis/data1/mc12_8TeV.160155.PowhegPythia8_AU2CT10_ggH125_ZZ4lep.merge.NTUP_HSG2.e1191_s1469_s1470_r3542_r3549_p1344_tid01133894_00/NTUP_HSG2.01133894._000001.root.1", "WP") +++ TFileInfo ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ +++ Server: root://eosatlas/ +++ File: /eos/atlas/user/g/gganis/data1/mc12_8TeV.160155.PowhegPythia8_AU2CT10_ggH125_ZZ4lep.merge.NTUP_HSG2.e1191_s1469_s1470_r3542_r3549_p1344_tid01133894_00/NTUP_HSG2.01133894._000001.root.1 +++ Processing interval: 0.062248 -> 612.267883 +++ Packets: 11 (11 remote) +++ Processing wrks: 2 (2 remote) Collection name='TList', class='TList', size=2 Worker: 0.3, packet(s): 10 1000 evts, 4.92 MB/s, 0.062 -> 22.280 s 900 evts, 5.63 MB/s, 22.283 -> 53.295 s 713 evts, 11.98 MB/s, 53.296 -> 63.765 s 820 evts, 10.39 MB/s, 63.766 -> 71.557 s 960 evts, 9.76 MB/s, 71.557 -> 81.750 s 1075 evts, 9.92 MB/s, 81.751 -> 92.715 s 1180 evts, 5.75 MB/s, 92.715 -> 117.761 s 892 evts, 8.44 MB/s, 117.761 -> 127.196 s 774 evts, 5.77 MB/s, 127.196 -> 144.127 s 686 evts, 8.17 MB/s, 144.128 -> 151.290 s Worker: 0.8, packet(s): 1 1000 evts, 0.24 MB/s, 0.111 -> 612.268 s +++ MB rates: 7.362400 MB/s (avg), 0.239816 MB/s (min), 11.984926 MB/s (max) +++ Sizes: 909 (avg), 686 (min), 1180 (max) +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
2. Description of the performance tree
The name of the tree is PROOF_PerfStats and its entries are instances of the class TPerfEvent. The tree is filled via the TPerfStats interface class, an implementation of TVirtualPerfStats. The class TPerfEvent contains the following:
class TPerfEvent : public TObject { public: TString fEvtNode; // node on which the event was generated TTimeStamp fTimeStamp; // time offset from start of run TVirtualPerfStats::EEventType fType; TString fSlaveName; TString fNodeName; TString fFileName; TString fFileClass; TString fSlave; Long64_t fEventsProcessed; Long64_t fBytesRead; Long64_t fLen; Double_t fLatency; Double_t fProcTime; Double_t fCpuTime; Bool_t fIsStart; Bool_t fIsOk; TPerfEvent(TTimeStamp *offset = 0); virtual ~TPerfEvent() {} Bool_t IsSortable() const { return kTRUE; } Int_t Compare(const TObject *obj) const; void Print(Option_t *option="") const; ClassDef(TPerfEvent,3) // Class holding TProof Event Info };
The meaning of the different members depend of the event type. The types are described in TVirtualPerfStats.h :
enum EEventType { kUnDefined, kPacket, //info of single packet processing kStart, //begin of run kStop, //end of run kFile, //file started/finished in packetizer kFileOpen, //opening data file statistics kFileRead, //reading data file event kRate, //processing {evt, MB} rates kNumEventType //number of entries, must be last };
Not all event types are filled during a PROOF query, but only the following:
- kStart, marking the begin of query processing;
- kPacket, after each packet has been processed;
- kFile, when a file is assigned to a worker and when there is no more packets from the file to be given to the worker;
- kRate, when a message with processing progress information is sent to the client;
- kStop, marking the end of query processing.
Each entry in the tree has a timestamp, so that time correlation between the different events are possible.