Re: Few questions about PROOF

From: Fons Rademakers <Fons.Rademakers_at_cern.ch>
Date: Thu, 04 Jan 2007 16:57:26 +0100


Hi Antonio,

Antonio Bulgheroni wrote:
> Dear ROOTers,
>
> first of all let me wish you all the best for the new year!
>
> I'm writing you because I'm trying to better understand the way
> PROOF works since it seems to fit all my requirements for parallel event
> analysis. I'm collaborating in a pretty big standalone ROOT-based code
> with some tens of classes for data analysis of pixel detectors (
> sucimaPix <http://groups-beta.google.com/group/sucimaPix-dev>). Even if
> it is in principle possible to load sucimaPix shared libraries into a
> ROOT interactive session and run the analysis job from the command line,
> we prefer to build some executables running in standalone mode. I'm
> using ROOT 5-15/01 on a linuxx8664 box with gcc 4.1.1. Here come my
> questions:
>
> * If I got it right, the best way to exploit PROOF is via a
> TSelector derived class. That's not a problem since my input data
> are already saved into a TTree, so I just need to produce the
> skeleton of a TSelector and fill in the empty methods. Then I can
> successfully run the TSelector from my standalone program adding
> a line like:
>
> MyTree->Process(MySelector);
>
> In the case I want to run it on PROOF, I added at the beginning a line
> like this:
>
> TProof::Open("localhost");
>
> (the PROOF server is properly set-up and the two processors of my PC are
> found)
>
> But, first of all, I wasn't able to compile/link my code using the
> standard `root-config --cflags --libs` command because it was
> complaining that the library containing Proof was missing.
> Using the trial and error approach, I managed to have it working adding
> the following libs
>
> -lProof -lThread -lTreePlayer
>
> Is that correct? is there a smarter way? Is it possible to use PROOF
> from outside an interactive ROOT session? Is it enough to add the
> TProof::Open() statement to have the Process() worked out by the
> cluster? or should I do something more?
>

If you want to build a standalone app that uses PROOF you will have to link with these libs (you found the correct ones).

> * I tried to understand how PROOF works and I guessed that each
> slave has to have its own copy of the input file. Is it right? In
> the case of a single host cluster with a multiple-processor is it
> true as well? Is a step-by-step tutorial for PROOF available
> somewhere?

Slaves can work on the same input files, in that case slaves will process unique ranges of events in this same file.
>
> * In my analysis procedure I have several histograms to be booked
> and filled. I believe that all these objects must be added to the
> fOutput list in order to have the different slave contributions
> merged together at the end. Is it correct? Is it true for all
> objects having a Merge() method? Should the booking be done in the
> Begin() or in the SlaveBeginning() method?
>

All objects added to the output list will come back to the user, merged when they have a Merge() method (implicit for histograms and other basic ROOT statistical objects).
The booking can be done in the Begin() (on the client), but than the objects should be added to the input list so that they get transfered to the slaves and are available there. In SlaveBegin(), as it runs only on the slaves, you can just create these objects and add them directly to the output list.

> * In my original analysis procedure I had a TList containing a set
> of reference histograms related to the full pixel detector and a
> certain variable number of other TList containing the same
> histograms as the main one but concerning only a region of the
> detector. To solve this problem I created a TList containing TList
> of histograms. In this way, looping on all the entries of the
> outer list, I can fill the histos of the sub-region. Is there a
> way to make it compatible with the PROOF mechanism of the fOutput
> list? If I add a TList MyList to fOutput, the content of MyList
> will be merged at the end?
>

These reference histograms you typically create in the Begin() method on the client and then add them to the input list. You can also add a entire TList with your histograms to the input list.

Cheers, Fons.

> Thank you very much for your precious help!
>
> Best regards,
>
>
> --
> Antonio Bulgheroni, PhD
> INFN - Sez. Roma III

-- 
Org:    CERN, European Laboratory for Particle Physics.
Mail:   1211 Geneve 23, Switzerland
E-Mail: Fons.Rademakers_at_cern.ch              Phone: +41 22 7679248
WWW:    http://fons.rademakers.org           Fax:   +41 22 7669640
Received on Thu Jan 04 2007 - 16:57:55 CET

This archive was generated by hypermail 2.2.0 : Fri Jan 05 2007 - 11:50:01 CET