Re: [ROOT] Need advice on TTree building

From: Rene Brun (Rene.Brun@cern.ch)
Date: Sun Mar 30 2003 - 20:50:47 MEST


Hi Topher,

The ROOT Trees mechanism allows you to store any C++ objects
or collections of objects. It is up to you to provide a C++ object 
model that is optimized for your problem.
We provide several examples in the tutorials or test directories
illustrating some of the possibilities.
With Trees, you can decide to store all your information in one
single branch. In this case, calling TTree::GetEntry reads the
full branch in memory. Most of the time, this is not an optimum 
situation for data analysis when you are interested to histogram 
only a small subset of the information.
Using the "split mode", you can create sub-branches of your top level
object, such that at read time, you have the possibility to read
only the branch(es) referenced by your query.
The sub-branches may be created by hand (by calling TTree::Branch)
or automatically by setting the split mode >=1.
When the split mode is activated, the top level branch will be split
to correspond to the internal structure of your top level object.
If one of the members is a TClonesArray, each member of the class
in the TClonesArray is in turn stored in a separate branch.
In this case, you will be able to make a query (via TTree::Draw)
that will make a double loop on events and on each entry in the
array. The next version of ROOT will be able to support the same
split mode for STL collections like vector and list.
The current query algorithm in TTree::Draw supports variable length
arrays where the length of the array is a member of the class.
See for example the member "fClosestDistance" in the test example 
Event.h.
You may have cross-references like in the case you describe.
See example in the tutorial "jets.C".

To give you more information and hints, I would need to see
your object model (header file(s)).

Rene Brun


On 
Thu, 
27 Mar 
2003, Topher Cawlfield wrote:

> I'm a beginning root user, and want to make TTrees for my next analysis 
> project.  The project is a fairly simple one, so it's as good a time as 
> any to step away from ntuples.  But after succeeding in making a TTree 
> with branches holding only simple variables, I am now stumped at how to 
> do the next more complex thing.  I need to store events which have 
> *lists* of various (sparse) detector hits.
> 
> It is not obvious to me what the best way to do this is.  For each event 
> I need to make at least two lists (collections) of items that have a few 
> variables such as what cell was hit, and how much energy.  I have looked 
> at the example at http://root.cern.ch/root/html/examples/Event.C.html 
> but this approach seems unnecessarily complex for my needs.  The first 
> thing I tried to do was make a class with a simple variable array in it, 
> and add the class to my tree.  The member data was a pointer to the 
> array, and the number of elements.  rootcint complained that it needed 
> help with the class since it used pointers, and I could see how there's 
> no way it would know (for example) what variable held the array length.
> 
> Next I tried making a class that defined hit properties, and added a 
> TObjArray (to hold instances of the class) to a branch.  This compiled 
> after some effort, but gave me a segmentation fault upon execution.  The 
> stack trace shows the fault occurring when the branch is defined.
> 
> ...
>    TObjArray *ccaa = new TObjArray(50); // initial capacity of 50
> ...
>    tree->Branch("CCABitArray", "TObjArray", ccaa); // crashes here
> ...
> 
> I'm not sure what to do here.  The objects I'm storing in the array are 
> proper subclasses of TObject, but that doesn't matter because I get the 
> same crash whether or not the TObjArray is empty when defining the branch.
> 
> I now see that I can add variable-length arrays to a tree, and maybe 
> that's the way to go.  But I'm still curious what I did wrong with the 
> method above.  The other alternative is to put the TObjArray into 
> another class, as in the tutorial example.  But I don't see how this 
> will help the problem I ran into, and it doesn't really simplify the 
> project.
> 
> Any advice?
> 
>  - Topher Cawlfield
> 
> p.s.
>   I was honestly very disappointed with TTrees and how they handle data 
> like this.  Common tasks should be easy, and events contaning 
> variable-length lists of things are the norm for any analysis in 
> particle physics.  The analysis package that I'm most familliar with is 
> an ancient, home-brew Fortran beast that at least had the advantage of 
> making certain common tasks easy.  Its "ntuple"-equivalent had a 
> two-level structure with "global" quantities and "combination" 
> quantities.  Typically global quantities were reserved for events, and 
> combination quantities for decay candidates.  Or, globals could be a 
> fit, and combinations might loop over fit parameters.  Another 
> successful use was to have globals refer to decay candidates and 
> combination quantities run over each track in the candidate.
> 
> I'm describing all this to make two observations.  First, this built-in 
> two-level structure was fantastically useful.  Second, it was also very 
> limiting and forced one to make clever, arbitrary decisions for any 
> analysis.  So naturally, when thinking about how I would reinvent an 
> analysis package if given the opportunity, I decided that a tree-like 
> structure would be best.  The trunk of the tree would contain quantities 
> corresponding to the "global" quantities of the old program, perhaps 
> describing general event properties such as the number of tracks, total 
> energy, etc.  Then any number of branches could be added and implicitly 
> looped over.  One branch could hold decay candidates, and a sub-branch 
> might hold track-by-track information.
> 
> So, when I began to see the term TTree in the root guide, I got *very* 
> excited and rejoiced that someone else was thinking the same way, and 
> finally got it "right!"  I gradually learned that my initial assumptions 
> were all wrong, and that TTrees are just a structure used to minimize 
> computing time and don't really make an analysis any easier.  And now 
> I'm learning that even a simple two-level event structure is a major 
> headache to implement, and will require special code just to histogram 
> simple variables (Draw won't work on elements of a variable-length list, 
> will it?).  What a let-down!
> 
> So I'm now trying to think constructively.  Maybe what's needed is a way 
> to have multiple trees, and be able to link "instances" (one "instance" 
> per call to Fill()) of one tree to instances of another in a one-to-many 
> fasion.  The nested loops that would normally be required would then be 
> handled automatically through recursion, or not be needed at all.  Well, 
> this is still a vague idea, but I thought I'd throw it out and see if 
> anyone else has been thinking along these lines or likes the idea. 
>  Imagine building TTrees where you have a general event information 
> tree, a track list tree, track combination tree, and decay candidate 
> tree, all interlinked.
> 



This archive was generated by hypermail 2b29 : Thu Jan 01 2004 - 17:50:10 MET