Re: benchmarking TClonesArray vs STL vector

From: Rene Brun (Rene.Brun@cern.ch)
Date: Wed Dec 16 1998 - 14:59:19 MET


Hi Pasha,
Thanks for this very interesting report comparing STL with Root
container TClonesArray. I have run your simple benchmark
on my Linux/egcs machine and also under Alpha/Unix with cxx6
and found exactly the same ratios.
When we started the Root project, we quickly realized the vital
importance of a container like TClonesArray. We believe that this
container class matches pretty well with the frequent case
of "physics objects" in particular during the analysis phase.
Most of the time, we have to deal with a huge number of tiny
physics objects (hits, digits, tracks, etc).
When designing an object model, one must be extremely careful
with overheads (memory and management) induced by the large quantity
of these objects.
In a context, where you generate/analyze many events having in turn
a long list of tiny objects, you may spend a substantial fraction
of your time in creating/deleting these objects.
After all, the object structure is so simple that it seems a non-sense
to not reuse the slots occupied in memory by similar objects
in the previous event.
We implemented (well, Fons did most of the work) the TClonesArray
container exploiting as much as as possible the standard and nice
C++ facility "new with placement". This saves an incredible overhead
in allocating/deallocating memory.
In addition, we also realized that TClonesArray are also a nice and
simple solution for ntuples replacement. Splitting with TTrees
become possible and automatic.
I will add the following remark. Your simple test is biased in favour
of STL. All the objects created belong to the same container.
In a more realistic example, where several containers will be filled
or accessed in parallel, I am convinced that TClonesArrays will even
show a better performance.
We could may be add a few additional and specialized containers
"a la TClonesArray". I am thinking to one specially.
Suppose the frequent case where you have hits/tracks referencing each
other. Having pointers as data members of the tiny objects may
induce a big penalty in I/O (4 or 8 bytes required). In case
you always reference the same list, it may be more clever to store
only the index inside the list, typically saving a factor 4 or 8
in storage.

I have now a few comments concerning Chris Green's mail.
We do not play any special trick in TClonesArray. We simply exploit
at best a standard (and not well known) C++ functionality.
Chris mentions reliability problems with TClonesArray.
Could you forward to us any evidence/description of these problems?

Rene Brun

Pasha Murat wrote:
> 
> Hi Rooters,
> 
>         I finally found some time to benchmark ROOT TClonesArray against STL vectors.
> Briefly the results: for HEP-like applications (loop over the events) TClonesArray seems
> to be more efficient than STL vectors even if no sorting of the objects is involved.
> On SGI IRIX6.2 (R10000) the test code using TClonesArray runs about 50% faster than
> similar code using STL vector of objects and about 3 times faster than the code using STL
> vector of pointers.
> 
>         The results on Linux are similar, those are preliminary though.
> 
> While satisfied, I do not quite understand why TClonesArray performed in this test
> faster than vector. It is also clear that when the sorting will be involved the
> difference should become even more striking. May be there is something in my code
> I don't see?
> 
> See the description of the test conditions and the complete source code of the program
> executed (including the Makefile) at URL
> 
> http://www-cdf.fnal.gov/upgrades/computing/projects/run2mc/run2mc.html
> 
> (see news from Dec 14, this includes a little bit of advertisement of new CDF MC web page),
> 
> a direct pointer is:
> 
> http://www-cdf.fnal.gov/upgrades/computing/projects/run2mc/minutes/1998_12_10/TClonesArray_vs_vector.html
> 
>                         I appreciate any comments, regards, pasha.



This archive was generated by hypermail 2b29 : Tue Jan 04 2000 - 00:34:41 MET