Re: benchmarking TClonesArray vs STL vector

From: Pasha Murat (murat@cdfsga.fnal.gov)
Date: Wed Dec 16 1998 - 04:58:31 MET


Chris Green writes:
 > On Tue, 15 Dec 1998, Pasha Murat wrote:
 > 
 > > 
 > > Hi Rooters,
 > > 
 > > 	I finally found some time to benchmark ROOT TClonesArray against
 > > STL vectors. Briefly the results: for HEP-like applications (loop over
 > > the events) TClonesArray seems to be more efficient than STL vectors
 > > even if no sorting of the objects is involved.  On SGI IRIX6.2
 > > (R10000) the test code using TClonesArray runs about 50% faster than
 > > similar code using STL vector of objects and about 3 times faster than
 > > the code using STL vector of pointers.
 > > 
 > > 	The results on Linux are similar, those are preliminary though.
 > > 
 > > While satisfied, I do not quite understand why TClonesArray performed
 > > in this test faster than vector. It is also clear that when the
 > > sorting will be involved the difference should become even more
 > > striking. May be there is something in my code I don't see?
 > 
 > Hi, Pasha.
 > 
 > I'm sure Rene will fill in the blanks and correct me when I'm wrong, but I
 > had a chat with an expert (Jim Kowalkowski) about this, and he pointed out
 > the following: the main reason TClonesArray is faster is because of the
 > fact that Rene has redone the memory allocation, so that it is done in
 > chunks rather than per object, and the memory is recycled. The facility
 > exists in STL, too -- all (I think) the containers allow you to specify an
 > allocator which is used to allocate the memory. The default is the one we
 > all know and love, but when extreme speed is needed for particular
 > applications, it is common to custom-write one's own allocator. It would
 > be interesting to see how an STL vector would perform in your test were
 > one to write an allocator using the same algorithm as the TClonesArray. To
 > first order, I'm guessing the two would be identical.
 > 
 > As you and I have discussed privately though, more worrying is the fact
 > that problems occur when using ROOT code which has been compiled
 > optimised: that puts a serious limitation on execution speed right from
 > the beginning, and mitigates against using ROOT as it is right now in
 > applications where speed is critical.
 > 
 > Cheers,
 > Chris.
 > 

	Hi Chris, all you're saying is true in case of vector of pointers,
which in a sense is "equivalent" to TClonesArray (which is a vector of pointers 
to TObject's), but unlike TClonesArray doesn't reuse memory.
If you take a look at the implementation of the STL vector of objects you'll see
that STL vector of objects does reuse memory even if no custom allocator
is supplied. Use of the specialized constructor (new with placement) could
be the more realistic explanation of the difference in timing between 
the TClonesArray and the vector of objects.
	I think that from the practical point of view it is very important that 
in addition to that ROOT containers provide a lot of utilities, first of all - 
persistency mechanism, which STL containers do not have (and thus user has to 
put time, brains and work into implementation of such), ROOT container classes 
also seem to be implemented quite efficiently (unless there is a flaw in the 
benchmarking code). 
	And of course, you're right when saying that there was something we
probably were missing which didn't allow us to compile ROOT 2.20/03 under
Linux/KAI with optimizations ON. I however bet that there is something 
basically wrong with this version, because I failed to build it on any
platform/compiler configuration I tried with whatever flags I used.
This certainly should be understood and fixed.
							-pasha.



This archive was generated by hypermail 2b29 : Tue Jan 04 2000 - 00:34:41 MET