Re: Bugs in tree I/O code

From: Rene Brun (Rene.Brun@cern.ch)
Date: Mon Oct 04 1999 - 09:23:25 MEST


Hi Fred,
Thanks for reporting these problems with the TTree reading functions.
I have introduced your suggested changes in the development version.
I would be very interested to know in which situation you are getting
the problem with mmzip. I left a warning message in the code when this
situation
happens. Did you get this warning ?

Concerning your second point, Root has a default internal buffer of 1000
words to store the pointers to entries into the branch buffers.
This is not the buffer size used to read from the file.
I agree that the piece of code managing the TTree reading is complex.
To understand the logic, one must have the picture describing the TTree
data structures. See http://root.cern.ch/root/html/TTree.html

Rene Brun

Fred Gray wrote:
> 
> I would like to bring to everyone's attention a few bugs that we have found
> in ROOT that relate to tree I/O.  For appropriate combinations of data
> structures and buffer sizes, they can cause ROOT to crash with a segmentation
> violation or even to return incorrect data.  I have filed bug reports in
> the bug tracking system for them (PR#508 and 509), but I think that they
> are of sufficient urgency that they should be brought to the attention of
> the entire ROOT community (so that no one else has to spend days upon
> nights upon days hunting for them).
> 
> First: Before a buffer is written to a tree in compressed mode, a check is
> made to ensure that the compression actually caused the size of the buffer
> to decrease.  The case where the compressed size exactly equals the
> uncompressed size is handled incorrectly, though: the compressed version
> is written, but it is not decompressed when it is read back in.
> 
> The patch for this one is to change the ">" to ">=" on line 374 of
> TREE_Basket.cxx:
> 
>     372       mmzip(cxlevel, &fObjlen, objbuf, &bufmax, &fBuffer[fKeylen],
> &nout);
>     373 //printf("mmzip:%s, bufmax=%d, fObjlen=%d, fKeylen=%d, nout=%d\n",
> GetName(),bufmax,fObjlen,fKeylen,nout);
>     374       if (nout > fObjlen) {
>     375          delete [] fBuffer;
>     376          fBuffer = fBufferRef->Buffer();
> 
> Second: Buffers seem to be read back from a tree in units of 1000
> bytes.  (Why 1000? Wouldn't a multiple of 512 be more efficient?)
> After reading that amount, a check is made to determine whether the
> entire buffer was contained in those 1000 bytes.  If not, the rest is
> read in at some point.  Unfortunately, this check only takes into
> account the length of the object itself, not the length of the key
> that is written along with the object.
> 
> The patch for this issue is to change line 440 in the following code:
> 
>     440    if (basket->GetObjlen() <= nb) {
>     441       buffer->SetBufferOffset(0);
>     442       basket->ReadBasketBuffers(buf);
>     443    }
> 
> to be:
> 
>     440    if (basket->GetObjlen() + basket->GetKeylen() <= nb) {
> 
> With these bugs now finally out of the way, we can move along with
> our data production with some confidence that we will be able to read
> the data that we produce.
> 
> I will also point out that some additional documentation of how
> this level of the system works would be very useful to anyone trying
> to make sense of it.  The code seems to have evolved through ad hoc
> fixes to the point that it is very obscure.
> 
> Thanks to the ROOT team for putting so much effort into supporting this
> tool.  Even at its worst (like this), it's still more fun than
> ZEBRA/HBOOK/ADAMO/etc.
> 
> -- Fred



This archive was generated by hypermail 2b29 : Tue Jan 04 2000 - 00:43:40 MET