Hi Fred, Thanks for reporting these problems with the TTree reading functions. I have introduced your suggested changes in the development version. I would be very interested to know in which situation you are getting the problem with mmzip. I left a warning message in the code when this situation happens. Did you get this warning ? Concerning your second point, Root has a default internal buffer of 1000 words to store the pointers to entries into the branch buffers. This is not the buffer size used to read from the file. I agree that the piece of code managing the TTree reading is complex. To understand the logic, one must have the picture describing the TTree data structures. See http://root.cern.ch/root/html/TTree.html Rene Brun Fred Gray wrote: > > I would like to bring to everyone's attention a few bugs that we have found > in ROOT that relate to tree I/O. For appropriate combinations of data > structures and buffer sizes, they can cause ROOT to crash with a segmentation > violation or even to return incorrect data. I have filed bug reports in > the bug tracking system for them (PR#508 and 509), but I think that they > are of sufficient urgency that they should be brought to the attention of > the entire ROOT community (so that no one else has to spend days upon > nights upon days hunting for them). > > First: Before a buffer is written to a tree in compressed mode, a check is > made to ensure that the compression actually caused the size of the buffer > to decrease. The case where the compressed size exactly equals the > uncompressed size is handled incorrectly, though: the compressed version > is written, but it is not decompressed when it is read back in. > > The patch for this one is to change the ">" to ">=" on line 374 of > TREE_Basket.cxx: > > 372 mmzip(cxlevel, &fObjlen, objbuf, &bufmax, &fBuffer[fKeylen], > &nout); > 373 //printf("mmzip:%s, bufmax=%d, fObjlen=%d, fKeylen=%d, nout=%d\n", > GetName(),bufmax,fObjlen,fKeylen,nout); > 374 if (nout > fObjlen) { > 375 delete [] fBuffer; > 376 fBuffer = fBufferRef->Buffer(); > > Second: Buffers seem to be read back from a tree in units of 1000 > bytes. (Why 1000? Wouldn't a multiple of 512 be more efficient?) > After reading that amount, a check is made to determine whether the > entire buffer was contained in those 1000 bytes. If not, the rest is > read in at some point. Unfortunately, this check only takes into > account the length of the object itself, not the length of the key > that is written along with the object. > > The patch for this issue is to change line 440 in the following code: > > 440 if (basket->GetObjlen() <= nb) { > 441 buffer->SetBufferOffset(0); > 442 basket->ReadBasketBuffers(buf); > 443 } > > to be: > > 440 if (basket->GetObjlen() + basket->GetKeylen() <= nb) { > > With these bugs now finally out of the way, we can move along with > our data production with some confidence that we will be able to read > the data that we produce. > > I will also point out that some additional documentation of how > this level of the system works would be very useful to anyone trying > to make sense of it. The code seems to have evolved through ad hoc > fixes to the point that it is very obscure. > > Thanks to the ROOT team for putting so much effort into supporting this > tool. Even at its worst (like this), it's still more fun than > ZEBRA/HBOOK/ADAMO/etc. > > -- Fred
This archive was generated by hypermail 2b29 : Tue Jan 04 2000 - 00:43:40 MET