Hi, I've recently come across a strang thing that happens to me when reading back a TTree from a file. The TTree contains one top level branch of a class Foo, that contains a TObjArray. The TObjArray may contain objects of classes Foo, Bar, Baz, Qux, ... and any number of them. When things go wrong, it's usually flaged with index -20432 out of bounds (size: 13, .... and then a segmentation violation causing an abort and core dump. So I did some serious debugging. It turns out the out of bounds error comes from "Int_t TClass::ReadBuffer(TBuffer &b, void *pointer)": UInt_t R__s, R__c; Version_t version = b.ReadVersion(&R__s, &R__c); if (gFile && gFile->GetVersion() < 30000) version = -1; TStreamerInfo *sinfo = (TStreamerInfo*)fStreamerInfo->At(version); ^^^^^^^ where version is some absurd number. Further debugging showed, that the problem poped actually popped up in "Version_t TBuffer::ReadVersion(UInt_t *startpos, UInt_t *bcnt)": union { UInt_t cnt; Version_t vers[2]; } v; *this >> v.vers[1]; *this >> v.vers[0]; Here, v.cnt is supposed to be a masked byte count, but instead it is 0, or something so that in the next instruction if (!(v.cnt & kByteCountMask)) { fBufCur -= sizeof(UInt_t); v.cnt = 0; } the conditional is true, and so the buffer backs up "sizeof(UInt_t)" = 4 bytes, and then goes on. The next thing is: *bcnt = (v.cnt & ~kByteCountMask); *this >> version; So the TClass gets the wrong byte count, and a wierd version number, since the buffer was reading the wrong stuff! Later on this is also what causes the segmentation violation. So, the whole thing is messed up because of this short (or is it long?) read. So, I played around a bit, tried different things. It turns out, that certain combinations of the buffer size and split level makes the thing happen. I did an investigation, and here's what I found: split | buffer size level | 100 2000 4000 6400 16000 32000 ------+----------------------------------------- 0 | n/a n/a ok bad n/a n/a 1 | ok ok bad bad bad bad 2 | ok n/a n/a ok ok ok 99 | ok n/a n/a ok ok ok "n/a" means I was lazy and didn't do the test. "ok" I could read back fine. "bad" means it failed as outlined above. Ok, so the numbers above only really makes sense if you have the full class specs and are running the thing on a machine like to mine. This seems odd to me. Does anyone have a good explanation? Is this really the behaviour intended (presuming that I'm not doing something wrong, which I don't thing I do). My machine is a Pentium III, 733 MHz, 256 MB RAM + 1 GB swap, Redhat 6.2, ROOT 3.01/06 (CVS head a week ago). Yours, Christian Holm Christensen ------------------------------------------- Address: Sankt Hansgade 23, 1. th. Phone: (+45) 35 35 96 91 DK-2200 Copenhagen N Cell: (+45) 28 82 16 23 Denmark Office: (+45) 353 25 305 Email: cholm@nbi.dk Web: www.nbi.dk/~cholm
This archive was generated by hypermail 2b29 : Tue Jan 01 2002 - 17:50:58 MET