Hi Rene, Thanks for your offer to look at the file. I've put a few examples of corrupt data files along with a description of the problem observed on: http://www.hep.umn.edu/~schubert/badfiles/ I will also try to pull together a subset of relevant classes and send this to you tomorrow. -Sue Rene Brun wrote: > Hi Susan, > > As you have guessed correctly, the message "Error in <TObjArray::At.." > is generated when Root reads what should be a class version number > from the buffer. This happens typically when the file has been overwritten. > The system, in principle, should be able to recover. In fact, I see that > you get the correct error message > " Warning in <TBuffer::CheckByteCount>: PlexPlaneId::Streamer() not in > sync with data on file, fix Streamer" > It is likely that some data member of one your class (eg a pointer) > will not contain valid data after this error. trying to access it and follow it > will generate a segm fault. > > I can have a look at your file, if you want. Could you also send a subset > of your classes? > > Rene Brun > > > > >>Hi roottalk, >>We are experiencing a problem with partially corrupt root data files. These >>files have been produced on a reconstruction production farm, and the source of the >>corruption is not clear yet and is being investigated. >> We believe that only a small subset of the entries on >>any one tree are affected by the data corruption, and I would like to be able >>to recognize when a corrupt record has been read into memory, warn the >>user, and then move to the next record in the tree so that all subsequent unaffected records >>can be processed. I'm wondering how this can be done. >> An example of what happens in a case where TTree::GetEntry() attempts >>to read a corrupt data record: >> >>Error in <TObjArray::At>: index 66 out of bounds (size: 16, this: 0x0921aad0) >>Error in <TObjArray::At>: index 3072 out of bounds (size: 15, this: 0x09239978) >>Error in <TObjArray::At>: index 852 out of bounds (size: 15, this: 0x0923a740) >>Error in <TObjArray::At>: index -11069 out of bounds (size: 68, this: 0x0921aad0) >>Error in <TObjArray::AddAt>: out of bounds at -11069 in 921aad0 >>Error in <TBuffer::CheckByteCount>: object of class PlexPlaneId read too many bytes: 6 instead of -1879030366 >>Warning in <TBuffer::CheckByteCount>: PlexPlaneId::Streamer() not in sync with data on file, fix Streamer() >>Segmentation fault (core dumped) >> >>The first sign of a problem, the TObjArray::At error messages, are produced when the >>TClass::ReadBuffer method reads a version number from the corrupt data buffer that is >>ridiculous for the class and uses that corrupt version number to access the TObjArray >>containing the StreamerInfo's. The subsequent segv occurs deep within root and the >>GetEntry method never returns to the user. >>Other corrupt data records produce different symptoms, but the segv is usually >>preceded by some error messages from root. >> I thought I could perhaps use an error handler to catch the errors and abort the >>read of the current entry without aborting the job, which would allow the user to continue >>processing entries. Unfortunately, I'm really a novice at using error handlers, and although I see that >>I can override root's ErrorHandler default function using TError's SetErrorHandler >>method, I don't see how to write the function so that it resurfaces at the place in my >>code just after the TTree::GetEntry() method is invoked. Perhaps this is a bad idea anyway, >>since it may leave some unfinished business in TTree::GetEntry? >> Can anyone suggest a solution to skip past these corrupt records? Of course, our >>first priority is to fix the cause of the data corruption and all data will eventually be >>reprocessed, but this will be a stopgap measure to allow the user to look at the data >>in the meantime. >>Thanks for your help, >>-Sue >>p.s. I'm using root cvs as of this past Sunday and gcc 3.2 on rh linux to read the data files. >>The data files were produced with an older version of root. >>
This archive was generated by hypermail 2b29 : Thu Jan 01 2004 - 17:50:09 MET