Re: Follow up on the question about TBasket::ReadBasketBuffers

From: Constantin Loizides <cloizides_at_lbl.gov>
Date: Fri, 4 Feb 2011 13:51:13 +0100


Hi Philippe,
I implemented the changes to R__unzip we discussed yesterday. I tested it on a bit of ALICE data, but please check it also carefully. Feel free to improve/modify.
Once you apply it to trunk, could you also put it into the 5.28.00 patches branch please.
Thanks,
Constantin

On 02/03/2011 06:11 PM, Constantin Loizides wrote:
> Hi Philippe,
> it is quite funny you ask me, but indeed I have spent about a year ago
> a few days incorporating a few zipping algorithms, so I do know this
> part quite well. It is probably best to discuss on skype.
>
> In short, I do not think that the R__unzip return code right
> now is the best way to report errors upstream, but it has been
> like that since I know it.
> Constantin
>
> On 02/03/2011 06:09 PM, Philippe Canal wrote:
>> Hi Constantin,
>>
>> This seems to work as you expect. However this also seems very specific
>> to the zipping algorithm
>> being used (i.e. this test might fail as soon as we try to switch to an
>> alternative zipping algorithm!).
>>
>> I.e. Is there a better way of abstracting this out? Is there any
>> documentation explaining why those
>> are 'valid' values?
>>
>> Thanks,
>> Philippe.
>>
>> On 2/3/11 4:14 AM, Constantin Loizides wrote:
>>> Hi Philippe,
>>> we ran over some of our broken files with your patch (simply ported to
>>> 5.27.06.patches for now),
>>> and can confirm that it now allows to skip offending events (or even
>>> to skip offending files)
>>> in the upstream code
>>>
>>> However, we also have cases, where "R__unzip" reports and error in the
>>> ROOT
>>> header before the actual zlib envelope. In the cases that are
>>> disturbing to
>>> us an inconsistency at an earlier stage, not easily detectable by the
>>> code,
>>> lead to allocation of large amount of memory, when fObjlen or fKeylen
>>> are corrupted. (ie lines "if ((fObjlen+fKeylen) > fCompressedSize)").
>>> A few calls later R__unzip will figure out that it can not find the
>>> header,
>>> but then one may already have allocated large junks of vmem.
>>> In my test cases something like
>>>
>>>
>>> + // early consisteny check:
>>> + if (1) {
>>> + UChar_t *tbuf = (UChar_t *)&buffer[fKeylen];
>>> + if ((tbuf[0] != 'C' && tbuf[0] != 'Z') ||
>>> + (tbuf[1] != 'S' && tbuf[1] != 'L') ||
>>> + tbuf[2] != 8) {
>>> + fprintf(stderr,"early abort: there will be an error reported by
>>> R__unzip\n");
>>> + badread = 1;
>>> + return badread;
>>> + }
>>> + }
>>>
>>> before the allocation of the buffer helps to not even allocate the
>>> memory. Again I do not know
>>> enough about the subsequent assumptions to know whether such an early
>>> abort is
>>> possible, but it would be great if you could try to see what you can do?
>>> Please let me know,
>>> Thanks,
>>> Constantin
>>>
>>>
>>> On 02/02/2011 09:05 PM, Philippe Canal wrote:
>>>> Yes, it is also there.
>>>> Cheers,
>>>> Philippe.
>>>>
>>>> On 2/2/11 1:32 PM, Constantin Loizides wrote:
>>>>> Hi Philippe,
>>>>> thanks for the fix. I assume this will go into
>>>>> the v5-28-00 patches as well?
>>>>> Constantin
>>>>>
>>>>> On 02/02/2011 08:13 PM, Philippe Canal wrote:
>>>>>> Hi Constantin,
>>>>>>
>>>>>> Yes this is indeed a problem (but the return shown below is too
>>>>>> early).
>>>>>> The problem is fixed by revision 37947 of the trunk.
>>>>>>
>>>>>> Thanks,
>>>>>> Philippe.
>>>>>>
>>>>>> On 1/26/11 3:39 AM, Constantin Loizides wrote:
>>>>>>> Dear experts,
>>>>>>> I have a question regarding the error handling
>>>>>>> in TBasket line 436. In the original version
>>>>>>> an unzipping error is recognized, badread is set
>>>>>>> to one, but then the code continues. I am not
>>>>>>> sure if this is intended, but I could imagine it
>>>>>>> is trying to repair the bad read. However, very
>>>>>>> often (or even always?) the code the crashes
>>>>>>> later in the "AfterBuffer" section, ie in
>>>>>>> fBufferRef->ReadArray(fEntryOffset);
>>>>>>> (at least in the cases I looked at).
>>>>>>>
>>>>>>> Now in my case, if I return immediately the bad return,
>>>>>>> I can nicely detect the bad read and decide to
>>>>>>> abort the current event, or even reading the current
>>>>>>> event.
>>>>>>>
>>>>>>> My question is: What is the intended behavior and
>>>>>>> would there be a way to make this method a bit more
>>>>>>> robust so that jobs do not crash when the unzip is
>>>>>>> detected?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Constantin
>>>>>>>
>>>>>>>
>>>>>>> Index: tree/tree/src/TBasket.cxx
>>>>>>> ===================================================================
>>>>>>> --- tree/tree/src/TBasket.cxx (revision 36525)
>>>>>>> +++ tree/tree/src/TBasket.cxx (working copy)
>>>>>>> @@ -421,6 +421,7 @@
>>>>>>> if (noutot != fObjlen) {
>>>>>>> Error("ReadBasketBuffers", "fNbytes = %d, fKeylen = %d, fObjlen =
>>>>>>> %d,
>>>>>>> noutot = %d, nout=%d, nin=%d, nbuf=%d", fNbytes,fKeylen,fObjlen,
>>>>>>> noutot,nout,nin,nbuf);
>>>>>>> badread = 1;
>>>>>>> + return badread;
>>>>>>> }
>>>>>>> // Switch the 2 buffers
>>>>>>> char *temp = fBufferRef->Buffer();
>>>>>>>
>>>>>
>>>>

Received on Fri Feb 04 2011 - 13:51:25 CET

This archive was generated by hypermail 2.2.0 : Mon Feb 07 2011 - 11:50:01 CET