Re: Follow up on the question about TBasket::ReadBasketBuffers

From: Philippe Canal <pcanal_at_fnal.gov>
Date: Fri, 4 Feb 2011 15:49:34 -0600


Hi Constantin,

Thanks for the patch. It is part of the trunk as revision 37985 and part of the patch branch for v5.28/00

Cheers,
Philippe.

On 2/4/11 6:51 AM, Constantin Loizides wrote:
> Hi Philippe,
> I implemented the changes to R__unzip we discussed yesterday.
> I tested it on a bit of ALICE data, but please check it also
> carefully. Feel free to improve/modify.
> Once you apply it to trunk, could you also put it into the
> 5.28.00 patches branch please.
> Thanks,
> Constantin
>
> On 02/03/2011 06:11 PM, Constantin Loizides wrote:
>> Hi Philippe,
>> it is quite funny you ask me, but indeed I have spent about a year ago
>> a few days incorporating a few zipping algorithms, so I do know this
>> part quite well. It is probably best to discuss on skype.
>>
>> In short, I do not think that the R__unzip return code right
>> now is the best way to report errors upstream, but it has been
>> like that since I know it.
>> Constantin
>>
>> On 02/03/2011 06:09 PM, Philippe Canal wrote:
>>> Hi Constantin,
>>>
>>> This seems to work as you expect. However this also seems very specific
>>> to the zipping algorithm
>>> being used (i.e. this test might fail as soon as we try to switch to an
>>> alternative zipping algorithm!).
>>>
>>> I.e. Is there a better way of abstracting this out? Is there any
>>> documentation explaining why those
>>> are 'valid' values?
>>>
>>> Thanks,
>>> Philippe.
>>>
>>> On 2/3/11 4:14 AM, Constantin Loizides wrote:
>>>> Hi Philippe,
>>>> we ran over some of our broken files with your patch (simply ported to
>>>> 5.27.06.patches for now),
>>>> and can confirm that it now allows to skip offending events (or even
>>>> to skip offending files)
>>>> in the upstream code
>>>>
>>>> However, we also have cases, where "R__unzip" reports and error in the
>>>> ROOT
>>>> header before the actual zlib envelope. In the cases that are
>>>> disturbing to
>>>> us an inconsistency at an earlier stage, not easily detectable by the
>>>> code,
>>>> lead to allocation of large amount of memory, when fObjlen or fKeylen
>>>> are corrupted. (ie lines "if ((fObjlen+fKeylen) > fCompressedSize)").
>>>> A few calls later R__unzip will figure out that it can not find the
>>>> header,
>>>> but then one may already have allocated large junks of vmem.
>>>> In my test cases something like
>>>>
>>>>
>>>> + // early consisteny check:
>>>> + if (1) {
>>>> + UChar_t *tbuf = (UChar_t *)&buffer[fKeylen];
>>>> + if ((tbuf[0] != 'C' && tbuf[0] != 'Z') ||
>>>> + (tbuf[1] != 'S' && tbuf[1] != 'L') ||
>>>> + tbuf[2] != 8) {
>>>> + fprintf(stderr,"early abort: there will be an error reported by
>>>> R__unzip\n");
>>>> + badread = 1;
>>>> + return badread;
>>>> + }
>>>> + }
>>>>
>>>> before the allocation of the buffer helps to not even allocate the
>>>> memory. Again I do not know
>>>> enough about the subsequent assumptions to know whether such an early
>>>> abort is
>>>> possible, but it would be great if you could try to see what you can do?
>>>> Please let me know,
>>>> Thanks,
>>>> Constantin
>>>>
>>>>
>>>> On 02/02/2011 09:05 PM, Philippe Canal wrote:
>>>>> Yes, it is also there.
>>>>> Cheers,
>>>>> Philippe.
>>>>>
>>>>> On 2/2/11 1:32 PM, Constantin Loizides wrote:
>>>>>> Hi Philippe,
>>>>>> thanks for the fix. I assume this will go into
>>>>>> the v5-28-00 patches as well?
>>>>>> Constantin
>>>>>>
>>>>>> On 02/02/2011 08:13 PM, Philippe Canal wrote:
>>>>>>> Hi Constantin,
>>>>>>>
>>>>>>> Yes this is indeed a problem (but the return shown below is too
>>>>>>> early).
>>>>>>> The problem is fixed by revision 37947 of the trunk.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Philippe.
>>>>>>>
>>>>>>> On 1/26/11 3:39 AM, Constantin Loizides wrote:
>>>>>>>> Dear experts,
>>>>>>>> I have a question regarding the error handling
>>>>>>>> in TBasket line 436. In the original version
>>>>>>>> an unzipping error is recognized, badread is set
>>>>>>>> to one, but then the code continues. I am not
>>>>>>>> sure if this is intended, but I could imagine it
>>>>>>>> is trying to repair the bad read. However, very
>>>>>>>> often (or even always?) the code the crashes
>>>>>>>> later in the "AfterBuffer" section, ie in
>>>>>>>> fBufferRef->ReadArray(fEntryOffset);
>>>>>>>> (at least in the cases I looked at).
>>>>>>>>
>>>>>>>> Now in my case, if I return immediately the bad return,
>>>>>>>> I can nicely detect the bad read and decide to
>>>>>>>> abort the current event, or even reading the current
>>>>>>>> event.
>>>>>>>>
>>>>>>>> My question is: What is the intended behavior and
>>>>>>>> would there be a way to make this method a bit more
>>>>>>>> robust so that jobs do not crash when the unzip is
>>>>>>>> detected?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Constantin
>>>>>>>>
>>>>>>>>
>>>>>>>> Index: tree/tree/src/TBasket.cxx
>>>>>>>> ===================================================================
>>>>>>>> --- tree/tree/src/TBasket.cxx (revision 36525)
>>>>>>>> +++ tree/tree/src/TBasket.cxx (working copy)
>>>>>>>> @@ -421,6 +421,7 @@
>>>>>>>> if (noutot != fObjlen) {
>>>>>>>> Error("ReadBasketBuffers", "fNbytes = %d, fKeylen = %d, fObjlen =
>>>>>>>> %d,
>>>>>>>> noutot = %d, nout=%d, nin=%d, nbuf=%d", fNbytes,fKeylen,fObjlen,
>>>>>>>> noutot,nout,nin,nbuf);
>>>>>>>> badread = 1;
>>>>>>>> + return badread;
>>>>>>>> }
>>>>>>>> // Switch the 2 buffers
>>>>>>>> char *temp = fBufferRef->Buffer();
>>>>>>>>
>>>>>>
>>>>>
Received on Fri Feb 04 2011 - 22:49:51 CET

This archive was generated by hypermail 2.2.0 : Mon Feb 07 2011 - 11:50:01 CET