Re: Follow up on the question about TBasket::ReadBasketBuffers

From: Constantin Loizides <cloizides_at_lbl.gov>
Date: Thu, 3 Feb 2011 18:11:45 +0100


Hi Philippe,
it is quite funny you ask me, but indeed I have spent about a year ago a few days incorporating a few zipping algorithms, so I do know this part quite well. It is probably best to discuss on skype.

In short, I do not think that the R__unzip return code right now is the best way to report errors upstream, but it has been like that since I know it.
Constantin

On 02/03/2011 06:09 PM, Philippe Canal wrote:
> Hi Constantin,
>
> This seems to work as you expect. However this also seems very specific
> to the zipping algorithm
> being used (i.e. this test might fail as soon as we try to switch to an
> alternative zipping algorithm!).
>
> I.e. Is there a better way of abstracting this out? Is there any
> documentation explaining why those
> are 'valid' values?
>
> Thanks,
> Philippe.
>
> On 2/3/11 4:14 AM, Constantin Loizides wrote:
>> Hi Philippe,
>> we ran over some of our broken files with your patch (simply ported to
>> 5.27.06.patches for now),
>> and can confirm that it now allows to skip offending events (or even
>> to skip offending files)
>> in the upstream code
>>
>> However, we also have cases, where "R__unzip" reports and error in the
>> ROOT
>> header before the actual zlib envelope. In the cases that are
>> disturbing to
>> us an inconsistency at an earlier stage, not easily detectable by the
>> code,
>> lead to allocation of large amount of memory, when fObjlen or fKeylen
>> are corrupted. (ie lines "if ((fObjlen+fKeylen) > fCompressedSize)").
>> A few calls later R__unzip will figure out that it can not find the
>> header,
>> but then one may already have allocated large junks of vmem.
>> In my test cases something like
>>
>>
>> + // early consisteny check:
>> + if (1) {
>> + UChar_t *tbuf = (UChar_t *)&buffer[fKeylen];
>> + if ((tbuf[0] != 'C' && tbuf[0] != 'Z') ||
>> + (tbuf[1] != 'S' && tbuf[1] != 'L') ||
>> + tbuf[2] != 8) {
>> + fprintf(stderr,"early abort: there will be an error reported by
>> R__unzip\n");
>> + badread = 1;
>> + return badread;
>> + }
>> + }
>>
>> before the allocation of the buffer helps to not even allocate the
>> memory. Again I do not know
>> enough about the subsequent assumptions to know whether such an early
>> abort is
>> possible, but it would be great if you could try to see what you can do?
>> Please let me know,
>> Thanks,
>> Constantin
>>
>>
>> On 02/02/2011 09:05 PM, Philippe Canal wrote:
>>> Yes, it is also there.
>>> Cheers,
>>> Philippe.
>>>
>>> On 2/2/11 1:32 PM, Constantin Loizides wrote:
>>>> Hi Philippe,
>>>> thanks for the fix. I assume this will go into
>>>> the v5-28-00 patches as well?
>>>> Constantin
>>>>
>>>> On 02/02/2011 08:13 PM, Philippe Canal wrote:
>>>>> Hi Constantin,
>>>>>
>>>>> Yes this is indeed a problem (but the return shown below is too
>>>>> early).
>>>>> The problem is fixed by revision 37947 of the trunk.
>>>>>
>>>>> Thanks,
>>>>> Philippe.
>>>>>
>>>>> On 1/26/11 3:39 AM, Constantin Loizides wrote:
>>>>>> Dear experts,
>>>>>> I have a question regarding the error handling
>>>>>> in TBasket line 436. In the original version
>>>>>> an unzipping error is recognized, badread is set
>>>>>> to one, but then the code continues. I am not
>>>>>> sure if this is intended, but I could imagine it
>>>>>> is trying to repair the bad read. However, very
>>>>>> often (or even always?) the code the crashes
>>>>>> later in the "AfterBuffer" section, ie in
>>>>>> fBufferRef->ReadArray(fEntryOffset);
>>>>>> (at least in the cases I looked at).
>>>>>>
>>>>>> Now in my case, if I return immediately the bad return,
>>>>>> I can nicely detect the bad read and decide to
>>>>>> abort the current event, or even reading the current
>>>>>> event.
>>>>>>
>>>>>> My question is: What is the intended behavior and
>>>>>> would there be a way to make this method a bit more
>>>>>> robust so that jobs do not crash when the unzip is
>>>>>> detected?
>>>>>>
>>>>>> Thanks,
>>>>>> Constantin
>>>>>>
>>>>>>
>>>>>> Index: tree/tree/src/TBasket.cxx
>>>>>> ===================================================================
>>>>>> --- tree/tree/src/TBasket.cxx (revision 36525)
>>>>>> +++ tree/tree/src/TBasket.cxx (working copy)
>>>>>> @@ -421,6 +421,7 @@
>>>>>> if (noutot != fObjlen) {
>>>>>> Error("ReadBasketBuffers", "fNbytes = %d, fKeylen = %d, fObjlen = %d,
>>>>>> noutot = %d, nout=%d, nin=%d, nbuf=%d", fNbytes,fKeylen,fObjlen,
>>>>>> noutot,nout,nin,nbuf);
>>>>>> badread = 1;
>>>>>> + return badread;
>>>>>> }
>>>>>> // Switch the 2 buffers
>>>>>> char *temp = fBufferRef->Buffer();
>>>>>>
>>>>
>>>
Received on Thu Feb 03 2011 - 18:11:56 CET

This archive was generated by hypermail 2.2.0 : Fri Feb 04 2011 - 17:50:02 CET