Re: histogram subtraction and mean

From: Stefan Piperov <piperov_at_fnal.gov>
Date: Tue, 1 Mar 2011 19:38:02 +0200

Hi Holger!

I also agree to use the most precise estimates of mean, when possible. The problem that I'm trying to highlight here is that from statistical point of view, you cannot call something "the mean of a histogram" if it was actually calculated from the original, un-binned, data, and not from the bin contents.

The problem is futher exagurated in the scenario of Margar Simonyan's case: The instant you have to treat your histogrammed data as a histogram, you lose all the benefits of working with un-binned data, and on top of that you get confused as to what does your "mean" really mean...

I'm just saying: The defaults should be (statistically) meaningful. The extra precision of calculating from raw data should be optional, and should be clear to the user that it's an extra.

Cheers,
Stefan.

On Tue, 1 Mar 2011, Holger Meyer wrote:

> Stefan,
>
> I agree with Rene. Use the best information available. If you want the
> moments from binned data you can calculate them from the histogram data.
> If you only have the binned moments, you can't go the other way. The
> only possible change might be to add a function to the histogram classes
> that does the work for you. Something like
> TH1::SetMomentsFromBinCenters(). That would be backwards compatible and
> you could get your desired behavior with just one additional line of code.
>
> Cheers,
> Holger
>
>
> On 03/01/2011 10:43 AM, Stefan Piperov wrote:
>> Hi Rene,
>>
>> I don't want to argue here - after all you are the author of ROOT, not
>> me - but it would be interesting to hear to opinions of both user
>> community, and professional statistiticians on this subject.
>>
>> I clearly see the implications that such a change will have on all
>> existing ROOT analysis codes, so I'm not proposing this change lightly.
>>
>> With Best Regards,
>> Stefan.
>>
>>
>> On Tue, 1 Mar 2011, Rene Brun wrote:
>>
>>
>>> No, I must disagree with you. ROOT computes the best possible value if the
>>> necessary information is available.
>>> I am sure that we will get zillions of complaints if we were following your
>>> suggestion ::)
>>>
>>> Rene
>>>
>>>
>>> On 01/03/2011 15:59, Stefan Piperov wrote:
>>>
>>>> I understand that mechanism, and the fact that computing moments from
>>>> un-binned data is more precise, but it's also misleading. If I'm filling a
>>>> histogram, then I want to know the charachteristics (e.g. moments) of the
>>>> binned data, not the originals. I can always calculate the moments of the
>>>> unbinned data if I wished.
>>>>
>>>> To me, at least, the default should be moments to be calculated from
>>>> binned data. The other behaviour should be optional.
>>>> But that might be quite a change on ROOT...
>>>>
>>>> Stefan.
>>>>
>>>>
>>>>
>>>> On Tue, 1 Mar 2011, Rene Brun wrote:
>>>>
>>>>
>>>>> When you use TH1::Fill, ROOT can compute the moments precisely because
>>>>> it has the original input values.
>>>>> As soon as you make an operation (add, ssubtract, zoom, etc), the only
>>>>> thing that we can do is to start from
>>>>> the bin contents only with the approximation that all values in the bin
>>>>> are at the center of the bin.
>>>>>
>>>>> Rene
>>>>>
>>>>>
>>>>> On 01/03/2011 15:50, Stefan Piperov wrote:
>>>>>
>>>>>> Well, probably the more interesting question is why before operations
>>>>>> the
>>>>>> moments are not calculated using the bin contents?
>>>>>> That's a default which has always puzlled me.
>>>>>> Stefan.
>>>>>>
>
>

*---------------------------------------------------------------------*
  Stefan Piperov      Mail: FNAL P.O.Box 500, MS 205, Batavia, IL-60510
  Phone: (630) 840-5176                        E-Mail: piperov_at_fnal.gov
*---------------------------------------------------------------------*
"Give a skeptic an inch... and he'll measure it."

Received on Tue Mar 01 2011 - 18:38:17 CET

This archive was generated by hypermail 2.2.0 : Tue Mar 01 2011 - 23:50:01 CET