Re: Histogram mean

From: Rene Brun <Rene.Brun_at_cern.ch>
Date: Wed, 14 Sep 2011 06:50:41 +0200


Let me come with some clarification as I have the impression that senders are confusing each-other.

When filling a histogram with Fill(x,w), ROOT keeps track of the sums of w, w*x and w*x*x such that later the mean, rms can be computed precisely over the FULL bin range.
When you set the bin range to a subset of the original range, ROOT does what it can only do, ie recompute the statistics quantities starting with the center of each bin in the range BECAUSE THE ORIGINAL ENTRIES IN THIS RANGE ARE NOT AVAILABLE ANYMORE.
In the case of Philip, there is no way to know that you have restricted the range without excluding any entries. Once you restrict the range, there is no way of offering an "UNBINNED" option for binned data because the original data is lost. ntuples have been invented many decades ago to solve your problem. You always start from unbinned data.
ROOT has an option (not well known) via TH1::SetBuffer to keep all entries to TH1::Fill in memory such that the statistics are always accurate, but, of course, this scheme can only be used for histograms with small number of entries.
So, use TNtupleD or TTrees and project the data in the desired range if you want/need very precise statistics.

Rene

sum ofOn 13/09/2011 22:49, Philip Rodrigues wrote:
> Here is my favourite example of the confusing behaviour of TH1::GetMean():
>
> root [0] TH1F* h2=new TH1F("foo2", "bar", 10, 0, 5)
>
> root [1] h2->Fill(2.2)
>
> root [2] h2->GetMean()
> (const Double_t)2.20000000000000018e+00
>
> // This excludes a zero bin, so shouldn't affect the mean:
> root [3] h2->GetXaxis()->SetRangeUser(-1, 3.8)
>
> root [4] h2->GetMean()
> (const Double_t)2.25000000000000000e+00
>
> On Tuesday, September 13, 2011 04:16:05 PM Arthur E. Snyder wrote:
>> Stefan and Aammer:
>>
>>
>> What would be nice would be an option, e.g., GetMean("binned") vs.
>> GetMean("unbinned") with default being what folks are used to.
>>
>> I suspect that problem occurs not just for subtracted historgrams but for
>> weighted ones.
>>
>> On the other hand sometimes you want the unbinned mean. If you've made big
>> bins to make the plot look nice, the mean may be seriously over estimated
>> by using middle of the bin ...
>>
>> Maybe there should be a warning when |h->Sumw2()| has been involked ...
>>
>> -Art S.
>>
>> A.E. Snyder, The Former Group C (TFC) \!c*p?/
>> SLAC Mail Stop #95 ((. .))
>> Box 4349 |
>> Stanford, Ca, USA, 94309 '\|/`
>> e-mail:snyder_at_slac.stanford.edu o
>> phone:650-926-2701 _
>> http://www.slac.stanford.edu/~snyder BaBar
>> FAX:707-313-0250 Collaboration
>> &
>> Fermi/GLAST
>>
>> On Tue, 13 Sep 2011, Stefan Piperov wrote:
>>> This issue - that ROOT reports histogram's momenta based on unbinned
>>> data - has been discussed several times now, but without much
>>> consequences...
>>>
>>> It is plain wrong, of course, to report quantities related to the
>>> initial dataset as belonging to the histogram, but this is how ROOT was
>>> designed from the very beginning, so it's too late now to change, I
>>> guess. What we can do, though, is to spread the word, so that at least
>>> the users know of this problem, and do not rely on TH1::GetMean() to get
>>> the mean of the binned data.
>>>
>>> Stefan.
>>>
>>> On Tue, 13 Sep 2011, Arthur E. Snyder wrote:
>>>> Aamer,
>>>>
>>>> It doesn't work on subtracted histograms. I had to 'roll-my-own' to do
>>>> that (though there might be something existing that does this that I
>>>> just didn't find).
>>>>
>>>> As I recall |root] does unbinned mean, so you get the same result
>>>> regardless of binning. It's not clear what it does the case of
>>>> subtracted histograms, but if the histograms is asymmetric the result
>>>> can be wildly wrong.
>>>>
>>>> -Art S.
>>>>
>>>> A.E. Snyder, The Former Group C (TFC) \!c*p?/
>>>> SLAC Mail Stop #95 ((. .))
>>>> Box 4349 |
>>>> Stanford, Ca, USA, 94309 '\|/`
>>>> e-mail:snyder_at_slac.stanford.edu o
>>>> phone:650-926-2701 _
>>>> http://www.slac.stanford.edu/~snyder BaBar
>>>> FAX:707-313-0250 Collaboration
>>>>
>>>> &
>>>>
>>>> Fermi/GLAST
>>>>
>>>> On Tue, 13 Sep 2011, Aamer Wali Rauf wrote:
>>>>> Hi,
>>>>> I have always assumed (and thus used) that the TH1::GetMean(1) method
>>>>> gives out the weighted mean value
>>>>> of the x-axis of the histogram. Visibly it looks to me that way but is
>>>>> it really so? Can someone
>>>>> comment on that please?
>>>>>
>>>>> Thanks in advance,
>>>>> Aamer
>>> *---------------------------------------------------------------------*
>>>
>>> Stefan Piperov Mail: FNAL P.O.Box 500, MS 205, Batavia, IL-60510
>>> Phone: (630) 840-5176 E-Mail: piperov_at_fnal.gov
>>>
>>> *---------------------------------------------------------------------*
>>> "Give a skeptic an inch... and he'll measure it."
Received on Wed Sep 14 2011 - 06:50:50 CEST

This archive was generated by hypermail 2.2.0 : Wed Sep 14 2011 - 23:50:01 CEST