Re: histogram subtraction and mean

From: Lorenzo Moneta <Lorenzo.Moneta_at_cern.ch>
Date: Tue, 1 Mar 2011 22:50:22 +0100


Hi Margar,

On Mar 1, 2011, at 10:35 PM, Margar Simonyan wrote:

> Lorenzo
> 
> sorry, another one: when ResetStats and StatOverflows are called, what
> is taken as a bin center for underflow and overflow?

If this is the case, it is used the underflow/overflow "bin center" values, i.e. XMIN - bin_width/2 and XMAX + bin_width/2 which is arbitrary and NOT correct.
In my opinion, when using bin centers, underflow and overflow should not be used in the statistics calculations

  Lorenzo

> 
> Thanks,
>      Margar
> 
> 
> On Tue, Mar 1, 2011 at 10:24 PM, Margar Simonyan
> <margar.simonyan_at_gmail.com> wrote:
>> Hi Lorenzo
>> 
>> thanks for checking. It is such a common operation that I could not
>> believe a bug can survive. I have one last question un-answered: can
>> one use TProfile and do "background subtraction" instead of dealing
>> with individual histograms?
>> 
>> Margar
>> 
>> 
>> 
>> On Tue, Mar 1, 2011 at 9:36 PM, Lorenzo Moneta <Lorenzo.Moneta_at_cern.ch> wrote:
>>> Hi Margar,
>>> 
>>>  I have checked the code and it is true, the statistics is not correctly calculated. It uses the abs(w) and this is wrong if w is negative.
>>> As I said before, I will correct for negative w, BUT if there are negative bins it cannot be computed anymore !
>>> I will set artificial 0 values in this case
>>> 
>>> A similar problem happens when you fill the histogram with negative weights
>>> 
>>>  Cheers
>>> 
>>>  Lorenzo
>>> On Mar 1, 2011, at 6:02 PM, Margar Simonyan wrote:
>>> 

>>>> Hello Lorenzo
>>>>
>>>> thank you for the answer. Please have a look the attached figure, I
>>>> make difference between the two histograms:
>>>>
>>>> blue.Add(red, w)
>>>>
>>>> where w is between -1 and 0. Naively I expect after Add the mean of
>>>> blue histogram to increase, but it depends on w. I don't understand
>>>> this results. I verified that there are no bins with negative content
>>>> before and after Add.
>>>>
>>>> Thanks,
>>>> Margar
>>>>
>>>>
>>>>
>>>> On Tue, Mar 1, 2011 at 4:14 PM, Lorenzo Moneta <Lorenzo.Moneta_at_cern.ch> wrote:
>>>>> Hello Margar,
>>>>> 
>>>>>  when you are getting an histogram with negative bins content (for example from the subtraction of two histograms)
>>>>> the statistics (mean , s.d., etc..) is computed now in ROOT  using the absolute value of the bin content.
>>>>> In my opinion, if a bin has negative content, it does not make any sense to compute any statistics using the bin centers.
>>>>> You would need to compute it using the original entries from the histogram.
>>>>> It is my plan to set artificially a mean/s.d. to zero )or whatever not defined value) in this particular cases to avoid computing a
>>>>> totally wrong result and avoiding confusion
>>>>> 
>>>>> Best Regards
>>>>> 
>>>>>  Lorenzo
>>>>> 
>>>>> 
>>>>> On Mar 1, 2011, at 3:54 PM, Margar Simonyan wrote:
>>>>> 
>>>>>> Dear Rene
>>>>>> 
>>>>>> thanks, now I understand the observed differences. In a real example I
>>>>>> have another issue, signal+background distribution has empty bins, but
>>>>>> background distribution can have non-zero content for the same bins,
>>>>>> then the difference has bins with negative content. I tried to re-bin,
>>>>>> but the results were depend significantly on grouping. Is there a
>>>>>> better way of solving this issue?
>>>>>> 
>>>>>> Can background subtraction from signal+background done with TProfile
>>>>>> (Add)? I attach updated version of my script. Certainly TProfile:Add
>>>>>> does something different.
>>>>>> 
>>>>>> Best regards,
>>>>>>        Margar
>>>>>> 
>>>>>> On Tue, Mar 1, 2011 at 12:56 PM, Rene Brun <Rene.Brun_at_cern.ch> wrote:
>>>>>>> What you get is perfectly normal.
>>>>>>> Following an operation on your histogram (Add, Substract, Rebin, etc) the
>>>>>>> statitics for moments (mean, sigma, etc)
>>>>>>> are recomputed from the bin contents, assuming the center of the bin.
>>>>>>> 
>>>>>>> Rene Brun
>>>>>>> 
>>>>>>> 
>>>>>>> On 01/03/2011 12:37, Margar Simonyan wrote:
>>>>>>>> 
>>>>>>>> Hello ROOTTalk
>>>>>>>> 
>>>>>>>> I get strange results after histogram subtraction, the attached script
>>>>>>>> written in Python demonstrates the issue. My goal is to subtract
>>>>>>>> background from signal+background distribution and get meaningful
>>>>>>>> results for mean.
>>>>>>>> There are several unexpected (for me) results:
>>>>>>>> First, the mean changes after subtracting empty histogram, this is not
>>>>>>>> a big issue. Second, after subtracting background I don't get exactly
>>>>>>>> the signal value. Third, rebinning before subtracting changes the
>>>>>>>> results once more.
>>>>>>>> Can somebody explain this? I am using ROOT 5.26/00e complied on SLC5
>>>>>>>> with gcc43.
>>>>>>>> 
>>>>>>>>  Thanks,
>>>>>>>>        Margar
>>>>>>>>  -------------------------------------------------------------------------
>>>>>>>>   Dr Margar Simonyan,  post-doctoral researcher
>>>>>>>>   Niels Bohr Institute, Copenhagen University
>>>>>>>>  -------------------------------------------------------------------------
>>>>>>> 
>>>>>>> 
>>>>>> <histo.py>
>>>>> 
>>>>> 

>>>> <test.png>
>>> 
>>> 
>> 
Received on Tue Mar 01 2011 - 22:50:59 CET

This archive was generated by hypermail 2.2.0 : Wed Mar 02 2011 - 11:50:01 CET