Re: [ROOT] TProfile error in low statistics bins

From: Rene Brun (Rene.Brun@cern.ch)
Date: Fri Jan 10 2003 - 13:02:36 MET


Hi,

The problem of low statistics bins with TProfiles has been a moving field.
I suggest to move to Root version 3.04/02.

Rene Brun

On Wed, 8 Jan 2003, Kazutaka Nakahara wrote:

> 
> Hi,
> 
> I'm somewhat new to root, so please bear with me.
> 
> I'm using root 3.03/06, and trying to fit (with pol1) a profile plot.  The 
> fit comes out to be completely bogus in that it puts alot of weight on 
> bins with low statistics.  This seems to be caused by the fact that the 
> errors are assigned incorrectly to these bins.  
> 
> I've used SetErrorOption(" ") and SetErrorOption("s") to compare the 
> errors I get.  Here is what I see:
> 
> ---- Part of my code that produces the output --- 
>   profx5->SetErrorOption(" ");
>   cout << "error option = " <<  profx5->GetErrorOption() << endl;
>   for(j=1;j<200;j++){
>     content[0] = profx5->GetBinContent(j);
>     entries[0] = profx5->GetBinEntries(j);
>     errors[0] = profx5->GetBinError(j);
>     cout << "bin# " << j << "  bin content = " << content[0] << "  
> BinEntries = " << entries[0] << "  error = " << errors[0] << endl;
>   }
> --------------------------------
> Below is part of my output (note the error in bin#170):
> numentries = 98782
> error option = 
> .
> .
> .
> bin# 167  bin content = -0.000507246  BinEntries = 138  error = 0.00068033
> bin# 168  bin content = -0.0015  BinEntries = 40  error = 0.0010338
> bin# 169  bin content = 0.00166667  BinEntries = 3  error = 0.00272166
> bin# 170  bin content = -0.005  BinEntries = 1  error = 5.04355e-05 <--!!!
> bin# 171  bin content = 0  BinEntries = 0  error = 0
> bin# 172  bin content = 0  BinEntries = 0  error = 0
> 
> 
> Now let me repeat the above with error option = "s"
> ------- Part of my code ---
>   profx5->SetErrorOption("s");
>   cout << "error option = " <<  profx5->GetErrorOption() << endl;
>   for(j=1;j<200;j++){
>     content[0] = profx5->GetBinContent(j);
>     entries[0] = profx5->GetBinEntries(j);
>     errors[0] = profx5->GetBinError(j);
>     cout << "bin# " << j << "  bin content = " << content[0] << "  
> BinEntries = " << entries[0] << "  err
> -----------------------
> Output (again, note the error in bin#170):
> numentries = 98782
> error option = s
> .
> .
> .
> bin# 167  bin content = -0.000507246  BinEntries = 138  error = 0.00799207
> bin# 168  bin content = -0.0015  BinEntries = 40  error = 0.00653835
> bin# 169  bin content = 0.00166667  BinEntries = 3  error = 0.00471405
> bin# 170  bin content = -0.005  BinEntries = 1  error = 0.0158473 <--!!!
> bin# 171  bin content = 0  BinEntries = 0  error = 0
> bin# 172  bin content = 0  BinEntries = 0  error = 0
> 
> ---------------------
> 
> 
> I find bin#170 to be rather troubling.  The bin has 1 entry, but for the 
> default option = " ", the error comes out to be VERY small.
> As I understand it, below is how the error is calculated for the two 
> options I specified (copied directly out of the root website):
>     option:
>      ' '  (Default) Errors are Spread/SQRT(N) for Spread.ne.0. ,
>                       "     "  SQRT(Y)/SQRT(N) for Spread.eq.0,N.gt.0 ,
>                       "     "  0.  for N.eq.0
>      's'            Errors are Spread  for Spread.ne.0. ,
>                       "     "  SQRT(Y)  for Spread.eq.0,N.gt.0 ,
>                       "     "  0.  for N.eq.0
> 
> 
> Where N = N_bin
> For N_bin=1, it seems to me that I should get the same error for both 
> options, since N_bin = 1 (which is equivalent to Spread = 0).  
> Now the output above tells me that the 
> rootfile I analyised has N_TOTAL = 98782.  
> So with option = 's',
> bin#170 ---   error = .0158473     <--- This is sqrt(Y)
> 
> With option = ' ',
> bin#170 ---  error = 5.04355e-05   <--- This is sqrt(Y)/sqrt(N_TOTAL) !!!!
> 
> Shouldn't the latter be sqrt(Y)/sqrt(N_bin) instead of 
> sqrt(Y)/sqrt(N_TOTAL) ??  
> 
> My suspicion is this is the reason why fits don't work for profile plots 
> with low statistic bins.  It's because the error seems to be calculated 
> wrong for N_bin = 1, thus creating a small error for that bin, which leads 
> to that bin being weighted heavily in the fit.
> 
> Did I miss something obvious??
> 
> 
> Regards,
> Kaz
> 
> 
> 
> 
> 



This archive was generated by hypermail 2b29 : Thu Jan 01 2004 - 17:50:08 MET