[ROOT] TProfile error in low statistics bins

From: Kazutaka Nakahara (nakahara@jlab.org)
Date: Thu Jan 09 2003 - 01:34:08 MET


Hi,

I'm somewhat new to root, so please bear with me.

I'm using root 3.03/06, and trying to fit (with pol1) a profile plot.  The 
fit comes out to be completely bogus in that it puts alot of weight on 
bins with low statistics.  This seems to be caused by the fact that the 
errors are assigned incorrectly to these bins.  

I've used SetErrorOption(" ") and SetErrorOption("s") to compare the 
errors I get.  Here is what I see:

---- Part of my code that produces the output --- 
  profx5->SetErrorOption(" ");
  cout << "error option = " <<  profx5->GetErrorOption() << endl;
  for(j=1;j<200;j++){
    content[0] = profx5->GetBinContent(j);
    entries[0] = profx5->GetBinEntries(j);
    errors[0] = profx5->GetBinError(j);
    cout << "bin# " << j << "  bin content = " << content[0] << "  
BinEntries = " << entries[0] << "  error = " << errors[0] << endl;
  }
--------------------------------
Below is part of my output (note the error in bin#170):
numentries = 98782
error option = 
.
.
.
bin# 167  bin content = -0.000507246  BinEntries = 138  error = 0.00068033
bin# 168  bin content = -0.0015  BinEntries = 40  error = 0.0010338
bin# 169  bin content = 0.00166667  BinEntries = 3  error = 0.00272166
bin# 170  bin content = -0.005  BinEntries = 1  error = 5.04355e-05 <--!!!
bin# 171  bin content = 0  BinEntries = 0  error = 0
bin# 172  bin content = 0  BinEntries = 0  error = 0


Now let me repeat the above with error option = "s"
------- Part of my code ---
  profx5->SetErrorOption("s");
  cout << "error option = " <<  profx5->GetErrorOption() << endl;
  for(j=1;j<200;j++){
    content[0] = profx5->GetBinContent(j);
    entries[0] = profx5->GetBinEntries(j);
    errors[0] = profx5->GetBinError(j);
    cout << "bin# " << j << "  bin content = " << content[0] << "  
BinEntries = " << entries[0] << "  err
-----------------------
Output (again, note the error in bin#170):
numentries = 98782
error option = s
.
.
.
bin# 167  bin content = -0.000507246  BinEntries = 138  error = 0.00799207
bin# 168  bin content = -0.0015  BinEntries = 40  error = 0.00653835
bin# 169  bin content = 0.00166667  BinEntries = 3  error = 0.00471405
bin# 170  bin content = -0.005  BinEntries = 1  error = 0.0158473 <--!!!
bin# 171  bin content = 0  BinEntries = 0  error = 0
bin# 172  bin content = 0  BinEntries = 0  error = 0

---------------------


I find bin#170 to be rather troubling.  The bin has 1 entry, but for the 
default option = " ", the error comes out to be VERY small.
As I understand it, below is how the error is calculated for the two 
options I specified (copied directly out of the root website):
    option:
     ' '  (Default) Errors are Spread/SQRT(N) for Spread.ne.0. ,
                      "     "  SQRT(Y)/SQRT(N) for Spread.eq.0,N.gt.0 ,
                      "     "  0.  for N.eq.0
     's'            Errors are Spread  for Spread.ne.0. ,
                      "     "  SQRT(Y)  for Spread.eq.0,N.gt.0 ,
                      "     "  0.  for N.eq.0


Where N = N_bin
For N_bin=1, it seems to me that I should get the same error for both 
options, since N_bin = 1 (which is equivalent to Spread = 0).  
Now the output above tells me that the 
rootfile I analyised has N_TOTAL = 98782.  
So with option = 's',
bin#170 ---   error = .0158473     <--- This is sqrt(Y)

With option = ' ',
bin#170 ---  error = 5.04355e-05   <--- This is sqrt(Y)/sqrt(N_TOTAL) !!!!

Shouldn't the latter be sqrt(Y)/sqrt(N_bin) instead of 
sqrt(Y)/sqrt(N_TOTAL) ??  

My suspicion is this is the reason why fits don't work for profile plots 
with low statistic bins.  It's because the error seems to be calculated 
wrong for N_bin = 1, thus creating a small error for that bin, which leads 
to that bin being weighted heavily in the fit.

Did I miss something obvious??


Regards,
Kaz



This archive was generated by hypermail 2b29 : Thu Jan 01 2004 - 17:50:08 MET