RE: Chi2Test: PS

From: Rene Brun <brun_at_pcroot.cern.ch>
Date: Mon, 15 Aug 2005 11:25:13 +0200 (MEST)


Hi Jan,

Thanks for your remarks. See:
http://root.cern.ch/root/htmldoc//TH1.html#TH1:KolmogorovTest http://root.cern.ch/root/htmldoc//TMath.html#TMath:KolmogorovTest

Rene Brun

On Sat,
13
Aug 2005, Jan CONRAD wrote:

>
> Dear Rene,
>
> I would agree that you leave the KS test if you add the following
> discussion to the KS test documentation (with apropiate change of the
> routine names)
>
> This is coming directly from the HDIFF documentation:
>
>
> "The value of PROB returned by HDIFF is calculated such that it will be
> uniformly distributed between zero and one for compatible histograms,
> provided the data are not binned (or the number of bins is very large
> compared with the number of events). Users who have access to unbinned
> data and wish exact confidence levels should therefore not put their data
> into histograms, but should save them in ordinary Fortran arrays and call
> the routine TKOLMO which is being introduced into the Program Library. On
> the other hand, since HBOOK is a convenient way of collecting data and
> saving space, the routine HDIFF has been provided, and we believe it is
> the best test for comparison even on binned data. However, the values of
> PROB for binned data will be shifted slightly higher than expected,
> depending on the effects of the binning. For example, when comparing two
> uniform distributions of 500 events in 100 bins, the values of PROB,
> instead of being exactly uniformly distributed between zero and one, have
> a mean value of about 0.56. Since we are physicists, we can apply a useful
> rule: As long as the bin width is small compared with any significant
> physical effect (for example the experimental resolution) then the binning
> cannot have an important effect. Therefore, we believe that for all
> practical purposes, the probability value PROB is calculated correctly
> provided the user is aware that:
>
> 1. The value of PROB should not be expected to have exactly the correct
> distribution for binned data.
> 2. The user is responsible for seeing to it that the bin widths are
> small compared with any physical phenomena of interest.
> 3. The effect of binning (if any) is always to make the value of PROB
> slightly too big. That is, setting an acceptance criterion of (PROB>0.05
> will assure that at most 5% of truly compatible histograms are rejected,
> and usually somewhat less."
>
>
> It is maybe a little pedantic, but I think users should be aware of the
> fact that the quoted confidence levels might be wrong.
>
> Also, the KS test for unbinned data might be a nice addition to ROOT. In
> fact it seems much more suitable for TGraph !
>
> Best,
>
> Jan
>
>
>
>
>
>
>
>
>
>
>>
>> Hi,
>> I am just back from holidays: the Kolmogorov Smirnov test does not work
>> for histograms. You can test it yourself, the test statistics will not be
>> uniformly distributed. Therefore I suggested to remove it for histograms.
>>
>> Best,
>> Jan
>>
>>
>>
>>> Hi Gero,
>>>
>>> Thanks for your comments.
>>> It would be nice if an expert in statistics (if possible with
>>> experience with KolmogorovTest and Chi2Test) could write some lines
>>> on the virtues/problems of the two methods. I will add these comments
>>> in the documentation of the two functions.
>>>
>>> Rene Brun
>>>
>>> On
>>> Thu,
>>> 11
>>> Aug 2005, Gero Flucke wrote:
>>>
>>>> On Thu, 11 Aug 2005, Rene Brun wrote:
>>>>
>>>>> On Wed, 3 Aug 2005, Jan Conrad wrote:
>>>>>
>>>>>> Dear Root-developers,
>>>>>>
>>>>>> two suggestions:
>>>>
>>>> <snip>
>>>>
>>>>>> 2) having Chi2Test in place, maybe it is time to get rid of the
>>>>>> Kolmogorov-Smirnov Test for
>>>>>> histograms ?
>>>>>
>>>>> I dont't think that users will appreciate this proposal.
>>>>> Could you comment on the relative merits/drawbacks of the two approaches.
>>>>> I will add your comment in the KolmogorovTest functions.
>>>>
>>>> Hi Jan and Rene,
>>>> so far I neither used the chi2Test nor the KolmogorovTest, but as far as I
>>>> know from statistics, Kolmogorov is superior to chi2 in that sense that it
>>>> is sensitive to differences in shape while chi2 is better in taking into
>>>> account the histogram errors. So both tests have their own validity.
>>>>
>>>> Cheers
>>>>
>>>> Gero
>>>>
>>>
>>
>>
>
>
Received on Mon Aug 15 2005 - 11:25:20 MEST

This archive was generated by hypermail 2.2.0 : Tue Jan 02 2007 - 14:45:11 MET