Logo ROOT  
Reference Guide
 
Loading...
Searching...
No Matches
MethodFisher.cxx
Go to the documentation of this file.
1// @(#)root/tmva $Id$
2// Author: Andreas Hoecker, Xavier Prudent, Joerg Stelzer, Helge Voss, Kai Voss
3
4/**********************************************************************************
5 * Project: TMVA - a Root-integrated toolkit for multivariate Data analysis *
6 * Package: TMVA *
7 * Class : MethodFisher *
8 * *
9 * *
10 * Description: *
11 * Implementation (see header for description) *
12 * *
13 * Original author of this Fisher-Discriminant implementation: *
14 * Andre Gaidot, CEA-France; *
15 * (Translation from FORTRAN) *
16 * *
17 * Authors (alphabetical): *
18 * Andreas Hoecker <Andreas.Hocker@cern.ch> - CERN, Switzerland *
19 * Xavier Prudent <prudent@lapp.in2p3.fr> - LAPP, France *
20 * Helge Voss <Helge.Voss@cern.ch> - MPI-K Heidelberg, Germany *
21 * Kai Voss <Kai.Voss@cern.ch> - U. of Victoria, Canada *
22 * *
23 * Copyright (c) 2005: *
24 * CERN, Switzerland *
25 * U. of Victoria, Canada *
26 * MPI-K Heidelberg, Germany *
27 * LAPP, Annecy, France *
28 * *
29 * Redistribution and use in source and binary forms, with or without *
30 * modification, are permitted according to the terms listed in LICENSE *
31 * (see tmva/doc/LICENSE) *
32 **********************************************************************************/
33
34/*! \class TMVA::MethodFisher
35\ingroup TMVA
36
37Fisher and Mahalanobis Discriminants (Linear Discriminant Analysis)
38
39In the method of Fisher discriminants event selection is performed
40in a transformed variable space with zero linear correlations, by
41distinguishing the mean values of the signal and background
42distributions.
43
44The linear discriminant analysis determines an axis in the (correlated)
45hyperspace of the input variables
46such that, when projecting the output classes (signal and background)
47upon this axis, they are pushed as far as possible away from each other,
48while events of a same class are confined in a close vicinity.
49The linearity property of this method is reflected in the metric with
50which "far apart" and "close vicinity" are determined: the covariance
51matrix of the discriminant variable space.
52
53The classification of the events in signal and background classes
54relies on the following characteristics (only): overall sample means, \f$ x_i \f$,
55for each input variable, \f$ i \f$,
56class-specific sample means, \f$ x_{S(B),i}\f$,
57and total covariance matrix \f$ T_{ij} \f$. The covariance matrix
58can be decomposed into the sum of a _within_ (\f$ W_{ij} \f$)
59and a _between-class_ (\f$ B_{ij} \f$) class matrix. They describe
60the dispersion of events relative to the means of their own class (within-class
61matrix), and relative to the overall sample means (between-class matrix).
62The Fisher coefficients, \f$ F_i \f$, are then given by
63
64\f[
65F_i = \frac{\sqrt{N_s N_b}}{N_s + N_b} \sum_{j=1}^{N_{SB}} W_{ij}^{-1} (\bar{X}_{Sj} - \bar{X}_{Bj})
66\f]
67
68where in TMVA is set \f$ N_S = N_B \f$, so that the factor
69in front of the sum simplifies to \f$ \frac{1}{2}\f$.
70The Fisher discriminant then reads
71
72\f[
73X_{Fi} = F_0 + \sum_{i=1}^{N_{SB}} F_i X_i
74\f]
75
76The offset \f$ F_0 \f$ centers the sample mean of \f$ x_{Fi} \f$
77at zero. Instead of using the within-class matrix, the Mahalanobis variant
78determines the Fisher coefficients as follows:
79
80\f[
81F_i = \frac{\sqrt{N_s N_b}}{N_s + N_b} \sum_{j=1}^{N_{SB}} (W + B)_{ij}^{-1} (\bar{X}_{Sj} - \bar{X}_{Bj})
82\f]
83
84with resulting \f$ x_{Ma} \f$ that are very similar to the \f$ x_{Fi} \f$.
85
86TMVA provides two outputs for the ranking of the input variables:
87
88 - __Fisher test:__ the Fisher analysis aims at simultaneously maximising
89the between-class separation, while minimising the within-class dispersion.
90A useful measure of the discrimination power of a variable is hence given
91by the diagonal quantity: \f$ \frac{B_{ii}}{W_{ii}} \f$ .
92
93 - __Discrimination power:__ the value of the Fisher coefficient is a
94measure of the discriminating power of a variable. The discrimination power
95of set of input variables can therefore be measured by the scalar
96
97\f[
98\lambda = \frac{\sqrt{N_s N_b}}{N_s + N_b} \sum_{j=1}^{N_{SB}} F_i (\bar{X}_{Sj} - \bar{X}_{Bj})
99\f]
100
101The corresponding numbers are printed on standard output.
102*/
103
104#include "TMVA/MethodFisher.h"
105
107#include "TMVA/Configurable.h"
108#include "TMVA/DataSet.h"
109#include "TMVA/DataSetInfo.h"
110#include "TMVA/Event.h"
111#include "TMVA/IMethod.h"
112#include "TMVA/MethodBase.h"
113#include "TMVA/MsgLogger.h"
114#include "TMVA/Ranking.h"
115#include "TMVA/Tools.h"
117#include "TMVA/Types.h"
119
120#include "TMath.h"
121#include "TMatrix.h"
122#include "TList.h"
123
124#include <iostream>
125#include <iomanip>
126#include <cassert>
127
129
130
131////////////////////////////////////////////////////////////////////////////////
132/// standard constructor for the "Fisher"
133
135 const TString& methodTitle,
138 MethodBase( jobName, Types::kFisher, methodTitle, dsi, theOption),
139 fMeanMatx ( 0 ),
140 fTheMethod ( "Fisher" ),
141 fFisherMethod ( kFisher ),
142 fBetw ( 0 ),
143 fWith ( 0 ),
144 fCov ( 0 ),
145 fSumOfWeightsS( 0 ),
146 fSumOfWeightsB( 0 ),
147 fDiscrimPow ( 0 ),
148 fFisherCoeff ( 0 ),
149 fF0 ( 0 )
150{
151}
152
153////////////////////////////////////////////////////////////////////////////////
154/// constructor from weight file
155
157 const TString& theWeightFile) :
158 MethodBase( Types::kFisher, dsi, theWeightFile),
159 fMeanMatx ( 0 ),
160 fTheMethod ( "Fisher" ),
161 fFisherMethod ( kFisher ),
162 fBetw ( 0 ),
163 fWith ( 0 ),
164 fCov ( 0 ),
165 fSumOfWeightsS( 0 ),
166 fSumOfWeightsB( 0 ),
167 fDiscrimPow ( 0 ),
168 fFisherCoeff ( 0 ),
169 fF0 ( 0 )
170{
171}
172
173////////////////////////////////////////////////////////////////////////////////
174/// default initialization called by all constructors
175
177{
178 // allocate Fisher coefficients
179 fFisherCoeff = new std::vector<Double_t>( GetNvar() );
180
181 // the minimum requirement to declare an event signal-like
182 SetSignalReferenceCut( 0.0 );
183
184 // this is the preparation for training
185 InitMatrices();
186}
187
188////////////////////////////////////////////////////////////////////////////////
189/// MethodFisher options:
190/// format and syntax of option string: "type"
191/// where type is "Fisher" or "Mahalanobis"
192
194{
195 DeclareOptionRef( fTheMethod = "Fisher", "Method", "Discrimination method" );
196 AddPreDefVal(TString("Fisher"));
197 AddPreDefVal(TString("Mahalanobis"));
198}
199
200////////////////////////////////////////////////////////////////////////////////
201/// process user options
202
204{
205 if (fTheMethod == "Fisher" ) fFisherMethod = kFisher;
206 else fFisherMethod = kMahalanobis;
207
208 // this is the preparation for training
209 InitMatrices();
210}
211
212////////////////////////////////////////////////////////////////////////////////
213/// destructor
214
216{
217 if (fBetw ) { delete fBetw; fBetw = 0; }
218 if (fWith ) { delete fWith; fWith = 0; }
219 if (fCov ) { delete fCov; fCov = 0; }
220 if (fDiscrimPow ) { delete fDiscrimPow; fDiscrimPow = 0; }
221 if (fFisherCoeff) { delete fFisherCoeff; fFisherCoeff = 0; }
222}
223
224////////////////////////////////////////////////////////////////////////////////
225/// Fisher can only handle classification with 2 classes
226
232
233////////////////////////////////////////////////////////////////////////////////
234/// computation of Fisher coefficients by series of matrix operations
235
237{
238 // get mean value of each variables for signal, backgd and signal+backgd
239 GetMean();
240
241 // get the matrix of covariance 'within class'
242 GetCov_WithinClass();
243
244 // get the matrix of covariance 'between class'
245 GetCov_BetweenClass();
246
247 // get the matrix of covariance 'between class'
248 GetCov_Full();
249
250 //--------------------------------------------------------------
251
252 // get the Fisher coefficients
253 GetFisherCoeff();
254
255 // get the discriminating power of each variables
256 GetDiscrimPower();
257
258 // nice output
259 PrintCoefficients();
260
261 ExitFromTraining();
262}
263
264////////////////////////////////////////////////////////////////////////////////
265/// returns the Fisher value (no fixed range)
266
268{
269 const Event * ev = GetEvent();
270 Double_t result = fF0;
271 for (UInt_t ivar=0; ivar<GetNvar(); ivar++)
272 result += (*fFisherCoeff)[ivar]*ev->GetValue(ivar);
273
274 // cannot determine error
275 NoErrorCalc(err, errUpper);
276
277 return result;
278
279}
280
281////////////////////////////////////////////////////////////////////////////////
282/// initialization method; creates global matrices and vectors
283
285{
286 // average value of each variables for S, B, S+B
287 fMeanMatx = new TMatrixD( GetNvar(), 3 );
288
289 // the covariance 'within class' and 'between class' matrices
290 fBetw = new TMatrixD( GetNvar(), GetNvar() );
291 fWith = new TMatrixD( GetNvar(), GetNvar() );
292 fCov = new TMatrixD( GetNvar(), GetNvar() );
293
294 // discriminating power
295 fDiscrimPow = new std::vector<Double_t>( GetNvar() );
296}
297
298////////////////////////////////////////////////////////////////////////////////
299/// compute mean values of variables in each sample, and the overall means
300
302{
303 // initialize internal sum-of-weights variables
304 fSumOfWeightsS = 0;
305 fSumOfWeightsB = 0;
306
307 const UInt_t nvar = DataInfo().GetNVariables();
308
309 // init vectors
310 Double_t* sumS = new Double_t[nvar];
311 Double_t* sumB = new Double_t[nvar];
312 for (UInt_t ivar=0; ivar<nvar; ivar++) { sumS[ivar] = sumB[ivar] = 0; }
313
314 // compute sample means
315 for (Int_t ievt=0; ievt<Data()->GetNEvents(); ievt++) {
316
317 // read the Training Event into "event"
318 const Event * ev = GetEvent(ievt);
319
320 // sum of weights
321 Double_t weight = ev->GetWeight();
322 if (DataInfo().IsSignal(ev)) fSumOfWeightsS += weight;
323 else fSumOfWeightsB += weight;
324
325 Double_t* sum = DataInfo().IsSignal(ev) ? sumS : sumB;
326
327 for (UInt_t ivar=0; ivar<nvar; ivar++) sum[ivar] += ev->GetValue( ivar )*weight;
328 }
329
330 for (UInt_t ivar=0; ivar<nvar; ivar++) {
331 (*fMeanMatx)( ivar, 2 ) = sumS[ivar];
332 (*fMeanMatx)( ivar, 0 ) = sumS[ivar]/fSumOfWeightsS;
333
334 (*fMeanMatx)( ivar, 2 ) += sumB[ivar];
335 (*fMeanMatx)( ivar, 1 ) = sumB[ivar]/fSumOfWeightsB;
336
337 // signal + background
338 (*fMeanMatx)( ivar, 2 ) /= (fSumOfWeightsS + fSumOfWeightsB);
339 }
340
341 // fMeanMatx->Print();
342 delete [] sumS;
343 delete [] sumB;
344}
345
346////////////////////////////////////////////////////////////////////////////////
347/// the matrix of covariance 'within class' reflects the dispersion of the
348/// events relative to the center of gravity of their own class
349
351{
352 // assert required
353 assert( fSumOfWeightsS > 0 && fSumOfWeightsB > 0 );
354
355 // product matrices (x-<x>)(y-<y>) where x;y are variables
356
357 // init
358 const Int_t nvar = GetNvar();
359 const Int_t nvar2 = nvar*nvar;
362 Double_t *xval = new Double_t[nvar];
363 memset(sumSig,0,nvar2*sizeof(Double_t));
364 memset(sumBgd,0,nvar2*sizeof(Double_t));
365
366 // 'within class' covariance
367 for (Int_t ievt=0; ievt<Data()->GetNEvents(); ievt++) {
368
369 // read the Training Event into "event"
370 const Event* ev = GetEvent(ievt);
371
372 Double_t weight = ev->GetWeight(); // may ignore events with negative weights
373
374 for (Int_t x=0; x<nvar; x++) xval[x] = ev->GetValue( x );
375 Int_t k=0;
376 for (Int_t x=0; x<nvar; x++) {
377 for (Int_t y=0; y<nvar; y++) {
378 if (DataInfo().IsSignal(ev)) {
379 Double_t v = ( (xval[x] - (*fMeanMatx)(x, 0))*(xval[y] - (*fMeanMatx)(y, 0)) )*weight;
380 sumSig[k] += v;
381 }else{
382 Double_t v = ( (xval[x] - (*fMeanMatx)(x, 1))*(xval[y] - (*fMeanMatx)(y, 1)) )*weight;
383 sumBgd[k] += v;
384 }
385 k++;
386 }
387 }
388 }
389 Int_t k=0;
390 for (Int_t x=0; x<nvar; x++) {
391 for (Int_t y=0; y<nvar; y++) {
392 //(*fWith)(x, y) = (sumSig[k] + sumBgd[k])/(fSumOfWeightsS + fSumOfWeightsB);
393 // HHV: I am still convinced that THIS is how it should be (below) However, while
394 // the old version corresponded so nicely with LD, the FIXED version does not, unless
395 // we agree to change LD. For LD, it is not "defined" to my knowledge how the weights
396 // are weighted, while it is clear how the "Within" matrix for Fisher should be calculated
397 // (i.e. as seen below). In order to agree with the Fisher classifier, one would have to
398 // weigh signal and background such that they correspond to the same number of effective
399 // (weighted) events.
400 // THAT is NOT done currently, but just "event weights" are used.
401 (*fWith)(x, y) = sumSig[k]/fSumOfWeightsS + sumBgd[k]/fSumOfWeightsB;
402 k++;
403 }
404 }
405
406 delete [] sumSig;
407 delete [] sumBgd;
408 delete [] xval;
409}
410
411////////////////////////////////////////////////////////////////////////////////
412/// the matrix of covariance 'between class' reflects the dispersion of the
413/// events of a class relative to the global center of gravity of all the class
414/// hence the separation between classes
415
417{
418 // assert required
419 assert( fSumOfWeightsS > 0 && fSumOfWeightsB > 0);
420
422
423 for (UInt_t x=0; x<GetNvar(); x++) {
424 for (UInt_t y=0; y<GetNvar(); y++) {
425
426 prodSig = ( ((*fMeanMatx)(x, 0) - (*fMeanMatx)(x, 2))*
427 ((*fMeanMatx)(y, 0) - (*fMeanMatx)(y, 2)) );
428 prodBgd = ( ((*fMeanMatx)(x, 1) - (*fMeanMatx)(x, 2))*
429 ((*fMeanMatx)(y, 1) - (*fMeanMatx)(y, 2)) );
430
431 (*fBetw)(x, y) = (fSumOfWeightsS*prodSig + fSumOfWeightsB*prodBgd) / (fSumOfWeightsS + fSumOfWeightsB);
432 }
433 }
434}
435
436////////////////////////////////////////////////////////////////////////////////
437/// compute full covariance matrix from sum of within and between matrices
438
440{
441 for (UInt_t x=0; x<GetNvar(); x++)
442 for (UInt_t y=0; y<GetNvar(); y++)
443 (*fCov)(x, y) = (*fWith)(x, y) + (*fBetw)(x, y);
444}
445
446////////////////////////////////////////////////////////////////////////////////
447/// Fisher = Sum { [coeff]*[variables] }
448///
449/// let Xs be the array of the mean values of variables for signal evts
450/// let Xb be the array of the mean values of variables for backgd evts
451/// let InvWith be the inverse matrix of the 'within class' correlation matrix
452///
453/// then the array of Fisher coefficients is
454/// [coeff] =sqrt(fNsig*fNbgd)/fNevt*transpose{Xs-Xb}*InvWith
455
457{
458 // assert required
459 assert( fSumOfWeightsS > 0 && fSumOfWeightsB > 0);
460
461 // invert covariance matrix
462 TMatrixD* theMat = 0;
463 switch (GetFisherMethod()) {
464 case kFisher:
465 theMat = fWith;
466 break;
467 case kMahalanobis:
468 theMat = fCov;
469 break;
470 default:
471 Log() << kFATAL << "<GetFisherCoeff> undefined method" << GetFisherMethod() << Endl;
472 }
473
475
476 if ( TMath::Abs(invCov.Determinant()) < 10E-24 ) {
477 Log() << kWARNING << "<GetFisherCoeff> matrix is almost singular with determinant="
478 << TMath::Abs(invCov.Determinant())
479 << " did you use the variables that are linear combinations or highly correlated?"
480 << Endl;
481 }
482 if ( TMath::Abs(invCov.Determinant()) < 10E-120 ) {
483 theMat->Print();
484 Log() << kFATAL << "<GetFisherCoeff> matrix is singular with determinant="
485 << TMath::Abs(invCov.Determinant())
486 << " did you use the variables that are linear combinations? \n"
487 << " do you any clue as to what went wrong in above printout of the covariance matrix? "
488 << Endl;
489 }
490
491 invCov.Invert();
492
493 // apply rescaling factor
494 Double_t xfact = TMath::Sqrt( fSumOfWeightsS*fSumOfWeightsB ) / (fSumOfWeightsS + fSumOfWeightsB);
495
496 // compute difference of mean values
497 std::vector<Double_t> diffMeans( GetNvar() );
499 for (ivar=0; ivar<GetNvar(); ivar++) {
500 (*fFisherCoeff)[ivar] = 0;
501
502 for (jvar=0; jvar<GetNvar(); jvar++) {
503 Double_t d = (*fMeanMatx)(jvar, 0) - (*fMeanMatx)(jvar, 1);
504 (*fFisherCoeff)[ivar] += invCov(ivar, jvar)*d;
505 }
506 // rescale
507 (*fFisherCoeff)[ivar] *= xfact;
508 }
509
510
511 // offset correction
512 fF0 = 0.0;
513 for (ivar=0; ivar<GetNvar(); ivar++){
514 fF0 += (*fFisherCoeff)[ivar]*((*fMeanMatx)(ivar, 0) + (*fMeanMatx)(ivar, 1));
515 }
516 fF0 /= -2.0;
517}
518
519////////////////////////////////////////////////////////////////////////////////
520/// computation of discrimination power indicator for each variable
521/// small values of "fWith" indicates little compactness of sig & of backgd
522/// big values of "fBetw" indicates large separation between sig & backgd
523///
524/// we want signal & backgd classes as compact and separated as possible
525/// the discriminating power is then defined as the ration "fBetw/fWith"
526
528{
529 for (UInt_t ivar=0; ivar<GetNvar(); ivar++) {
530 if ((*fCov)(ivar, ivar) != 0)
531 (*fDiscrimPow)[ivar] = (*fBetw)(ivar, ivar)/(*fCov)(ivar, ivar);
532 else
533 (*fDiscrimPow)[ivar] = 0;
534 }
535}
536
537////////////////////////////////////////////////////////////////////////////////
538/// computes ranking of input variables
539
541{
542 // create the ranking object
543 fRanking = new Ranking( GetName(), "Discr. power" );
544
545 for (UInt_t ivar=0; ivar<GetNvar(); ivar++) {
546 fRanking->AddRank( Rank( GetInputLabel(ivar), (*fDiscrimPow)[ivar] ) );
547 }
548
549 return fRanking;
550}
551
552////////////////////////////////////////////////////////////////////////////////
553/// display Fisher coefficients and discriminating power for each variable
554/// check maximum length of variable name
555
557{
558 Log() << kHEADER << "Results for Fisher coefficients:" << Endl;
559
560 if (GetTransformationHandler().GetTransformationList().GetSize() != 0) {
561 Log() << kINFO << "NOTE: The coefficients must be applied to TRANFORMED variables" << Endl;
562 Log() << kINFO << " List of the transformation: " << Endl;
563 TListIter trIt(&GetTransformationHandler().GetTransformationList());
565 Log() << kINFO << " -- " << trf->GetName() << Endl;
566 }
567 }
568 std::vector<TString> vars;
569 std::vector<Double_t> coeffs;
570 for (UInt_t ivar=0; ivar<GetNvar(); ivar++) {
571 vars .push_back( GetInputLabel(ivar) );
572 coeffs.push_back( (*fFisherCoeff)[ivar] );
573 }
574 vars .push_back( "(offset)" );
575 coeffs.push_back( fF0 );
576 TMVA::gTools().FormattedOutput( coeffs, vars, "Variable" , "Coefficient", Log() );
577
578 // for (int i=0; i<coeffs.size(); i++)
579 // std::cout << "fisher coeff["<<i<<"]="<<coeffs[i]<<std::endl;
580
581 if (IsNormalised()) {
582 Log() << kINFO << "NOTE: You have chosen to use the \"Normalise\" booking option. Hence, the" << Endl;
583 Log() << kINFO << " coefficients must be applied to NORMALISED (') variables as follows:" << Endl;
584 Int_t maxL = 0;
585 for (UInt_t ivar=0; ivar<GetNvar(); ivar++) if (GetInputLabel(ivar).Length() > maxL) maxL = GetInputLabel(ivar).Length();
586
587 // Print normalisation expression (see Tools.cxx): "2*(x - xmin)/(xmax - xmin) - 1.0"
588 for (UInt_t ivar=0; ivar<GetNvar(); ivar++) {
589 Log() << kINFO
590 << std::setw(maxL+9) << TString("[") + GetInputLabel(ivar) + "]' = 2*("
591 << std::setw(maxL+2) << TString("[") + GetInputLabel(ivar) + "]"
592 << std::setw(3) << (GetXmin(ivar) > 0 ? " - " : " + ")
593 << std::setw(6) << TMath::Abs(GetXmin(ivar)) << std::setw(3) << ")/"
594 << std::setw(6) << (GetXmax(ivar) - GetXmin(ivar) )
595 << std::setw(3) << " - 1"
596 << Endl;
597 }
598 Log() << kINFO << "The TMVA Reader will properly account for this normalisation, but if the" << Endl;
599 Log() << kINFO << "Fisher classifier is applied outside the Reader, the transformation must be" << Endl;
600 Log() << kINFO << "implemented -- or the \"Normalise\" option is removed and Fisher retrained." << Endl;
601 Log() << kINFO << Endl;
602 }
603}
604
605////////////////////////////////////////////////////////////////////////////////
606/// read Fisher coefficients from weight file
607
609{
610 istr >> fF0;
611 for (UInt_t ivar=0; ivar<GetNvar(); ivar++) istr >> (*fFisherCoeff)[ivar];
612}
613
614////////////////////////////////////////////////////////////////////////////////
615/// create XML description of Fisher classifier
616
617void TMVA::MethodFisher::AddWeightsXMLTo( void* parent ) const
618{
619 void* wght = gTools().AddChild(parent, "Weights");
620 gTools().AddAttr( wght, "NCoeff", GetNvar()+1 );
621 void* coeffxml = gTools().AddChild(wght, "Coefficient");
622 gTools().AddAttr( coeffxml, "Index", 0 );
623 gTools().AddAttr( coeffxml, "Value", fF0 );
624 for (UInt_t ivar=0; ivar<GetNvar(); ivar++) {
625 coeffxml = gTools().AddChild( wght, "Coefficient" );
626 gTools().AddAttr( coeffxml, "Index", ivar+1 );
627 gTools().AddAttr( coeffxml, "Value", (*fFisherCoeff)[ivar] );
628 }
629}
630
631////////////////////////////////////////////////////////////////////////////////
632/// read Fisher coefficients from xml weight file
633
635{
637 gTools().ReadAttr( wghtnode, "NCoeff", ncoeff );
638 fFisherCoeff->resize(ncoeff-1);
639
640 void* ch = gTools().GetChild(wghtnode);
642 while (ch) {
643 gTools().ReadAttr( ch, "Index", coeffidx );
644 gTools().ReadAttr( ch, "Value", coeff );
645 if (coeffidx==0) fF0 = coeff;
646 else (*fFisherCoeff)[coeffidx-1] = coeff;
647 ch = gTools().GetNextChild(ch);
648 }
649}
650
651////////////////////////////////////////////////////////////////////////////////
652/// write Fisher-specific classifier response
653
654void TMVA::MethodFisher::MakeClassSpecific( std::ostream& fout, const TString& className ) const
655{
656 Int_t dp = fout.precision();
657 fout << " double fFisher0;" << std::endl;
658 fout << " std::vector<double> fFisherCoefficients;" << std::endl;
659 fout << "};" << std::endl;
660 fout << "" << std::endl;
661 fout << "inline void " << className << "::Initialize() " << std::endl;
662 fout << "{" << std::endl;
663 fout << " fFisher0 = " << std::setprecision(12) << fF0 << ";" << std::endl;
664 for (UInt_t ivar=0; ivar<GetNvar(); ivar++) {
665 fout << " fFisherCoefficients.push_back( " << std::setprecision(12) << (*fFisherCoeff)[ivar] << " );" << std::endl;
666 }
667 fout << std::endl;
668 fout << " // sanity check" << std::endl;
669 fout << " if (fFisherCoefficients.size() != fNvars) {" << std::endl;
670 fout << " std::cout << \"Problem in class \\\"\" << fClassName << \"\\\"::Initialize: mismatch in number of input values\"" << std::endl;
671 fout << " << fFisherCoefficients.size() << \" != \" << fNvars << std::endl;" << std::endl;
672 fout << " fStatusIsClean = false;" << std::endl;
673 fout << " } " << std::endl;
674 fout << "}" << std::endl;
675 fout << std::endl;
676 fout << "inline double " << className << "::GetMvaValue__( const std::vector<double>& inputValues ) const" << std::endl;
677 fout << "{" << std::endl;
678 fout << " double retval = fFisher0;" << std::endl;
679 fout << " for (size_t ivar = 0; ivar < fNvars; ivar++) {" << std::endl;
680 fout << " retval += fFisherCoefficients[ivar]*inputValues[ivar];" << std::endl;
681 fout << " }" << std::endl;
682 fout << std::endl;
683 fout << " return retval;" << std::endl;
684 fout << "}" << std::endl;
685 fout << std::endl;
686 fout << "// Clean up" << std::endl;
687 fout << "inline void " << className << "::Clear() " << std::endl;
688 fout << "{" << std::endl;
689 fout << " // clear coefficients" << std::endl;
690 fout << " fFisherCoefficients.clear(); " << std::endl;
691 fout << "}" << std::endl;
692 fout << std::setprecision(dp);
693}
694
695////////////////////////////////////////////////////////////////////////////////
696/// get help message text
697///
698/// typical length of text line:
699/// "|--------------------------------------------------------------|"
700
702{
703 Log() << Endl;
704 Log() << gTools().Color("bold") << "--- Short description:" << gTools().Color("reset") << Endl;
705 Log() << Endl;
706 Log() << "Fisher discriminants select events by distinguishing the mean " << Endl;
707 Log() << "values of the signal and background distributions in a trans- " << Endl;
708 Log() << "formed variable space where linear correlations are removed." << Endl;
709 Log() << Endl;
710 Log() << " (More precisely: the \"linear discriminator\" determines" << Endl;
711 Log() << " an axis in the (correlated) hyperspace of the input " << Endl;
712 Log() << " variables such that, when projecting the output classes " << Endl;
713 Log() << " (signal and background) upon this axis, they are pushed " << Endl;
714 Log() << " as far as possible away from each other, while events" << Endl;
715 Log() << " of a same class are confined in a close vicinity. The " << Endl;
716 Log() << " linearity property of this classifier is reflected in the " << Endl;
717 Log() << " metric with which \"far apart\" and \"close vicinity\" are " << Endl;
718 Log() << " determined: the covariance matrix of the discriminating" << Endl;
719 Log() << " variable space.)" << Endl;
720 Log() << Endl;
721 Log() << gTools().Color("bold") << "--- Performance optimisation:" << gTools().Color("reset") << Endl;
722 Log() << Endl;
723 Log() << "Optimal performance for Fisher discriminants is obtained for " << Endl;
724 Log() << "linearly correlated Gaussian-distributed variables. Any deviation" << Endl;
725 Log() << "from this ideal reduces the achievable separation power. In " << Endl;
726 Log() << "particular, no discrimination at all is achieved for a variable" << Endl;
727 Log() << "that has the same sample mean for signal and background, even if " << Endl;
728 Log() << "the shapes of the distributions are very different. Thus, Fisher " << Endl;
729 Log() << "discriminants often benefit from suitable transformations of the " << Endl;
730 Log() << "input variables. For example, if a variable x in [-1,1] has a " << Endl;
731 Log() << "a parabolic signal distributions, and a uniform background" << Endl;
732 Log() << "distributions, their mean value is zero in both cases, leading " << Endl;
733 Log() << "to no separation. The simple transformation x -> |x| renders this " << Endl;
734 Log() << "variable powerful for the use in a Fisher discriminant." << Endl;
735 Log() << Endl;
736 Log() << gTools().Color("bold") << "--- Performance tuning via configuration options:" << gTools().Color("reset") << Endl;
737 Log() << Endl;
738 Log() << "<None>" << Endl;
739}
#define REGISTER_METHOD(CLASS)
for example
#define d(i)
Definition RSha256.hxx:102
constexpr Bool_t kFALSE
Definition RtypesCore.h:108
constexpr Bool_t kTRUE
Definition RtypesCore.h:107
ROOT::Detail::TRangeCast< T, true > TRangeDynCast
TRangeDynCast is an adapter class that allows the typed iteration through a TCollection.
Option_t Option_t TPoint TPoint const char GetTextMagnitude GetFillStyle GetLineColor GetLineWidth GetMarkerStyle GetTextAlign GetTextColor GetTextSize void char Point_t Rectangle_t WindowAttributes_t Float_t Float_t Float_t Int_t Int_t UInt_t UInt_t Rectangle_t result
Option_t Option_t TPoint TPoint const char GetTextMagnitude GetFillStyle GetLineColor GetLineWidth GetMarkerStyle GetTextAlign GetTextColor GetTextSize void char Point_t Rectangle_t WindowAttributes_t Float_t Float_t Float_t Int_t Int_t UInt_t UInt_t Rectangle_t Int_t Int_t Window_t TString Int_t GCValues_t GetPrimarySelectionOwner GetDisplay GetScreen GetColormap GetNativeEvent const char const char dpyName wid window const char font_name cursor keysym reg const char only_if_exist regb h Point_t winding char text const char depth char const char Int_t count const char ColorStruct_t color const char Pixmap_t Pixmap_t PictureAttributes_t attr const char char ret_data h unsigned char height h Atom_t Int_t ULong_t ULong_t unsigned char prop_list Atom_t Atom_t Atom_t Time_t type
TMatrixT< Double_t > TMatrixD
Definition TMatrixDfwd.h:23
Iterator of linked list.
Definition TList.h:191
Class that contains all the data information.
Definition DataSetInfo.h:62
Virtual base Class for all MVA method.
Definition MethodBase.h:111
Fisher and Mahalanobis Discriminants (Linear Discriminant Analysis)
void GetCov_Full(void)
compute full covariance matrix from sum of within and between matrices
void GetHelpMessage() const override
get help message text
MethodFisher(const TString &jobName, const TString &methodTitle, DataSetInfo &dsi, const TString &theOption="Fisher")
standard constructor for the "Fisher"
virtual ~MethodFisher(void)
destructor
Bool_t HasAnalysisType(Types::EAnalysisType type, UInt_t numberClasses, UInt_t numberTargets) override
Fisher can only handle classification with 2 classes.
void AddWeightsXMLTo(void *parent) const override
create XML description of Fisher classifier
void GetDiscrimPower(void)
computation of discrimination power indicator for each variable small values of "fWith" indicates lit...
void PrintCoefficients(void)
display Fisher coefficients and discriminating power for each variable check maximum length of variab...
void GetCov_BetweenClass(void)
the matrix of covariance 'between class' reflects the dispersion of the events of a class relative to...
void Init(void) override
default initialization called by all constructors
void DeclareOptions() override
MethodFisher options: format and syntax of option string: "type" where type is "Fisher" or "Mahalanob...
void MakeClassSpecific(std::ostream &, const TString &) const override
write Fisher-specific classifier response
void ReadWeightsFromXML(void *wghtnode) override
read Fisher coefficients from xml weight file
void GetFisherCoeff(void)
Fisher = Sum { [coeff]*[variables] }.
void GetMean(void)
compute mean values of variables in each sample, and the overall means
void Train(void) override
computation of Fisher coefficients by series of matrix operations
void InitMatrices(void)
initialization method; creates global matrices and vectors
void GetCov_WithinClass(void)
the matrix of covariance 'within class' reflects the dispersion of the events relative to the center ...
Double_t GetMvaValue(Double_t *err=nullptr, Double_t *errUpper=nullptr) override
returns the Fisher value (no fixed range)
void ProcessOptions() override
process user options
const Ranking * CreateRanking() override
computes ranking of input variables
void ReadWeightsFromStream(std::istream &i) override
read Fisher coefficients from weight file
Ranking for variables in method (implementation)
Definition Ranking.h:48
void FormattedOutput(const std::vector< Double_t > &, const std::vector< TString > &, const TString titleVars, const TString titleValues, MsgLogger &logger, TString format="%+1.3f")
formatted output of simple table
Definition Tools.cxx:887
const TString & Color(const TString &)
human readable color strings
Definition Tools.cxx:828
void ReadAttr(void *node, const char *, T &value)
read attribute from xml
Definition Tools.h:329
void * GetChild(void *parent, const char *childname=nullptr)
get child node
Definition Tools.cxx:1150
void AddAttr(void *node, const char *, const T &value, Int_t precision=16)
add attribute to xml
Definition Tools.h:347
void * AddChild(void *parent, const char *childname, const char *content=nullptr, bool isRootNode=false)
add child node
Definition Tools.cxx:1124
void * GetNextChild(void *prevchild, const char *childname=nullptr)
XML helpers.
Definition Tools.cxx:1162
Singleton class for Global types used by TMVA.
Definition Types.h:71
@ kClassification
Definition Types.h:127
Linear interpolation class.
Basic string class.
Definition TString.h:138
Double_t y[n]
Definition legend1.C:17
Double_t x[n]
Definition legend1.C:17
create variable transformations
Tools & gTools()
MsgLogger & Endl(MsgLogger &ml)
Definition MsgLogger.h:148
Double_t Sqrt(Double_t x)
Returns the square root of x.
Definition TMath.h:673
Short_t Abs(Short_t d)
Returns the absolute value of parameter Short_t d.
Definition TMathBase.h:124
static uint64_t sum(uint64_t i)
Definition Factory.cxx:2339