Likelihood analysis ("non-parametric approach")
Also implemented is a "diagonalized likelihood approach", which improves over the uncorrelated likelihood ansatz by transforming linearly the input variables into a diagonal space, using the square-root of the covariance matrix
The method of maximum likelihood is the most straightforward, and
certainly among the most elegant multivariate analyser approaches.
We define the likelihood ratio, RL, for event
i, by:
Note that in TMVA the output of the likelihood ratio is transformed
by
The biggest drawback of the Likelihood approach is that it assumes
that the discriminant variables are uncorrelated. If it were the case,
it can be proven that the discrimination obtained by the above likelihood
ratio is optimal, ie, no other method can beat it. However, in most
practical applications of MVAs correlations are present.
Linear correlations, measured from the training sample, can be taken
into account in a straightforward manner through the square-root
of the covariance matrix. The square-root of a matrix
C is the matrix C′ that multiplied with itself
yields C: C=C′C′. We compute the
square-root matrix (SQM) by means of diagonalising (D) the
covariance matrix:
The above diagonalisation is complete for linearly correlated, Gaussian distributed variables only. In real-world examples this is not often the case, so that only little additional information may be recovered by the diagonalisation procedure. In these cases, non-linear methods must be applied.
| virtual void | DeclareOptions() |
| void | InitLik() |
| virtual void | ProcessOptions() |
| Double_t | TransformLikelihoodOutput(Double_t ps, Double_t pb) const |
| enum TMVA::MethodBase::EWeightFileType { | kROOT | |
| kTEXT | ||
| }; | ||
| enum TMVA::MethodBase::ECutOrientation { | kNegative | |
| kPositive | ||
| }; | ||
| enum TObject::EStatusBits { | kCanDelete | |
| kMustCleanup | ||
| kObjInCanvas | ||
| kIsReferenced | ||
| kHasUUID | ||
| kCannotPick | ||
| kNoContextMenu | ||
| kInvalidObject | ||
| }; | ||
| enum TObject::[unnamed] { | kIsOnHeap | |
| kNotDeleted | ||
| kZombie | ||
| kBitMask | ||
| kSingleKey | ||
| kOverwrite | ||
| kWriteDelete | ||
| }; | ||
| enum TObject::EStatusBits { | kCanDelete | |
| kMustCleanup | ||
| kObjInCanvas | ||
| kIsReferenced | ||
| kHasUUID | ||
| kCannotPick | ||
| kNoContextMenu | ||
| kInvalidObject | ||
| }; | ||
| enum TObject::[unnamed] { | kIsOnHeap | |
| kNotDeleted | ||
| kZombie | ||
| kBitMask | ||
| kSingleKey | ||
| kOverwrite | ||
| kWriteDelete | ||
| }; |
| TMVA::MsgLogger | TMVA::Configurable::fLogger | message logger |
| vector<TString>* | TMVA::MethodBase::fInputVars | vector of input variables used in MVA |
| TMVA::MsgLogger | TMVA::MethodBase::fLogger | message logger |
| Int_t | TMVA::MethodBase::fNbins | number of bins in representative histograms |
| Int_t | TMVA::MethodBase::fNbinsH | number of bins in evaluation histograms |
| TMVA::Ranking* | TMVA::MethodBase::fRanking | pointer to ranking object (created by derived classifiers) |
| Int_t | fAverageEvtPerBin | average events per bin; used to calculate fNbins |
| Int_t* | fAverageEvtPerBinVarB | average events per bin; used to calculate fNbins |
| Int_t* | fAverageEvtPerBinVarS | average events per bin; used to calculate fNbins |
| TMVA::KDEKernel::EKernelBorder | fBorderMethod | the method to take care about "border" effects |
| TString | fBorderMethodString | the method to take care about "border" effects (string) |
| Int_t | fDropVariable | for ranking test |
| Double_t | fEpsilon | minimum number of likelihood (to avoid zero) |
| vector<TH1*>* | fHistBgd | background PDFs (histograms) |
| vector<TH1*>* | fHistBgd_smooth | background PDFs (smoothed histograms) |
| vector<TH1*>* | fHistSig | signal PDFs (histograms) |
| vector<TH1*>* | fHistSig_smooth | signal PDFs (smoothed histograms) |
| TMVA::PDF::EInterpolateMethod* | fInterpolateMethod | enumerators encoding the interpolation method |
| TString* | fInterpolateString | which interpolation method used for reference histograms (individual for each variable) |
| Float_t | fKDEfineFactor | fine tuning factor for Adaptive KDE: factor to multiply the "width" of the Kernel function |
| TMVA::KDEKernel::EKernelIter | fKDEiter | Number of iterations |
| TString | fKDEiterString | Number of iterations (string) |
| TMVA::KDEKernel::EKernelType | fKDEtype | Kernel type to use for KDE |
| TString | fKDEtypeString | Kernel type to use for KDE (string) (if KDE is selected for interpolation) |
| Int_t | fNsmooth | number of smooth passes |
| Int_t* | fNsmoothVarB | number of smooth passes |
| Int_t* | fNsmoothVarS | number of smooth passes |
| vector<PDF*>* | fPDFBgd | list of PDFs (background) |
| vector<PDF*>* | fPDFSig | list of PDFs (signal) |
| Int_t | fSpline | Spline order to smooth histograms (if spline is selected for interpolation) |
| Bool_t | fTransformLikelihoodOutput | likelihood output is sigmoid-transformed |

standard constructor MethodLikelihood options: format and syntax of option string: "Spline2:0:25:D" where: SplineI [I=0,12,3,5] - which spline is used for smoothing the pdfs 0 - how often the input histos are smoothed 25 - average num of events per PDF bin D - use square-root-matrix to decorrelate variable space
construct likelihood references from file
define the options (their key words) that can be set in the option string
know options:
PDFInterpol[ivar] <string> Spline0, Spline1, Spline2 <default>, Spline3, Spline5, KDE used to interpolate reference histograms
if no variable index is given, it is valid for ALL the variables
NSmooth <int> how often the input histos are smoothed
NAvEvtPerBin <int> minimum average number of events per PDF bin
TransformOutput <bool> transform (often strongly peaked) likelihood output through sigmoid inversion
fKDEtype <KernelType> type of the Kernel to use (1 is Gaussian)
fKDEiter <KerneIter> number of iterations (1 --> "static KDE", 2 --> "adaptive KDE")
fBorderMethod <KernelBorder> the method to take care about "border" effects (1=no treatment , 2=kernel renormalization, 3=sample mirroring)
create reference distributions (PDFs) from signal and background events: fill histograms and smooth them; if decorrelation is required, compute corresponding square-root matrices
returns transformed or non-transformed output
read weight info from file nothing to do for this method
write specific header of the classifier (mostly include files)
get help message text
typical length of text line:
"|--------------------------------------------------------------|"