library: libTMVA #include "MethodCuts.h" |
Inheritance Chart: | |||||||||||||
|
private:
void CreateVariablePDFs() void GetEffsfromPDFs(Double_t* cutMin, Double_t* cutMax, Double_t& effS, Double_t& effB) void GetEffsfromSelection(Double_t* cutMin, Double_t* cutMax, Double_t& effS, Double_t& effB) void InitCuts() void MatchCutsToPars(Double_t*, Double_t*, Double_t*) void MatchParsToCuts(const vector<Double_t>&, Double_t*, Double_t*) void MatchParsToCuts(Double_t*, Double_t*, Double_t*) Bool_t SanityChecks() public:
MethodCuts(TString jobName, vector<TString>* theVariables, TTree* theTree = 0, TString theOption = MC:150:10000:, TDirectory* theTargetFile = 0) MethodCuts(vector<TString>* theVariables, TString theWeightFile, TDirectory* theTargetDir = NULL) MethodCuts(const TMVA::MethodCuts&) virtual ~MethodCuts() static TClass* Class() Double_t ComputeEstimator(const vector<Double_t>&) virtual Double_t GetEfficiency(TString, TTree*) virtual Double_t GetmuTransform(TTree*) virtual Double_t GetMvaValue(TMVA::Event* e) virtual Double_t GetSeparation() virtual Double_t GetSignificance() virtual TClass* IsA() const TMVA::MethodCuts& operator=(const TMVA::MethodCuts&) virtual void ReadWeightsFromFile() void SetTestSignalEfficiency(Double_t eff) virtual void ShowMembers(TMemberInspector& insp, char* parent) virtual void Streamer(TBuffer& b) void StreamerNVirtual(TBuffer& b) virtual void TestInitLocal(TTree* testTree) static TMVA::MethodCuts* ThisCuts() virtual void Train() virtual void WriteHistosToFile() virtual void WriteWeightsToFile()
private:
TMVA::MethodCuts::ConstrainType fConstrainType TMVA::MethodCuts::FitMethodType fFitMethod chosen fit method TMVA::MethodCuts::EffMethod fEffMethod chosen efficiency calculation method vector<FitParameters>* fFitParams vector for series of fit methods Double_t fTestSignalEff used to test optimized signal efficiency Double_t fEffSMin used to test optimized signal efficiency Double_t fEffSMax used to test optimized signal efficiency TMVA::BinarySearchTree* fBinaryTreeS TMVA::BinarySearchTree* fBinaryTreeB Int_t fGa_nsteps GA settings: number of steps Int_t fGa_preCalc GA settings: number of pre-calc steps Int_t fGa_SC_steps GA settings: SC_steps Int_t fGa_SC_offsteps GA settings: SC_offsteps Double_t fGa_SC_factor GA settings: SC_factor Double_t fEffRef reference efficiency vector<Int_t>* fRangeSign used to match cuts to fit parameters (and vice versa) TRandom* fTrandom random generator for MC optimisation method Int_t fNpar number of parameters in fit (default: 2*Nvar) vector<Double_t>* fMeanS means of variables (signal) vector<Double_t>* fMeanB means of variables (background) vector<Double_t>* fRmsS RMSs of variables (signal) vector<Double_t>* fRmsB RMSs of variables (background) vector<Double_t>* fXmin minimum values of variables vector<Double_t>* fXmax maximum values of variables TH1* fEffBvsSLocal intermediate eff. background versus eff signal histo vector<TH1*>* fVarHistS reference histograms (signal) vector<TH1*>* fVarHistB reference histograms (background) vector<TH1*>* fVarHistS_smooth smoothed reference histograms (signal) vector<TH1*>* fVarHistB_smooth smoothed reference histograms (background) vector<PDF*>* fVarPdfS reference PDFs (signal) vector<PDF*>* fVarPdfB reference PDFs (background) Int_t fNRandCuts number of random cut samplings Double_t** fCutMin minimum requirement Double_t** fCutMax maximum requirement Double_t* fTmpCutMin temporary minimum requirement Double_t* fTmpCutMax temporary maximum requirement static TMVA::MethodCuts* fgThisCuts needed for function reference (GA) public:
static const TMVA::MethodCuts::ConstrainType kConstrainEffS static const TMVA::MethodCuts::ConstrainType kConstrainEffB static const TMVA::MethodCuts::FitMethodType kUseMonteCarlo static const TMVA::MethodCuts::FitMethodType kUseGeneticAlgorithm static const TMVA::MethodCuts::EffMethod kUseEventSelection static const TMVA::MethodCuts::EffMethod kUsePDFs static const TMVA::MethodCuts::FitParameters kNotEnforced static const TMVA::MethodCuts::FitParameters kForceMin static const TMVA::MethodCuts::FitParameters kForceMax static const TMVA::MethodCuts::FitParameters kForceSmart static const TMVA::MethodCuts::FitParameters kForceVerySmart
_______________________________________________________________________Multivariate optimisation of signal efficiency for given background efficiency, applying rectangular minimum and maximum requirements.
Other optimisation criteria, such as maximising the signal significance- squared, S^2/(S+B), with S and B being the signal and background yields, correspond to a particular point in the optimised background rejection versus signal efficiency curve. This working point requires the knowledge of the expected yields, which is not the case in general. Note also that for rare signals, Poissonian statistics should be used, which modifies the significance criterion.
The rectangular cut of a volume in the variable space is performed using a binary tree to sort the training events. This provides a significant reduction in computing time (up to several orders of magnitudes, depending on the complexity of the problem at hand).
Technically, optimisation is achieved in TMVA by two methods:
Attempts to use Minuit fits (Simplex ot Migrad) instead have not shown superior results, and often failed due to convergence at local minima.
The tests we have performed so far showed that in generic applications, the GA is superior to MC sampling, and hence GA is the default method. It is worthwhile to try both anyway.
standard constructor ---------------------------------------------------------------------------------- format of option string: "OptMethod:EffMethod:Option_var1:...:Option_varn" "OptMethod" can be: - "GA" : Genetic Algorithm (recommended) - "MC" : Monte-Carlo optimization "EffMethod" can be: - "EffSel": compute efficiency by event counting - "EffPDF": compute efficiency from PDFs === For "GA" method ====== "Option_var1++" are (see GA for explanation of parameters): - fGa_nsteps - fGa_preCalc - fGa_SC_steps - fGa_SC_offsteps - fGa_SC_factor === For "MC" method ====== "Option_var1" is number of random samples "Option_var2++" can be - "FMax" : ForceMax (the max cut is fixed to maximum of variable i) - "FMin" : ForceMin (the min cut is fixed to minimum of variable i) - "FSmart": ForceSmart (the min or max cut is fixed to min/max, based on mean value) - Adding "All" to "option_vari", eg, "AllFSmart" will use this option for all variables - if "option_vari" is empty (== ""), no assumptions on cut min/max are made ----------------------------------------------------------------------------------
construction from weight file
returns estimator for "cut fitness" used by GA
there are two requirements:
1) the signal efficiency must be equal to the required one in the
efficiency scan
2) the background efficiency must be as small as possible
the requirement 1) has priority over 2)
translates parameters into cuts
translates cuts into parameters
compute signal and background efficiencies from PDFs for given cut sample
compute signal and background efficiencies from event counting for given cut sample
create binary trees (global member variables) for signal and background