Logo ROOT  
Reference Guide
TMVAClassificationCategory.C File Reference

Detailed Description

View in nbviewer Open in SWAN This macro provides examples for the training and testing of the TMVA classifiers in categorisation mode.

  • Project : TMVA - a Root-integrated toolkit for multivariate data analysis
  • Package : TMVA
  • Root Macro: TMVAClassificationCategory

As input data is used a toy-MC sample consisting of four Gaussian-distributed and linearly correlated input variables with category (eta) dependent properties.

For this example, only Fisher and Likelihood are used. Run via:

root -l TMVAClassificationCategory.C

The output file "TMVA.root" can be analysed with the use of dedicated macros (simply say: root -l <macro.C>), which can be conveniently invoked through a GUI that will appear at the end of the run of this macro.

==> Start TMVAClassificationCategory
--- TMVAClassificationCategory: Accessing /home/sftnight/build/workspace/root-makedoc-master/rootspi/rdoc/src/master/tutorials/tmva/data/toy_sigbkg_categ_offset.root
<HEADER> DataSetInfo : [dataset] : Added class "Signal"
: Add Tree TreeS of type Signal with 10000 events
<HEADER> DataSetInfo : [dataset] : Added class "Background"
: Add Tree TreeB of type Background with 10000 events
<HEADER> Factory : Booking method: Fisher
:
<HEADER> Factory : Booking method: Likelihood
:
<HEADER> Factory : Booking method: FisherCat
:
: Adding sub-classifier: Fisher::Category_Fisher_1
<HEADER> DataSetInfo : [Category_Fisher_1_dsi] : Added class "Signal"
<HEADER> DataSetInfo : [Category_Fisher_1_dsi] : Added class "Background"
: Adding sub-classifier: Fisher::Category_Fisher_2
<HEADER> DataSetInfo : [Category_Fisher_2_dsi] : Added class "Signal"
<HEADER> DataSetInfo : [Category_Fisher_2_dsi] : Added class "Background"
<HEADER> Factory : Booking method: LikelihoodCat
:
: Adding sub-classifier: Likelihood::Category_Likelihood_1
<HEADER> DataSetInfo : [Category_Likelihood_1_dsi] : Added class "Signal"
<HEADER> DataSetInfo : [Category_Likelihood_1_dsi] : Added class "Background"
: Adding sub-classifier: Likelihood::Category_Likelihood_2
<HEADER> DataSetInfo : [Category_Likelihood_2_dsi] : Added class "Signal"
<HEADER> DataSetInfo : [Category_Likelihood_2_dsi] : Added class "Background"
<HEADER> Factory : Train all methods
: Building event vectors for type 2 Signal
: Dataset[dataset] : create input formulas for tree TreeS
: Building event vectors for type 2 Background
: Dataset[dataset] : create input formulas for tree TreeB
<HEADER> DataSetFactory : [dataset] : Number of events in input trees
:
:
: Number of training and testing events
: ---------------------------------------------------------------------------
: Signal -- training events : 5000
: Signal -- testing events : 5000
: Signal -- training and testing events: 10000
: Background -- training events : 5000
: Background -- testing events : 5000
: Background -- training and testing events: 10000
:
<HEADER> DataSetInfo : Correlation matrix (Signal):
: ----------------------------------------
: var1 var2 var3 var4
: var1: +1.000 +0.368 +0.378 +0.391
: var2: +0.368 +1.000 +0.388 +0.386
: var3: +0.378 +0.388 +1.000 +0.389
: var4: +0.391 +0.386 +0.389 +1.000
: ----------------------------------------
<HEADER> DataSetInfo : Correlation matrix (Background):
: ----------------------------------------
: var1 var2 var3 var4
: var1: +1.000 +0.365 +0.376 +0.381
: var2: +0.365 +1.000 +0.382 +0.387
: var3: +0.376 +0.382 +1.000 +0.376
: var4: +0.381 +0.387 +0.376 +1.000
: ----------------------------------------
<HEADER> DataSetFactory : [dataset] :
:
<HEADER> Factory : Train method: Fisher for Classification
:
<HEADER> Fisher : Results for Fisher coefficients:
: -----------------------
: Variable: Coefficient:
: -----------------------
: var1: -0.053
: var2: -0.014
: var3: +0.096
: var4: +0.216
: (offset): -0.023
: -----------------------
: Elapsed time for training with 10000 events: 0.00324 sec
<HEADER> Fisher : [dataset] : Evaluation of Fisher on training sample (10000 events)
: Elapsed time for evaluation of 10000 events: 0.000994 sec
: Creating xml weight file: dataset/weights/TMVAClassificationCategory_Fisher.weights.xml
: Creating standalone class: dataset/weights/TMVAClassificationCategory_Fisher.class.C
<HEADER> Factory : Training finished
:
<HEADER> Factory : Train method: Likelihood for Classification
:
: Filling reference histograms
: Building PDF out of reference histograms
: Elapsed time for training with 10000 events: 0.0489 sec
<HEADER> Likelihood : [dataset] : Evaluation of Likelihood on training sample (10000 events)
: Elapsed time for evaluation of 10000 events: 0.00878 sec
: Creating xml weight file: dataset/weights/TMVAClassificationCategory_Likelihood.weights.xml
: Creating standalone class: dataset/weights/TMVAClassificationCategory_Likelihood.class.C
: TMVA.root:/dataset/Method_Likelihood/Likelihood
<HEADER> Factory : Training finished
:
<HEADER> Factory : Train method: FisherCat for Classification
:
: Train all sub-classifiers for Classification ...
: Building event vectors for type 2 Signal
: Dataset[Category_Fisher_1_dsi] : create input formulas for tree TreeS
: Building event vectors for type 2 Background
: Dataset[Category_Fisher_1_dsi] : create input formulas for tree TreeB
<HEADER> DataSetFactory : [Category_Fisher_1_dsi] : Number of events in input trees
: Dataset[Category_Fisher_1_dsi] : Signal requirement: "abs(eta)<=1.3"
: Dataset[Category_Fisher_1_dsi] : Signal -- number of events passed: 5123 / sum of weights: 5123
: Dataset[Category_Fisher_1_dsi] : Signal -- efficiency : 0.5123
: Dataset[Category_Fisher_1_dsi] : Background requirement: "abs(eta)<=1.3"
: Dataset[Category_Fisher_1_dsi] : Background -- number of events passed: 5134 / sum of weights: 5134
: Dataset[Category_Fisher_1_dsi] : Background -- efficiency : 0.5134
: Dataset[Category_Fisher_1_dsi] : you have opted for interpreting the requested number of training/testing events
: to be the number of events AFTER your preselection cuts
:
: Dataset[Category_Fisher_1_dsi] : you have opted for interpreting the requested number of training/testing events
: to be the number of events AFTER your preselection cuts
:
: Number of training and testing events
: ---------------------------------------------------------------------------
: Signal -- training events : 2561
: Signal -- testing events : 2561
: Signal -- training and testing events: 5122
: Dataset[Category_Fisher_1_dsi] : Signal -- due to the preselection a scaling factor has been applied to the numbers of requested events: 0.5123
: Background -- training events : 2567
: Background -- testing events : 2567
: Background -- training and testing events: 5134
: Dataset[Category_Fisher_1_dsi] : Background -- due to the preselection a scaling factor has been applied to the numbers of requested events: 0.5134
:
<HEADER> DataSetInfo : Correlation matrix (Signal):
: ----------------------------------------
: var1 var2 var3 var4
: var1: +1.000 -0.017 +0.004 +0.001
: var2: -0.017 +1.000 -0.019 -0.003
: var3: +0.004 -0.019 +1.000 -0.012
: var4: +0.001 -0.003 -0.012 +1.000
: ----------------------------------------
<HEADER> DataSetInfo : Correlation matrix (Background):
: ----------------------------------------
: var1 var2 var3 var4
: var1: +1.000 -0.019 -0.022 +0.003
: var2: -0.019 +1.000 -0.018 +0.004
: var3: -0.022 -0.018 +1.000 +0.004
: var4: +0.003 +0.004 +0.004 +1.000
: ----------------------------------------
<HEADER> DataSetFactory : [Category_Fisher_1_dsi] :
:
: Train method: Category_Fisher_1 for Classification
<HEADER> Category_Fisher_1 : Results for Fisher coefficients:
: -----------------------
: Variable: Coefficient:
: -----------------------
: var1: +0.105
: var2: +0.152
: var3: +0.247
: var4: +0.375
: (offset): +0.648
: -----------------------
: Elapsed time for training with 5128 events: 0.00159 sec
<HEADER> Category_Fisher_1 : [Category_Fisher_1_dsi] : Evaluation of Category_Fisher_1 on training sample (5128 events)
: Elapsed time for evaluation of 5128 events: 0.000578 sec
: Training finished
: Building event vectors for type 2 Signal
: Dataset[Category_Fisher_2_dsi] : create input formulas for tree TreeS
: Building event vectors for type 2 Background
: Dataset[Category_Fisher_2_dsi] : create input formulas for tree TreeB
<HEADER> DataSetFactory : [Category_Fisher_2_dsi] : Number of events in input trees
: Dataset[Category_Fisher_2_dsi] : Signal requirement: "abs(eta)>1.3"
: Dataset[Category_Fisher_2_dsi] : Signal -- number of events passed: 4877 / sum of weights: 4877
: Dataset[Category_Fisher_2_dsi] : Signal -- efficiency : 0.4877
: Dataset[Category_Fisher_2_dsi] : Background requirement: "abs(eta)>1.3"
: Dataset[Category_Fisher_2_dsi] : Background -- number of events passed: 4866 / sum of weights: 4866
: Dataset[Category_Fisher_2_dsi] : Background -- efficiency : 0.4866
: Dataset[Category_Fisher_2_dsi] : you have opted for interpreting the requested number of training/testing events
: to be the number of events AFTER your preselection cuts
:
: Dataset[Category_Fisher_2_dsi] : you have opted for interpreting the requested number of training/testing events
: to be the number of events AFTER your preselection cuts
:
: Number of training and testing events
: ---------------------------------------------------------------------------
: Signal -- training events : 2438
: Signal -- testing events : 2438
: Signal -- training and testing events: 4876
: Dataset[Category_Fisher_2_dsi] : Signal -- due to the preselection a scaling factor has been applied to the numbers of requested events: 0.4877
: Background -- training events : 2433
: Background -- testing events : 2433
: Background -- training and testing events: 4866
: Dataset[Category_Fisher_2_dsi] : Background -- due to the preselection a scaling factor has been applied to the numbers of requested events: 0.4866
:
<HEADER> DataSetInfo : Correlation matrix (Signal):
: ----------------------------------------
: var1 var2 var3 var4
: var1: +1.000 -0.005 +0.002 -0.039
: var2: -0.005 +1.000 +0.011 -0.004
: var3: +0.002 +0.011 +1.000 -0.021
: var4: -0.039 -0.004 -0.021 +1.000
: ----------------------------------------
<HEADER> DataSetInfo : Correlation matrix (Background):
: ----------------------------------------
: var1 var2 var3 var4
: var1: +1.000 -0.007 +0.009 +0.008
: var2: -0.007 +1.000 -0.020 +0.013
: var3: +0.009 -0.020 +1.000 +0.007
: var4: +0.008 +0.013 +0.007 +1.000
: ----------------------------------------
<HEADER> DataSetFactory : [Category_Fisher_2_dsi] :
:
: Train method: Category_Fisher_2 for Classification
<HEADER> Category_Fisher_2 : Results for Fisher coefficients:
: -----------------------
: Variable: Coefficient:
: -----------------------
: var1: +0.107
: var2: +0.148
: var3: +0.251
: var4: +0.372
: (offset): -0.751
: -----------------------
: Elapsed time for training with 4871 events: 0.00154 sec
<HEADER> Category_Fisher_2 : [Category_Fisher_2_dsi] : Evaluation of Category_Fisher_2 on training sample (4871 events)
: Elapsed time for evaluation of 4871 events: 0.000557 sec
: Training finished
: Begin ranking of input variables...
<HEADER> Category_Fisher_1 : Ranking result (top variable is best ranked)
: -------------------------------
: Rank : Variable : Discr. power
: -------------------------------
: 1 : var4 : 2.205e-01
: 2 : var3 : 1.054e-01
: 3 : var2 : 4.114e-02
: 4 : var1 : 1.987e-02
: -------------------------------
<HEADER> Category_Fisher_2 : Ranking result (top variable is best ranked)
: -------------------------------
: Rank : Variable : Discr. power
: -------------------------------
: 1 : var4 : 2.153e-01
: 2 : var3 : 1.105e-01
: 3 : var2 : 4.289e-02
: 4 : var1 : 1.986e-02
: -------------------------------
: Elapsed time for training with 10000 events: 0.0364 sec
<HEADER> FisherCat : [dataset] : Evaluation of FisherCat on training sample (10000 events)
: Elapsed time for evaluation of 10000 events: 0.00472 sec
: Creating xml weight file: dataset/weights/TMVAClassificationCategory_FisherCat.weights.xml
<HEADER> Factory : Training finished
:
<HEADER> Factory : Train method: LikelihoodCat for Classification
:
: Train all sub-classifiers for Classification ...
: Building event vectors for type 2 Signal
: Dataset[Category_Likelihood_1_dsi] : create input formulas for tree TreeS
: Building event vectors for type 2 Background
: Dataset[Category_Likelihood_1_dsi] : create input formulas for tree TreeB
<HEADER> DataSetFactory : [Category_Likelihood_1_dsi] : Number of events in input trees
: Dataset[Category_Likelihood_1_dsi] : Signal requirement: "abs(eta)<=1.3"
: Dataset[Category_Likelihood_1_dsi] : Signal -- number of events passed: 5123 / sum of weights: 5123
: Dataset[Category_Likelihood_1_dsi] : Signal -- efficiency : 0.5123
: Dataset[Category_Likelihood_1_dsi] : Background requirement: "abs(eta)<=1.3"
: Dataset[Category_Likelihood_1_dsi] : Background -- number of events passed: 5134 / sum of weights: 5134
: Dataset[Category_Likelihood_1_dsi] : Background -- efficiency : 0.5134
: Dataset[Category_Likelihood_1_dsi] : you have opted for interpreting the requested number of training/testing events
: to be the number of events AFTER your preselection cuts
:
: Dataset[Category_Likelihood_1_dsi] : you have opted for interpreting the requested number of training/testing events
: to be the number of events AFTER your preselection cuts
:
: Number of training and testing events
: ---------------------------------------------------------------------------
: Signal -- training events : 2561
: Signal -- testing events : 2561
: Signal -- training and testing events: 5122
: Dataset[Category_Likelihood_1_dsi] : Signal -- due to the preselection a scaling factor has been applied to the numbers of requested events: 0.5123
: Background -- training events : 2567
: Background -- testing events : 2567
: Background -- training and testing events: 5134
: Dataset[Category_Likelihood_1_dsi] : Background -- due to the preselection a scaling factor has been applied to the numbers of requested events: 0.5134
:
<HEADER> DataSetInfo : Correlation matrix (Signal):
: ----------------------------------------
: var1 var2 var3 var4
: var1: +1.000 -0.017 +0.004 +0.001
: var2: -0.017 +1.000 -0.019 -0.003
: var3: +0.004 -0.019 +1.000 -0.012
: var4: +0.001 -0.003 -0.012 +1.000
: ----------------------------------------
<HEADER> DataSetInfo : Correlation matrix (Background):
: ----------------------------------------
: var1 var2 var3 var4
: var1: +1.000 -0.019 -0.022 +0.003
: var2: -0.019 +1.000 -0.018 +0.004
: var3: -0.022 -0.018 +1.000 +0.004
: var4: +0.003 +0.004 +0.004 +1.000
: ----------------------------------------
<HEADER> DataSetFactory : [Category_Likelihood_1_dsi] :
:
: Train method: Category_Likelihood_1 for Classification
: Filling reference histograms
: Building PDF out of reference histograms
: Elapsed time for training with 5128 events: 0.0277 sec
<HEADER> Category_Likelihood_1 : [Category_Likelihood_1_dsi] : Evaluation of Category_Likelihood_1 on training sample (5128 events)
: Elapsed time for evaluation of 5128 events: 0.00452 sec
: TMVA.root:/dataset/Method_Category/LikelihoodCat/Method_Likelihood/Category_Likelihood_1
: Training finished
: Building event vectors for type 2 Signal
: Dataset[Category_Likelihood_2_dsi] : create input formulas for tree TreeS
: Building event vectors for type 2 Background
: Dataset[Category_Likelihood_2_dsi] : create input formulas for tree TreeB
<HEADER> DataSetFactory : [Category_Likelihood_2_dsi] : Number of events in input trees
: Dataset[Category_Likelihood_2_dsi] : Signal requirement: "abs(eta)>1.3"
: Dataset[Category_Likelihood_2_dsi] : Signal -- number of events passed: 4877 / sum of weights: 4877
: Dataset[Category_Likelihood_2_dsi] : Signal -- efficiency : 0.4877
: Dataset[Category_Likelihood_2_dsi] : Background requirement: "abs(eta)>1.3"
: Dataset[Category_Likelihood_2_dsi] : Background -- number of events passed: 4866 / sum of weights: 4866
: Dataset[Category_Likelihood_2_dsi] : Background -- efficiency : 0.4866
: Dataset[Category_Likelihood_2_dsi] : you have opted for interpreting the requested number of training/testing events
: to be the number of events AFTER your preselection cuts
:
: Dataset[Category_Likelihood_2_dsi] : you have opted for interpreting the requested number of training/testing events
: to be the number of events AFTER your preselection cuts
:
: Number of training and testing events
: ---------------------------------------------------------------------------
: Signal -- training events : 2438
: Signal -- testing events : 2438
: Signal -- training and testing events: 4876
: Dataset[Category_Likelihood_2_dsi] : Signal -- due to the preselection a scaling factor has been applied to the numbers of requested events: 0.4877
: Background -- training events : 2433
: Background -- testing events : 2433
: Background -- training and testing events: 4866
: Dataset[Category_Likelihood_2_dsi] : Background -- due to the preselection a scaling factor has been applied to the numbers of requested events: 0.4866
:
<HEADER> DataSetInfo : Correlation matrix (Signal):
: ----------------------------------------
: var1 var2 var3 var4
: var1: +1.000 -0.005 +0.002 -0.039
: var2: -0.005 +1.000 +0.011 -0.004
: var3: +0.002 +0.011 +1.000 -0.021
: var4: -0.039 -0.004 -0.021 +1.000
: ----------------------------------------
<HEADER> DataSetInfo : Correlation matrix (Background):
: ----------------------------------------
: var1 var2 var3 var4
: var1: +1.000 -0.007 +0.009 +0.008
: var2: -0.007 +1.000 -0.020 +0.013
: var3: +0.009 -0.020 +1.000 +0.007
: var4: +0.008 +0.013 +0.007 +1.000
: ----------------------------------------
<HEADER> DataSetFactory : [Category_Likelihood_2_dsi] :
:
: Train method: Category_Likelihood_2 for Classification
: Filling reference histograms
: Building PDF out of reference histograms
: Elapsed time for training with 4871 events: 0.0263 sec
<HEADER> Category_Likelihood_2 : [Category_Likelihood_2_dsi] : Evaluation of Category_Likelihood_2 on training sample (4871 events)
: Elapsed time for evaluation of 4871 events: 0.00388 sec
: TMVA.root:/dataset/Method_Category/LikelihoodCat/Method_Likelihood/Category_Likelihood_2
: Training finished
: Begin ranking of input variables...
<HEADER> Category_Likelihood_1 : Ranking result (top variable is best ranked)
: -----------------------------------
: Rank : Variable : Delta Separation
: -----------------------------------
: 1 : var4 : 1.031e-01
: 2 : var3 : 1.716e-02
: 3 : var1 : 1.036e-02
: 4 : var2 : 4.428e-03
: -----------------------------------
<HEADER> Category_Likelihood_2 : Ranking result (top variable is best ranked)
: -----------------------------------
: Rank : Variable : Delta Separation
: -----------------------------------
: 1 : var4 : 1.424e-01
: 2 : var3 : 6.035e-02
: 3 : var2 : 1.824e-02
: 4 : var1 : 8.110e-03
: -----------------------------------
: Elapsed time for training with 10000 events: 0.24 sec
<HEADER> LikelihoodCat : [dataset] : Evaluation of LikelihoodCat on training sample (10000 events)
: Elapsed time for evaluation of 10000 events: 0.0112 sec
: Creating xml weight file: dataset/weights/TMVAClassificationCategory_LikelihoodCat.weights.xml
<HEADER> Factory : Training finished
:
: Ranking input variables (method specific)...
<HEADER> Fisher : Ranking result (top variable is best ranked)
: -------------------------------
: Rank : Variable : Discr. power
: -------------------------------
: 1 : var4 : 1.446e-01
: 2 : var3 : 7.153e-02
: 3 : var2 : 2.447e-02
: 4 : var1 : 1.243e-02
: -------------------------------
<HEADER> Likelihood : Ranking result (top variable is best ranked)
: -----------------------------------
: Rank : Variable : Delta Separation
: -----------------------------------
: 1 : var4 : 1.162e-01
: 2 : var3 : 5.179e-02
: 3 : var2 : 2.915e-02
: 4 : var1 : 2.168e-02
: -----------------------------------
: No variable ranking supplied by classifier: FisherCat
: No variable ranking supplied by classifier: LikelihoodCat
<HEADER> Factory : === Destroy and recreate all methods via weight files for testing ===
:
: Reading weight file: dataset/weights/TMVAClassificationCategory_Fisher.weights.xml
: Reading weight file: dataset/weights/TMVAClassificationCategory_Likelihood.weights.xml
: Reading weight file: dataset/weights/TMVAClassificationCategory_FisherCat.weights.xml
: Recreating sub-classifiers from XML-file
<HEADER> DataSetInfo : [Category_Fisher_1_dsi] : Added class "Signal"
<HEADER> DataSetInfo : [Category_Fisher_1_dsi] : Added class "Background"
<HEADER> DataSetInfo : [Category_Fisher_2_dsi] : Added class "Signal"
<HEADER> DataSetInfo : [Category_Fisher_2_dsi] : Added class "Background"
: Reading weight file: dataset/weights/TMVAClassificationCategory_LikelihoodCat.weights.xml
: Recreating sub-classifiers from XML-file
<HEADER> DataSetInfo : [Category_Likelihood_1_dsi] : Added class "Signal"
<HEADER> DataSetInfo : [Category_Likelihood_1_dsi] : Added class "Background"
<HEADER> DataSetInfo : [Category_Likelihood_2_dsi] : Added class "Signal"
<HEADER> DataSetInfo : [Category_Likelihood_2_dsi] : Added class "Background"
<HEADER> Factory : Test all methods
<HEADER> Factory : Test method: Fisher for Classification performance
:
<HEADER> Fisher : [dataset] : Evaluation of Fisher on testing sample (10000 events)
: Elapsed time for evaluation of 10000 events: 0.0019 sec
<HEADER> Factory : Test method: Likelihood for Classification performance
:
<HEADER> Likelihood : [dataset] : Evaluation of Likelihood on testing sample (10000 events)
: Elapsed time for evaluation of 10000 events: 0.00779 sec
<HEADER> Factory : Test method: FisherCat for Classification performance
:
<HEADER> FisherCat : [dataset] : Evaluation of FisherCat on testing sample (10000 events)
: Elapsed time for evaluation of 10000 events: 0.00299 sec
<HEADER> Factory : Test method: LikelihoodCat for Classification performance
:
<HEADER> LikelihoodCat : [dataset] : Evaluation of LikelihoodCat on testing sample (10000 events)
: Elapsed time for evaluation of 10000 events: 0.00946 sec
<HEADER> Factory : Evaluate all methods
<HEADER> Factory : Evaluate classifier: Fisher
:
<HEADER> Fisher : [dataset] : Loop over test events and fill histograms with classifier response...
:
<HEADER> TFHandler_Fisher : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: var1: -0.014081 1.2910 [ -5.3119 4.5609 ]
: var2: -0.014399 1.3299 [ -4.7537 4.6723 ]
: var3: -0.027971 1.3779 [ -5.2892 4.7007 ]
: var4: 0.12966 1.4883 [ -5.1002 4.9767 ]
: -----------------------------------------------------------
<HEADER> Factory : Evaluate classifier: Likelihood
:
<HEADER> Likelihood : [dataset] : Loop over test events and fill histograms with classifier response...
:
<HEADER> TFHandler_Likelihood : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: var1: -0.014081 1.2910 [ -5.3119 4.5609 ]
: var2: -0.014399 1.3299 [ -4.7537 4.6723 ]
: var3: -0.027971 1.3779 [ -5.2892 4.7007 ]
: var4: 0.12966 1.4883 [ -5.1002 4.9767 ]
: -----------------------------------------------------------
<HEADER> Factory : Evaluate classifier: FisherCat
:
<HEADER> FisherCat : [dataset] : Loop over test events and fill histograms with classifier response...
:
<HEADER> TFHandler_FisherCat : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: var1: -0.014081 1.2910 [ -5.3119 4.5609 ]
: var2: -0.014399 1.3299 [ -4.7537 4.6723 ]
: var3: -0.027971 1.3779 [ -5.2892 4.7007 ]
: var4: 0.12966 1.4883 [ -5.1002 4.9767 ]
: -----------------------------------------------------------
<HEADER> Factory : Evaluate classifier: LikelihoodCat
:
<HEADER> LikelihoodCat : [dataset] : Loop over test events and fill histograms with classifier response...
:
<HEADER> TFHandler_LikelihoodCat : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: var1: -0.014081 1.2910 [ -5.3119 4.5609 ]
: var2: -0.014399 1.3299 [ -4.7537 4.6723 ]
: var3: -0.027971 1.3779 [ -5.2892 4.7007 ]
: var4: 0.12966 1.4883 [ -5.1002 4.9767 ]
: -----------------------------------------------------------
:
: Evaluation results ranked by best signal efficiency and purity (area)
: -------------------------------------------------------------------------------------------------------------------
: DataSet MVA
: Name: Method: ROC-integ
: dataset FisherCat : 0.914
: dataset LikelihoodCat : 0.913
: dataset Fisher : 0.808
: dataset Likelihood : 0.768
: -------------------------------------------------------------------------------------------------------------------
:
: Testing efficiency compared to training efficiency (overtraining check)
: -------------------------------------------------------------------------------------------------------------------
: DataSet MVA Signal efficiency: from test sample (from training sample)
: Name: Method: @B=0.01 @B=0.10 @B=0.30
: -------------------------------------------------------------------------------------------------------------------
: dataset FisherCat : 0.352 (0.360) 0.743 (0.739) 0.919 (0.916)
: dataset LikelihoodCat : 0.350 (0.351) 0.738 (0.736) 0.919 (0.916)
: dataset Fisher : 0.184 (0.185) 0.471 (0.486) 0.746 (0.742)
: dataset Likelihood : 0.211 (0.242) 0.446 (0.453) 0.609 (0.608)
: -------------------------------------------------------------------------------------------------------------------
:
<HEADER> Dataset:dataset : Created tree 'TestTree' with 10000 events
:
<HEADER> Dataset:dataset : Created tree 'TrainTree' with 10000 events
:
<HEADER> Factory : Thank you for using TMVA!
: For citation information, please visit: http://tmva.sf.net/citeTMVA.html
==> Wrote root file: TMVA.root
==> TMVAClassificationCategory is done!
#include <cstdlib>
#include <iostream>
#include <map>
#include <string>
#include "TChain.h"
#include "TFile.h"
#include "TTree.h"
#include "TString.h"
#include "TObjString.h"
#include "TSystem.h"
#include "TROOT.h"
#include "TMVA/Factory.h"
#include "TMVA/Tools.h"
#include "TMVA/TMVAGui.h"
// two types of category methods are implemented
Bool_t UseOffsetMethod = kTRUE;
void TMVAClassificationCategory()
{
//---------------------------------------------------------------
// Example for usage of different event categories with classifiers
std::cout << std::endl << "==> Start TMVAClassificationCategory" << std::endl;
// This loads the library
bool batchMode = false;
// Create a new root output file.
TString outfileName( "TMVA.root" );
TFile* outputFile = TFile::Open( outfileName, "RECREATE" );
// Create the factory object (see TMVAClassification.C for more information)
std::string factoryOptions( "!V:!Silent:Transformations=I;D;P;G,D" );
if (batchMode) factoryOptions += ":!Color:!DrawProgressBar";
TMVA::Factory *factory = new TMVA::Factory( "TMVAClassificationCategory", outputFile, factoryOptions );
// Create DataLoader
TMVA::DataLoader *dataloader=new TMVA::DataLoader("dataset");
// Define the input variables used for the MVA training
dataloader->AddVariable( "var1", 'F' );
dataloader->AddVariable( "var2", 'F' );
dataloader->AddVariable( "var3", 'F' );
dataloader->AddVariable( "var4", 'F' );
// You can add so-called "Spectator variables", which are not used in the MVA training,
// but will appear in the final "TestTree" produced by TMVA. This TestTree will contain the
// input variables, the response values of all trained MVAs, and the spectator variables
dataloader->AddSpectator( "eta" );
// Load the signal and background event samples from ROOT trees
TFile *input(0);
TString fname = gSystem->GetDirName(__FILE__) + "/data/";
if (gSystem->AccessPathName( fname + "toy_sigbkg_categ_offset.root")) {
// if directory data not found try using tutorials dir
fname = gROOT->GetTutorialDir() + "/tmva/data/";
}
if (UseOffsetMethod) fname += "toy_sigbkg_categ_offset.root";
else fname += "toy_sigbkg_categ_varoff.root";
if (!gSystem->AccessPathName( fname )) {
// first we try to find tmva_example.root in the local directory
std::cout << "--- TMVAClassificationCategory: Accessing " << fname << std::endl;
input = TFile::Open( fname );
}
if (!input) {
std::cout << "ERROR: could not open data file: " << fname << std::endl;
exit(1);
}
TTree *signalTree = (TTree*)input->Get("TreeS");
TTree *background = (TTree*)input->Get("TreeB");
// Global event weights per tree (see below for setting event-wise weights)
Double_t signalWeight = 1.0;
Double_t backgroundWeight = 1.0;
// You can add an arbitrary number of signal or background trees
dataloader->AddSignalTree ( signalTree, signalWeight );
dataloader->AddBackgroundTree( background, backgroundWeight );
// Apply additional cuts on the signal and background samples (can be different)
TCut mycuts = ""; // for example: TCut mycuts = "abs(var1)<0.5 && abs(var2-0.5)<1";
TCut mycutb = ""; // for example: TCut mycutb = "abs(var1)<0.5";
// Tell the factory how to use the training and testing events
dataloader->PrepareTrainingAndTestTree( mycuts, mycutb,
"nTrain_Signal=0:nTrain_Background=0:SplitMode=Random:NormMode=NumEvents:!V" );
// Book MVA methods
// Fisher discriminant
factory->BookMethod( dataloader, TMVA::Types::kFisher, "Fisher", "!H:!V:Fisher" );
// Likelihood
factory->BookMethod( dataloader, TMVA::Types::kLikelihood, "Likelihood",
"!H:!V:TransformOutput:PDFInterpol=Spline2:NSmoothSig[0]=20:NSmoothBkg[0]=20:NSmoothBkg[1]=10:NSmooth=1:NAvEvtPerBin=50" );
// Categorised classifier
// The variable sets
TString theCat1Vars = "var1:var2:var3:var4";
TString theCat2Vars = (UseOffsetMethod ? "var1:var2:var3:var4" : "var1:var2:var3");
// Fisher with categories
TMVA::MethodBase* fiCat = factory->BookMethod( dataloader, TMVA::Types::kCategory, "FisherCat","" );
mcat = dynamic_cast<TMVA::MethodCategory*>(fiCat);
mcat->AddMethod( "abs(eta)<=1.3", theCat1Vars, TMVA::Types::kFisher, "Category_Fisher_1","!H:!V:Fisher" );
mcat->AddMethod( "abs(eta)>1.3", theCat2Vars, TMVA::Types::kFisher, "Category_Fisher_2","!H:!V:Fisher" );
// Likelihood with categories
TMVA::MethodBase* liCat = factory->BookMethod( dataloader, TMVA::Types::kCategory, "LikelihoodCat","" );
mcat = dynamic_cast<TMVA::MethodCategory*>(liCat);
mcat->AddMethod( "abs(eta)<=1.3",theCat1Vars, TMVA::Types::kLikelihood,
"Category_Likelihood_1","!H:!V:TransformOutput:PDFInterpol=Spline2:NSmoothSig[0]=20:NSmoothBkg[0]=20:NSmoothBkg[1]=10:NSmooth=1:NAvEvtPerBin=50" );
mcat->AddMethod( "abs(eta)>1.3", theCat2Vars, TMVA::Types::kLikelihood,
"Category_Likelihood_2","!H:!V:TransformOutput:PDFInterpol=Spline2:NSmoothSig[0]=20:NSmoothBkg[0]=20:NSmoothBkg[1]=10:NSmooth=1:NAvEvtPerBin=50" );
// Now you can tell the factory to train, test, and evaluate the MVAs
// Train MVAs using the set of training events
factory->TrainAllMethods();
// Evaluate all MVAs using the set of test events
factory->TestAllMethods();
// Evaluate and compare performance of all configured MVAs
factory->EvaluateAllMethods();
// --------------------------------------------------------------
// Save the output
outputFile->Close();
std::cout << "==> Wrote root file: " << outputFile->GetName() << std::endl;
std::cout << "==> TMVAClassificationCategory is done!" << std::endl;
// Clean up
delete factory;
delete dataloader;
// Launch the GUI for the root macros
if (!gROOT->IsBatch()) TMVA::TMVAGui( outfileName );
}
int main( int argc, char** argv )
{
TMVAClassificationCategory();
return 0;
}
Author
Andreas Hoecker

Definition in file TMVAClassificationCategory.C.

TMVA::MethodCategory::AddMethod
TMVA::IMethod * AddMethod(const TCut &, const TString &theVariables, Types::EMVA theMethod, const TString &theTitle, const TString &theOptions)
adds sub-classifier for a category
Definition: MethodCategory.cxx:138
TCut
Definition: TCut.h:25
kTRUE
const Bool_t kTRUE
Definition: RtypesCore.h:91
TMVA::DataLoader::PrepareTrainingAndTestTree
void PrepareTrainingAndTestTree(const TCut &cut, const TString &splitOpt)
prepare the training and test trees -> same cuts for signal and background
Definition: DataLoader.cxx:631
TObjString.h
TTree
Definition: TTree.h:79
DataLoader.h
TFile::Open
static TFile * Open(const char *name, Option_t *option="", const char *ftitle="", Int_t compress=ROOT::RCompressionSetting::EDefaults::kUseCompiledDefault, Int_t netopt=0)
Create / open a file.
Definition: TFile.cxx:3946
TMVA::Factory::TestAllMethods
void TestAllMethods()
Evaluates all booked methods on the testing data and adds the output to the Results in the corresponi...
Definition: Factory.cxx:1241
TMVAGui.h
TTree.h
TMVA::DataLoader::AddSignalTree
void AddSignalTree(TTree *signal, Double_t weight=1.0, Types::ETreeType treetype=Types::kMaxTreeType)
number of signal events (used to compute significance)
Definition: DataLoader.cxx:370
TString
Definition: TString.h:136
TSystem::AccessPathName
virtual Bool_t AccessPathName(const char *path, EAccessMode mode=kFileExists)
Returns FALSE if one can access a file using the specified access mode.
Definition: TSystem.cxx:1294
Bool_t
bool Bool_t
Definition: RtypesCore.h:63
TMVA::MethodCategory
Definition: MethodCategory.h:83
TString.h
TFile.h
TSystem::GetDirName
virtual TString GetDirName(const char *pathname)
Return the directory name in pathname.
Definition: TSystem.cxx:1030
TMVA::Types::kLikelihood
@ kLikelihood
Definition: Types.h:104
TMVA::DataLoader::AddSpectator
void AddSpectator(const TString &expression, const TString &title="", const TString &unit="", Double_t min=0, Double_t max=0)
user inserts target in data set info
Definition: DataLoader.cxx:523
TROOT.h
TChain.h
TMVA::Types::kFisher
@ kFisher
Definition: Types.h:107
TSystem.h
main
int main(int argc, char **argv)
Definition: histspeedtest.cxx:751
TMVA::Types::kCategory
@ kCategory
Definition: Types.h:122
TMVA::Factory
Definition: Factory.h:80
TMVA::MethodBase
Definition: MethodBase.h:111
TFile
Definition: TFile.h:54
gSystem
R__EXTERN TSystem * gSystem
Definition: TSystem.h:559
TMVA::Tools::Instance
static Tools & Instance()
Definition: Tools.cxx:75
TMVA::Factory::BookMethod
MethodBase * BookMethod(DataLoader *loader, TString theMethodName, TString methodTitle, TString theOption="")
Book a classifier or regression method.
Definition: Factory.cxx:342
Double_t
double Double_t
Definition: RtypesCore.h:59
TFile::Close
void Close(Option_t *option="") override
Close a file.
Definition: TFile.cxx:876
Factory.h
TMVA::TMVAGui
void TMVAGui(const char *fName="TMVA.root", TString dataset="")
TMVA::DataLoader::AddBackgroundTree
void AddBackgroundTree(TTree *background, Double_t weight=1.0, Types::ETreeType treetype=Types::kMaxTreeType)
number of signal events (used to compute significance)
Definition: DataLoader.cxx:401
Tools.h
TNamed::GetName
virtual const char * GetName() const
Returns name of object.
Definition: TNamed.h:53
TMVA::Factory::EvaluateAllMethods
void EvaluateAllMethods(void)
Iterates over all MVAs that have been booked, and calls their evaluation methods.
Definition: Factory.cxx:1346
TMVA::Factory::TrainAllMethods
void TrainAllMethods()
Iterates through all booked methods and calls training.
Definition: Factory.cxx:1090
TMVA::DataLoader::AddVariable
void AddVariable(const TString &expression, const TString &title, const TString &unit, char type='F', Double_t min=0, Double_t max=0)
user inserts discriminating variable in data set info
Definition: DataLoader.cxx:484
MethodCategory.h
gROOT
#define gROOT
Definition: TROOT.h:406
TMVA::DataLoader
Definition: DataLoader.h:50