As input data is used a toy-MC sample consisting of four Gaussian-distributed and linearly correlated input variables.
The methods to be used can be switched on and off by means of booleans, or via the prompt command, for example:
(note that the backslashes are mandatory) If no method given, a default set is used.
The output file "TMVAReg.root" can be analysed with the use of dedicated macros (simply say: root -l <macro.C>), which can be conveniently invoked through a GUI that will appear at the end of the run of this macro.
0.060357093811
25.8791851997
Processing /mnt/build/workspace/root-makedoc-v614/rootspi/rdoc/src/v6-14-00-patches/tutorials/tmva/TMVARegression.C...
==> Start TMVARegression
--- TMVARegression : Using input file: ./files/tmva_reg_example.root
DataSetInfo : [dataset] : Added class "Regression"
: Add Tree TreeR of type Regression with 10000 events
: Dataset[dataset] : Class index : 0 name : Regression
Factory : Booking method: [1mPDEFoam[0m
:
DataSetFactory : [dataset] : Number of events in input trees
:
: Number of training and testing events
: ---------------------------------------------------------------------------
: Regression -- training events : 1000
: Regression -- testing events : 9000
: Regression -- training and testing events: 10000
:
DataSetInfo : Correlation matrix (Regression):
: ------------------------
: var1 var2
: var1: +1.000 -0.018
: var2: -0.018 +1.000
: ------------------------
DataSetFactory : [dataset] :
:
Factory : Booking method: [1mKNN[0m
:
Factory : Booking method: [1mLD[0m
:
Factory : Booking method: [1mDNN_CPU[0m
:
: Parsing option string:
: ... "!H:V:ErrorStrategy=SUMOFSQUARES:VarTransform=G:WeightInitialization=XAVIERUNIFORM:Architecture=CPU:Layout=TANH|50,Layout=TANH|50,Layout=TANH|50,LINEAR:TrainingStrategy=LearningRate=1e-2,Momentum=0.5,Repetitions=1,ConvergenceSteps=20,BatchSize=50,TestRepetitions=10,WeightDecay=0.01,Regularization=NONE,DropConfig=0.2+0.2+0.2+0.,DropRepetitions=2|LearningRate=1e-3,Momentum=0.9,Repetitions=1,ConvergenceSteps=20,BatchSize=50,TestRepetitions=5,WeightDecay=0.01,Regularization=L2,DropConfig=0.1+0.1+0.1,DropRepetitions=1|LearningRate=1e-4,Momentum=0.3,Repetitions=1,ConvergenceSteps=10,BatchSize=50,TestRepetitions=5,WeightDecay=0.01,Regularization=NONE"
: The following options are set:
: - By User:
: <none>
: - Default:
: Boost_num: "0" [Number of times the classifier will be boosted]
: Parsing option string:
: ... "!H:V:ErrorStrategy=SUMOFSQUARES:VarTransform=G:WeightInitialization=XAVIERUNIFORM:Architecture=CPU:Layout=TANH|50,Layout=TANH|50,Layout=TANH|50,LINEAR:TrainingStrategy=LearningRate=1e-2,Momentum=0.5,Repetitions=1,ConvergenceSteps=20,BatchSize=50,TestRepetitions=10,WeightDecay=0.01,Regularization=NONE,DropConfig=0.2+0.2+0.2+0.,DropRepetitions=2|LearningRate=1e-3,Momentum=0.9,Repetitions=1,ConvergenceSteps=20,BatchSize=50,TestRepetitions=5,WeightDecay=0.01,Regularization=L2,DropConfig=0.1+0.1+0.1,DropRepetitions=1|LearningRate=1e-4,Momentum=0.3,Repetitions=1,ConvergenceSteps=10,BatchSize=50,TestRepetitions=5,WeightDecay=0.01,Regularization=NONE"
: The following options are set:
: - By User:
: V: "True" [Verbose output (short form of "VerbosityLevel" below - overrides the latter one)]
: VarTransform: "G" [List of variable transformations performed before training, e.g., "D_Background,P_Signal,G,N_AllClasses" for: "Decorrelation, PCA-transformation, Gaussianisation, Normalisation, each for the given class of events ('AllClasses' denotes all events of all classes, if no class indication is given, 'All' is assumed)"]
: H: "False" [Print method-specific help message]
: Layout: "TANH|50,Layout=TANH|50,Layout=TANH|50,LINEAR" [Layout of the network.]
: ErrorStrategy: "SUMOFSQUARES" [Loss function: Mean squared error (regression) or cross entropy (binary classification).]
: WeightInitialization: "XAVIERUNIFORM" [Weight initialization strategy]
: Architecture: "CPU" [Which architecture to perform the training on.]
: TrainingStrategy: "LearningRate=1e-2,Momentum=0.5,Repetitions=1,ConvergenceSteps=20,BatchSize=50,TestRepetitions=10,WeightDecay=0.01,Regularization=NONE,DropConfig=0.2+0.2+0.2+0.,DropRepetitions=2|LearningRate=1e-3,Momentum=0.9,Repetitions=1,ConvergenceSteps=20,BatchSize=50,TestRepetitions=5,WeightDecay=0.01,Regularization=L2,DropConfig=0.1+0.1+0.1,DropRepetitions=1|LearningRate=1e-4,Momentum=0.3,Repetitions=1,ConvergenceSteps=10,BatchSize=50,TestRepetitions=5,WeightDecay=0.01,Regularization=NONE" [Defines the training strategies.]
: - Default:
: VerbosityLevel: "Default" [Verbosity level]
: CreateMVAPdfs: "False" [Create PDFs for classifier outputs (signal and background)]
: IgnoreNegWeightsInTraining: "False" [Events with negative weights are ignored in the training (but are included for testing and performance evaluation)]
: ValidationSize: "20%" [Part of the training data to use for validation. Specify as 0.2 or 20% to use a fifth of the data set as validation set. Specify as 100 to use exactly 100 events. (Default: 20%)]
DNN_CPU : [dataset] : Create Transformation "G" with events from all classes.
:
: Transformation, Variable selection :
: Input : variable 'var1' <---> Output : variable 'var1'
: Input : variable 'var2' <---> Output : variable 'var2'
: Preparing the Gaussian transformation...
TFHandler_DNN_CPU : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: var1: 0.013183 1.0272 [ -3.3668 5.7307 ]
: var2: 0.0071633 1.0351 [ -4.2630 5.7307 ]
: fvalue: 164.96 82.203 [ 1.7144 391.23 ]
: -----------------------------------------------------------
Parsed Training DNN string LearningRate=1e-2,Momentum=0.5,Repetitions=1,ConvergenceSteps=20,BatchSize=50,TestRepetitions=10,WeightDecay=0.01,Regularization=NONE,DropConfig=0.2+0.2+0.2+0.,DropRepetitions=2|LearningRate=1e-3,Momentum=0.9,Repetitions=1,ConvergenceSteps=20,BatchSize=50,TestRepetitions=5,WeightDecay=0.01,Regularization=L2,DropConfig=0.1+0.1+0.1,DropRepetitions=1|LearningRate=1e-4,Momentum=0.3,Repetitions=1,ConvergenceSteps=10,BatchSize=50,TestRepetitions=5,WeightDecay=0.01,Regularization=NONE
STring has size 3
Factory : Booking method: [1mBDTG[0m
:
<WARNING> : Value for option maxdepth was previously set to 3
: the option *InverseBoostNegWeights* does not exist for BoostType=Grad --> change
: to new default for GradBoost *Pray*
Factory : [1mTrain all methods[0m
Factory : [dataset] : Create Transformation "I" with events from all classes.
:
: Transformation, Variable selection :
: Input : variable 'var1' <---> Output : variable 'var1'
: Input : variable 'var2' <---> Output : variable 'var2'
TFHandler_Factory : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: var1: 3.4138 1.1963 [ 0.0026062 4.9957 ]
: var2: 2.4356 1.4134 [ 0.0092062 4.9990 ]
: fvalue: 164.96 82.203 [ 1.7144 391.23 ]
: -----------------------------------------------------------
: Ranking input variables (method unspecific)...
IdTransformation : Ranking result (top variable is best ranked)
: --------------------------------------------
: Rank : Variable : |Correlation with target|
: --------------------------------------------
: 1 : var2 : 7.419e-01
: 2 : var1 : 5.996e-01
: --------------------------------------------
IdTransformation : Ranking result (top variable is best ranked)
: -------------------------------------
: Rank : Variable : Mutual information
: -------------------------------------
: 1 : var2 : 2.029e+00
: 2 : var1 : 1.950e+00
: -------------------------------------
IdTransformation : Ranking result (top variable is best ranked)
: ------------------------------------
: Rank : Variable : Correlation Ratio
: ------------------------------------
: 1 : var1 : 6.538e+00
: 2 : var2 : 2.460e+00
: ------------------------------------
IdTransformation : Ranking result (top variable is best ranked)
: ----------------------------------------
: Rank : Variable : Correlation Ratio (T)
: ----------------------------------------
: 1 : var2 : 9.156e-01
: 2 : var1 : 2.981e-01
: ----------------------------------------
Factory : Train method: PDEFoam for Regression
:
: Build mono target regression foam
: Elapsed time: 0.646 sec
: Elapsed time for training with 1000 events: 0.653 sec
: Dataset[dataset] : Create results for training
: Dataset[dataset] : Evaluation of PDEFoam on training sample
: Dataset[dataset] : Elapsed time for evaluation of 1000 events: 0.0071 sec
: Create variable histograms
: Create regression target histograms
: Create regression average deviation
: Results created
: Creating xml weight file: [0;36mdataset/weights/TMVARegression_PDEFoam.weights.xml[0m
: writing foam MonoTargetRegressionFoam to file
: Foams written to file: [0;36mdataset/weights/TMVARegression_PDEFoam.weights_foams.root[0m
Factory : Training finished
:
Factory : Train method: KNN for Regression
:
KNN : <Train> start...
: Reading 1000 events
: Number of signal events 1000
: Number of background events 0
: Creating kd-tree with 1000 events
: Computing scale factor for 1d distributions: (ifrac, bottom, top) = (80%, 10%, 90%)
ModulekNN : Optimizing tree for 2 variables with 1000 values
: <Fill> Class 1 has 1000 events
: Elapsed time for training with 1000 events: 0.00153 sec
: Dataset[dataset] : Create results for training
: Dataset[dataset] : Evaluation of KNN on training sample
: Dataset[dataset] : Elapsed time for evaluation of 1000 events: 0.0135 sec
: Create variable histograms
: Create regression target histograms
: Create regression average deviation
: Results created
: Creating xml weight file: [0;36mdataset/weights/TMVARegression_KNN.weights.xml[0m
Factory : Training finished
:
Factory : Train method: LD for Regression
:
LD : Results for LD coefficients:
: -----------------------
: Variable: Coefficient:
: -----------------------
: var1: +42.104
: var2: +44.607
: (offset): -87.420
: -----------------------
: Elapsed time for training with 1000 events: 0.000373 sec
: Dataset[dataset] : Create results for training
: Dataset[dataset] : Evaluation of LD on training sample
: Dataset[dataset] : Elapsed time for evaluation of 1000 events: 0.00231 sec
: Create variable histograms
: Create regression target histograms
: Create regression average deviation
: Results created
: Creating xml weight file: [0;36mdataset/weights/TMVARegression_LD.weights.xml[0m
Factory : Training finished
:
Factory : Train method: DNN_CPU for Regression
:
TFHandler_DNN_CPU : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: var1: 0.013183 1.0272 [ -3.3668 5.7307 ]
: var2: 0.0071633 1.0351 [ -4.2630 5.7307 ]
: fvalue: 164.96 82.203 [ 1.7144 391.23 ]
: -----------------------------------------------------------
: Start of neural network training on CPU.
:
: Training phase 1 of 3:
: Epoch | Train Err. Test Err. GFLOP/s Conv. Steps
: --------------------------------------------------------------
: 10 | 1223.16 1186.64 1.0245 0
: 20 | 1913.8 1799.36 1.1141 10
: 30 | 2989.46 3125.5 1.19686 20
:
: Training phase 2 of 3:
: Epoch | Train Err. Test Err. GFLOP/s Conv. Steps
: --------------------------------------------------------------
: 5 | 7766.39 1639.33 1.20215 0
: 10 | 6828.2 1138.18 1.17525 0
: 15 | 6750.87 1258.19 1.13374 5
: 20 | 6638.98 1359.33 1.13497 10
: 25 | 6426.33 1084 1.12594 0
: 30 | 6358.57 1128.37 1.1136 5
: 35 | 6601.17 1413.66 1.08626 10
: 40 | 6383.97 1606.09 1.12923 15
: 45 | 6674.44 1582.05 1.12947 20
:
: Training phase 3 of 3:
: Epoch | Train Err. Test Err. GFLOP/s Conv. Steps
: --------------------------------------------------------------
: 5 | 1567.14 1408.25 1.57317 0
: 10 | 1524.59 1378.28 1.61924 0
: 15 | 1467.95 1334.19 1.59934 0
: 20 | 1441.84 1322.22 1.63546 0
: 25 | 1428.85 1311.5 1.6441 0
: 30 | 1416.92 1301.79 1.66205 0
: 35 | 1406.14 1293.01 1.64024 0
: 40 | 1396 1283.43 1.61055 0
: 45 | 1386.62 1276 1.64586 0
: 50 | 1377.8 1269.35 1.66474 0
: 55 | 1369.42 1263.2 1.68068 0
: 60 | 1361.67 1258.02 1.65397 0
: 65 | 1353.56 1253.52 1.64161 0
: 70 | 1346.7 1248.81 1.64646 0
: 75 | 1340.95 1245 1.6497 0
: 80 | 1335.81 1241.54 1.55391 0
: 85 | 1330.99 1237.98 1.50878 0
: 90 | 1326.55 1235.22 1.59568 0
: 95 | 1322.52 1232.47 1.58413 0
: 100 | 1318.68 1230.09 1.58085 0
: 105 | 1315.12 1227.62 1.60077 0
: 110 | 1311.76 1225.81 1.61769 0
: 115 | 1308.71 1223.71 1.63288 0
: 120 | 1305.77 1222.09 1.66172 0
: 125 | 1302.99 1220.67 1.63651 0
: 130 | 1300.39 1219.44 1.68469 0
: 135 | 1297.78 1217.81 1.63767 0
: 140 | 1295.39 1216.76 1.64437 5
: 145 | 1293.29 1215.88 1.57922 0
: 150 | 1291.21 1214.68 1.63066 5
: 155 | 1289.33 1213.91 1.67057 0
: 160 | 1287.47 1212.89 1.65315 5
: 165 | 1285.79 1212.25 1.71419 0
: 170 | 1284.07 1211.74 1.65769 5
: 175 | 1282.55 1211 1.62855 0
: 180 | 1280.92 1210.58 1.6441 5
: 185 | 1279.37 1210.09 1.66901 10
:
: Elapsed time for training with 1000 events: 4.85 sec
: Dataset[dataset] : Create results for training
: Dataset[dataset] : Evaluation of DNN_CPU on training sample
: Dataset[dataset] : Elapsed time for evaluation of 1000 events: 0.0267 sec
: Create variable histograms
: Create regression target histograms
: Create regression average deviation
: Results created
: Creating xml weight file: [0;36mdataset/weights/TMVARegression_DNN_CPU.weights.xml[0m
Factory : Training finished
:
Factory : Train method: BDTG for Regression
:
: Regression Loss Function: Huber
: Training 2000 Decision Trees ... patience please
: Elapsed time for training with 1000 events: 2.15 sec
: Dataset[dataset] : Create results for training
: Dataset[dataset] : Evaluation of BDTG on training sample
: Dataset[dataset] : Elapsed time for evaluation of 1000 events: 0.377 sec
: Create variable histograms
: Create regression target histograms
: Create regression average deviation
: Results created
: Creating xml weight file: [0;36mdataset/weights/TMVARegression_BDTG.weights.xml[0m
Factory : Training finished
:
Factory : === Destroy and recreate all methods via weight files for testing ===
:
: Reading weight file: [0;36mdataset/weights/TMVARegression_PDEFoam.weights.xml[0m
: Read foams from file: [0;36mdataset/weights/TMVARegression_PDEFoam.weights_foams.root[0m
: Reading weight file: [0;36mdataset/weights/TMVARegression_KNN.weights.xml[0m
: Creating kd-tree with 1000 events
: Computing scale factor for 1d distributions: (ifrac, bottom, top) = (80%, 10%, 90%)
ModulekNN : Optimizing tree for 2 variables with 1000 values
: <Fill> Class 1 has 1000 events
: Reading weight file: [0;36mdataset/weights/TMVARegression_LD.weights.xml[0m
: Reading weight file: [0;36mdataset/weights/TMVARegression_DNN_CPU.weights.xml[0m
: Reading weight file: [0;36mdataset/weights/TMVARegression_BDTG.weights.xml[0m
Factory : [1mTest all methods[0m
Factory : Test method: PDEFoam for Regression performance
:
: Dataset[dataset] : Create results for testing
: Dataset[dataset] : Evaluation of PDEFoam on testing sample
: Dataset[dataset] : Elapsed time for evaluation of 9000 events: 0.0535 sec
: Create variable histograms
: Create regression target histograms
: Create regression average deviation
: Results created
Factory : Test method: KNN for Regression performance
:
: Dataset[dataset] : Create results for testing
: Dataset[dataset] : Evaluation of KNN on testing sample
: Dataset[dataset] : Elapsed time for evaluation of 9000 events: 0.113 sec
: Create variable histograms
: Create regression target histograms
: Create regression average deviation
: Results created
Factory : Test method: LD for Regression performance
:
: Dataset[dataset] : Create results for testing
: Dataset[dataset] : Evaluation of LD on testing sample
: Dataset[dataset] : Elapsed time for evaluation of 9000 events: 0.00821 sec
: Create variable histograms
: Create regression target histograms
: Create regression average deviation
: Results created
Factory : Test method: DNN_CPU for Regression performance
:
: Dataset[dataset] : Create results for testing
: Dataset[dataset] : Evaluation of DNN_CPU on testing sample
: Dataset[dataset] : Elapsed time for evaluation of 9000 events: 0.25 sec
: Create variable histograms
: Create regression target histograms
: Create regression average deviation
: Results created
Factory : Test method: BDTG for Regression performance
:
: Dataset[dataset] : Create results for testing
: Dataset[dataset] : Evaluation of BDTG on testing sample
: Dataset[dataset] : Elapsed time for evaluation of 9000 events: 1.82 sec
: Create variable histograms
: Create regression target histograms
: Create regression average deviation
: Results created
Factory : [1mEvaluate all methods[0m
: Evaluate regression method: PDEFoam
: TestRegression (testing)
: Calculate regression for all events
: Elapsed time for evaluation of 9000 events: 0.047 sec
: TestRegression (training)
: Calculate regression for all events
: Elapsed time for evaluation of 1000 events: 0.00592 sec
TFHandler_PDEFoam : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: var1: 3.3309 1.1858 [ 0.00020069 5.0000 ]
: var2: 2.4914 1.4393 [ 0.00071490 5.0000 ]
: fvalue: 164.02 83.932 [ 1.6186 394.84 ]
: -----------------------------------------------------------
: Evaluate regression method: KNN
: TestRegression (testing)
: Calculate regression for all events
: Elapsed time for evaluation of 9000 events: 0.107 sec
: TestRegression (training)
: Calculate regression for all events
: Elapsed time for evaluation of 1000 events: 0.0115 sec
TFHandler_KNN : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: var1: 3.3309 1.1858 [ 0.00020069 5.0000 ]
: var2: 2.4914 1.4393 [ 0.00071490 5.0000 ]
: fvalue: 164.02 83.932 [ 1.6186 394.84 ]
: -----------------------------------------------------------
: Evaluate regression method: LD
: TestRegression (testing)
: Calculate regression for all events
: Elapsed time for evaluation of 9000 events: 0.00771 sec
: TestRegression (training)
: Calculate regression for all events
: Elapsed time for evaluation of 1000 events: 0.00122 sec
TFHandler_LD : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: var1: 3.3309 1.1858 [ 0.00020069 5.0000 ]
: var2: 2.4914 1.4393 [ 0.00071490 5.0000 ]
: fvalue: 164.02 83.932 [ 1.6186 394.84 ]
: -----------------------------------------------------------
: Evaluate regression method: DNN_CPU
: TestRegression (testing)
: Calculate regression for all events
: Elapsed time for evaluation of 9000 events: 0.231 sec
: TestRegression (training)
: Calculate regression for all events
: Elapsed time for evaluation of 1000 events: 0.0287 sec
TFHandler_DNN_CPU : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: var1: -0.062720 1.0031 [ -3.3827 5.7307 ]
: var2: 0.031261 1.0685 [ -5.7307 5.7307 ]
: fvalue: 164.02 83.932 [ 1.6186 394.84 ]
: -----------------------------------------------------------
TFHandler_DNN_CPU : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: var1: -0.062720 1.0031 [ -3.3827 5.7307 ]
: var2: 0.031261 1.0685 [ -5.7307 5.7307 ]
: fvalue: 164.02 83.932 [ 1.6186 394.84 ]
: -----------------------------------------------------------
: Evaluate regression method: BDTG
: TestRegression (testing)
: Calculate regression for all events
: Elapsed time for evaluation of 9000 events: 1.96 sec
: TestRegression (training)
: Calculate regression for all events
: Elapsed time for evaluation of 1000 events: 0.197 sec
TFHandler_BDTG : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: var1: 3.3309 1.1858 [ 0.00020069 5.0000 ]
: var2: 2.4914 1.4393 [ 0.00071490 5.0000 ]
: fvalue: 164.02 83.932 [ 1.6186 394.84 ]
: -----------------------------------------------------------
:
: Evaluation results ranked by smallest RMS on test sample:
: ("Bias" quotes the mean deviation of the regression from true target.
: "MutInf" is the "Mutual Information" between regression and target.
: Indicated by "_T" are the corresponding "truncated" quantities ob-
: tained when removing events deviating more than 2sigma from average.)
: --------------------------------------------------------------------------------------------------
: --------------------------------------------------------------------------------------------------
: dataset KNN : -0.507 0.436 5.77 3.79 | 2.871 2.903
: dataset PDEFoam : -0.831 -0.645 9.90 8.12 | 2.245 2.327
: dataset LD : -0.0644 1.63 19.7 17.9 | 1.988 1.981
: dataset DNN_CPU : -1.31 3.92 37.7 30.0 | 1.153 1.166
: dataset BDTG : 0.531 -5.57 82.5 73.9 | 2.307 2.239
: --------------------------------------------------------------------------------------------------
:
: Evaluation results ranked by smallest RMS on training sample:
: (overtraining check)
: --------------------------------------------------------------------------------------------------
: DataSet Name: MVA Method: <Bias> <Bias_T> RMS RMS_T | MutInf MutInf_T
: --------------------------------------------------------------------------------------------------
: dataset KNN : -0.523 0.298 5.55 3.82 | 2.931 2.946
: dataset PDEFoam : 7.41e-07 0.243 7.99 6.37 | 2.489 2.565
: dataset LD : 3.68e-06 1.76 18.9 16.9 | 2.101 2.099
: dataset DNN_CPU : 0.0254 4.80 35.6 28.6 | 1.235 1.247
: dataset BDTG : 0.354 -4.75 80.9 72.3 | 2.369 2.287
: --------------------------------------------------------------------------------------------------
:
Dataset:dataset : Created tree 'TestTree' with 9000 events
:
Dataset:dataset : Created tree 'TrainTree' with 1000 events
:
Factory : [1mThank you for using TMVA![0m
: [1mFor citation information, please visit: http://tmva.sf.net/citeTMVA.html[0m
==> Wrote root file: TMVAReg.root
==> TMVARegression is done!