Logo ROOT  
Reference Guide
MethodDT.cxx
Go to the documentation of this file.
1// @(#)root/tmva $Id$
2// Author: Andreas Hoecker, Joerg Stelzer, Helge Voss, Kai Voss
3
4/**********************************************************************************
5 * Project: TMVA - a Root-integrated toolkit for multivariate data analysis *
6 * Package: TMVA *
7 * Class : MethodDT (DT = Decision Trees) *
8 * Web : http://tmva.sourceforge.net *
9 * *
10 * Description: *
11 * Analysis of Boosted Decision Trees *
12 * *
13 * Authors (alphabetical): *
14 * Andreas Hoecker <Andreas.Hocker@cern.ch> - CERN, Switzerland *
15 * Helge Voss <Helge.Voss@cern.ch> - MPI-K Heidelberg, Germany *
16 * Or Cohen <orcohenor@gmail.com> - Weizmann Inst., Israel *
17 * *
18 * Copyright (c) 2005: *
19 * CERN, Switzerland *
20 * MPI-K Heidelberg, Germany *
21 * *
22 * Redistribution and use in source and binary forms, with or without *
23 * modification, are permitted according to the terms listed in LICENSE *
24 * (http://tmva.sourceforge.net/LICENSE) *
25 **********************************************************************************/
26
27/*! \class TMVA::MethodDT
28\ingroup TMVA
29
30Analysis of Boosted Decision Trees
31
32Boosted decision trees have been successfully used in High Energy
33Physics analysis for example by the MiniBooNE experiment
34(Yang-Roe-Zhu, physics/0508045). In Boosted Decision Trees, the
35selection is done on a majority vote on the result of several decision
36trees, which are all derived from the same training sample by
37supplying different event weights during the training.
38
39### Decision trees:
40
41successive decision nodes are used to categorize the
42events out of the sample as either signal or background. Each node
43uses only a single discriminating variable to decide if the event is
44signal-like ("goes right") or background-like ("goes left"). This
45forms a tree like structure with "baskets" at the end (leave nodes),
46and an event is classified as either signal or background according to
47whether the basket where it ends up has been classified signal or
48background during the training. Training of a decision tree is the
49process to define the "cut criteria" for each node. The training
50starts with the root node. Here one takes the full training event
51sample and selects the variable and corresponding cut value that gives
52the best separation between signal and background at this stage. Using
53this cut criterion, the sample is then divided into two subsamples, a
54signal-like (right) and a background-like (left) sample. Two new nodes
55are then created for each of the two sub-samples and they are
56constructed using the same mechanism as described for the root
57node. The devision is stopped once a certain node has reached either a
58minimum number of events, or a minimum or maximum signal purity. These
59leave nodes are then called "signal" or "background" if they contain
60more signal respective background events from the training sample.
61
62### Boosting:
63
64the idea behind the boosting is, that signal events from the training
65sample, that *end up in a background node (and vice versa) are given a
66larger weight than events that are in the correct leave node. This
67results in a re-weighed training event sample, with which then a new
68decision tree can be developed. The boosting can be applied several
69times (typically 100-500 times) and one ends up with a set of decision
70trees (a forest).
71
72### Bagging:
73
74In this particular variant of the Boosted Decision Trees the boosting
75is not done on the basis of previous training results, but by a simple
76stochastic re-sampling of the initial training event sample.
77
78### Analysis:
79
80applying an individual decision tree to a test event results in a
81classification of the event as either signal or background. For the
82boosted decision tree selection, an event is successively subjected to
83the whole set of decision trees and depending on how often it is
84classified as signal, a "likelihood" estimator is constructed for the
85event being signal or background. The value of this estimator is the
86one which is then used to select the events from an event sample, and
87the cut value on this estimator defines the efficiency and purity of
88the selection.
89*/
90
91#include "TMVA/MethodDT.h"
92
94#include "TMVA/CCPruner.h"
96#include "TMVA/Configurable.h"
97#include "TMVA/CrossEntropy.h"
98#include "TMVA/DataSet.h"
99#include "TMVA/DecisionTree.h"
100#include "TMVA/GiniIndex.h"
101#include "TMVA/IMethod.h"
102#include "TMVA/MethodBase.h"
103#include "TMVA/MethodBoost.h"
105#include "TMVA/MsgLogger.h"
106#include "TMVA/Ranking.h"
107#include "TMVA/SdivSqrtSplusB.h"
108#include "TMVA/SeparationBase.h"
109#include "TMVA/Timer.h"
110#include "TMVA/Tools.h"
111#include "TMVA/Types.h"
112
113#include "Riostream.h"
114#include "TRandom3.h"
115#include "TMath.h"
116
117#include <algorithm>
118
119using std::vector;
120
122
124
125////////////////////////////////////////////////////////////////////////////////
126/// the standard constructor for just an ordinar "decision trees"
127
129 const TString& methodTitle,
130 DataSetInfo& theData,
131 const TString& theOption) :
132 TMVA::MethodBase( jobName, Types::kDT, methodTitle, theData, theOption)
133 , fTree(0)
134 , fSepType(0)
135 , fMinNodeEvents(0)
136 , fMinNodeSize(0)
137 , fNCuts(0)
138 , fUseYesNoLeaf(kFALSE)
139 , fNodePurityLimit(0)
140 , fMaxDepth(0)
141 , fErrorFraction(0)
142 , fPruneStrength(0)
143 , fPruneMethod(DecisionTree::kNoPruning)
144 , fAutomatic(kFALSE)
145 , fRandomisedTrees(kFALSE)
146 , fUseNvars(0)
147 , fUsePoissonNvars(0) // don't use this initialisation, only here to make Coverity happy. Is set in Init()
148 , fDeltaPruneStrength(0)
149{
151}
152
153////////////////////////////////////////////////////////////////////////////////
154///constructor from Reader
155
157 const TString& theWeightFile) :
158 TMVA::MethodBase( Types::kDT, dsi, theWeightFile)
159 , fTree(0)
160 , fSepType(0)
161 , fMinNodeEvents(0)
162 , fMinNodeSize(0)
163 , fNCuts(0)
164 , fUseYesNoLeaf(kFALSE)
165 , fNodePurityLimit(0)
166 , fMaxDepth(0)
167 , fErrorFraction(0)
168 , fPruneStrength(0)
169 , fPruneMethod(DecisionTree::kNoPruning)
170 , fAutomatic(kFALSE)
171 , fRandomisedTrees(kFALSE)
172 , fUseNvars(0)
173 , fDeltaPruneStrength(0)
174{
176}
177
178////////////////////////////////////////////////////////////////////////////////
179/// FDA can handle classification with 2 classes and regression with one regression-target
180
182{
183 if( type == Types::kClassification && numberClasses == 2 ) return kTRUE;
184 return kFALSE;
185}
186
187
188////////////////////////////////////////////////////////////////////////////////
189/// Define the options (their key words) that can be set in the option string.
190///
191/// - UseRandomisedTrees choose at each node splitting a random set of variables
192/// - UseNvars use UseNvars variables in randomised trees
193/// - SeparationType the separation criterion applied in the node splitting.
194/// known:
195/// - GiniIndex
196/// - MisClassificationError
197/// - CrossEntropy
198/// - SDivSqrtSPlusB
199/// - nEventsMin: the minimum number of events in a node (leaf criteria, stop splitting)
200/// - nCuts: the number of steps in the optimisation of the cut for a node (if < 0, then
201/// step size is determined by the events)
202/// - UseYesNoLeaf decide if the classification is done simply by the node type, or the S/B
203/// (from the training) in the leaf node
204/// - NodePurityLimit the minimum purity to classify a node as a signal node (used in pruning and boosting to determine
205/// misclassification error rate)
206/// - PruneMethod The Pruning method:
207/// known:
208/// - NoPruning // switch off pruning completely
209/// - ExpectedError
210/// - CostComplexity
211/// - PruneStrength a parameter to adjust the amount of pruning. Should be large enough such that overtraining is avoided");
212
214{
215 DeclareOptionRef(fRandomisedTrees,"UseRandomisedTrees","Choose at each node splitting a random set of variables and *bagging*");
216 DeclareOptionRef(fUseNvars,"UseNvars","Number of variables used if randomised Tree option is chosen");
217 DeclareOptionRef(fUsePoissonNvars,"UsePoissonNvars", "Interpret \"UseNvars\" not as fixed number but as mean of a Poisson distribution in each split with RandomisedTree option");
218 DeclareOptionRef(fUseYesNoLeaf=kTRUE, "UseYesNoLeaf",
219 "Use Sig or Bkg node type or the ratio S/B as classification in the leaf node");
220 DeclareOptionRef(fNodePurityLimit=0.5, "NodePurityLimit", "In boosting/pruning, nodes with purity > NodePurityLimit are signal; background otherwise.");
221 DeclareOptionRef(fSepTypeS="GiniIndex", "SeparationType", "Separation criterion for node splitting");
222 AddPreDefVal(TString("MisClassificationError"));
223 AddPreDefVal(TString("GiniIndex"));
224 AddPreDefVal(TString("CrossEntropy"));
225 AddPreDefVal(TString("SDivSqrtSPlusB"));
226 DeclareOptionRef(fMinNodeEvents=-1, "nEventsMin", "deprecated !!! Minimum number of events required in a leaf node");
227 DeclareOptionRef(fMinNodeSizeS, "MinNodeSize", "Minimum percentage of training events required in a leaf node (default: Classification: 10%, Regression: 1%)");
228 DeclareOptionRef(fNCuts, "nCuts", "Number of steps during node cut optimisation");
229 DeclareOptionRef(fPruneStrength, "PruneStrength", "Pruning strength (negative value == automatic adjustment)");
230 DeclareOptionRef(fPruneMethodS="NoPruning", "PruneMethod", "Pruning method: NoPruning (switched off), ExpectedError or CostComplexity");
231
232 AddPreDefVal(TString("NoPruning"));
233 AddPreDefVal(TString("ExpectedError"));
234 AddPreDefVal(TString("CostComplexity"));
235
236 if (DoRegression()) {
237 DeclareOptionRef(fMaxDepth=50,"MaxDepth","Max depth of the decision tree allowed");
238 }else{
239 DeclareOptionRef(fMaxDepth=3,"MaxDepth","Max depth of the decision tree allowed");
240 }
241}
242
243////////////////////////////////////////////////////////////////////////////////
244/// options that are used ONLY for the READER to ensure backward compatibility
245
247
249
250 DeclareOptionRef(fPruneBeforeBoost=kFALSE, "PruneBeforeBoost",
251 "--> removed option .. only kept for reader backward compatibility");
252}
253
254////////////////////////////////////////////////////////////////////////////////
255/// the option string is decoded, for available options see "DeclareOptions"
256
258{
259 fSepTypeS.ToLower();
260 if (fSepTypeS == "misclassificationerror") fSepType = new MisClassificationError();
261 else if (fSepTypeS == "giniindex") fSepType = new GiniIndex();
262 else if (fSepTypeS == "crossentropy") fSepType = new CrossEntropy();
263 else if (fSepTypeS == "sdivsqrtsplusb") fSepType = new SdivSqrtSplusB();
264 else {
265 Log() << kINFO << GetOptions() << Endl;
266 Log() << kFATAL << "<ProcessOptions> unknown Separation Index option called" << Endl;
267 }
268
269 // std::cout << "fSeptypes " << fSepTypeS << " fseptype " << fSepType << std::endl;
270
271 fPruneMethodS.ToLower();
272 if (fPruneMethodS == "expectederror" ) fPruneMethod = DecisionTree::kExpectedErrorPruning;
273 else if (fPruneMethodS == "costcomplexity" ) fPruneMethod = DecisionTree::kCostComplexityPruning;
274 else if (fPruneMethodS == "nopruning" ) fPruneMethod = DecisionTree::kNoPruning;
275 else {
276 Log() << kINFO << GetOptions() << Endl;
277 Log() << kFATAL << "<ProcessOptions> unknown PruneMethod option:" << fPruneMethodS <<" called" << Endl;
278 }
279
280 if (fPruneStrength < 0) fAutomatic = kTRUE;
281 else fAutomatic = kFALSE;
282 if (fAutomatic && fPruneMethod == DecisionTree::kExpectedErrorPruning){
283 Log() << kFATAL
284 << "Sorry automatic pruning strength determination is not implemented yet for ExpectedErrorPruning" << Endl;
285 }
286
287
288 if (this->Data()->HasNegativeEventWeights()){
289 Log() << kINFO << " You are using a Monte Carlo that has also negative weights. "
290 << "That should in principle be fine as long as on average you end up with "
291 << "something positive. For this you have to make sure that the minimal number "
292 << "of (un-weighted) events demanded for a tree node (currently you use: MinNodeSize="
293 <<fMinNodeSizeS
294 <<", (or the deprecated equivalent nEventsMin) you can set this via the "
295 <<"MethodDT option string when booking the "
296 << "classifier) is large enough to allow for reasonable averaging!!! "
297 << " If this does not help.. maybe you want to try the option: IgnoreNegWeightsInTraining "
298 << "which ignores events with negative weight in the training. " << Endl
299 << Endl << "Note: You'll get a WARNING message during the training if that should ever happen" << Endl;
300 }
301
302 if (fRandomisedTrees){
303 Log() << kINFO << " Randomised trees should use *bagging* as *boost* method. Did you set this in the *MethodBoost* ? . Here I can enforce only the *no pruning*" << Endl;
304 fPruneMethod = DecisionTree::kNoPruning;
305 // fBoostType = "Bagging";
306 }
307
308 if (fMinNodeEvents > 0){
309 fMinNodeSize = fMinNodeEvents / Data()->GetNTrainingEvents() * 100;
310 Log() << kWARNING << "You have explicitly set *nEventsMin*, the min absolute number \n"
311 << "of events in a leaf node. This is DEPRECATED, please use the option \n"
312 << "*MinNodeSize* giving the relative number as percentage of training \n"
313 << "events instead. \n"
314 << "nEventsMin="<<fMinNodeEvents<< "--> MinNodeSize="<<fMinNodeSize<<"%"
315 << Endl;
316 }else{
317 SetMinNodeSize(fMinNodeSizeS);
318 }
319}
320
322 if (sizeInPercent > 0 && sizeInPercent < 50){
323 fMinNodeSize=sizeInPercent;
324
325 } else {
326 Log() << kERROR << "you have demanded a minimal node size of "
327 << sizeInPercent << "% of the training events.. \n"
328 << " that somehow does not make sense "<<Endl;
329 }
330
331}
333 sizeInPercent.ReplaceAll("%","");
334 if (sizeInPercent.IsAlnum()) SetMinNodeSize(sizeInPercent.Atof());
335 else {
336 Log() << kERROR << "I had problems reading the option MinNodeEvents, which\n"
337 << "after removing a possible % sign now reads " << sizeInPercent << Endl;
338 }
339}
340
341////////////////////////////////////////////////////////////////////////////////
342/// common initialisation with defaults for the DT-Method
343
345{
346 fMinNodeEvents = -1;
347 fMinNodeSize = 5;
348 fMinNodeSizeS = "5%";
349 fNCuts = 20;
350 fPruneMethod = DecisionTree::kNoPruning;
351 fPruneStrength = 5; // -1 means automatic determination of the prune strength using a validation sample
352 fDeltaPruneStrength=0.1;
353 fRandomisedTrees= kFALSE;
354 fUseNvars = GetNvar();
355 fUsePoissonNvars = kTRUE;
356
357 // reference cut value to distinguish signal-like from background-like events
358 SetSignalReferenceCut( 0 );
359 if (fAnalysisType == Types::kClassification || fAnalysisType == Types::kMulticlass ) {
360 fMaxDepth = 3;
361 }else {
362 fMaxDepth = 50;
363 }
364}
365
366////////////////////////////////////////////////////////////////////////////////
367///destructor
368
370{
371 delete fTree;
372}
373
374////////////////////////////////////////////////////////////////////////////////
375
377{
379 fTree = new DecisionTree( fSepType, fMinNodeSize, fNCuts, &(DataInfo()), 0,
380 fRandomisedTrees, fUseNvars, fUsePoissonNvars,fMaxDepth,0 );
381 fTree->SetNVars(GetNvar());
382 if (fRandomisedTrees) Log()<<kWARNING<<" randomised Trees do not work yet in this framework,"
383 << " as I do not know how to give each tree a new random seed, now they"
384 << " will be all the same and that is not good " << Endl;
385 fTree->SetAnalysisType( GetAnalysisType() );
386
387 //fTree->BuildTree(GetEventCollection(Types::kTraining));
388 Data()->SetCurrentType(Types::kTraining);
389 UInt_t nevents = Data()->GetNTrainingEvents();
390 std::vector<const TMVA::Event*> tmp;
391 for (Long64_t ievt=0; ievt<nevents; ievt++) {
392 const Event *event = GetEvent(ievt);
393 tmp.push_back(event);
394 }
395 fTree->BuildTree(tmp);
396 if (fPruneMethod != DecisionTree::kNoPruning) fTree->PruneTree();
397
399 ExitFromTraining();
400}
401
402////////////////////////////////////////////////////////////////////////////////
403/// prune the decision tree if requested (good for individual trees that are best grown out, and then
404/// pruned back, while boosted decision trees are best 'small' trees to start with. Well, at least the
405/// standard "optimal pruning algorithms" don't result in 'weak enough' classifiers !!
406
408{
409 // remember the number of nodes beforehand (for monitoring purposes)
410
411
412 if (fAutomatic && fPruneMethod == DecisionTree::kCostComplexityPruning) { // automatic cost complexity pruning
413 CCPruner* pruneTool = new CCPruner(fTree, this->Data() , fSepType);
414 pruneTool->Optimize();
415 std::vector<DecisionTreeNode*> nodes = pruneTool->GetOptimalPruneSequence();
416 fPruneStrength = pruneTool->GetOptimalPruneStrength();
417 for(UInt_t i = 0; i < nodes.size(); i++)
418 fTree->PruneNode(nodes[i]);
419 delete pruneTool;
420 }
421 else if (fAutomatic && fPruneMethod != DecisionTree::kCostComplexityPruning){
422 /*
423
424 Double_t alpha = 0;
425 Double_t delta = fDeltaPruneStrength;
426
427 DecisionTree* dcopy;
428 std::vector<Double_t> q;
429 multimap<Double_t,Double_t> quality;
430 Int_t nnodes=fTree->GetNNodes();
431
432 // find the maximum prune strength that still leaves some nodes
433 Bool_t forceStop = kFALSE;
434 Int_t troubleCount=0, previousNnodes=nnodes;
435
436
437 nnodes=fTree->GetNNodes();
438 while (nnodes > 3 && !forceStop) {
439 dcopy = new DecisionTree(*fTree);
440 dcopy->SetPruneStrength(alpha+=delta);
441 dcopy->PruneTree();
442 q.push_back(TestTreeQuality(dcopy));
443 quality.insert(std::pair<const Double_t,Double_t>(q.back(),alpha));
444 nnodes=dcopy->GetNNodes();
445 if (previousNnodes == nnodes) troubleCount++;
446 else {
447 troubleCount=0; // reset counter
448 if (nnodes < previousNnodes / 2 ) fDeltaPruneStrength /= 2.;
449 }
450 previousNnodes = nnodes;
451 if (troubleCount > 20) {
452 if (methodIndex == 0 && fPruneStrength <=0) {//maybe you need larger stepsize ??
453 fDeltaPruneStrength *= 5;
454 Log() << kINFO << "<PruneTree> trouble determining optimal prune strength"
455 << " for Tree " << methodIndex
456 << " --> first try to increase the step size"
457 << " currently Prunestrenght= " << alpha
458 << " stepsize " << fDeltaPruneStrength << " " << Endl;
459 troubleCount = 0; // try again
460 fPruneStrength = 1; // if it was for the first time..
461 } else if (methodIndex == 0 && fPruneStrength <=2) {//maybe you need much larger stepsize ??
462 fDeltaPruneStrength *= 5;
463 Log() << kINFO << "<PruneTree> trouble determining optimal prune strength"
464 << " for Tree " << methodIndex
465 << " --> try to increase the step size even more.. "
466 << " if that still didn't work, TRY IT BY HAND"
467 << " currently Prunestrenght= " << alpha
468 << " stepsize " << fDeltaPruneStrength << " " << Endl;
469 troubleCount = 0; // try again
470 fPruneStrength = 3; // if it was for the first time..
471 } else {
472 forceStop=kTRUE;
473 Log() << kINFO << "<PruneTree> trouble determining optimal prune strength"
474 << " for Tree " << methodIndex << " at tested prune strength: " << alpha << " --> abort forced, use same strength as for previous tree:"
475 << fPruneStrength << Endl;
476 }
477 }
478 if (fgDebugLevel==1) Log() << kINFO << "Pruneed with ("<<alpha
479 << ") give quality: " << q.back()
480 << " and #nodes: " << nnodes
481 << Endl;
482 delete dcopy;
483 }
484 if (!forceStop) {
485 multimap<Double_t,Double_t>::reverse_iterator it=quality.rend();
486 it++;
487 fPruneStrength = it->second;
488 // adjust the step size for the next tree.. think that 20 steps are sort of
489 // fine enough.. could become a tunable option later..
490 fDeltaPruneStrength *= Double_t(q.size())/20.;
491 }
492
493 fTree->SetPruneStrength(fPruneStrength);
494 fTree->PruneTree();
495 */
496 }
497 else {
498 fTree->SetPruneStrength(fPruneStrength);
499 fTree->PruneTree();
500 }
501
502 return fPruneStrength;
503}
504
505////////////////////////////////////////////////////////////////////////////////
506
508{
509 Data()->SetCurrentType(Types::kValidation);
510 // test the tree quality.. in terms of Misclassification
511 Double_t SumCorrect=0,SumWrong=0;
512 for (Long64_t ievt=0; ievt<Data()->GetNEvents(); ievt++)
513 {
514 const Event * ev = Data()->GetEvent(ievt);
515 if ((dt->CheckEvent(ev) > dt->GetNodePurityLimit() ) == DataInfo().IsSignal(ev)) SumCorrect+=ev->GetWeight();
516 else SumWrong+=ev->GetWeight();
517 }
518 Data()->SetCurrentType(Types::kTraining);
519 return SumCorrect / (SumCorrect + SumWrong);
520}
521
522////////////////////////////////////////////////////////////////////////////////
523
524void TMVA::MethodDT::AddWeightsXMLTo( void* parent ) const
525{
526 fTree->AddXMLTo(parent);
527 //Log() << kFATAL << "Please implement writing of weights as XML" << Endl;
528}
529
530////////////////////////////////////////////////////////////////////////////////
531
533{
534 if(fTree)
535 delete fTree;
536 fTree = new DecisionTree();
537 fTree->ReadXML(wghtnode,GetTrainingTMVAVersionCode());
538}
539
540////////////////////////////////////////////////////////////////////////////////
541
543{
544 delete fTree;
545 fTree = new DecisionTree();
546 fTree->Read(istr);
547}
548
549////////////////////////////////////////////////////////////////////////////////
550/// returns MVA value
551
553{
554 // cannot determine error
555 NoErrorCalc(err, errUpper);
556
557 return fTree->CheckEvent(GetEvent(),fUseYesNoLeaf);
558}
559
560////////////////////////////////////////////////////////////////////////////////
561
563{
564}
565////////////////////////////////////////////////////////////////////////////////
566
568{
569 return 0;
570}
#define REGISTER_METHOD(CLASS)
for example
const Bool_t kFALSE
Definition: RtypesCore.h:90
double Double_t
Definition: RtypesCore.h:57
long long Long64_t
Definition: RtypesCore.h:71
const Bool_t kTRUE
Definition: RtypesCore.h:89
#define ClassImp(name)
Definition: Rtypes.h:361
int type
Definition: TGX11.cxx:120
A helper class to prune a decision tree using the Cost Complexity method (see Classification and Regr...
Definition: CCPruner.h:61
void SetPruneStrength(Float_t alpha=-1.0)
Definition: CCPruner.h:109
void Optimize()
determine the pruning sequence
Definition: CCPruner.cxx:124
std::vector< TMVA::DecisionTreeNode * > GetOptimalPruneSequence() const
return the prune strength (=alpha) corresponding to the prune sequence
Definition: CCPruner.cxx:240
Float_t GetOptimalPruneStrength() const
Definition: CCPruner.h:88
Implementation of the CrossEntropy as separation criterion.
Definition: CrossEntropy.h:43
Class that contains all the data information.
Definition: DataSetInfo.h:60
Implementation of a Decision Tree.
Definition: DecisionTree.h:64
Double_t GetNodePurityLimit() const
Definition: DecisionTree.h:161
Double_t CheckEvent(const TMVA::Event *, Bool_t UseYesNoLeaf=kFALSE) const
the event e is put into the decision tree (starting at the root node) and the output is NodeType (sig...
Double_t GetWeight() const
return the event weight - depending on whether the flag IgnoreNegWeightsInTraining is or not.
Definition: Event.cxx:381
Implementation of the GiniIndex as separation criterion.
Definition: GiniIndex.h:63
Virtual base Class for all MVA method.
Definition: MethodBase.h:111
virtual void DeclareCompatibilityOptions()
options that are used ONLY for the READER to ensure backward compatibility they are hence without any...
Definition: MethodBase.cxx:598
Analysis of Boosted Decision Trees.
Definition: MethodDT.h:49
virtual ~MethodDT(void)
destructor
Definition: MethodDT.cxx:369
MethodDT(const TString &jobName, const TString &methodTitle, DataSetInfo &theData, const TString &theOption="")
the standard constructor for just an ordinar "decision trees"
Definition: MethodDT.cxx:128
Double_t TestTreeQuality(DecisionTree *dt)
Definition: MethodDT.cxx:507
virtual Bool_t HasAnalysisType(Types::EAnalysisType type, UInt_t numberClasses, UInt_t numberTargets)
FDA can handle classification with 2 classes and regression with one regression-target.
Definition: MethodDT.cxx:181
Double_t GetMvaValue(Double_t *err=0, Double_t *errUpper=0)
returns MVA value
Definition: MethodDT.cxx:552
void Train(void)
Definition: MethodDT.cxx:376
const Ranking * CreateRanking()
Definition: MethodDT.cxx:567
void ReadWeightsFromXML(void *wghtnode)
Definition: MethodDT.cxx:532
void GetHelpMessage() const
Definition: MethodDT.cxx:562
void AddWeightsXMLTo(void *parent) const
Definition: MethodDT.cxx:524
Double_t PruneTree()
prune the decision tree if requested (good for individual trees that are best grown out,...
Definition: MethodDT.cxx:407
void ReadWeightsFromStream(std::istream &istr)
Definition: MethodDT.cxx:542
void DeclareOptions()
Define the options (their key words) that can be set in the option string.
Definition: MethodDT.cxx:213
Bool_t fPruneBeforeBoost
Definition: MethodDT.h:137
void Init(void)
common initialisation with defaults for the DT-Method
Definition: MethodDT.cxx:344
void SetMinNodeSize(Double_t sizeInPercent)
Definition: MethodDT.cxx:321
void DeclareCompatibilityOptions()
options that are used ONLY for the READER to ensure backward compatibility
Definition: MethodDT.cxx:246
void ProcessOptions()
the option string is decoded, for available options see "DeclareOptions"
Definition: MethodDT.cxx:257
Implementation of the MisClassificationError as separation criterion.
Ranking for variables in method (implementation)
Definition: Ranking.h:48
Implementation of the SdivSqrtSplusB as separation criterion.
Singleton class for Global types used by TMVA.
Definition: Types.h:73
EAnalysisType
Definition: Types.h:127
@ kMulticlass
Definition: Types.h:130
@ kClassification
Definition: Types.h:128
@ kTraining
Definition: Types.h:144
@ kValidation
Definition: Types.h:147
Basic string class.
Definition: TString.h:131
Double_t Atof() const
Return floating-point value contained in string.
Definition: TString.cxx:1987
TString & ReplaceAll(const TString &s1, const TString &s2)
Definition: TString.h:687
Bool_t IsAlnum() const
Returns true if all characters in string are alphanumeric.
Definition: TString.cxx:1746
create variable transformations
MsgLogger & Endl(MsgLogger &ml)
Definition: MsgLogger.h:158
Double_t Log(Double_t x)
Definition: TMath.h:750