Logo ROOT   6.18/05
Reference Guide
MethodDT.cxx
Go to the documentation of this file.
1// @(#)root/tmva $Id$
2// Author: Andreas Hoecker, Joerg Stelzer, Helge Voss, Kai Voss
3
4/**********************************************************************************
5 * Project: TMVA - a Root-integrated toolkit for multivariate data analysis *
6 * Package: TMVA *
7 * Class : MethodDT (DT = Decision Trees) *
8 * Web : http://tmva.sourceforge.net *
9 * *
10 * Description: *
11 * Analysis of Boosted Decision Trees *
12 * *
13 * Authors (alphabetical): *
14 * Andreas Hoecker <Andreas.Hocker@cern.ch> - CERN, Switzerland *
15 * Helge Voss <Helge.Voss@cern.ch> - MPI-K Heidelberg, Germany *
16 * Or Cohen <orcohenor@gmail.com> - Weizmann Inst., Israel *
17 * *
18 * Copyright (c) 2005: *
19 * CERN, Switzerland *
20 * MPI-K Heidelberg, Germany *
21 * *
22 * Redistribution and use in source and binary forms, with or without *
23 * modification, are permitted according to the terms listed in LICENSE *
24 * (http://tmva.sourceforge.net/LICENSE) *
25 **********************************************************************************/
26
27/*! \class TMVA::MethodDT
28\ingroup TMVA
29
30Analysis of Boosted Decision Trees
31
32Boosted decision trees have been successfully used in High Energy
33Physics analysis for example by the MiniBooNE experiment
34(Yang-Roe-Zhu, physics/0508045). In Boosted Decision Trees, the
35selection is done on a majority vote on the result of several decision
36trees, which are all derived from the same training sample by
37supplying different event weights during the training.
38
39### Decision trees:
40
41successive decision nodes are used to categorize the
42events out of the sample as either signal or background. Each node
43uses only a single discriminating variable to decide if the event is
44signal-like ("goes right") or background-like ("goes left"). This
45forms a tree like structure with "baskets" at the end (leave nodes),
46and an event is classified as either signal or background according to
47whether the basket where it ends up has been classified signal or
48background during the training. Training of a decision tree is the
49process to define the "cut criteria" for each node. The training
50starts with the root node. Here one takes the full training event
51sample and selects the variable and corresponding cut value that gives
52the best separation between signal and background at this stage. Using
53this cut criterion, the sample is then divided into two subsamples, a
54signal-like (right) and a background-like (left) sample. Two new nodes
55are then created for each of the two sub-samples and they are
56constructed using the same mechanism as described for the root
57node. The devision is stopped once a certain node has reached either a
58minimum number of events, or a minimum or maximum signal purity. These
59leave nodes are then called "signal" or "background" if they contain
60more signal respective background events from the training sample.
61
62### Boosting:
63
64the idea behind the boosting is, that signal events from the training
65sample, that *end up in a background node (and vice versa) are given a
66larger weight than events that are in the correct leave node. This
67results in a re-weighed training event sample, with which then a new
68decision tree can be developed. The boosting can be applied several
69times (typically 100-500 times) and one ends up with a set of decision
70trees (a forest).
71
72### Bagging:
73
74In this particular variant of the Boosted Decision Trees the boosting
75is not done on the basis of previous training results, but by a simple
76stochastic re-sampling of the initial training event sample.
77
78### Analysis:
79
80applying an individual decision tree to a test event results in a
81classification of the event as either signal or background. For the
82boosted decision tree selection, an event is successively subjected to
83the whole set of decision trees and depending on how often it is
84classified as signal, a "likelihood" estimator is constructed for the
85event being signal or background. The value of this estimator is the
86one which is then used to select the events from an event sample, and
87the cut value on this estimator defines the efficiency and purity of
88the selection.
89*/
90
91#include "TMVA/MethodDT.h"
92
94#include "TMVA/CCPruner.h"
96#include "TMVA/Configurable.h"
97#include "TMVA/CrossEntropy.h"
98#include "TMVA/DataSet.h"
99#include "TMVA/DecisionTree.h"
100#include "TMVA/GiniIndex.h"
101#include "TMVA/IMethod.h"
102#include "TMVA/MethodBase.h"
103#include "TMVA/MethodBoost.h"
105#include "TMVA/MsgLogger.h"
106#include "TMVA/Ranking.h"
107#include "TMVA/SdivSqrtSplusB.h"
108#include "TMVA/SeparationBase.h"
109#include "TMVA/Timer.h"
110#include "TMVA/Tools.h"
111#include "TMVA/Types.h"
112
113#include "Riostream.h"
114#include "TRandom3.h"
115#include "TMath.h"
116#include "TObjString.h"
117
118#include <algorithm>
119
120using std::vector;
121
123
125
126////////////////////////////////////////////////////////////////////////////////
127/// the standard constructor for just an ordinar "decision trees"
128
130 const TString& methodTitle,
131 DataSetInfo& theData,
132 const TString& theOption) :
133 TMVA::MethodBase( jobName, Types::kDT, methodTitle, theData, theOption)
134 , fTree(0)
135 , fSepType(0)
136 , fMinNodeEvents(0)
137 , fMinNodeSize(0)
138 , fNCuts(0)
139 , fUseYesNoLeaf(kFALSE)
140 , fNodePurityLimit(0)
141 , fMaxDepth(0)
142 , fErrorFraction(0)
143 , fPruneStrength(0)
144 , fPruneMethod(DecisionTree::kNoPruning)
145 , fAutomatic(kFALSE)
146 , fRandomisedTrees(kFALSE)
147 , fUseNvars(0)
148 , fUsePoissonNvars(0) // don't use this initialisation, only here to make Coverity happy. Is set in Init()
149 , fDeltaPruneStrength(0)
150{
152}
153
154////////////////////////////////////////////////////////////////////////////////
155///constructor from Reader
156
158 const TString& theWeightFile) :
159 TMVA::MethodBase( Types::kDT, dsi, theWeightFile)
160 , fTree(0)
161 , fSepType(0)
162 , fMinNodeEvents(0)
163 , fMinNodeSize(0)
164 , fNCuts(0)
165 , fUseYesNoLeaf(kFALSE)
166 , fNodePurityLimit(0)
167 , fMaxDepth(0)
168 , fErrorFraction(0)
169 , fPruneStrength(0)
170 , fPruneMethod(DecisionTree::kNoPruning)
171 , fAutomatic(kFALSE)
172 , fRandomisedTrees(kFALSE)
173 , fUseNvars(0)
174 , fDeltaPruneStrength(0)
175{
177}
178
179////////////////////////////////////////////////////////////////////////////////
180/// FDA can handle classification with 2 classes and regression with one regression-target
181
183{
184 if( type == Types::kClassification && numberClasses == 2 ) return kTRUE;
185 return kFALSE;
186}
187
188
189////////////////////////////////////////////////////////////////////////////////
190/// Define the options (their key words) that can be set in the option string.
191///
192/// - UseRandomisedTrees choose at each node splitting a random set of variables
193/// - UseNvars use UseNvars variables in randomised trees
194/// - SeparationType the separation criterion applied in the node splitting.
195/// known:
196/// - GiniIndex
197/// - MisClassificationError
198/// - CrossEntropy
199/// - SDivSqrtSPlusB
200/// - nEventsMin: the minimum number of events in a node (leaf criteria, stop splitting)
201/// - nCuts: the number of steps in the optimisation of the cut for a node (if < 0, then
202/// step size is determined by the events)
203/// - UseYesNoLeaf decide if the classification is done simply by the node type, or the S/B
204/// (from the training) in the leaf node
205/// - NodePurityLimit the minimum purity to classify a node as a signal node (used in pruning and boosting to determine
206/// misclassification error rate)
207/// - PruneMethod The Pruning method:
208/// known:
209/// - NoPruning // switch off pruning completely
210/// - ExpectedError
211/// - CostComplexity
212/// - PruneStrength a parameter to adjust the amount of pruning. Should be large enough such that overtraining is avoided");
213
215{
216 DeclareOptionRef(fRandomisedTrees,"UseRandomisedTrees","Choose at each node splitting a random set of variables and *bagging*");
217 DeclareOptionRef(fUseNvars,"UseNvars","Number of variables used if randomised Tree option is chosen");
218 DeclareOptionRef(fUsePoissonNvars,"UsePoissonNvars", "Interpret \"UseNvars\" not as fixed number but as mean of a Poisson distribution in each split with RandomisedTree option");
219 DeclareOptionRef(fUseYesNoLeaf=kTRUE, "UseYesNoLeaf",
220 "Use Sig or Bkg node type or the ratio S/B as classification in the leaf node");
221 DeclareOptionRef(fNodePurityLimit=0.5, "NodePurityLimit", "In boosting/pruning, nodes with purity > NodePurityLimit are signal; background otherwise.");
222 DeclareOptionRef(fSepTypeS="GiniIndex", "SeparationType", "Separation criterion for node splitting");
223 AddPreDefVal(TString("MisClassificationError"));
224 AddPreDefVal(TString("GiniIndex"));
225 AddPreDefVal(TString("CrossEntropy"));
226 AddPreDefVal(TString("SDivSqrtSPlusB"));
227 DeclareOptionRef(fMinNodeEvents=-1, "nEventsMin", "deprecated !!! Minimum number of events required in a leaf node");
228 DeclareOptionRef(fMinNodeSizeS, "MinNodeSize", "Minimum percentage of training events required in a leaf node (default: Classification: 10%, Regression: 1%)");
229 DeclareOptionRef(fNCuts, "nCuts", "Number of steps during node cut optimisation");
230 DeclareOptionRef(fPruneStrength, "PruneStrength", "Pruning strength (negative value == automatic adjustment)");
231 DeclareOptionRef(fPruneMethodS="NoPruning", "PruneMethod", "Pruning method: NoPruning (switched off), ExpectedError or CostComplexity");
232
233 AddPreDefVal(TString("NoPruning"));
234 AddPreDefVal(TString("ExpectedError"));
235 AddPreDefVal(TString("CostComplexity"));
236
237 if (DoRegression()) {
238 DeclareOptionRef(fMaxDepth=50,"MaxDepth","Max depth of the decision tree allowed");
239 }else{
240 DeclareOptionRef(fMaxDepth=3,"MaxDepth","Max depth of the decision tree allowed");
241 }
242}
243
244////////////////////////////////////////////////////////////////////////////////
245/// options that are used ONLY for the READER to ensure backward compatibility
246
248
250
251 DeclareOptionRef(fPruneBeforeBoost=kFALSE, "PruneBeforeBoost",
252 "--> removed option .. only kept for reader backward compatibility");
253}
254
255////////////////////////////////////////////////////////////////////////////////
256/// the option string is decoded, for available options see "DeclareOptions"
257
259{
260 fSepTypeS.ToLower();
261 if (fSepTypeS == "misclassificationerror") fSepType = new MisClassificationError();
262 else if (fSepTypeS == "giniindex") fSepType = new GiniIndex();
263 else if (fSepTypeS == "crossentropy") fSepType = new CrossEntropy();
264 else if (fSepTypeS == "sdivsqrtsplusb") fSepType = new SdivSqrtSplusB();
265 else {
266 Log() << kINFO << GetOptions() << Endl;
267 Log() << kFATAL << "<ProcessOptions> unknown Separation Index option called" << Endl;
268 }
269
270 // std::cout << "fSeptypes " << fSepTypeS << " fseptype " << fSepType << std::endl;
271
272 fPruneMethodS.ToLower();
273 if (fPruneMethodS == "expectederror" ) fPruneMethod = DecisionTree::kExpectedErrorPruning;
274 else if (fPruneMethodS == "costcomplexity" ) fPruneMethod = DecisionTree::kCostComplexityPruning;
275 else if (fPruneMethodS == "nopruning" ) fPruneMethod = DecisionTree::kNoPruning;
276 else {
277 Log() << kINFO << GetOptions() << Endl;
278 Log() << kFATAL << "<ProcessOptions> unknown PruneMethod option:" << fPruneMethodS <<" called" << Endl;
279 }
280
281 if (fPruneStrength < 0) fAutomatic = kTRUE;
282 else fAutomatic = kFALSE;
283 if (fAutomatic && fPruneMethod==!DecisionTree::kCostComplexityPruning){
284 Log() << kFATAL
285 << "Sorry automatic pruning strength determination is not implemented yet for ExpectedErrorPruning" << Endl;
286 }
287
288
289 if (this->Data()->HasNegativeEventWeights()){
290 Log() << kINFO << " You are using a Monte Carlo that has also negative weights. "
291 << "That should in principle be fine as long as on average you end up with "
292 << "something positive. For this you have to make sure that the minimal number "
293 << "of (un-weighted) events demanded for a tree node (currently you use: MinNodeSize="
294 <<fMinNodeSizeS
295 <<", (or the deprecated equivalent nEventsMin) you can set this via the "
296 <<"MethodDT option string when booking the "
297 << "classifier) is large enough to allow for reasonable averaging!!! "
298 << " If this does not help.. maybe you want to try the option: IgnoreNegWeightsInTraining "
299 << "which ignores events with negative weight in the training. " << Endl
300 << Endl << "Note: You'll get a WARNING message during the training if that should ever happen" << Endl;
301 }
302
303 if (fRandomisedTrees){
304 Log() << kINFO << " Randomised trees should use *bagging* as *boost* method. Did you set this in the *MethodBoost* ? . Here I can enforce only the *no pruning*" << Endl;
305 fPruneMethod = DecisionTree::kNoPruning;
306 // fBoostType = "Bagging";
307 }
308
309 if (fMinNodeEvents > 0){
310 fMinNodeSize = fMinNodeEvents / Data()->GetNTrainingEvents() * 100;
311 Log() << kWARNING << "You have explicitly set *nEventsMin*, the min absolute number \n"
312 << "of events in a leaf node. This is DEPRECATED, please use the option \n"
313 << "*MinNodeSize* giving the relative number as percentage of training \n"
314 << "events instead. \n"
315 << "nEventsMin="<<fMinNodeEvents<< "--> MinNodeSize="<<fMinNodeSize<<"%"
316 << Endl;
317 }else{
318 SetMinNodeSize(fMinNodeSizeS);
319 }
320}
321
323 if (sizeInPercent > 0 && sizeInPercent < 50){
324 fMinNodeSize=sizeInPercent;
325
326 } else {
327 Log() << kERROR << "you have demanded a minimal node size of "
328 << sizeInPercent << "% of the training events.. \n"
329 << " that somehow does not make sense "<<Endl;
330 }
331
332}
334 sizeInPercent.ReplaceAll("%","");
335 if (sizeInPercent.IsAlnum()) SetMinNodeSize(sizeInPercent.Atof());
336 else {
337 Log() << kERROR << "I had problems reading the option MinNodeEvents, which\n"
338 << "after removing a possible % sign now reads " << sizeInPercent << Endl;
339 }
340}
341
342////////////////////////////////////////////////////////////////////////////////
343/// common initialisation with defaults for the DT-Method
344
346{
347 fMinNodeEvents = -1;
348 fMinNodeSize = 5;
349 fMinNodeSizeS = "5%";
350 fNCuts = 20;
351 fPruneMethod = DecisionTree::kNoPruning;
352 fPruneStrength = 5; // -1 means automatic determination of the prune strength using a validation sample
353 fDeltaPruneStrength=0.1;
354 fRandomisedTrees= kFALSE;
355 fUseNvars = GetNvar();
356 fUsePoissonNvars = kTRUE;
357
358 // reference cut value to distinguish signal-like from background-like events
359 SetSignalReferenceCut( 0 );
360 if (fAnalysisType == Types::kClassification || fAnalysisType == Types::kMulticlass ) {
361 fMaxDepth = 3;
362 }else {
363 fMaxDepth = 50;
364 }
365}
366
367////////////////////////////////////////////////////////////////////////////////
368///destructor
369
371{
372 delete fTree;
373}
374
375////////////////////////////////////////////////////////////////////////////////
376
378{
380 fTree = new DecisionTree( fSepType, fMinNodeSize, fNCuts, &(DataInfo()), 0,
381 fRandomisedTrees, fUseNvars, fUsePoissonNvars,fMaxDepth,0 );
382 fTree->SetNVars(GetNvar());
383 if (fRandomisedTrees) Log()<<kWARNING<<" randomised Trees do not work yet in this framework,"
384 << " as I do not know how to give each tree a new random seed, now they"
385 << " will be all the same and that is not good " << Endl;
386 fTree->SetAnalysisType( GetAnalysisType() );
387
388 //fTree->BuildTree(GetEventCollection(Types::kTraining));
389 Data()->SetCurrentType(Types::kTraining);
390 UInt_t nevents = Data()->GetNTrainingEvents();
391 std::vector<const TMVA::Event*> tmp;
392 for (Long64_t ievt=0; ievt<nevents; ievt++) {
393 const Event *event = GetEvent(ievt);
394 tmp.push_back(event);
395 }
396 fTree->BuildTree(tmp);
397 if (fPruneMethod != DecisionTree::kNoPruning) fTree->PruneTree();
398
400 ExitFromTraining();
401}
402
403////////////////////////////////////////////////////////////////////////////////
404/// prune the decision tree if requested (good for individual trees that are best grown out, and then
405/// pruned back, while boosted decision trees are best 'small' trees to start with. Well, at least the
406/// standard "optimal pruning algorithms" don't result in 'weak enough' classifiers !!
407
409{
410 // remember the number of nodes beforehand (for monitoring purposes)
411
412
413 if (fAutomatic && fPruneMethod == DecisionTree::kCostComplexityPruning) { // automatic cost complexity pruning
414 CCPruner* pruneTool = new CCPruner(fTree, this->Data() , fSepType);
415 pruneTool->Optimize();
416 std::vector<DecisionTreeNode*> nodes = pruneTool->GetOptimalPruneSequence();
417 fPruneStrength = pruneTool->GetOptimalPruneStrength();
418 for(UInt_t i = 0; i < nodes.size(); i++)
419 fTree->PruneNode(nodes[i]);
420 delete pruneTool;
421 }
422 else if (fAutomatic && fPruneMethod != DecisionTree::kCostComplexityPruning){
423 /*
424
425 Double_t alpha = 0;
426 Double_t delta = fDeltaPruneStrength;
427
428 DecisionTree* dcopy;
429 std::vector<Double_t> q;
430 multimap<Double_t,Double_t> quality;
431 Int_t nnodes=fTree->GetNNodes();
432
433 // find the maximum prune strength that still leaves some nodes
434 Bool_t forceStop = kFALSE;
435 Int_t troubleCount=0, previousNnodes=nnodes;
436
437
438 nnodes=fTree->GetNNodes();
439 while (nnodes > 3 && !forceStop) {
440 dcopy = new DecisionTree(*fTree);
441 dcopy->SetPruneStrength(alpha+=delta);
442 dcopy->PruneTree();
443 q.push_back(TestTreeQuality(dcopy));
444 quality.insert(std::pair<const Double_t,Double_t>(q.back(),alpha));
445 nnodes=dcopy->GetNNodes();
446 if (previousNnodes == nnodes) troubleCount++;
447 else {
448 troubleCount=0; // reset counter
449 if (nnodes < previousNnodes / 2 ) fDeltaPruneStrength /= 2.;
450 }
451 previousNnodes = nnodes;
452 if (troubleCount > 20) {
453 if (methodIndex == 0 && fPruneStrength <=0) {//maybe you need larger stepsize ??
454 fDeltaPruneStrength *= 5;
455 Log() << kINFO << "<PruneTree> trouble determining optimal prune strength"
456 << " for Tree " << methodIndex
457 << " --> first try to increase the step size"
458 << " currently Prunestrenght= " << alpha
459 << " stepsize " << fDeltaPruneStrength << " " << Endl;
460 troubleCount = 0; // try again
461 fPruneStrength = 1; // if it was for the first time..
462 } else if (methodIndex == 0 && fPruneStrength <=2) {//maybe you need much larger stepsize ??
463 fDeltaPruneStrength *= 5;
464 Log() << kINFO << "<PruneTree> trouble determining optimal prune strength"
465 << " for Tree " << methodIndex
466 << " --> try to increase the step size even more.. "
467 << " if that still didn't work, TRY IT BY HAND"
468 << " currently Prunestrenght= " << alpha
469 << " stepsize " << fDeltaPruneStrength << " " << Endl;
470 troubleCount = 0; // try again
471 fPruneStrength = 3; // if it was for the first time..
472 } else {
473 forceStop=kTRUE;
474 Log() << kINFO << "<PruneTree> trouble determining optimal prune strength"
475 << " for Tree " << methodIndex << " at tested prune strength: " << alpha << " --> abort forced, use same strength as for previous tree:"
476 << fPruneStrength << Endl;
477 }
478 }
479 if (fgDebugLevel==1) Log() << kINFO << "Pruneed with ("<<alpha
480 << ") give quality: " << q.back()
481 << " and #nodes: " << nnodes
482 << Endl;
483 delete dcopy;
484 }
485 if (!forceStop) {
486 multimap<Double_t,Double_t>::reverse_iterator it=quality.rend();
487 it++;
488 fPruneStrength = it->second;
489 // adjust the step size for the next tree.. think that 20 steps are sort of
490 // fine enough.. could become a tunable option later..
491 fDeltaPruneStrength *= Double_t(q.size())/20.;
492 }
493
494 fTree->SetPruneStrength(fPruneStrength);
495 fTree->PruneTree();
496 */
497 }
498 else {
499 fTree->SetPruneStrength(fPruneStrength);
500 fTree->PruneTree();
501 }
502
503 return fPruneStrength;
504}
505
506////////////////////////////////////////////////////////////////////////////////
507
509{
510 Data()->SetCurrentType(Types::kValidation);
511 // test the tree quality.. in terms of Misclassification
512 Double_t SumCorrect=0,SumWrong=0;
513 for (Long64_t ievt=0; ievt<Data()->GetNEvents(); ievt++)
514 {
515 const Event * ev = Data()->GetEvent(ievt);
516 if ((dt->CheckEvent(ev) > dt->GetNodePurityLimit() ) == DataInfo().IsSignal(ev)) SumCorrect+=ev->GetWeight();
517 else SumWrong+=ev->GetWeight();
518 }
519 Data()->SetCurrentType(Types::kTraining);
520 return SumCorrect / (SumCorrect + SumWrong);
521}
522
523////////////////////////////////////////////////////////////////////////////////
524
525void TMVA::MethodDT::AddWeightsXMLTo( void* parent ) const
526{
527 fTree->AddXMLTo(parent);
528 //Log() << kFATAL << "Please implement writing of weights as XML" << Endl;
529}
530
531////////////////////////////////////////////////////////////////////////////////
532
534{
535 if(fTree)
536 delete fTree;
537 fTree = new DecisionTree();
538 fTree->ReadXML(wghtnode,GetTrainingTMVAVersionCode());
539}
540
541////////////////////////////////////////////////////////////////////////////////
542
544{
545 delete fTree;
546 fTree = new DecisionTree();
547 fTree->Read(istr);
548}
549
550////////////////////////////////////////////////////////////////////////////////
551/// returns MVA value
552
554{
555 // cannot determine error
556 NoErrorCalc(err, errUpper);
557
558 return fTree->CheckEvent(GetEvent(),fUseYesNoLeaf);
559}
560
561////////////////////////////////////////////////////////////////////////////////
562
564{
565}
566////////////////////////////////////////////////////////////////////////////////
567
569{
570 return 0;
571}
#define REGISTER_METHOD(CLASS)
for example
unsigned int UInt_t
Definition: RtypesCore.h:42
const Bool_t kFALSE
Definition: RtypesCore.h:88
bool Bool_t
Definition: RtypesCore.h:59
double Double_t
Definition: RtypesCore.h:55
long long Long64_t
Definition: RtypesCore.h:69
const Bool_t kTRUE
Definition: RtypesCore.h:87
#define ClassImp(name)
Definition: Rtypes.h:365
int type
Definition: TGX11.cxx:120
A helper class to prune a decision tree using the Cost Complexity method (see Classification and Regr...
Definition: CCPruner.h:61
void SetPruneStrength(Float_t alpha=-1.0)
Definition: CCPruner.h:109
void Optimize()
determine the pruning sequence
Definition: CCPruner.cxx:124
std::vector< TMVA::DecisionTreeNode * > GetOptimalPruneSequence() const
return the prune strength (=alpha) corresponding to the prune sequence
Definition: CCPruner.cxx:240
Float_t GetOptimalPruneStrength() const
Definition: CCPruner.h:88
Implementation of the CrossEntropy as separation criterion.
Definition: CrossEntropy.h:43
Class that contains all the data information.
Definition: DataSetInfo.h:60
Implementation of a Decision Tree.
Definition: DecisionTree.h:64
Double_t GetNodePurityLimit() const
Definition: DecisionTree.h:161
Double_t CheckEvent(const TMVA::Event *, Bool_t UseYesNoLeaf=kFALSE) const
the event e is put into the decision tree (starting at the root node) and the output is NodeType (sig...
Double_t GetWeight() const
return the event weight - depending on whether the flag IgnoreNegWeightsInTraining is or not.
Definition: Event.cxx:382
Implementation of the GiniIndex as separation criterion.
Definition: GiniIndex.h:63
Virtual base Class for all MVA method.
Definition: MethodBase.h:109
virtual void DeclareCompatibilityOptions()
options that are used ONLY for the READER to ensure backward compatibility they are hence without any...
Definition: MethodBase.cxx:601
Analysis of Boosted Decision Trees.
Definition: MethodDT.h:49
virtual ~MethodDT(void)
destructor
Definition: MethodDT.cxx:370
MethodDT(const TString &jobName, const TString &methodTitle, DataSetInfo &theData, const TString &theOption="")
the standard constructor for just an ordinar "decision trees"
Definition: MethodDT.cxx:129
Double_t TestTreeQuality(DecisionTree *dt)
Definition: MethodDT.cxx:508
virtual Bool_t HasAnalysisType(Types::EAnalysisType type, UInt_t numberClasses, UInt_t numberTargets)
FDA can handle classification with 2 classes and regression with one regression-target.
Definition: MethodDT.cxx:182
Double_t GetMvaValue(Double_t *err=0, Double_t *errUpper=0)
returns MVA value
Definition: MethodDT.cxx:553
void Train(void)
Definition: MethodDT.cxx:377
const Ranking * CreateRanking()
Definition: MethodDT.cxx:568
void ReadWeightsFromXML(void *wghtnode)
Definition: MethodDT.cxx:533
void GetHelpMessage() const
Definition: MethodDT.cxx:563
void AddWeightsXMLTo(void *parent) const
Definition: MethodDT.cxx:525
Double_t PruneTree()
prune the decision tree if requested (good for individual trees that are best grown out,...
Definition: MethodDT.cxx:408
void ReadWeightsFromStream(std::istream &istr)
Definition: MethodDT.cxx:543
void DeclareOptions()
Define the options (their key words) that can be set in the option string.
Definition: MethodDT.cxx:214
Bool_t fPruneBeforeBoost
Definition: MethodDT.h:137
void Init(void)
common initialisation with defaults for the DT-Method
Definition: MethodDT.cxx:345
void SetMinNodeSize(Double_t sizeInPercent)
Definition: MethodDT.cxx:322
void DeclareCompatibilityOptions()
options that are used ONLY for the READER to ensure backward compatibility
Definition: MethodDT.cxx:247
void ProcessOptions()
the option string is decoded, for available options see "DeclareOptions"
Definition: MethodDT.cxx:258
Implementation of the MisClassificationError as separation criterion.
Ranking for variables in method (implementation)
Definition: Ranking.h:48
Implementation of the SdivSqrtSplusB as separation criterion.
Singleton class for Global types used by TMVA.
Definition: Types.h:73
EAnalysisType
Definition: Types.h:127
@ kMulticlass
Definition: Types.h:130
@ kClassification
Definition: Types.h:128
@ kTraining
Definition: Types.h:144
@ kValidation
Definition: Types.h:147
Basic string class.
Definition: TString.h:131
Double_t Atof() const
Return floating-point value contained in string.
Definition: TString.cxx:1987
TString & ReplaceAll(const TString &s1, const TString &s2)
Definition: TString.h:687
Bool_t IsAlnum() const
Returns true if all characters in string are alphanumeric.
Definition: TString.cxx:1746
create variable transformations
MsgLogger & Endl(MsgLogger &ml)
Definition: MsgLogger.h:158
Double_t Log(Double_t x)
Definition: TMath.h:748