Logo ROOT  
Reference Guide
MethodDT.cxx
Go to the documentation of this file.
1 // @(#)root/tmva $Id$
2 // Author: Andreas Hoecker, Joerg Stelzer, Helge Voss, Kai Voss
3 
4 /**********************************************************************************
5  * Project: TMVA - a Root-integrated toolkit for multivariate data analysis *
6  * Package: TMVA *
7  * Class : MethodDT (DT = Decision Trees) *
8  * Web : http://tmva.sourceforge.net *
9  * *
10  * Description: *
11  * Analysis of Boosted Decision Trees *
12  * *
13  * Authors (alphabetical): *
14  * Andreas Hoecker <Andreas.Hocker@cern.ch> - CERN, Switzerland *
15  * Helge Voss <Helge.Voss@cern.ch> - MPI-K Heidelberg, Germany *
16  * Or Cohen <orcohenor@gmail.com> - Weizmann Inst., Israel *
17  * *
18  * Copyright (c) 2005: *
19  * CERN, Switzerland *
20  * MPI-K Heidelberg, Germany *
21  * *
22  * Redistribution and use in source and binary forms, with or without *
23  * modification, are permitted according to the terms listed in LICENSE *
24  * (http://tmva.sourceforge.net/LICENSE) *
25  **********************************************************************************/
26 
27 /*! \class TMVA::MethodDT
28 \ingroup TMVA
29 
30 Analysis of Boosted Decision Trees
31 
32 Boosted decision trees have been successfully used in High Energy
33 Physics analysis for example by the MiniBooNE experiment
34 (Yang-Roe-Zhu, physics/0508045). In Boosted Decision Trees, the
35 selection is done on a majority vote on the result of several decision
36 trees, which are all derived from the same training sample by
37 supplying different event weights during the training.
38 
39 ### Decision trees:
40 
41 successive decision nodes are used to categorize the
42 events out of the sample as either signal or background. Each node
43 uses only a single discriminating variable to decide if the event is
44 signal-like ("goes right") or background-like ("goes left"). This
45 forms a tree like structure with "baskets" at the end (leave nodes),
46 and an event is classified as either signal or background according to
47 whether the basket where it ends up has been classified signal or
48 background during the training. Training of a decision tree is the
49 process to define the "cut criteria" for each node. The training
50 starts with the root node. Here one takes the full training event
51 sample and selects the variable and corresponding cut value that gives
52 the best separation between signal and background at this stage. Using
53 this cut criterion, the sample is then divided into two subsamples, a
54 signal-like (right) and a background-like (left) sample. Two new nodes
55 are then created for each of the two sub-samples and they are
56 constructed using the same mechanism as described for the root
57 node. The devision is stopped once a certain node has reached either a
58 minimum number of events, or a minimum or maximum signal purity. These
59 leave nodes are then called "signal" or "background" if they contain
60 more signal respective background events from the training sample.
61 
62 ### Boosting:
63 
64 the idea behind the boosting is, that signal events from the training
65 sample, that *end up in a background node (and vice versa) are given a
66 larger weight than events that are in the correct leave node. This
67 results in a re-weighed training event sample, with which then a new
68 decision tree can be developed. The boosting can be applied several
69 times (typically 100-500 times) and one ends up with a set of decision
70 trees (a forest).
71 
72 ### Bagging:
73 
74 In this particular variant of the Boosted Decision Trees the boosting
75 is not done on the basis of previous training results, but by a simple
76 stochastic re-sampling of the initial training event sample.
77 
78 ### Analysis:
79 
80 applying an individual decision tree to a test event results in a
81 classification of the event as either signal or background. For the
82 boosted decision tree selection, an event is successively subjected to
83 the whole set of decision trees and depending on how often it is
84 classified as signal, a "likelihood" estimator is constructed for the
85 event being signal or background. The value of this estimator is the
86 one which is then used to select the events from an event sample, and
87 the cut value on this estimator defines the efficiency and purity of
88 the selection.
89 */
90 
91 #include "TMVA/MethodDT.h"
92 
93 #include "TMVA/BinarySearchTree.h"
94 #include "TMVA/CCPruner.h"
95 #include "TMVA/ClassifierFactory.h"
96 #include "TMVA/Configurable.h"
97 #include "TMVA/CrossEntropy.h"
98 #include "TMVA/DataSet.h"
99 #include "TMVA/DecisionTree.h"
100 #include "TMVA/GiniIndex.h"
101 #include "TMVA/IMethod.h"
102 #include "TMVA/MethodBase.h"
103 #include "TMVA/MethodBoost.h"
105 #include "TMVA/MsgLogger.h"
106 #include "TMVA/Ranking.h"
107 #include "TMVA/SdivSqrtSplusB.h"
108 #include "TMVA/SeparationBase.h"
109 #include "TMVA/Timer.h"
110 #include "TMVA/Tools.h"
111 #include "TMVA/Types.h"
112 
113 #include "TRandom3.h"
114 
115 #include <iostream>
116 #include <algorithm>
117 
118 using std::vector;
119 
120 REGISTER_METHOD(DT)
121 
123 
124 ////////////////////////////////////////////////////////////////////////////////
125 /// the standard constructor for just an ordinar "decision trees"
126 
128  const TString& methodTitle,
129  DataSetInfo& theData,
130  const TString& theOption) :
131  TMVA::MethodBase( jobName, Types::kDT, methodTitle, theData, theOption)
132  , fTree(0)
133  , fSepType(0)
134  , fMinNodeEvents(0)
135  , fMinNodeSize(0)
136  , fNCuts(0)
137  , fUseYesNoLeaf(kFALSE)
138  , fNodePurityLimit(0)
139  , fMaxDepth(0)
140  , fErrorFraction(0)
141  , fPruneStrength(0)
142  , fPruneMethod(DecisionTree::kNoPruning)
143  , fAutomatic(kFALSE)
144  , fRandomisedTrees(kFALSE)
145  , fUseNvars(0)
146  , fUsePoissonNvars(0) // don't use this initialisation, only here to make Coverity happy. Is set in Init()
147  , fDeltaPruneStrength(0)
148 {
150 }
151 
152 ////////////////////////////////////////////////////////////////////////////////
153 ///constructor from Reader
154 
156  const TString& theWeightFile) :
157  TMVA::MethodBase( Types::kDT, dsi, theWeightFile)
158  , fTree(0)
159  , fSepType(0)
160  , fMinNodeEvents(0)
161  , fMinNodeSize(0)
162  , fNCuts(0)
163  , fUseYesNoLeaf(kFALSE)
164  , fNodePurityLimit(0)
165  , fMaxDepth(0)
166  , fErrorFraction(0)
167  , fPruneStrength(0)
168  , fPruneMethod(DecisionTree::kNoPruning)
169  , fAutomatic(kFALSE)
170  , fRandomisedTrees(kFALSE)
171  , fUseNvars(0)
172  , fDeltaPruneStrength(0)
173 {
175 }
176 
177 ////////////////////////////////////////////////////////////////////////////////
178 /// FDA can handle classification with 2 classes and regression with one regression-target
179 
181 {
182  if( type == Types::kClassification && numberClasses == 2 ) return kTRUE;
183  return kFALSE;
184 }
185 
186 
187 ////////////////////////////////////////////////////////////////////////////////
188 /// Define the options (their key words) that can be set in the option string.
189 ///
190 /// - UseRandomisedTrees choose at each node splitting a random set of variables
191 /// - UseNvars use UseNvars variables in randomised trees
192 /// - SeparationType the separation criterion applied in the node splitting.
193 /// known:
194 /// - GiniIndex
195 /// - MisClassificationError
196 /// - CrossEntropy
197 /// - SDivSqrtSPlusB
198 /// - nEventsMin: the minimum number of events in a node (leaf criteria, stop splitting)
199 /// - nCuts: the number of steps in the optimisation of the cut for a node (if < 0, then
200 /// step size is determined by the events)
201 /// - UseYesNoLeaf decide if the classification is done simply by the node type, or the S/B
202 /// (from the training) in the leaf node
203 /// - NodePurityLimit the minimum purity to classify a node as a signal node (used in pruning and boosting to determine
204 /// misclassification error rate)
205 /// - PruneMethod The Pruning method:
206 /// known:
207 /// - NoPruning // switch off pruning completely
208 /// - ExpectedError
209 /// - CostComplexity
210 /// - PruneStrength a parameter to adjust the amount of pruning. Should be large enough such that overtraining is avoided");
211 
213 {
214  DeclareOptionRef(fRandomisedTrees,"UseRandomisedTrees","Choose at each node splitting a random set of variables and *bagging*");
215  DeclareOptionRef(fUseNvars,"UseNvars","Number of variables used if randomised Tree option is chosen");
216  DeclareOptionRef(fUsePoissonNvars,"UsePoissonNvars", "Interpret \"UseNvars\" not as fixed number but as mean of a Poisson distribution in each split with RandomisedTree option");
217  DeclareOptionRef(fUseYesNoLeaf=kTRUE, "UseYesNoLeaf",
218  "Use Sig or Bkg node type or the ratio S/B as classification in the leaf node");
219  DeclareOptionRef(fNodePurityLimit=0.5, "NodePurityLimit", "In boosting/pruning, nodes with purity > NodePurityLimit are signal; background otherwise.");
220  DeclareOptionRef(fSepTypeS="GiniIndex", "SeparationType", "Separation criterion for node splitting");
221  AddPreDefVal(TString("MisClassificationError"));
222  AddPreDefVal(TString("GiniIndex"));
223  AddPreDefVal(TString("CrossEntropy"));
224  AddPreDefVal(TString("SDivSqrtSPlusB"));
225  DeclareOptionRef(fMinNodeEvents=-1, "nEventsMin", "deprecated !!! Minimum number of events required in a leaf node");
226  DeclareOptionRef(fMinNodeSizeS, "MinNodeSize", "Minimum percentage of training events required in a leaf node (default: Classification: 10%, Regression: 1%)");
227  DeclareOptionRef(fNCuts, "nCuts", "Number of steps during node cut optimisation");
228  DeclareOptionRef(fPruneStrength, "PruneStrength", "Pruning strength (negative value == automatic adjustment)");
229  DeclareOptionRef(fPruneMethodS="NoPruning", "PruneMethod", "Pruning method: NoPruning (switched off), ExpectedError or CostComplexity");
230 
231  AddPreDefVal(TString("NoPruning"));
232  AddPreDefVal(TString("ExpectedError"));
233  AddPreDefVal(TString("CostComplexity"));
234 
235  if (DoRegression()) {
236  DeclareOptionRef(fMaxDepth=50,"MaxDepth","Max depth of the decision tree allowed");
237  }else{
238  DeclareOptionRef(fMaxDepth=3,"MaxDepth","Max depth of the decision tree allowed");
239  }
240 }
241 
242 ////////////////////////////////////////////////////////////////////////////////
243 /// options that are used ONLY for the READER to ensure backward compatibility
244 
246 
248 
249  DeclareOptionRef(fPruneBeforeBoost=kFALSE, "PruneBeforeBoost",
250  "--> removed option .. only kept for reader backward compatibility");
251 }
252 
253 ////////////////////////////////////////////////////////////////////////////////
254 /// the option string is decoded, for available options see "DeclareOptions"
255 
257 {
258  fSepTypeS.ToLower();
259  if (fSepTypeS == "misclassificationerror") fSepType = new MisClassificationError();
260  else if (fSepTypeS == "giniindex") fSepType = new GiniIndex();
261  else if (fSepTypeS == "crossentropy") fSepType = new CrossEntropy();
262  else if (fSepTypeS == "sdivsqrtsplusb") fSepType = new SdivSqrtSplusB();
263  else {
264  Log() << kINFO << GetOptions() << Endl;
265  Log() << kFATAL << "<ProcessOptions> unknown Separation Index option called" << Endl;
266  }
267 
268  // std::cout << "fSeptypes " << fSepTypeS << " fseptype " << fSepType << std::endl;
269 
270  fPruneMethodS.ToLower();
271  if (fPruneMethodS == "expectederror" ) fPruneMethod = DecisionTree::kExpectedErrorPruning;
272  else if (fPruneMethodS == "costcomplexity" ) fPruneMethod = DecisionTree::kCostComplexityPruning;
273  else if (fPruneMethodS == "nopruning" ) fPruneMethod = DecisionTree::kNoPruning;
274  else {
275  Log() << kINFO << GetOptions() << Endl;
276  Log() << kFATAL << "<ProcessOptions> unknown PruneMethod option:" << fPruneMethodS <<" called" << Endl;
277  }
278 
279  if (fPruneStrength < 0) fAutomatic = kTRUE;
280  else fAutomatic = kFALSE;
281  if (fAutomatic && fPruneMethod == DecisionTree::kExpectedErrorPruning){
282  Log() << kFATAL
283  << "Sorry automatic pruning strength determination is not implemented yet for ExpectedErrorPruning" << Endl;
284  }
285 
286 
287  if (this->Data()->HasNegativeEventWeights()){
288  Log() << kINFO << " You are using a Monte Carlo that has also negative weights. "
289  << "That should in principle be fine as long as on average you end up with "
290  << "something positive. For this you have to make sure that the minimal number "
291  << "of (un-weighted) events demanded for a tree node (currently you use: MinNodeSize="
292  <<fMinNodeSizeS
293  <<", (or the deprecated equivalent nEventsMin) you can set this via the "
294  <<"MethodDT option string when booking the "
295  << "classifier) is large enough to allow for reasonable averaging!!! "
296  << " If this does not help.. maybe you want to try the option: IgnoreNegWeightsInTraining "
297  << "which ignores events with negative weight in the training. " << Endl
298  << Endl << "Note: You'll get a WARNING message during the training if that should ever happen" << Endl;
299  }
300 
301  if (fRandomisedTrees){
302  Log() << kINFO << " Randomised trees should use *bagging* as *boost* method. Did you set this in the *MethodBoost* ? . Here I can enforce only the *no pruning*" << Endl;
303  fPruneMethod = DecisionTree::kNoPruning;
304  // fBoostType = "Bagging";
305  }
306 
307  if (fMinNodeEvents > 0){
308  fMinNodeSize = fMinNodeEvents / Data()->GetNTrainingEvents() * 100;
309  Log() << kWARNING << "You have explicitly set *nEventsMin*, the min absolute number \n"
310  << "of events in a leaf node. This is DEPRECATED, please use the option \n"
311  << "*MinNodeSize* giving the relative number as percentage of training \n"
312  << "events instead. \n"
313  << "nEventsMin="<<fMinNodeEvents<< "--> MinNodeSize="<<fMinNodeSize<<"%"
314  << Endl;
315  }else{
316  SetMinNodeSize(fMinNodeSizeS);
317  }
318 }
319 
321  if (sizeInPercent > 0 && sizeInPercent < 50){
322  fMinNodeSize=sizeInPercent;
323 
324  } else {
325  Log() << kERROR << "you have demanded a minimal node size of "
326  << sizeInPercent << "% of the training events.. \n"
327  << " that somehow does not make sense "<<Endl;
328  }
329 
330 }
332  sizeInPercent.ReplaceAll("%","");
333  if (sizeInPercent.IsAlnum()) SetMinNodeSize(sizeInPercent.Atof());
334  else {
335  Log() << kERROR << "I had problems reading the option MinNodeEvents, which\n"
336  << "after removing a possible % sign now reads " << sizeInPercent << Endl;
337  }
338 }
339 
340 ////////////////////////////////////////////////////////////////////////////////
341 /// common initialisation with defaults for the DT-Method
342 
344 {
345  fMinNodeEvents = -1;
346  fMinNodeSize = 5;
347  fMinNodeSizeS = "5%";
348  fNCuts = 20;
349  fPruneMethod = DecisionTree::kNoPruning;
350  fPruneStrength = 5; // -1 means automatic determination of the prune strength using a validation sample
351  fDeltaPruneStrength=0.1;
352  fRandomisedTrees= kFALSE;
353  fUseNvars = GetNvar();
354  fUsePoissonNvars = kTRUE;
355 
356  // reference cut value to distinguish signal-like from background-like events
357  SetSignalReferenceCut( 0 );
358  if (fAnalysisType == Types::kClassification || fAnalysisType == Types::kMulticlass ) {
359  fMaxDepth = 3;
360  }else {
361  fMaxDepth = 50;
362  }
363 }
364 
365 ////////////////////////////////////////////////////////////////////////////////
366 ///destructor
367 
369 {
370  delete fTree;
371 }
372 
373 ////////////////////////////////////////////////////////////////////////////////
374 
376 {
378  fTree = new DecisionTree( fSepType, fMinNodeSize, fNCuts, &(DataInfo()), 0,
379  fRandomisedTrees, fUseNvars, fUsePoissonNvars,fMaxDepth,0 );
380  fTree->SetNVars(GetNvar());
381  if (fRandomisedTrees) Log()<<kWARNING<<" randomised Trees do not work yet in this framework,"
382  << " as I do not know how to give each tree a new random seed, now they"
383  << " will be all the same and that is not good " << Endl;
384  fTree->SetAnalysisType( GetAnalysisType() );
385 
386  //fTree->BuildTree(GetEventCollection(Types::kTraining));
387  Data()->SetCurrentType(Types::kTraining);
388  UInt_t nevents = Data()->GetNTrainingEvents();
389  std::vector<const TMVA::Event*> tmp;
390  for (Long64_t ievt=0; ievt<nevents; ievt++) {
391  const Event *event = GetEvent(ievt);
392  tmp.push_back(event);
393  }
394  fTree->BuildTree(tmp);
395  if (fPruneMethod != DecisionTree::kNoPruning) fTree->PruneTree();
396 
398  ExitFromTraining();
399 }
400 
401 ////////////////////////////////////////////////////////////////////////////////
402 /// prune the decision tree if requested (good for individual trees that are best grown out, and then
403 /// pruned back, while boosted decision trees are best 'small' trees to start with. Well, at least the
404 /// standard "optimal pruning algorithms" don't result in 'weak enough' classifiers !!
405 
407 {
408  // remember the number of nodes beforehand (for monitoring purposes)
409 
410 
411  if (fAutomatic && fPruneMethod == DecisionTree::kCostComplexityPruning) { // automatic cost complexity pruning
412  CCPruner* pruneTool = new CCPruner(fTree, this->Data() , fSepType);
413  pruneTool->Optimize();
414  std::vector<DecisionTreeNode*> nodes = pruneTool->GetOptimalPruneSequence();
415  fPruneStrength = pruneTool->GetOptimalPruneStrength();
416  for(UInt_t i = 0; i < nodes.size(); i++)
417  fTree->PruneNode(nodes[i]);
418  delete pruneTool;
419  }
420  else if (fAutomatic && fPruneMethod != DecisionTree::kCostComplexityPruning){
421  /*
422 
423  Double_t alpha = 0;
424  Double_t delta = fDeltaPruneStrength;
425 
426  DecisionTree* dcopy;
427  std::vector<Double_t> q;
428  multimap<Double_t,Double_t> quality;
429  Int_t nnodes=fTree->GetNNodes();
430 
431  // find the maximum prune strength that still leaves some nodes
432  Bool_t forceStop = kFALSE;
433  Int_t troubleCount=0, previousNnodes=nnodes;
434 
435 
436  nnodes=fTree->GetNNodes();
437  while (nnodes > 3 && !forceStop) {
438  dcopy = new DecisionTree(*fTree);
439  dcopy->SetPruneStrength(alpha+=delta);
440  dcopy->PruneTree();
441  q.push_back(TestTreeQuality(dcopy));
442  quality.insert(std::pair<const Double_t,Double_t>(q.back(),alpha));
443  nnodes=dcopy->GetNNodes();
444  if (previousNnodes == nnodes) troubleCount++;
445  else {
446  troubleCount=0; // reset counter
447  if (nnodes < previousNnodes / 2 ) fDeltaPruneStrength /= 2.;
448  }
449  previousNnodes = nnodes;
450  if (troubleCount > 20) {
451  if (methodIndex == 0 && fPruneStrength <=0) {//maybe you need larger stepsize ??
452  fDeltaPruneStrength *= 5;
453  Log() << kINFO << "<PruneTree> trouble determining optimal prune strength"
454  << " for Tree " << methodIndex
455  << " --> first try to increase the step size"
456  << " currently Prunestrenght= " << alpha
457  << " stepsize " << fDeltaPruneStrength << " " << Endl;
458  troubleCount = 0; // try again
459  fPruneStrength = 1; // if it was for the first time..
460  } else if (methodIndex == 0 && fPruneStrength <=2) {//maybe you need much larger stepsize ??
461  fDeltaPruneStrength *= 5;
462  Log() << kINFO << "<PruneTree> trouble determining optimal prune strength"
463  << " for Tree " << methodIndex
464  << " --> try to increase the step size even more.. "
465  << " if that still didn't work, TRY IT BY HAND"
466  << " currently Prunestrenght= " << alpha
467  << " stepsize " << fDeltaPruneStrength << " " << Endl;
468  troubleCount = 0; // try again
469  fPruneStrength = 3; // if it was for the first time..
470  } else {
471  forceStop=kTRUE;
472  Log() << kINFO << "<PruneTree> trouble determining optimal prune strength"
473  << " for Tree " << methodIndex << " at tested prune strength: " << alpha << " --> abort forced, use same strength as for previous tree:"
474  << fPruneStrength << Endl;
475  }
476  }
477  if (fgDebugLevel==1) Log() << kINFO << "Pruneed with ("<<alpha
478  << ") give quality: " << q.back()
479  << " and #nodes: " << nnodes
480  << Endl;
481  delete dcopy;
482  }
483  if (!forceStop) {
484  multimap<Double_t,Double_t>::reverse_iterator it=quality.rend();
485  it++;
486  fPruneStrength = it->second;
487  // adjust the step size for the next tree.. think that 20 steps are sort of
488  // fine enough.. could become a tunable option later..
489  fDeltaPruneStrength *= Double_t(q.size())/20.;
490  }
491 
492  fTree->SetPruneStrength(fPruneStrength);
493  fTree->PruneTree();
494  */
495  }
496  else {
497  fTree->SetPruneStrength(fPruneStrength);
498  fTree->PruneTree();
499  }
500 
501  return fPruneStrength;
502 }
503 
504 ////////////////////////////////////////////////////////////////////////////////
505 
507 {
508  Data()->SetCurrentType(Types::kValidation);
509  // test the tree quality.. in terms of Misclassification
510  Double_t SumCorrect=0,SumWrong=0;
511  for (Long64_t ievt=0; ievt<Data()->GetNEvents(); ievt++)
512  {
513  const Event * ev = Data()->GetEvent(ievt);
514  if ((dt->CheckEvent(ev) > dt->GetNodePurityLimit() ) == DataInfo().IsSignal(ev)) SumCorrect+=ev->GetWeight();
515  else SumWrong+=ev->GetWeight();
516  }
517  Data()->SetCurrentType(Types::kTraining);
518  return SumCorrect / (SumCorrect + SumWrong);
519 }
520 
521 ////////////////////////////////////////////////////////////////////////////////
522 
523 void TMVA::MethodDT::AddWeightsXMLTo( void* parent ) const
524 {
525  fTree->AddXMLTo(parent);
526  //Log() << kFATAL << "Please implement writing of weights as XML" << Endl;
527 }
528 
529 ////////////////////////////////////////////////////////////////////////////////
530 
532 {
533  if(fTree)
534  delete fTree;
535  fTree = new DecisionTree();
536  fTree->ReadXML(wghtnode,GetTrainingTMVAVersionCode());
537 }
538 
539 ////////////////////////////////////////////////////////////////////////////////
540 
541 void TMVA::MethodDT::ReadWeightsFromStream( std::istream& istr )
542 {
543  delete fTree;
544  fTree = new DecisionTree();
545  fTree->Read(istr);
546 }
547 
548 ////////////////////////////////////////////////////////////////////////////////
549 /// returns MVA value
550 
552 {
553  // cannot determine error
554  NoErrorCalc(err, errUpper);
555 
556  return fTree->CheckEvent(GetEvent(),fUseYesNoLeaf);
557 }
558 
559 ////////////////////////////////////////////////////////////////////////////////
560 
562 {
563 }
564 ////////////////////////////////////////////////////////////////////////////////
565 
567 {
568  return 0;
569 }
TMVA::MethodDT
Analysis of Boosted Decision Trees.
Definition: MethodDT.h:49
TMVA::SdivSqrtSplusB
Implementation of the SdivSqrtSplusB as separation criterion.
Definition: SdivSqrtSplusB.h:44
kTRUE
const Bool_t kTRUE
Definition: RtypesCore.h:91
TMVA::Types::kMulticlass
@ kMulticlass
Definition: Types.h:131
TMVA::DecisionTreeNode::SetIsTraining
static void SetIsTraining(bool on)
Definition: DecisionTreeNode.cxx:546
TMVA::MethodDT::AddWeightsXMLTo
void AddWeightsXMLTo(void *parent) const
Definition: MethodDT.cxx:523
ClassImp
#define ClassImp(name)
Definition: Rtypes.h:364
TMVA::Ranking
Ranking for variables in method (implementation)
Definition: Ranking.h:48
IMethod.h
Long64_t
long long Long64_t
Definition: RtypesCore.h:73
TMath::Log
Double_t Log(Double_t x)
Definition: TMath.h:760
TMVA::MethodBase::DeclareCompatibilityOptions
virtual void DeclareCompatibilityOptions()
options that are used ONLY for the READER to ensure backward compatibility they are hence without any...
Definition: MethodBase.cxx:596
TString::Atof
Double_t Atof() const
Return floating-point value contained in string.
Definition: TString.cxx:1987
Ranking.h
TMVA::MethodDT::~MethodDT
virtual ~MethodDT(void)
destructor
Definition: MethodDT.cxx:368
TMVA::CCPruner::Optimize
void Optimize()
determine the pruning sequence
Definition: CCPruner.cxx:124
GiniIndex.h
TMVA::MethodDT::ReadWeightsFromStream
virtual void ReadWeightsFromStream(std::istream &)=0
TMVA::MethodDT::ReadWeightsFromXML
void ReadWeightsFromXML(void *wghtnode)
Definition: MethodDT.cxx:531
TMVA::MethodDT::CreateRanking
const Ranking * CreateRanking()
Definition: MethodDT.cxx:566
TMVA::MethodDT::DeclareOptions
void DeclareOptions()
Define the options (their key words) that can be set in the option string.
Definition: MethodDT.cxx:212
TMVA::DecisionTree::CheckEvent
Double_t CheckEvent(const TMVA::Event *, Bool_t UseYesNoLeaf=kFALSE) const
the event e is put into the decision tree (starting at the root node) and the output is NodeType (sig...
Definition: DecisionTree.cxx:2690
TMVA::DecisionTree::kExpectedErrorPruning
@ kExpectedErrorPruning
Definition: DecisionTree.h:139
MethodBase.h
TMVA::CrossEntropy
Implementation of the CrossEntropy as separation criterion.
Definition: CrossEntropy.h:43
TMVA::GiniIndex
Implementation of the GiniIndex as separation criterion.
Definition: GiniIndex.h:63
TMVA::MethodDT::GetHelpMessage
void GetHelpMessage() const
Definition: MethodDT.cxx:561
TString
Basic string class.
Definition: TString.h:136
TMVA::DecisionTree::kNoPruning
@ kNoPruning
Definition: DecisionTree.h:139
TMVA::DecisionTree
Implementation of a Decision Tree.
Definition: DecisionTree.h:65
REGISTER_METHOD
#define REGISTER_METHOD(CLASS)
for example
Definition: ClassifierFactory.h:124
bool
TString::ReplaceAll
TString & ReplaceAll(const TString &s1, const TString &s2)
Definition: TString.h:692
TMVA::MethodDT::GetMvaValue
Double_t GetMvaValue(Double_t *err=0, Double_t *errUpper=0)
returns MVA value
Definition: MethodDT.cxx:551
TMVA::DecisionTree::GetNodePurityLimit
Double_t GetNodePurityLimit() const
Definition: DecisionTree.h:162
TMVA::DataSetInfo
Class that contains all the data information.
Definition: DataSetInfo.h:62
SeparationBase.h
MsgLogger.h
MethodBoost.h
DecisionTree.h
Timer.h
BinarySearchTree.h
TMVA::Types::EAnalysisType
EAnalysisType
Definition: Types.h:128
TMVA::MethodDT::fPruneBeforeBoost
Bool_t fPruneBeforeBoost
Definition: MethodDT.h:137
TMVA::CCPruner::GetOptimalPruneStrength
Float_t GetOptimalPruneStrength() const
Definition: CCPruner.h:89
TMVA::CCPruner::SetPruneStrength
void SetPruneStrength(Float_t alpha=-1.0)
Definition: CCPruner.h:110
kFALSE
const Bool_t kFALSE
Definition: RtypesCore.h:92
TMVA::MethodDT::TestTreeQuality
Double_t TestTreeQuality(DecisionTree *dt)
Definition: MethodDT.cxx:506
TMVA::Types::kClassification
@ kClassification
Definition: Types.h:129
TRandom3.h
TString::IsAlnum
Bool_t IsAlnum() const
Returns true if all characters in string are alphanumeric.
Definition: TString.cxx:1746
TMVA::MethodBase
Virtual base Class for all MVA method.
Definition: MethodBase.h:111
TMVA::Types
Singleton class for Global types used by TMVA.
Definition: Types.h:73
TMVA::MethodDT::ProcessOptions
void ProcessOptions()
the option string is decoded, for available options see "DeclareOptions"
Definition: MethodDT.cxx:256
Types.h
Configurable.h
TMVA::MethodDT::Train
void Train(void)
Definition: MethodDT.cxx:375
TMVA::Endl
MsgLogger & Endl(MsgLogger &ml)
Definition: MsgLogger.h:158
unsigned int
TMVA::Types::kTraining
@ kTraining
Definition: Types.h:145
CCPruner.h
Double_t
double Double_t
Definition: RtypesCore.h:59
TMVA::Types::kValidation
@ kValidation
Definition: Types.h:148
MethodDT.h
TMVA::Event::GetWeight
Double_t GetWeight() const
return the event weight - depending on whether the flag IgnoreNegWeightsInTraining is or not.
Definition: Event.cxx:381
TMVA::DecisionTree::kCostComplexityPruning
@ kCostComplexityPruning
Definition: DecisionTree.h:139
TMVA::CCPruner::GetOptimalPruneSequence
std::vector< TMVA::DecisionTreeNode * > GetOptimalPruneSequence() const
return the prune strength (=alpha) corresponding to the prune sequence
Definition: CCPruner.cxx:240
TMVA::Event
Definition: Event.h:51
TMVA::MisClassificationError
Implementation of the MisClassificationError as separation criterion.
Definition: MisClassificationError.h:46
CrossEntropy.h
TMVA::MethodDT::MethodDT
MethodDT(const TString &jobName, const TString &methodTitle, DataSetInfo &theData, const TString &theOption="")
the standard constructor for just an ordinar "decision trees"
Definition: MethodDT.cxx:127
SdivSqrtSplusB.h
TMVA::MethodDT::DeclareCompatibilityOptions
void DeclareCompatibilityOptions()
options that are used ONLY for the READER to ensure backward compatibility
Definition: MethodDT.cxx:245
MisClassificationError.h
TMVA::MethodDT::Init
void Init(void)
common initialisation with defaults for the DT-Method
Definition: MethodDT.cxx:343
TMVA::CCPruner
A helper class to prune a decision tree using the Cost Complexity method (see Classification and Regr...
Definition: CCPruner.h:62
TMVA::MethodDT::HasAnalysisType
virtual Bool_t HasAnalysisType(Types::EAnalysisType type, UInt_t numberClasses, UInt_t numberTargets)
FDA can handle classification with 2 classes and regression with one regression-target.
Definition: MethodDT.cxx:180
Tools.h
ClassifierFactory.h
type
int type
Definition: TGX11.cxx:121
TMVA::MethodDT::PruneTree
Double_t PruneTree()
prune the decision tree if requested (good for individual trees that are best grown out,...
Definition: MethodDT.cxx:406
TMVA::MethodDT::SetMinNodeSize
void SetMinNodeSize(Double_t sizeInPercent)
Definition: MethodDT.cxx:320
DataSet.h
TMVA
create variable transformations
Definition: GeneticMinimizer.h:22