 ROOT   Reference Guide TMVA::CCPruner Class Reference

A helper class to prune a decision tree using the Cost Complexity method (see Classification and Regression Trees by Leo Breiman et al)

### Some definitions:

• $$T_{max}$$ - the initial, usually highly overtrained tree, that is to be pruned back
• $$R(T)$$ - quality index (Gini, misclassification rate, or other) of a tree $$T$$
• $$\sim T$$ - set of terminal nodes in $$T$$
• $$T'$$ - the pruned subtree of $$T_max$$ that has the best quality index $$R(T')$$
• $$\alpha$$ - the prune strength parameter in Cost Complexity pruning $$(R_{\alpha}(T) = R(T) + \alpha*|\sim T|)$$

There are two running modes in CCPruner: (i) one may select a prune strength and prune back the tree $$T_{max}$$ until the criterion:

$\alpha < \frac{R(T) - R(t)}{|\sim T_t| - 1}$

is true for all nodes t in $$T$$, or (ii) the algorithm finds the sequence of critical points $$\alpha_k < \alpha_{k+1} ... < \alpha_K$$ such that $$T_K = root(T_{max})$$ and then selects the optimally-pruned subtree, defined to be the subtree with the best quality index for the validation sample.

Definition at line 62 of file CCPruner.h.

## Public Types

typedef std::vector< Event * > EventList

## Public Member Functions

CCPruner (DecisionTree *t_max, const DataSet *validationSample, SeparationBase *qualityIndex=nullptr)
constructor More...

CCPruner (DecisionTree *t_max, const EventList *validationSample, SeparationBase *qualityIndex=nullptr)
constructor More...

~CCPruner ()

std::vector< TMVA::DecisionTreeNode * > GetOptimalPruneSequence () const
return the prune strength (=alpha) corresponding to the prune sequence More...

Float_t GetOptimalPruneStrength () const

Float_t GetOptimalQualityIndex () const

void Optimize ()
determine the pruning sequence More...

void SetPruneStrength (Float_t alpha=-1.0)

## Private Attributes

Float_t fAlpha
! regularization parameter in CC pruning More...

Bool_t fDebug
! debug flag More...

Int_t fOptimalK
! index of the optimal tree in the pruned tree sequence More...

Bool_t fOwnQIndex
! flag indicates if fQualityIndex is owned by this More...

std::vector< TMVA::DecisionTreeNode * > fPruneSequence
! map of weakest links (i.e., branches to prune) -> pruning index More...

std::vector< Float_tfPruneStrengthList
! map of alpha -> pruning index More...

SeparationBasefQualityIndex
! the quality index used to calculate R(t), R(T) = sum[t in ~T]{ R(t) } More...

std::vector< Float_tfQualityIndexList
! map of R(T) -> pruning index More...

DecisionTreefTree
! (pruned) decision tree More...

const DataSetfValidationDataSet
! the event sample to select the optimally-pruned tree More...

const EventListfValidationSample
! the event sample to select the optimally-pruned tree More...

#include <TMVA/CCPruner.h>

## ◆ EventList

 typedef std::vector TMVA::CCPruner::EventList

Definition at line 64 of file CCPruner.h.

## ◆ CCPruner() [1/2]

 CCPruner::CCPruner ( DecisionTree * t_max, const EventList * validationSample, SeparationBase * qualityIndex = nullptr )

constructor

Definition at line 69 of file CCPruner.cxx.

## ◆ CCPruner() [2/2]

 CCPruner::CCPruner ( DecisionTree * t_max, const DataSet * validationSample, SeparationBase * qualityIndex = nullptr )

constructor

Definition at line 92 of file CCPruner.cxx.

## ◆ ~CCPruner()

 CCPruner::~CCPruner ( )

Definition at line 115 of file CCPruner.cxx.

## ◆ GetOptimalPruneSequence()

 std::vector< DecisionTreeNode * > CCPruner::GetOptimalPruneSequence ( ) const

return the prune strength (=alpha) corresponding to the prune sequence

Definition at line 240 of file CCPruner.cxx.

## ◆ GetOptimalPruneStrength()

 Float_t TMVA::CCPruner::GetOptimalPruneStrength ( ) const
inline

Definition at line 89 of file CCPruner.h.

## ◆ GetOptimalQualityIndex()

 Float_t TMVA::CCPruner::GetOptimalQualityIndex ( ) const
inline

Definition at line 85 of file CCPruner.h.

## ◆ Optimize()

 void CCPruner::Optimize ( )

determine the pruning sequence

Definition at line 124 of file CCPruner.cxx.

## ◆ SetPruneStrength()

 void TMVA::CCPruner::SetPruneStrength ( Float_t alpha = -1.0 )
inline

Definition at line 110 of file CCPruner.h.

## ◆ fAlpha

 Float_t TMVA::CCPruner::fAlpha
private

! regularization parameter in CC pruning

Definition at line 93 of file CCPruner.h.

## ◆ fDebug

 Bool_t TMVA::CCPruner::fDebug
private

! debug flag

Definition at line 106 of file CCPruner.h.

## ◆ fOptimalK

 Int_t TMVA::CCPruner::fOptimalK
private

! index of the optimal tree in the pruned tree sequence

Definition at line 105 of file CCPruner.h.

## ◆ fOwnQIndex

 Bool_t TMVA::CCPruner::fOwnQIndex
private

! flag indicates if fQualityIndex is owned by this

Definition at line 97 of file CCPruner.h.

## ◆ fPruneSequence

 std::vector TMVA::CCPruner::fPruneSequence
private

! map of weakest links (i.e., branches to prune) -> pruning index

Definition at line 101 of file CCPruner.h.

## ◆ fPruneStrengthList

 std::vector TMVA::CCPruner::fPruneStrengthList
private

! map of alpha -> pruning index

Definition at line 102 of file CCPruner.h.

## ◆ fQualityIndex

 SeparationBase* TMVA::CCPruner::fQualityIndex
private

! the quality index used to calculate R(t), R(T) = sum[t in ~T]{ R(t) }

Definition at line 96 of file CCPruner.h.

## ◆ fQualityIndexList

 std::vector TMVA::CCPruner::fQualityIndexList
private

! map of R(T) -> pruning index

Definition at line 103 of file CCPruner.h.

## ◆ fTree

 DecisionTree* TMVA::CCPruner::fTree
private

! (pruned) decision tree

Definition at line 99 of file CCPruner.h.

## ◆ fValidationDataSet

 const DataSet* TMVA::CCPruner::fValidationDataSet
private

! the event sample to select the optimally-pruned tree

Definition at line 95 of file CCPruner.h.

## ◆ fValidationSample

 const EventList* TMVA::CCPruner::fValidationSample
private

! the event sample to select the optimally-pruned tree

Definition at line 94 of file CCPruner.h.

Libraries for TMVA::CCPruner: [legend]

The documentation for this class was generated from the following files: