Principal Components Analysis (PCA)

The current implementation is based on the LINTRA package from CERNLIB by R. Brun, H. Hansroul, and J. Kubler. The class has been implemented by Christian Holm Christensen in August 2000.

Introduction

In many applications of various fields of research, the treatment of large amounts of data requires powerful techniques capable of rapid data reduction and analysis. Usually, the quantities most conveniently measured by the experimentalist, are not necessarily the most significant for classification and analysis of the data. It is then useful to have a way of selecting an optimal set of variables necessary for the recognition process and reducing the dimensionality of the problem, resulting in an easier classification procedure.

This paper describes the implementation of one such method of feature selection, namely the principal components analysis. This multidimensional technique is well known in the field of pattern recognition and and its use in Particle Physics has been documented elsewhere (cf. H. Wind, Function Parameterization, CERN 72-21).

Overview

Suppose we have prototypes which are trajectories of particles, passing through a spectrometer. If one measures the passage of the particle at say 8 fixed planes, the trajectory is described by an 8-component vector:

$x = (x_{0}, x_{1}, \dots, x_{7})$

in 8-dimensional pattern space.

One proceeds by generating a representative tracks sample and building up the covariance matrix $C$ . Its eigenvectors and eigenvalues are computed by standard methods, and thus a new basis is obtained for the original 8-dimensional space the expansion of the prototypes,

$x_{m} = \sum_{i = 0}^{7} a_{m_{i}} e_{i} where a_{m_{i}} = x^{T} ∙ e_{i}$

allows the study of the behavior of the coefficients $a_{m_{i}}$ for all the tracks of the sample. The eigenvectors which are insignificant for the trajectory description in the expansion will have their corresponding coefficients $a_{m_{i}}$ close to zero for all the prototypes.

On one hand, a reduction of the dimensionality is then obtained by omitting these least significant vectors in the subsequent analysis.

On the other hand, in the analysis of real data, these least significant variables(?) can be used for the pattern recognition problem of extracting the valid combinations of coordinates describing a true trajectory from the set of all possible wrong combinations.

The program described here performs this principal components analysis on a sample of data provided by the user. It computes the covariance matrix, its eigenvalues ands corresponding eigenvectors and exhibits the behavior of the principal components $a_{m_{i}}$ , thus providing to the user all the means of understanding their data.

Principal Components Method

Let's consider a sample of $M$ prototypes each being characterized by $P$ variables $x_{0}, x_{1}, \dots, x_{P - 1}$ . Each prototype is a point, or a column vector, in a $P$ -dimensional Pattern space.

$x = [\begin{matrix} x_{0} \\ x_{1} \\ ⋮ \\ x_{P - 1} \end{matrix}],$

where each $x_{n}$ represents the particular value associated with the $n$ -dimension.

Those $P$ variables are the quantities accessible to the experimentalist, but are not necessarily the most significant for the classification purpose.

The Principal Components Method consists of applying a linear* transformation to the original variables. This transformation is described by an orthogonal matrix and is equivalent to a rotation of the original pattern space into a new set of coordinate vectors, which hopefully provide easier feature identification and dimensionality reduction.

Let's define the covariance matrix:

$C = ⟨ y y^{T} ⟩ where y = x - ⟨ x ⟩,$

and the brackets indicate mean value over the sample of $M$ prototypes.

This matrix $C$ is real, positive definite, symmetric, and will have all its eigenvalues greater then zero. It will now be show that among the family of all the complete orthonormal bases of the pattern space, the base formed by the eigenvectors of the covariance matrix and belonging to the largest eigenvalues, corresponds to the most significant features of the description of the original prototypes.

let the prototypes be expanded on into a set of $N$ basis vectors $e_{n}, n = 0, \dots, N, N + 1, \dots, P - 1$

$y_{i} = \sum_{i = 0}^{N} a_{i_{n}} e_{n}, i = 1, \dots, M, N < P - 1$

The ‘best’ feature coordinates $e_{n}$ , spanning a feature space, will be obtained by minimizing the error due to this truncated expansion, i.e.,

$min (E_{N}) = min [⟨ {(y_{i} - \sum_{i = 0}^{N} a_{i_{n}} e_{n})}^{2} ⟩]$

with the conditions:

$e_{k} ∙ e_{j} = δ_{j k} = {\begin{array}{rcl} 1 & for & k = j \\ 0 & for & k \neq j \end{array}$

Multiplying (3) by $e_{n}^{T}$ using (5), we get

$a_{i_{n}} = y_{i}^{T} ∙ e_{n},$

so the error becomes

$\begin{array}{rcl} E_{N} & = & ⟨ {[\sum_{n = N + 1}^{P - 1} a_{i_{n}} e_{n}]}^{2} ⟩ \\ = & ⟨ {[\sum_{n = N + 1}^{P - 1} y_{i}^{T} ∙ e_{n} e_{n}]}^{2} ⟩ \\ = & ⟨ \sum_{n = N + 1}^{P - 1} e_{n}^{T} y_{i} y_{i}^{T} e_{n} ⟩ \\ = & \sum_{n = N + 1}^{P - 1} e_{n}^{T} C e_{n} \end{array}$

The minimization of the sum in (7) is obtained when each term $e_{n}^{C} e_{n}$ is minimum, since $C$ is positive definite. By the method of Lagrange multipliers, and the condition (5), we get

$E_{N} = \sum_{n = N + 1}^{P - 1} (e_{n}^{T} C e_{n} - l_{n} e_{n}^{T} ∙ e_{n} + l_{n})$

The minimum condition $\frac{d E_{N}}{d e_{n}^{T}} = 0$ leads to the equation

$C e_{n} = l_{n} e_{n},$

which shows that $e_{n}$ is an eigenvector of the covariance matrix $C$ with eigenvalue $l_{n}$ . The estimated minimum error is then given by

$E_{N} \sim \sum_{n = N + 1}^{P - 1} e_{n}^{T} ∙ l_{n} e_{n} = \sum_{n = N + 1}^{P - 1} l_{n},$

where $l_{n}, n = N + 1, \dots, P$ $l_{n}, n = N + 1, \dots, P - 1$ are the eigenvalues associated with the omitted eigenvectors in the expansion (3). Thus, by choosing the $N$ largest eigenvalues, and their associated eigenvectors, the error $E_{N}$ is minimized.

The transformation matrix to go from the pattern space to the feature space consists of the ordered eigenvectors $e_{1}, \dots, e_{P}$ $e_{0}, \dots, e_{P - 1}$ for its columns

$T = [\begin{array}{cccc} e_{0} & e_{1} & ⋮ & e_{P - 1} \end{array}] = [\begin{array}{cccc} e_{0_{0}} & e_{1_{0}} & \dots & e_{{P - 1}_{0}} \\ e_{0_{1}} & e_{1_{1}} & \dots & e_{{P - 1}_{1}} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ e_{0_{P - 1}} & e_{1_{P - 1}} & \dots & e_{{P - 1}_{P - 1}} \end{array}]$

This is an orthogonal transformation, or rotation, of the pattern space and feature selection results in ignoring certain coordinates in the transformed space.

Christian Holm August 2000, CERN

Definition at line 21 of file TPrincipal.h.

Public Member Functions
	TPrincipal ()
	Empty constructor. Do not use.

	TPrincipal (Long64_t nVariables, Option_t *opt="ND")
	Constructor.

	~TPrincipal () override
	Destructor.

virtual void	AddRow (const Double_t *x)
	Add a data point and update the covariance matrix.

void	Browse (TBrowser *b) override
	Browse the TPrincipal object in the TBrowser.

void	Clear (Option_t *option="") override
	Clear the data in Object.

const TMatrixD *	GetCovarianceMatrix () const
	Return the covariance matrix.

const TVectorD *	GetEigenValues () const

const TMatrixD *	GetEigenVectors () const

TList *	GetHistograms () const

const TVectorD *	GetMeanValues () const

const Double_t *	GetRow (Long64_t row)
	Return a row of the user supplied data.

const TVectorD *	GetSigmas () const

const TVectorD *	GetUserData () const

TClass *	IsA () const override

Bool_t	IsFolder () const override
	Returns kTRUE in case object contains browsable objects (like containers or lists of other objects).

virtual void	MakeCode (const char filename="pca", Option_t option="")
	Generates the file `<filename>`, with `.C` appended if it does argument doesn't end in .cxx or .C.

virtual void	MakeHistograms (const char name="pca", Option_t option="epsdx")
	Make histograms of the result of the analysis.

virtual void	MakeMethods (const char classname="PCA", Option_t option="")
	Generate the file `<classname>PCA.cxx` which contains the implementation of two methods:

virtual void	MakePrincipals ()
	Perform the principal components analysis.

virtual void	P2X (const Double_t p, Double_t x, Int_t nTest)
	Calculate x as a function of nTest of the most significant principal components p, and return it in x.

void	Print (Option_t *opt="MSE") const override
	Print the statistics Options are.

void	Streamer (TBuffer &) override
	Stream an object of class TObject.

void	StreamerNVirtual (TBuffer &ClassDef_StreamerNVirtual_b)

virtual void	SumOfSquareResiduals (const Double_t x, Double_t s)
	Calculates the sum of the square residuals, that is.

void	Test (Option_t *option="")
	Test the PCA, bye calculating the sum square of residuals (see method SumOfSquareResiduals), and display the histogram.

virtual void	X2P (const Double_t x, Double_t p)
	Calculate the principal components from the original data vector x, and return it in p.

Public Member Functions inherited from TNamed
	TNamed ()

	TNamed (const char name, const char title)

	TNamed (const TNamed &named)
	TNamed copy ctor.

	TNamed (const TString &name, const TString &title)

virtual	~TNamed ()
	TNamed destructor.

TObject *	Clone (const char *newname="") const override
	Make a clone of an object using the Streamer facility.

Int_t	Compare (const TObject *obj) const override
	Compare two TNamed objects.

void	Copy (TObject &named) const override
	Copy this to obj.

virtual void	FillBuffer (char *&buffer)
	Encode TNamed into output buffer.

const char *	GetName () const override
	Returns name of object.

const char *	GetTitle () const override
	Returns title of object.

ULong_t	Hash () const override
	Return hash value for this object.

Bool_t	IsSortable () const override

void	ls (Option_t *option="") const override
	List TNamed name and title.

TNamed &	operator= (const TNamed &rhs)
	TNamed assignment operator.

virtual void	SetName (const char *name)
	Set the name of the TNamed.

virtual void	SetNameTitle (const char name, const char title)
	Set all the TNamed parameters (name and title).

virtual void	SetTitle (const char *title="")
	Set the title of the TNamed.

virtual Int_t	Sizeof () const
	Return size of the TNamed part of the TObject.

void	StreamerNVirtual (TBuffer &ClassDef_StreamerNVirtual_b)

Public Member Functions inherited from TObject
	TObject ()
	TObject constructor.

	TObject (const TObject &object)
	TObject copy ctor.

virtual	~TObject ()
	TObject destructor.

void	AbstractMethod (const char *method) const
	Use this method to implement an "abstract" method that you don't want to leave purely abstract.

virtual void	AppendPad (Option_t *option="")
	Append graphics object to current pad.

ULong_t	CheckedHash ()
	Check and record whether this class has a consistent Hash/RecursiveRemove setup (*) and then return the regular Hash value for this object.

virtual const char *	ClassName () const
	Returns name of class to which the object belongs.

virtual void	Delete (Option_t *option="")
	Delete this object.

virtual Int_t	DistancetoPrimitive (Int_t px, Int_t py)
	Computes distance from point (px,py) to the object.

virtual void	Draw (Option_t *option="")
	Default Draw method for all objects.

virtual void	DrawClass () const
	Draw class inheritance tree of the class to which this object belongs.

virtual TObject *	DrawClone (Option_t *option="") const
	Draw a clone of this object in the current selected pad with: `gROOT->SetSelectedPad(c1)`.

virtual void	Dump () const
	Dump contents of object on stdout.

virtual void	Error (const char method, const char msgfmt,...) const
	Issue error message.

virtual void	Execute (const char method, const char params, Int_t *error=nullptr)
	Execute method on this object with the given parameter string, e.g.

virtual void	Execute (TMethod method, TObjArray params, Int_t *error=nullptr)
	Execute method on this object with parameters stored in the TObjArray.

virtual void	ExecuteEvent (Int_t event, Int_t px, Int_t py)
	Execute action corresponding to an event at (px,py).

virtual void	Fatal (const char method, const char msgfmt,...) const
	Issue fatal error message.

virtual TObject *	FindObject (const char *name) const
	Must be redefined in derived classes.

virtual TObject *	FindObject (const TObject *obj) const
	Must be redefined in derived classes.

virtual Option_t *	GetDrawOption () const
	Get option used by the graphics system to draw this object.

virtual const char *	GetIconName () const
	Returns mime type name of object.

virtual char *	GetObjectInfo (Int_t px, Int_t py) const
	Returns string containing info about the object at position (px,py).

virtual Option_t *	GetOption () const

virtual UInt_t	GetUniqueID () const
	Return the unique object id.

virtual Bool_t	HandleTimer (TTimer *timer)
	Execute action in response of a timer timing out.

Bool_t	HasInconsistentHash () const
	Return true is the type of this object is known to have an inconsistent setup for Hash and RecursiveRemove (i.e.

virtual void	Info (const char method, const char msgfmt,...) const
	Issue info message.

virtual Bool_t	InheritsFrom (const char *classname) const
	Returns kTRUE if object inherits from class "classname".

virtual Bool_t	InheritsFrom (const TClass *cl) const
	Returns kTRUE if object inherits from TClass cl.

virtual void	Inspect () const
	Dump contents of this object in a graphics canvas.

void	InvertBit (UInt_t f)

Bool_t	IsDestructed () const
	IsDestructed.

virtual Bool_t	IsEqual (const TObject *obj) const
	Default equal comparison (objects are equal if they have the same address in memory).

R__ALWAYS_INLINE Bool_t	IsOnHeap () const

R__ALWAYS_INLINE Bool_t	IsZombie () const

void	MayNotUse (const char *method) const
	Use this method to signal that a method (defined in a base class) may not be called in a derived class (in principle against good design since a child class should not provide less functionality than its parent, however, sometimes it is necessary).

virtual Bool_t	Notify ()
	This method must be overridden to handle object notification (the base implementation is no-op).

void	Obsolete (const char method, const char asOfVers, const char *removedFromVers) const
	Use this method to declare a method obsolete.

void	operator delete (void *ptr)
	Operator delete.

void	operator delete (void ptr, void vp)
	Only called by placement new when throwing an exception.

void	operator delete[] (void *ptr)
	Operator delete [].

void	operator delete[] (void ptr, void vp)
	Only called by placement new[] when throwing an exception.

void *	operator new (size_t sz)

void *	operator new (size_t sz, void *vp)

void *	operator new[] (size_t sz)

void *	operator new[] (size_t sz, void *vp)

TObject &	operator= (const TObject &rhs)
	TObject assignment operator.

virtual void	Paint (Option_t *option="")
	This method must be overridden if a class wants to paint itself.

virtual void	Pop ()
	Pop on object drawn in a pad to the top of the display list.

virtual Int_t	Read (const char *name)
	Read contents of object with specified name from the current directory.

virtual void	RecursiveRemove (TObject *obj)
	Recursively remove this object from a list.

void	ResetBit (UInt_t f)

virtual void	SaveAs (const char filename="", Option_t option="") const
	Save this object in the file specified by filename.

virtual void	SavePrimitive (std::ostream &out, Option_t *option="")
	Save a primitive as a C++ statement(s) on output stream "out".

void	SetBit (UInt_t f)

void	SetBit (UInt_t f, Bool_t set)
	Set or unset the user status bits as specified in f.

virtual void	SetDrawOption (Option_t *option="")
	Set drawing option for object.

virtual void	SetUniqueID (UInt_t uid)
	Set the unique object id.

void	StreamerNVirtual (TBuffer &ClassDef_StreamerNVirtual_b)

virtual void	SysError (const char method, const char msgfmt,...) const
	Issue system error message.

R__ALWAYS_INLINE Bool_t	TestBit (UInt_t f) const

Int_t	TestBits (UInt_t f) const

virtual void	UseCurrentStyle ()
	Set current style settings in this object This function is called when either TCanvas::UseCurrentStyle or TROOT::ForceStyle have been invoked.

virtual void	Warning (const char method, const char msgfmt,...) const
	Issue warning message.

virtual Int_t	Write (const char *name=nullptr, Int_t option=0, Int_t bufsize=0)
	Write this object to the current directory.

virtual Int_t	Write (const char *name=nullptr, Int_t option=0, Int_t bufsize=0) const
	Write this object to the current directory.

Static Public Member Functions
static TClass *	Class ()

static const char *	Class_Name ()

static constexpr Version_t	Class_Version ()

static const char *	DeclFileName ()

Static Public Member Functions inherited from TNamed
static TClass *	Class ()

static const char *	Class_Name ()

static constexpr Version_t	Class_Version ()

static const char *	DeclFileName ()

Static Public Member Functions inherited from TObject
static TClass *	Class ()

static const char *	Class_Name ()

static constexpr Version_t	Class_Version ()

static const char *	DeclFileName ()

static Longptr_t	GetDtorOnly ()
	Return destructor only flag.

static Bool_t	GetObjectStat ()
	Get status of object stat flag.

static void	SetDtorOnly (void *obj)
	Set destructor only flag.

static void	SetObjectStat (Bool_t stat)
	Turn on/off tracking of objects in the TObjectTable.

Protected Member Functions
	TPrincipal (const TPrincipal &)
	Copy constructor.

void	MakeNormalised ()
	Normalize the covariance matrix.

void	MakeRealCode (const char filename, const char prefix, Option_t *option="")
	This is the method that actually generates the code for the transformations to and from feature space and pattern space It's called by TPrincipal::MakeCode and TPrincipal::MakeMethods.

TPrincipal &	operator= (const TPrincipal &)
	Assignment operator.

Protected Member Functions inherited from TNamed
void	SavePrimitiveNameTitle (std::ostream &out, const char *variable_name)
	Save object name and title into the output stream "out".

Protected Member Functions inherited from TObject
virtual void	DoError (int level, const char location, const char fmt, va_list va) const
	Interface to ErrorHandler (protected).

void	MakeZombie ()

Protected Attributes
TMatrixD	fCovarianceMatrix
	Covariance matrix.

TVectorD	fEigenValues
	Eigenvalue vector of trans.

TMatrixD	fEigenVectors
	Eigenvector matrix of trans.

TList *	fHistograms
	List of histograms.

Bool_t	fIsNormalised
	Normalize matrix?

TVectorD	fMeanValues
	Mean value over all data points.

Int_t	fNumberOfDataPoints
	Number of data points.

Int_t	fNumberOfVariables
	Number of variables.

TVectorD	fOffDiagonal
	Elements of the tridiagonal.

TVectorD	fSigmas
	vector of sigmas

Bool_t	fStoreData
	Should we store input data?

Double_t	fTrace
	Trace of covarience matrix.

TVectorD	fUserData
	Vector of original data points.

Protected Attributes inherited from TNamed
TString	fName

TString	fTitle

Additional Inherited Members
Public Types inherited from TObject
enum	{ kIsOnHeap = 0x01000000 , kNotDeleted = 0x02000000 , kZombie = 0x04000000 , kInconsistent = 0x08000000 , kBitMask = 0x00ffffff }

enum	{ kSingleKey = (1ULL << ( 0 )) , kOverwrite = (1ULL << ( 1 )) , kWriteDelete = (1ULL << ( 2 )) }

enum	EDeprecatedStatusBits { kObjInCanvas = (1ULL << ( 3 )) }

enum	EStatusBits { kCanDelete = (1ULL << ( 0 )) , kMustCleanup = (1ULL << ( 3 )) , kIsReferenced = (1ULL << ( 4 )) , kHasUUID = (1ULL << ( 5 )) , kCannotPick = (1ULL << ( 6 )) , kNoContextMenu = (1ULL << ( 8 )) , kInvalidObject = (1ULL << ( 13 )) }

Protected Types inherited from TObject
enum	{ kOnlyPrepStep = (1ULL << ( 3 )) }

Static Protected Member Functions inherited from TObject
static void	SavePrimitiveConstructor (std::ostream &out, TClass cl, const char variable_name, const char *constructor_agrs="", Bool_t empty_line=kTRUE)
	Save object constructor in the output stream "out".

static void	SavePrimitiveDraw (std::ostream &out, const char variable_name, Option_t option=nullptr)
	Save invocation of primitive Draw() method Skipped if option contains "nodraw" string.

static TString	SavePrimitiveVector (std::ostream &out, const char prefix, Int_t len, Double_t arr, Bool_t empty_line=kFALSE)
	Save array in the output stream "out" as vector.

#include <TPrincipal.h>

Inheritance diagram for TPrincipal:

[legend]

Constructor & Destructor Documentation

◆ TPrincipal() [1/3]

TPrincipal::TPrincipal ( const TPrincipal & pr )

protected

Copy constructor.

Definition at line 316 of file TPrincipal.cxx.

◆ TPrincipal() [2/3]

TPrincipal::TPrincipal ( )

Empty constructor. Do not use.

Definition at line 229 of file TPrincipal.cxx.

◆ ~TPrincipal()

TPrincipal::~TPrincipal ( )

override

Destructor.

Definition at line 361 of file TPrincipal.cxx.

◆ TPrincipal() [3/3]

TPrincipal::TPrincipal	(	Long64_t	nVariables,
		Option_t *	opt = "ND" )

Constructor.

Argument is number of variables in the sample of data Options are:

N Normalize the covariance matrix (default)
D Store input data (default)

The created object is named "principal" by default.

Definition at line 253 of file TPrincipal.cxx.

Member Function Documentation

◆ AddRow()

void TPrincipal::AddRow ( const Double_t * p )

virtual

Add a data point and update the covariance matrix.

The input array must be fNumberOfVariables long.

The Covariance matrix and mean values of the input data is calculated on the fly by the following equations:

${⟨ x_{i} ⟩}^{(0)} = x_{i 0}$

${⟨ x_{i} ⟩}^{(n)} = {⟨ x_{i} ⟩}^{(n - 1)} + \frac{1}{n} (x_{i n} - {⟨ x_{i} ⟩}^{(n - 1)})$

$C_{i j}^{(0)} = 0$

$C_{i j}^{(n)} = C_{i j}^{(n - 1)} + \frac{1}{n - 1} [(x_{i n} - {⟨ x_{i} ⟩}^{(n)}) (x_{j n} - {⟨ x_{j} ⟩}^{(n)})] - \frac{1}{n} C_{i j}^{(n - 1)}$

since this is a really fast method, with no rounding errors (please refer to CERN 72-21 pp. 54-106).

The data is stored internally in a TVectorD, in the following way:

$x = [(x_{0_{0}}, \dots, x_{{P - 1}_{0}}), \dots, (x_{0_{i}}, \dots, x_{{P - 1}_{i}}), \dots]$

With $P$ as defined in the class description.

Definition at line 414 of file TPrincipal.cxx.

◆ Browse()

void TPrincipal::Browse ( TBrowser * b )

overridevirtual

Browse the TPrincipal object in the TBrowser.

Reimplemented from TObject.

Definition at line 471 of file TPrincipal.cxx.

◆ Class()

static TClass * TPrincipal::Class ( )

static

Returns: TClass describing this class

◆ Class_Name()

static const char * TPrincipal::Class_Name ( )

static

Returns: Name of this class

◆ Class_Version()

static constexpr Version_t TPrincipal::Class_Version ( )

inlinestaticconstexpr

Returns: Version of this class

Definition at line 79 of file TPrincipal.h.

◆ Clear()

void TPrincipal::Clear ( Option_t * opt = "" )

overridevirtual

Clear the data in Object.

Notice, that's not possible to change the dimension of the original data.

Reimplemented from TNamed.

Definition at line 494 of file TPrincipal.cxx.

◆ DeclFileName()

static const char * TPrincipal::DeclFileName ( )

inlinestatic

Returns: Name of the file containing the class declaration

Definition at line 79 of file TPrincipal.h.

◆ GetCovarianceMatrix()

const TMatrixD * TPrincipal::GetCovarianceMatrix ( ) const

inline

Return the covariance matrix.

Note: Only the lower diagonal of the covariance matrix is computed by the class

Definition at line 60 of file TPrincipal.h.

◆ GetEigenValues()

const TVectorD * TPrincipal::GetEigenValues ( ) const

inline

Definition at line 61 of file TPrincipal.h.

◆ GetEigenVectors()

const TMatrixD * TPrincipal::GetEigenVectors ( ) const

inline

Definition at line 62 of file TPrincipal.h.

◆ GetHistograms()

TList * TPrincipal::GetHistograms ( ) const

inline

Definition at line 63 of file TPrincipal.h.

◆ GetMeanValues()

const TVectorD * TPrincipal::GetMeanValues ( ) const

inline

Definition at line 64 of file TPrincipal.h.

◆ GetRow()

const Double_t * TPrincipal::GetRow ( Long64_t row )

Return a row of the user supplied data.

If row is out of bounds, 0 is returned. It's up to the user to delete the returned array. Row 0 is the first row;

Definition at line 521 of file TPrincipal.cxx.

◆ GetSigmas()

const TVectorD * TPrincipal::GetSigmas ( ) const

inline

Definition at line 66 of file TPrincipal.h.

◆ GetUserData()

const TVectorD * TPrincipal::GetUserData ( ) const

inline

Definition at line 67 of file TPrincipal.h.

◆ IsA()

TClass * TPrincipal::IsA ( ) const

inlineoverridevirtual

Returns: TClass describing current object

Reimplemented from TNamed.

Definition at line 79 of file TPrincipal.h.

◆ IsFolder()

Bool_t TPrincipal::IsFolder ( ) const

inlineoverridevirtual

Returns kTRUE in case object contains browsable objects (like containers or lists of other objects).

Reimplemented from TObject.

Definition at line 68 of file TPrincipal.h.

◆ MakeCode()

void TPrincipal::MakeCode	(	const char *	filename = "pca",
		Option_t *	opt = "" )

virtual

Generates the file <filename>, with .C appended if it does argument doesn't end in .cxx or .C.

The file contains the implementation of two functions

void X2P(Double_t *x, Double *p)

void P2X(Double_t *p, Double *x, Int_t nTest)

p

winID h TVirtualViewer3D TVirtualGLPainter p

Definition TGWin32VirtualGLProxy.cxx:51

ROOT::Detail::TRangeCast

Definition TCollection.h:311

TPrincipal::X2P

virtual void X2P(const Double_t *x, Double_t *p)

Calculate the principal components from the original data vector x, and return it in p.

Definition TPrincipal.cxx:1229

TPrincipal::P2X

virtual void P2X(const Double_t *p, Double_t *x, Int_t nTest)

Calculate x as a function of nTest of the most significant principal components p,...

Definition TPrincipal.cxx:1074

double

int

x

Double_t x[n]

Definition legend1.C:17

which does the same as TPrincipal::X2P and TPrincipal::P2X respectively. Please refer to these methods.

Further, the static variables:

Int_t    gNVariables
Double_t gEigenValues[]
Double_t gEigenVectors[]
Double_t gMeanValues[]
Double_t gSigmaValues[]

are initialized. The only ROOT header file needed is Rtypes.h

See TPrincipal::MakeRealCode for a list of options

Definition at line 562 of file TPrincipal.cxx.

◆ MakeHistograms()

void TPrincipal::MakeHistograms	(	const char *	name = "pca",
		Option_t *	opt = "epsdx" )

virtual

Make histograms of the result of the analysis.

The option string say which histograms to create

X Histogram original data
P Histogram principal components corresponding to original data
D Histogram the difference between the original data and the projection of principal unto a lower dimensional subspace (2D histograms)
E Histogram the eigenvalues
S Histogram the square of the residues (see TPrincipal::SumOfSquareResiduals) The histograms will be named <name>_<type><number>, where <name> is the first argument, <type> is one of X,P,D,E,S, and <number> is the variable.

Definition at line 587 of file TPrincipal.cxx.

◆ MakeMethods()

void TPrincipal::MakeMethods	(	const char *	classname = "PCA",
		Option_t *	opt = "" )

virtual

Generate the file <classname>PCA.cxx which contains the implementation of two methods:

void <classname>::X2P(Double_t *x, Double *p)

void <classname>::P2X(Double_t *p, Double *x, Int_t nTest)

which does the same as TPrincipal::X2P and TPrincipal::P2X respectively. Please refer to these methods.

Further, the public static members:

Int_t    <classname>::fgNVariables
Double_t <classname>::fgEigenValues[]
Double_t <classname>::fgEigenVectors[]
Double_t <classname>::fgMeanValues[]
Double_t <classname>::fgSigmaValues[]

are initialized, and assumed to exist. The class declaration is assumed to be in <classname>.h and assumed to be provided by the user.

See TPrincipal::MakeRealCode for a list of options

The minimal class definition is:

class <classname> {
public:
  static Int_t    fgNVariables;
  static Double_t fgEigenVectors[];
  static Double_t fgEigenValues[];
  static Double_t fgMeanValues[];
  static Double_t fgSigmaValues[];
 
  void X2P(Double_t *x, Double_t *p);
  void P2X(Double_t *p, Double_t *x, Int_t nTest);
};

Whether the methods <classname>::X2P and <classname>::P2X should be static or not, is up to the user.

Definition at line 871 of file TPrincipal.cxx.

◆ MakeNormalised()

void TPrincipal::MakeNormalised ( )

protected

Normalize the covariance matrix.

Definition at line 809 of file TPrincipal.cxx.

◆ MakePrincipals()

void TPrincipal::MakePrincipals ( )

virtual

Perform the principal components analysis.

This is done in several stages in the TMatrix::EigenVectors method:

Transform the covariance matrix into a tridiagonal matrix.
Find the eigenvalues and vectors of the tridiagonal matrix.

Definition at line 884 of file TPrincipal.cxx.

◆ MakeRealCode()

void TPrincipal::MakeRealCode	(	const char *	filename,
		const char *	classname,
		Option_t *	option = "" )

protected

This is the method that actually generates the code for the transformations to and from feature space and pattern space It's called by TPrincipal::MakeCode and TPrincipal::MakeMethods.

The options are: NONE so far

Definition at line 906 of file TPrincipal.cxx.

◆ operator=()

TPrincipal & TPrincipal::operator= ( const TPrincipal & pr )

protected

Assignment operator.

Definition at line 337 of file TPrincipal.cxx.

◆ P2X()

void TPrincipal::P2X	(	const Double_t *	p,
		Double_t *	x,
		Int_t	nTest )

virtual

Calculate x as a function of nTest of the most significant principal components p, and return it in x.

It's the users responsibility to make sure that both x and p are of the right size (i.e., memory must be allocated for x).

Definition at line 1074 of file TPrincipal.cxx.

◆ Print()

void TPrincipal::Print ( Option_t * opt = "MSE" ) const

overridevirtual

Print the statistics Options are.

M Print mean values of original data
S Print sigma values of original data
E Print eigenvalues of covariance matrix
V Print eigenvectors of covariance matrix Default is MSE

Reimplemented from TNamed.

Definition at line 1094 of file TPrincipal.cxx.

◆ Streamer()

void TPrincipal::Streamer ( TBuffer & R__b )

overridevirtual

Stream an object of class TObject.

Reimplemented from TNamed.

◆ StreamerNVirtual()

void TPrincipal::StreamerNVirtual ( TBuffer & ClassDef_StreamerNVirtual_b )

inline

Definition at line 79 of file TPrincipal.h.

◆ SumOfSquareResiduals()

void TPrincipal::SumOfSquareResiduals	(	const Double_t *	x,
		Double_t *	s )

virtual

Calculates the sum of the square residuals, that is.

$E_{N} = \sum_{i = 0}^{P - 1} {(x_{i} - x_{i}^{'})}^{2}$

where $x_{i}^{'} = \sum_{j = i}^{N} p_{i} e_{n_{j}}$ is the $i^{th}$ component of the principal vector, corresponding to $x_{i}$ , the original data; I.e., the square distance to the space spanned by $N$ eigenvectors.

Definition at line 1183 of file TPrincipal.cxx.

◆ Test()

void TPrincipal::Test ( Option_t * option = "" )

Test the PCA, bye calculating the sum square of residuals (see method SumOfSquareResiduals), and display the histogram.

Definition at line 1205 of file TPrincipal.cxx.

◆ X2P()

void TPrincipal::X2P	(	const Double_t *	x,
		Double_t *	p )

virtual

Calculate the principal components from the original data vector x, and return it in p.

It's the users responsibility to make sure that both x and p are of the right size (i.e., memory must be allocated for p).

Definition at line 1229 of file TPrincipal.cxx.

Member Data Documentation

◆ fCovarianceMatrix

TMatrixD TPrincipal::fCovarianceMatrix

protected

Covariance matrix.

Definition at line 29 of file TPrincipal.h.

◆ fEigenValues

TVectorD TPrincipal::fEigenValues

protected

Eigenvalue vector of trans.

Definition at line 32 of file TPrincipal.h.

◆ fEigenVectors

TMatrixD TPrincipal::fEigenVectors

protected

Eigenvector matrix of trans.

Definition at line 31 of file TPrincipal.h.

◆ fHistograms

TList* TPrincipal::fHistograms

protected

List of histograms.

Definition at line 40 of file TPrincipal.h.

◆ fIsNormalised

Bool_t TPrincipal::fIsNormalised

protected

Normalize matrix?

Definition at line 42 of file TPrincipal.h.

◆ fMeanValues

TVectorD TPrincipal::fMeanValues

protected

Mean value over all data points.

Definition at line 27 of file TPrincipal.h.

◆ fNumberOfDataPoints

Int_t TPrincipal::fNumberOfDataPoints

protected

Number of data points.

Definition at line 24 of file TPrincipal.h.

◆ fNumberOfVariables

Int_t TPrincipal::fNumberOfVariables

protected

Number of variables.

Definition at line 25 of file TPrincipal.h.

◆ fOffDiagonal

TVectorD TPrincipal::fOffDiagonal

protected

Elements of the tridiagonal.

Definition at line 34 of file TPrincipal.h.

◆ fSigmas

TVectorD TPrincipal::fSigmas

protected

vector of sigmas

Definition at line 28 of file TPrincipal.h.

◆ fStoreData

Bool_t TPrincipal::fStoreData

protected

Should we store input data?

Definition at line 43 of file TPrincipal.h.

◆ fTrace

Double_t TPrincipal::fTrace

protected

Trace of covarience matrix.

Definition at line 38 of file TPrincipal.h.

◆ fUserData

TVectorD TPrincipal::fUserData

protected

Vector of original data points.

Definition at line 36 of file TPrincipal.h.

Libraries for TPrincipal:

[legend]

The documentation for this class was generated from the following files:

hist/hist/inc/TPrincipal.h
hist/hist/src/TPrincipal.cxx

Introduction

Overview

Principal Components Method

Public Member Functions

Static Public Member Functions

Protected Member Functions

Protected Attributes

Additional Inherited Members

Constructor & Destructor Documentation

◆ TPrincipal() [1/3]

◆ TPrincipal() [2/3]

◆ ~TPrincipal()

◆ TPrincipal() [3/3]

Member Function Documentation

◆ AddRow()

◆ Browse()

◆ Class()

◆ Class_Name()

◆ Class_Version()

◆ Clear()

◆ DeclFileName()

◆ GetCovarianceMatrix()

◆ GetEigenValues()

◆ GetEigenVectors()

◆ GetHistograms()

◆ GetMeanValues()

◆ GetRow()

◆ GetSigmas()

◆ GetUserData()

◆ IsA()

◆ IsFolder()

◆ MakeCode()

◆ MakeHistograms()

◆ MakeMethods()

◆ MakeNormalised()

◆ MakePrincipals()

◆ MakeRealCode()

◆ operator=()

◆ P2X()

◆ Print()

◆ Streamer()

◆ StreamerNVirtual()

◆ SumOfSquareResiduals()

◆ Test()

◆ X2P()

Member Data Documentation

◆ fCovarianceMatrix

◆ fEigenValues

◆ fEigenVectors

◆ fHistograms

◆ fIsNormalised

◆ fMeanValues

◆ fNumberOfDataPoints

◆ fNumberOfVariables

◆ fOffDiagonal

◆ fSigmas

◆ fStoreData

◆ fTrace

◆ fUserData