Logo ROOT  
Reference Guide
 
Loading...
Searching...
No Matches
TPrincipal.cxx
Go to the documentation of this file.
1// @(#)root/hist:$Id$
2// Author: Christian Holm Christensen 1/8/2000
3
4/*************************************************************************
5 * Copyright (C) 1995-2004, Rene Brun and Fons Rademakers. *
6 * All rights reserved. *
7 * *
8 * For the licensing terms see $ROOTSYS/LICENSE. *
9 * For the list of contributors see $ROOTSYS/README/CREDITS. *
10 *************************************************************************/
11
12/** \class TPrincipal
13 \ingroup Hist
14Principal Components Analysis (PCA)
15
16The current implementation is based on the LINTRA package from CERNLIB
17by R. Brun, H. Hansroul, and J. Kubler.
18The class has been implemented by Christian Holm Christensen in August 2000.
19
20## Introduction
21
22In many applications of various fields of research, the treatment of
23large amounts of data requires powerful techniques capable of rapid
24data reduction and analysis. Usually, the quantities most
25conveniently measured by the experimentalist, are not necessarily the
26most significant for classification and analysis of the data. It is
27then useful to have a way of selecting an optimal set of variables
28necessary for the recognition process and reducing the dimensionality
29of the problem, resulting in an easier classification procedure.
30
31This paper describes the implementation of one such method of
32feature selection, namely the principal components analysis. This
33multidimensional technique is well known in the field of pattern
34recognition and and its use in Particle Physics has been documented
35elsewhere (cf. H. Wind, <I>Function Parameterization</I>, CERN
3672-21).
37
38## Overview
39Suppose we have prototypes which are trajectories of particles,
40passing through a spectrometer. If one measures the passage of the
41particle at say 8 fixed planes, the trajectory is described by an
428-component vector:
43\f[
44 \mathbf{x} = \left(x_0, x_1, \ldots, x_7\right)
45\f]
46in 8-dimensional pattern space.
47
48One proceeds by generating a representative tracks sample and
49building up the covariance matrix \f$\mathsf{C}\f$. Its eigenvectors and
50eigenvalues are computed by standard methods, and thus a new basis is
51obtained for the original 8-dimensional space the expansion of the
52prototypes,
53\f[
54 \mathbf{x}_m = \sum^7_{i=0} a_{m_i} \mathbf{e}_i
55 \quad
56 \mbox{where}
57 \quad
58 a_{m_i} = \mathbf{x}^T\bullet\mathbf{e}_i
59\f]
60allows the study of the behavior of the coefficients \f$a_{m_i}\f$ for all
61the tracks of the sample. The eigenvectors which are insignificant for
62the trajectory description in the expansion will have their
63corresponding coefficients \f$a_{m_i}\f$ close to zero for all the
64prototypes.
65
66On one hand, a reduction of the dimensionality is then obtained by
67omitting these least significant vectors in the subsequent analysis.
68
69On the other hand, in the analysis of real data, these least
70significant variables(?) can be used for the pattern
71recognition problem of extracting the valid combinations of
72coordinates describing a true trajectory from the set of all possible
73wrong combinations.
74
75The program described here performs this principal components analysis
76on a sample of data provided by the user. It computes the covariance
77matrix, its eigenvalues ands corresponding eigenvectors and exhibits
78the behavior of the principal components \f$a_{m_i}\f$, thus providing
79to the user all the means of understanding their data.
80
81## Principal Components Method
82Let's consider a sample of \f$M\f$ prototypes each being characterized by
83\f$P\f$ variables \f$x_0, x_1, \ldots, x_{P-1}\f$. Each prototype is a point, or a
84column vector, in a \f$P\f$-dimensional *Pattern space*.
85\f[
86 \mathbf{x} = \left[\begin{array}{c}
87 x_0\\x_1\\\vdots\\x_{P-1}\end{array}\right]\,,
88\f]
89where each \f$x_n\f$ represents the particular value associated with the
90\f$n\f$-dimension.
91
92Those \f$P\f$ variables are the quantities accessible to the
93experimentalist, but are not necessarily the most significant for the
94classification purpose.
95
96The *Principal Components Method* consists of applying a
97*linear* transformation to the original variables. This
98transformation is described by an orthogonal matrix and is equivalent
99to a rotation of the original pattern space into a new set of
100coordinate vectors, which hopefully provide easier feature
101identification and dimensionality reduction.
102
103Let's define the covariance matrix:
104\f[
105 \mathsf{C} = \left\langle\mathbf{y}\mathbf{y}^T\right\rangle
106 \quad\mbox{where}\quad
107 \mathbf{y} = \mathbf{x} - \left\langle\mathbf{x}\right\rangle\,,
108\f]
109and the brackets indicate mean value over the sample of \f$M\f$
110prototypes.
111
112This matrix \f$\mathsf{C}\f$ is real, positive definite, symmetric, and will
113have all its eigenvalues greater then zero. It will now be show that
114among the family of all the complete orthonormal bases of the pattern
115space, the base formed by the eigenvectors of the covariance matrix
116and belonging to the largest eigenvalues, corresponds to the most
117significant features of the description of the original prototypes.
118
119let the prototypes be expanded on into a set of \f$N\f$ basis vectors
120\f$\mathbf{e}_n, n=0,\ldots,N,N+1, \ldots, P-1\f$
121\f[
122 \mathbf{y}_i = \sum^N_{i=0} a_{i_n} \mathbf{e}_n,
123 \quad
124 i = 1, \ldots, M,
125 \quad
126 N < P-1
127\f]
128The `best' feature coordinates \f$\mathbf{e}_n\f$, spanning a *feature
129space*, will be obtained by minimizing the error due to this
130truncated expansion, i.e.,
131\f[
132 \min\left(E_N\right) =
133 \min\left[\left\langle\left(\mathbf{y}_i - \sum^N_{i=0} a_{i_n} \mathbf{e}_n\right)^2\right\rangle\right]
134\f]
135with the conditions:
136\f[
137 \mathbf{e}_k\bullet\mathbf{e}_j = \delta_{jk} =
138 \left\{\begin{array}{rcl}
139 1 & \mbox{for} & k = j\\
140 0 & \mbox{for} & k \neq j
141 \end{array}\right.
142\f]
143Multiplying (3) by \f$\mathbf{e}^T_n\f$ using (5),
144we get
145\f[
146 a_{i_n} = \mathbf{y}_i^T\bullet\mathbf{e}_n\,,
147\f]
148so the error becomes
149\f{eqnarray*}{
150 E_N &=&
151 \left\langle\left[\sum_{n=N+1}^{P-1} a_{i_n}\mathbf{e}_n\right]^2\right\rangle\nonumber\\
152 &=&
153 \left\langle\left[\sum_{n=N+1}^{P-1} \mathbf{y}_i^T\bullet\mathbf{e}_n\mathbf{e}_n\right]^2\right\rangle\nonumber\\
154 &=&
155 \left\langle\sum_{n=N+1}^{P-1} \mathbf{e}_n^T\mathbf{y}_i\mathbf{y}_i^T\mathbf{e}_n\right\rangle\nonumber\\
156 &=&
157 \sum_{n=N+1}^{P-1} \mathbf{e}_n^T\mathsf{C}\mathbf{e}_n
158\f}
159The minimization of the sum in (7) is obtained when each
160term \f$\mathbf{e}_n^\mathsf{C}\mathbf{e}_n\f$ is minimum, since \f$\mathsf{C}\f$ is
161positive definite. By the method of Lagrange multipliers, and the
162condition (5), we get
163\f[
164 E_N = \sum^{P-1}_{n=N+1} \left(\mathbf{e}_n^T\mathsf{C}\mathbf{e}_n -
165 l_n\mathbf{e}_n^T\bullet\mathbf{e}_n + l_n\right)
166\f]
167The minimum condition \f$\frac{dE_N}{d\mathbf{e}^T_n} = 0\f$ leads to the
168equation
169\f[
170 \mathsf{C}\mathbf{e}_n = l_n\mathbf{e}_n\,,
171\f]
172which shows that \f$\mathbf{e}_n\f$ is an eigenvector of the covariance
173matrix \f$\mathsf{C}\f$ with eigenvalue \f$l_n\f$. The estimated minimum error is
174then given by
175\f[
176 E_N \sim \sum^{P-1}_{n=N+1} \mathbf{e}_n^T\bullet l_n\mathbf{e}_n
177 = \sum^{P-1}_{n=N+1} l_n\,,
178\f]
179where \f$l_n,\,n=N+1,\ldots,P\f$ \f$l_n,\,n=N+1,\ldots,P-1\f$ are the eigenvalues associated with the
180omitted eigenvectors in the expansion (3). Thus, by choosing
181the \f$N\f$ largest eigenvalues, and their associated eigenvectors, the
182error \f$E_N\f$ is minimized.
183
184The transformation matrix to go from the pattern space to the feature
185space consists of the ordered eigenvectors \f$\mathbf{e}_1,\ldots,\mathbf{e}_P\f$
186\f$\mathbf{e}_0,\ldots,\mathbf{e}_{P-1}\f$ for its columns
187\f[
188 \mathsf{T} = \left[
189 \begin{array}{cccc}
190 \mathbf{e}_0 &
191 \mathbf{e}_1 &
192 \vdots &
193 \mathbf{e}_{P-1}
194 \end{array}\right]
195 = \left[
196 \begin{array}{cccc}
197 \mathbf{e}_{0_0} & \mathbf{e}_{1_0} & \cdots & \mathbf{e}_{{P-1}_0}\\
198 \mathbf{e}_{0_1} & \mathbf{e}_{1_1} & \cdots & \mathbf{e}_{{P-1}_1}\\
199 \vdots & \vdots & \ddots & \vdots \\
200 \mathbf{e}_{0_{P-1}} & \mathbf{e}_{1_{P-1}} & \cdots & \mathbf{e}_{{P-1}_{P-1}}\\
201 \end{array}\right]
202\f]
203This is an orthogonal transformation, or rotation, of the pattern
204space and feature selection results in ignoring certain coordinates
205in the transformed space.
206
207Christian Holm August 2000, CERN
208*/
209
210#include "TPrincipal.h"
211
212#include "TVectorD.h"
213#include "TMatrixD.h"
214#include "TMatrixDSymEigen.h"
215#include "TMath.h"
216#include "TList.h"
217#include "TH2.h"
218#include "TDatime.h"
219#include "TBrowser.h"
220#include "TROOT.h"
221#include "Riostream.h"
222
223
224
225////////////////////////////////////////////////////////////////////////////////
226/// Empty constructor. Do not use.
227
229 : fMeanValues(0),
230 fSigmas(0),
231 fCovarianceMatrix(1,1),
232 fEigenVectors(1,1),
233 fEigenValues(0),
234 fOffDiagonal(0),
235 fStoreData(kFALSE)
236{
237 fTrace = 0;
238 fHistograms = nullptr;
242}
243
244////////////////////////////////////////////////////////////////////////////////
245/// Constructor. Argument is number of variables in the sample of data
246/// Options are:
247/// - N Normalize the covariance matrix (default)
248/// - D Store input data (default)
249///
250/// The created object is named "principal" by default.
251
253 : fMeanValues(nVariables),
254 fSigmas(nVariables),
255 fCovarianceMatrix(nVariables,nVariables),
256 fEigenVectors(nVariables,nVariables),
257 fEigenValues(nVariables),
258 fOffDiagonal(nVariables),
259 fStoreData(kFALSE)
260{
261 if (nVariables <= 1) {
262 Error("TPrincipal", "You can't be serious - nVariables <= 1!!!");
263 return;
264 }
265 if (nVariables > std::numeric_limits<Int_t>::max()) {
266 Error("TPrincipal", "`nVariables` input parameter %lld is larger than the allowed maximum %d", nVariables, std::numeric_limits<Int_t>::max());
267 return;
268 }
269
270 SetName("principal");
271
272 fTrace = 0;
273 fHistograms = nullptr;
277 while (opt && strlen(opt) > 0) {
278 switch(*opt++) {
279 case 'N':
280 case 'n':
282 break;
283 case 'D':
284 case 'd':
286 break;
287 default:
288 break;
289 }
290 }
291
292 if (!fMeanValues.IsValid())
293 Error("TPrincipal","Couldn't create vector mean values");
294 if (!fSigmas.IsValid())
295 Error("TPrincipal","Couldn't create vector sigmas");
297 Error("TPrincipal","Couldn't create covariance matrix");
298 if (!fEigenVectors.IsValid())
299 Error("TPrincipal","Couldn't create eigenvector matrix");
300 if (!fEigenValues.IsValid())
301 Error("TPrincipal","Couldn't create eigenvalue vector");
302 if (!fOffDiagonal.IsValid())
303 Error("TPrincipal","Couldn't create offdiagonal vector");
304 if (fStoreData) {
306 fUserData.Zero();
307 if (!fUserData.IsValid())
308 Error("TPrincipal","Couldn't create user data vector");
309 }
310}
311
312////////////////////////////////////////////////////////////////////////////////
313/// Copy constructor.
314
316 TNamed(pr),
317 fNumberOfDataPoints(pr.fNumberOfDataPoints),
318 fNumberOfVariables(pr.fNumberOfVariables),
319 fMeanValues(pr.fMeanValues),
320 fSigmas(pr.fSigmas),
321 fCovarianceMatrix(pr.fCovarianceMatrix),
322 fEigenVectors(pr.fEigenVectors),
323 fEigenValues(pr.fEigenValues),
324 fOffDiagonal(pr.fOffDiagonal),
325 fUserData(pr.fUserData),
326 fTrace(pr.fTrace),
327 fHistograms(pr.fHistograms),
328 fIsNormalised(pr.fIsNormalised),
329 fStoreData(pr.fStoreData)
330{
331}
332
333////////////////////////////////////////////////////////////////////////////////
334/// Assignment operator.
335
337{
338 if(this!=&pr) {
340 fNumberOfDataPoints=pr.fNumberOfDataPoints;
341 fNumberOfVariables=pr.fNumberOfVariables;
342 fMeanValues=pr.fMeanValues;
343 fSigmas=pr.fSigmas;
344 fCovarianceMatrix=pr.fCovarianceMatrix;
345 fEigenVectors=pr.fEigenVectors;
346 fEigenValues=pr.fEigenValues;
347 fOffDiagonal=pr.fOffDiagonal;
348 fUserData=pr.fUserData;
349 fTrace=pr.fTrace;
350 fHistograms=pr.fHistograms;
351 fIsNormalised=pr.fIsNormalised;
352 fStoreData=pr.fStoreData;
353 }
354 return *this;
355}
356
357////////////////////////////////////////////////////////////////////////////////
358/// Destructor.
359
361{
362 if (fHistograms) {
364 delete fHistograms;
365 }
366}
367
368////////////////////////////////////////////////////////////////////////////////
369/// Add a data point and update the covariance matrix. The input
370/// array must be <TT>fNumberOfVariables</TT> long.
371///
372///
373/// The Covariance matrix and mean values of the input data is calculated
374/// on the fly by the following equations:
375///
376/// \f[
377/// \left<x_i\right>^{(0)} = x_{i0}
378/// \f]
379///
380///
381/// \f[
382/// \left<x_i\right>^{(n)} = \left<x_i\right>^{(n-1)}
383/// + \frac1n \left(x_{in} - \left<x_i\right>^{(n-1)}\right)
384/// \f]
385///
386/// \f[
387/// C_{ij}^{(0)} = 0
388/// \f]
389///
390///
391///
392/// \f[
393/// C_{ij}^{(n)} = C_{ij}^{(n-1)}
394/// + \frac1{n-1}\left[\left(x_{in} - \left<x_i\right>^{(n)}\right)
395/// \left(x_{jn} - \left<x_j\right>^{(n)}\right)\right]
396/// - \frac1n C_{ij}^{(n-1)}
397/// \f]
398///
399/// since this is a really fast method, with no rounding errors (please
400/// refer to CERN 72-21 pp. 54-106).
401///
402///
403/// The data is stored internally in a <TT>TVectorD</TT>, in the following
404/// way:
405///
406/// \f[
407/// \mathbf{x} = \left[\left(x_{0_0},\ldots,x_{{P-1}_0}\right),\ldots,
408/// \left(x_{0_i},\ldots,x_{{P-1}_i}\right), \ldots\right]
409/// \f]
410///
411/// With \f$P\f$ as defined in the class description.
412
414{
415 if (!p)
416 return;
417 if (fNumberOfDataPoints == std::numeric_limits<Int_t>::max()) {
418 Error("AddRow", "`fNumberOfDataPoints` has reached its allowed maximum %d, cannot add new row.", fNumberOfDataPoints);
419 return;
420 }
421
422 // Increment the data point counter
423 Int_t i,j;
424 if (++fNumberOfDataPoints == 1) {
425 for (i = 0; i < fNumberOfVariables; i++)
426 fMeanValues(i) = p[i];
427 }
428 else {
429
431 const Double_t invnpM1 = 1. /(Double_t(fNumberOfDataPoints - 1));
432 const Double_t cor = 1. - invnp;
433 // use directly vector array for faster element access
436 for (i = 0; i < fNumberOfVariables; i++) {
437
438 meanValues[i] *= cor;
439 meanValues[i] += p[i] * invnp;
440 const Double_t t1 = (p[i] - meanValues[i]) * invnpM1;
441
442 // Setting Matrix (lower triangle) elements
443 for (j = 0; j < i + 1; j++) {
444 const Int_t index = i * fNumberOfVariables + j;
445 covMatrix[index] *= cor;
446 covMatrix[index] += t1 * (p[j] - meanValues[j]);
447 }
448 }
449 }
450
451 // Store data point in internal vector
452 // If the vector isn't big enough to hold the new data, then
453 // expand the vector by half it's size.
454 if (!fStoreData)
455 return;
459
460 for (i = 0; i < fNumberOfVariables; i++) {
462 fUserData.GetMatrixArray()[j] = p[i];
463 }
464
465}
466
467////////////////////////////////////////////////////////////////////////////////
468/// Browse the TPrincipal object in the TBrowser.
469
471{
472 if (fHistograms) {
473 TIter next(fHistograms);
474 TH1* h = nullptr;
475 while ((h = (TH1*)next()))
476 b->Add(h,h->GetName());
477 }
478
479 if (fStoreData)
480 b->Add(&fUserData,"User Data");
481 b->Add(&fCovarianceMatrix,"Covariance Matrix");
482 b->Add(&fMeanValues,"Mean value vector");
483 b->Add(&fSigmas,"Sigma value vector");
484 b->Add(&fEigenValues,"Eigenvalue vector");
485 b->Add(&fEigenVectors,"Eigenvector Matrix");
486
487}
488
489////////////////////////////////////////////////////////////////////////////////
490/// Clear the data in Object. Notice, that's not possible to change
491/// the dimension of the original data.
492
494{
495 if (fHistograms) {
496 fHistograms->Delete(opt);
497 }
498
500 fTrace = 0;
505 fSigmas.Zero();
507
508 if (fStoreData) {
510 fUserData.Zero();
511 }
512}
513
514////////////////////////////////////////////////////////////////////////////////
515/// Return a row of the user supplied data.
516/// If row is out of bounds, 0 is returned.
517/// It's up to the user to delete the returned array.
518/// Row 0 is the first row;
519
521{
522 if (row >= fNumberOfDataPoints)
523 return nullptr;
524
525 if (!fStoreData)
526 return nullptr;
527
529 if (index > std::numeric_limits<Int_t>::max()) {
530 Error("GetRow", "Input parameter `row` %lld x fNumberOfVariables %d goes into overflow (%lld>%d), returning nullptr.", row, fNumberOfVariables, index, std::numeric_limits<Int_t>::max());
531 return nullptr;
532 }
533 return &fUserData(index);
534}
535
536
537////////////////////////////////////////////////////////////////////////////////
538/// Generates the file `<filename>`, with `.C` appended if it does
539/// argument doesn't end in .cxx or .C.
540///
541/// The file contains the implementation of two functions
542/// ~~~ {.cpp}
543/// void X2P(Double_t *x, Double *p)
544/// void P2X(Double_t *p, Double *x, Int_t nTest)
545/// ~~~
546/// which does the same as `TPrincipal::X2P` and `TPrincipal::P2X`
547/// respectively. Please refer to these methods.
548///
549/// Further, the static variables:
550/// ~~~ {.cpp}
551/// Int_t gNVariables
552/// Double_t gEigenValues[]
553/// Double_t gEigenVectors[]
554/// Double_t gMeanValues[]
555/// Double_t gSigmaValues[]
556/// ~~~
557/// are initialized. The only ROOT header file needed is Rtypes.h
558///
559/// See TPrincipal::MakeRealCode for a list of options
560
562{
564 if (!outName.EndsWith(".C") && !outName.EndsWith(".cxx"))
565 outName += ".C";
566
567 MakeRealCode(outName.Data(),"",opt);
568}
569
570////////////////////////////////////////////////////////////////////////////////
571/// Make histograms of the result of the analysis.
572/// The option string say which histograms to create
573/// - X Histogram original data
574/// - P Histogram principal components corresponding to
575/// original data
576/// - D Histogram the difference between the original data
577/// and the projection of principal unto a lower
578/// dimensional subspace (2D histograms)
579/// - E Histogram the eigenvalues
580/// - S Histogram the square of the residues
581/// (see `TPrincipal::SumOfSquareResiduals`)
582/// The histograms will be named `<name>_<type><number>`, where `<name>`
583/// is the first argument, `<type>` is one of X,P,D,E,S, and `<number>`
584/// is the variable.
585
587{
593
594 Int_t len = opt ? strlen(opt) : 0;
595 Int_t i,j,k;
596 for (i = 0; i < len; i++) {
597 switch (opt[i]) {
598 case 'X':
599 case 'x':
600 if (fStoreData)
601 makeX = kTRUE;
602 break;
603 case 'd':
604 case 'D':
605 if (fStoreData)
606 makeD = kTRUE;
607 break;
608 case 'P':
609 case 'p':
610 if (fStoreData)
611 makeP = kTRUE;
612 break;
613 case 'E':
614 case 'e':
615 makeE = kTRUE;
616 break;
617 case 's':
618 case 'S':
619 if (fStoreData)
620 makeS = kTRUE;
621 break;
622 default:
623 Warning("MakeHistograms","Unknown option: %c",opt[i]);
624 }
625 }
626
627 // If no option was given, then exit gracefully
628 if (!makeX && !makeD && !makeP && !makeE && !makeS)
629 return;
630
631 // If the list of histograms doesn't exist, create it.
632 if (!fHistograms)
633 fHistograms = new TList;
634
635 // Don't create the histograms if they are already in the TList.
636 if (makeX && fHistograms->FindObject(TString::Format("%s_x000",name)))
637 makeX = kFALSE;
638 if (makeD && fHistograms->FindObject(TString::Format("%s_d000",name)))
639 makeD = kFALSE;
640 if (makeP && fHistograms->FindObject(TString::Format("%s_p000",name)))
641 makeP = kFALSE;
643 makeE = kFALSE;
645 makeS = kFALSE;
646
647 TH1F **hX = nullptr;
648 TH2F **hD = nullptr;
649 TH1F **hP = nullptr;
650 TH1F *hE = nullptr;
651 TH1F *hS = nullptr;
652
653 // Initialize the arrays of histograms needed
654 if (makeX)
655 hX = new TH1F * [fNumberOfVariables];
656
657 if (makeD)
658 hD = new TH2F * [fNumberOfVariables];
659
660 if (makeP)
661 hP = new TH1F * [fNumberOfVariables];
662
663 if (makeE){
664 hE = new TH1F(TString::Format("%s_e",name), "Eigenvalues of Covariance matrix",
666 hE->SetXTitle("Eigenvalue");
668 }
669
670 if (makeS) {
671 hS = new TH1F(TString::Format("%s_s",name),"E_{N}",
673 hS->SetXTitle("N");
674 hS->SetYTitle("#sum_{i=1}^{M} (x_{i} - x'_{N,i})^{2}");
675 fHistograms->Add(hS);
676 }
677
678 // Initialize sub elements of the histogram arrays
679 for (i = 0; i < fNumberOfVariables; i++) {
680 if (makeX) {
681 // We allow 4 sigma spread in the original data in our
682 // histogram.
683 Double_t xlowb = fMeanValues(i) - 4 * fSigmas(i);
684 Double_t xhighb = fMeanValues(i) + 4 * fSigmas(i);
685 Int_t xbins = (fNumberOfDataPoints > 0 && fNumberOfDataPoints < 100 ? 1 : fNumberOfDataPoints/100);
686 hX[i] = new TH1F(TString::Format("%s_x%03d", name, i),
687 TString::Format("Pattern space, variable %d", i),
688 xbins,xlowb,xhighb);
689 hX[i]->SetXTitle(Form("x_{%d}",i));
690 fHistograms->Add(hX[i]);
691 }
692
693 if(makeD) {
694 // The upper limit below is arbitrary!!!
695 Double_t dlowb = 0;
696 Double_t dhighb = 20;
698 hD[i] = new TH2F(TString::Format("%s_d%03d", name, i),
699 TString::Format("Distance from pattern to feature space, variable %d", i),
702 hD[i]->SetXTitle(TString::Format("|x_{%d} - x'_{%d,N}|/#sigma_{%d}",i,i,i));
703 hD[i]->SetYTitle("N");
704 fHistograms->Add(hD[i]);
705 }
706
707 if(makeP) {
708 // For some reason, the trace of the none-scaled matrix
709 // (see TPrincipal::MakeNormalised) should enter here. Taken
710 // from LINTRA code.
712 Double_t plowb = -10 * TMath::Sqrt(et);
714 Int_t pbins = 100;
715 hP[i] = new TH1F(TString::Format("%s_p%03d", name, i),
716 TString::Format("Feature space, variable %d", i),
718 hP[i]->SetXTitle(TString::Format("p_{%d}",i));
719 fHistograms->Add(hP[i]);
720 }
721
722 if (makeE)
723 // The Eigenvector histogram is easy
724 hE->Fill(i,fEigenValues(i));
725
726 }
727 if (!makeX && !makeP && !makeD && !makeS) {
728 if (hX)
729 delete[] hX;
730 if (hD)
731 delete[] hD;
732 if (hP)
733 delete[] hP;
734 return;
735 }
736
737 Double_t *x = nullptr;
740 for (i = 0; i < fNumberOfDataPoints; i++) {
741
742 // Zero arrays
743 for (j = 0; j < fNumberOfVariables; j++)
744 p[j] = d[j] = 0;
745
746 // update the original data histogram
747 x = (Double_t*)(GetRow(i));
748 R__ASSERT(x);
749
750 if (makeP||makeD||makeS)
751 // calculate the corresponding principal component
752 X2P(x,p);
753
754 if (makeD || makeS) {
755 // Calculate the difference between the original data, and the
756 // same project onto principal components, and then to a lower
757 // dimensional sub-space
758 for (j = fNumberOfVariables; j > 0; j--) {
759 P2X(p,d,j);
760
761 for (k = 0; k < fNumberOfVariables; k++) {
762 // We use the absolute value of the difference!
763 d[k] = x[k] - d[k];
764
765 if (makeS)
766 hS->Fill(j,d[k]*d[k]);
767
768 if (makeD) {
769 d[k] = TMath::Abs(d[k]) / (fIsNormalised ? fSigmas(k) : 1);
770 (hD[k])->Fill(d[k],j);
771 }
772 }
773 }
774 }
775
776 if (makeX||makeP) {
777 // If we are asked to make any of these histograms, we have to
778 // go here
779 for (j = 0; j < fNumberOfVariables; j++) {
780 if (makeX)
781 (hX[j])->Fill(x[j]);
782
783 if (makeP)
784 (hP[j])->Fill(p[j]);
785 }
786 }
787 }
788 // Clean up
789 if (hX)
790 delete [] hX;
791 if (hD)
792 delete [] hD;
793 if (hP)
794 delete [] hP;
795 if (d)
796 delete [] d;
797 if (p)
798 delete [] p;
799
800 // Normalize the residues
801 if (makeS)
802 hS->Scale(Double_t(1.)/fNumberOfDataPoints);
803}
804
805////////////////////////////////////////////////////////////////////////////////
806/// Normalize the covariance matrix
807
809{
810 Int_t i,j;
811 for (i = 0; i < fNumberOfVariables; i++) {
813 if (fIsNormalised)
814 for (j = 0; j <= i; j++)
815 fCovarianceMatrix(i,j) /= (fSigmas(i) * fSigmas(j));
816
818 }
819
820 // Fill remaining parts of matrix, and scale.
821 for (i = 0; i < fNumberOfVariables; i++)
822 for (j = 0; j <= i; j++) {
825 }
826
827}
828
829////////////////////////////////////////////////////////////////////////////////
830/// Generate the file `<classname>PCA.cxx` which contains the
831/// implementation of two methods:
832/// ~~~ {.cpp}
833/// void <classname>::X2P(Double_t *x, Double *p)
834/// void <classname>::P2X(Double_t *p, Double *x, Int_t nTest)
835/// ~~~
836/// which does the same as TPrincipal::X2P and TPrincipal::P2X
837/// respectively. Please refer to these methods.
838///
839/// Further, the public static members:
840/// ~~~ {.cpp}
841/// Int_t <classname>::fgNVariables
842/// Double_t <classname>::fgEigenValues[]
843/// Double_t <classname>::fgEigenVectors[]
844/// Double_t <classname>::fgMeanValues[]
845/// Double_t <classname>::fgSigmaValues[]
846/// ~~~
847/// are initialized, and assumed to exist. The class declaration is
848/// assumed to be in `<classname>.h` and assumed to be provided by the
849/// user.
850///
851/// See TPrincipal::MakeRealCode for a list of options
852///
853/// The minimal class definition is:
854/// ~~~ {.cpp}
855/// class <classname> {
856/// public:
857/// static Int_t fgNVariables;
858/// static Double_t fgEigenVectors[];
859/// static Double_t fgEigenValues[];
860/// static Double_t fgMeanValues[];
861/// static Double_t fgSigmaValues[];
862///
863/// void X2P(Double_t *x, Double_t *p);
864/// void P2X(Double_t *p, Double_t *x, Int_t nTest);
865/// };
866/// ~~~
867/// Whether the methods `<classname>::%X2P` and `<classname>::%P2X` should
868/// be static or not, is up to the user.
869
870void TPrincipal::MakeMethods(const char *classname, Option_t *opt)
871{
872
873 MakeRealCode(TString::Format("%sPCA.cxx", classname), classname, opt);
874}
875
876
877////////////////////////////////////////////////////////////////////////////////
878/// Perform the principal components analysis.
879/// This is done in several stages in the TMatrix::EigenVectors method:
880/// - Transform the covariance matrix into a tridiagonal matrix.
881/// - Find the eigenvalues and vectors of the tridiagonal matrix.
882
884{
885 // Normalize covariance matrix
887
890 fEigenVectors = eigen.GetEigenVectors();
891 fEigenValues = eigen.GetEigenValues();
892 //make sure that eigenvalues are positive
893 for (Int_t i = 0; i < fNumberOfVariables; i++) {
894 if (fEigenValues[i] < 0) fEigenValues[i] = -fEigenValues[i];
895 }
896}
897
898////////////////////////////////////////////////////////////////////////////////
899/// This is the method that actually generates the code for the
900/// transformations to and from feature space and pattern space
901/// It's called by TPrincipal::MakeCode and TPrincipal::MakeMethods.
902///
903/// The options are: NONE so far
904
905void TPrincipal::MakeRealCode(const char *filename, const char *classname,
906 Option_t *)
907{
908 Bool_t isMethod = classname[0] == '\0' ? kFALSE : kTRUE;
909 TString prefix;
910 const char *cv_qual = isMethod ? "" : "static ";
911 if (isMethod)
912 prefix.Form("%s::", classname);
913
914 std::ofstream outFile(filename,std::ios::out|std::ios::trunc);
915 if (!outFile) {
916 Error("MakeRealCode","couldn't open output file '%s'",filename);
917 return;
918 }
919
920 std::cout << "Writing on file \"" << filename << "\" ... " << std::flush;
921 //
922 // Write header of file
923 //
924 // Emacs mode line ;-)
925 outFile << "// -*- mode: c++ -*-" << std::endl;
926 // Info about creator
927 outFile << "// " << std::endl
928 << "// File " << filename
929 << " generated by TPrincipal::MakeCode" << std::endl;
930 // Time stamp
932 outFile << "// on " << date.AsString() << std::endl;
933 // ROOT version info
934 outFile << "// ROOT version " << gROOT->GetVersion()
935 << std::endl << "//" << std::endl;
936 // General information on the code
937 outFile << "// This file contains the functions " << std::endl
938 << "//" << std::endl
939 << "// void " << prefix
940 << "X2P(Double_t *x, Double_t *p); " << std::endl
941 << "// void " << prefix
942 << "P2X(Double_t *p, Double_t *x, Int_t nTest);"
943 << std::endl << "//" << std::endl
944 << "// The first for transforming original data x in " << std::endl
945 << "// pattern space, to principal components p in " << std::endl
946 << "// feature space. The second function is for the" << std::endl
947 << "// inverse transformation, but using only nTest" << std::endl
948 << "// of the principal components in the expansion" << std::endl
949 << "// " << std::endl
950 << "// See TPrincipal class documentation for more "
951 << "information " << std::endl << "// " << std::endl;
952 // Header files
953 if (isMethod)
954 // If these are methods, we need the class header
955 outFile << "#include \"" << classname << ".h\"" << std::endl;
956 else
957 // otherwise, we need the typedefs of Int_t and Double_t
958 outFile << "#include <Rtypes.h> // needed for Double_t etc" << std::endl;
959
960 //
961 // Now for the data
962 //
963 // We make the Eigenvector matrix, Eigenvalue vector, Sigma vector,
964 // and Mean value vector static, since all are needed in both
965 // functions. Also ,the number of variables are stored in a static
966 // variable.
967 outFile << "//" << std::endl
968 << "// Static data variables" << std::endl
969 << "//" << std::endl;
970 outFile << cv_qual << "Int_t " << prefix << "gNVariables = "
971 << fNumberOfVariables << ";" << std::endl;
972
973 // Assign the values to the Eigenvector matrix. The elements are
974 // stored row-wise, that is element
975 // M[i][j] = e[i * nVariables + j]
976 // where i and j are zero-based.
977 outFile << std::endl << "// Assignment of eigenvector matrix." << std::endl
978 << "// Elements are stored row-wise, that is" << std::endl
979 << "// M[i][j] = e[i * nVariables + j] " << std::endl
980 << "// where i and j are zero-based" << std::endl;
981 outFile << cv_qual << "Double_t " << prefix
982 << "gEigenVectors[] = {" << std::flush;
983 Int_t i,j;
984 for (i = 0; i < fNumberOfVariables; i++) {
985 for (j = 0; j < fNumberOfVariables; j++) {
987 outFile << (index != 0 ? "," : "" ) << std::endl
988 << " " << fEigenVectors(i,j) << std::flush;
989 }
990 }
991 outFile << "};" << std::endl << std::endl;
992
993 // Assignment to eigenvalue vector. Zero-based.
994 outFile << "// Assignment to eigen value vector. Zero-based." << std::endl;
995 outFile << cv_qual << "Double_t " << prefix
996 << "gEigenValues[] = {" << std::flush;
997 for (i = 0; i < fNumberOfVariables; i++)
998 outFile << (i != 0 ? "," : "") << std::endl
999 << " " << fEigenValues(i) << std::flush;
1000 outFile << std::endl << "};" << std::endl << std::endl;
1001
1002 // Assignment to mean Values vector. Zero-based.
1003 outFile << "// Assignment to mean value vector. Zero-based." << std::endl;
1004 outFile << cv_qual << "Double_t " << prefix
1005 << "gMeanValues[] = {" << std::flush;
1006 for (i = 0; i < fNumberOfVariables; i++)
1007 outFile << (i != 0 ? "," : "") << std::endl
1008 << " " << fMeanValues(i) << std::flush;
1009 outFile << std::endl << "};" << std::endl << std::endl;
1010
1011 // Assignment to mean Values vector. Zero-based.
1012 outFile << "// Assignment to sigma value vector. Zero-based." << std::endl;
1013 outFile << cv_qual << "Double_t " << prefix
1014 << "gSigmaValues[] = {" << std::flush;
1015 for (i = 0; i < fNumberOfVariables; i++)
1016 outFile << (i != 0 ? "," : "") << std::endl
1017 << " " << (fIsNormalised ? fSigmas(i) : 1) << std::flush;
1018 // << " " << fSigmas(i) << std::flush;
1019 outFile << std::endl << "};" << std::endl << std::endl;
1020
1021 //
1022 // Finally we reach the functions themselves
1023 //
1024 // First: void x2p(Double_t *x, Double_t *p);
1025 //
1026 outFile << "// " << std::endl
1027 << "// The "
1028 << (isMethod ? "method " : "function ")
1029 << " void " << prefix
1030 << "X2P(Double_t *x, Double_t *p)"
1031 << std::endl << "// " << std::endl;
1032 outFile << "void " << prefix
1033 << "X2P(Double_t *x, Double_t *p) {" << std::endl
1034 << " for (Int_t i = 0; i < gNVariables; i++) {" << std::endl
1035 << " p[i] = 0;" << std::endl
1036 << " for (Int_t j = 0; j < gNVariables; j++)" << std::endl
1037 << " p[i] += (x[j] - gMeanValues[j]) " << std::endl
1038 << " * gEigenVectors[j * gNVariables + i] "
1039 << "/ gSigmaValues[j];" << std::endl << std::endl << " }"
1040 << std::endl << "}" << std::endl << std::endl;
1041 //
1042 // Now: void p2x(Double_t *p, Double_t *x, Int_t nTest);
1043 //
1044 outFile << "// " << std::endl << "// The "
1045 << (isMethod ? "method " : "function ")
1046 << " void " << prefix
1047 << "P2X(Double_t *p, Double_t *x, Int_t nTest)"
1048 << std::endl << "// " << std::endl;
1049 outFile << "void " << prefix
1050 << "P2X(Double_t *p, Double_t *x, Int_t nTest) {" << std::endl
1051 << " for (Int_t i = 0; i < gNVariables; i++) {" << std::endl
1052 << " x[i] = gMeanValues[i];" << std::endl
1053 << " for (Int_t j = 0; j < nTest; j++)" << std::endl
1054 << " x[i] += p[j] * gSigmaValues[i] " << std::endl
1055 << " * gEigenVectors[i * gNVariables + j];" << std::endl
1056 << " }" << std::endl << "}" << std::endl << std::endl;
1057
1058 // EOF
1059 outFile << "// EOF for " << filename << std::endl;
1060
1061 // Close the file
1062 outFile.close();
1063
1064 std::cout << "done" << std::endl;
1065}
1066
1067////////////////////////////////////////////////////////////////////////////////
1068/// Calculate x as a function of nTest of the most significant
1069/// principal components p, and return it in x.
1070/// It's the users responsibility to make sure that both x and p are
1071/// of the right size (i.e., memory must be allocated for x).
1072
1074{
1075 for (Int_t i = 0; i < fNumberOfVariables; i++){
1076 x[i] = fMeanValues(i);
1077 for (Int_t j = 0; j < nTest; j++)
1078 x[i] += p[j] * (fIsNormalised ? fSigmas(i) : 1)
1079 * fEigenVectors(i,j);
1080 }
1081
1082}
1083
1084////////////////////////////////////////////////////////////////////////////////
1085/// Print the statistics
1086/// Options are
1087/// - M Print mean values of original data
1088/// - S Print sigma values of original data
1089/// - E Print eigenvalues of covariance matrix
1090/// - V Print eigenvectors of covariance matrix
1091/// Default is MSE
1092
1094{
1099
1100 Int_t len = opt ? strlen(opt) : 0;
1101 for (Int_t i = 0; i < len; i++) {
1102 switch (opt[i]) {
1103 case 'V':
1104 case 'v':
1105 printV = kTRUE;
1106 break;
1107 case 'M':
1108 case 'm':
1109 printM = kTRUE;
1110 break;
1111 case 'S':
1112 case 's':
1113 printS = kTRUE;
1114 break;
1115 case 'E':
1116 case 'e':
1117 printE = kTRUE;
1118 break;
1119 default:
1120 Warning("Print", "Unknown option '%c'",opt[i]);
1121 break;
1122 }
1123 }
1124
1125 if (printM||printS||printE) {
1126 std::cout << " Variable # " << std::flush;
1127 if (printM)
1128 std::cout << "| Mean Value " << std::flush;
1129 if (printS)
1130 std::cout << "| Sigma " << std::flush;
1131 if (printE)
1132 std::cout << "| Eigenvalue" << std::flush;
1133 std::cout << std::endl;
1134
1135 std::cout << "-------------" << std::flush;
1136 if (printM)
1137 std::cout << "+------------" << std::flush;
1138 if (printS)
1139 std::cout << "+------------" << std::flush;
1140 if (printE)
1141 std::cout << "+------------" << std::flush;
1142 std::cout << std::endl;
1143
1144 for (Int_t i = 0; i < fNumberOfVariables; i++) {
1145 std::cout << std::setw(12) << i << " " << std::flush;
1146 if (printM)
1147 std::cout << "| " << std::setw(10) << std::setprecision(4)
1148 << fMeanValues(i) << " " << std::flush;
1149 if (printS)
1150 std::cout << "| " << std::setw(10) << std::setprecision(4)
1151 << fSigmas(i) << " " << std::flush;
1152 if (printE)
1153 std::cout << "| " << std::setw(10) << std::setprecision(4)
1154 << fEigenValues(i) << " " << std::flush;
1155 std::cout << std::endl;
1156 }
1157 std::cout << std::endl;
1158 }
1159
1160 if(printV) {
1161 for (Int_t i = 0; i < fNumberOfVariables; i++) {
1162 std::cout << "Eigenvector # " << i << std::flush;
1165 v.Print();
1166 }
1167 }
1168}
1169
1170////////////////////////////////////////////////////////////////////////////////
1171/// Calculates the sum of the square residuals, that is
1172///
1173/// \f[
1174/// E_N = \sum_{i=0}^{P-1} \left(x_i - x^\prime_i\right)^2
1175/// \f]
1176///
1177/// where \f$x^\prime_i = \sum_{j=i}^N p_i e_{n_j}\f$
1178/// is the \f$i^{\mbox{th}}\f$ component of the principal vector, corresponding to
1179/// \f$x_i\f$, the original data; I.e., the square distance to the space
1180/// spanned by \f$N\f$ eigenvectors.
1181
1183{
1184
1185 if (!x)
1186 return;
1187
1188 Double_t p[100];
1189 Double_t xp[100];
1190
1191 X2P(x,p);
1192 for (Int_t i = fNumberOfVariables-1; i >= 0; i--) {
1193 P2X(p,xp,i);
1194 for (Int_t j = 0; j < fNumberOfVariables; j++) {
1195 s[i] += (x[j] - xp[j])*(x[j] - xp[j]);
1196 }
1197 }
1198}
1199
1200////////////////////////////////////////////////////////////////////////////////
1201/// Test the PCA, bye calculating the sum square of residuals
1202/// (see method SumOfSquareResiduals), and display the histogram
1203
1205{
1206 MakeHistograms("pca","S");
1207
1208 if (!fStoreData)
1209 return;
1210
1211 TH1 *pca_s = nullptr;
1212 if (fHistograms) pca_s = (TH1*)fHistograms->FindObject("pca_s");
1213 if (!pca_s) {
1214 Warning("Test", "Couldn't get histogram of square residuals");
1215 return;
1216 }
1217
1218 pca_s->Draw();
1219}
1220
1221////////////////////////////////////////////////////////////////////////////////
1222/// Calculate the principal components from the original data vector
1223/// x, and return it in p.
1224///
1225/// It's the users responsibility to make sure that both x and p are
1226/// of the right size (i.e., memory must be allocated for p).
1227
1229{
1230 for (Int_t i = 0; i < fNumberOfVariables; i++){
1231 p[i] = 0;
1232 for (Int_t j = 0; j < fNumberOfVariables; j++)
1233 p[i] += (x[j] - fMeanValues(j)) * fEigenVectors(j,i) /
1234 (fIsNormalised ? fSigmas(j) : 1);
1235 }
1236
1237}
#define d(i)
Definition RSha256.hxx:102
#define b(i)
Definition RSha256.hxx:100
#define h(i)
Definition RSha256.hxx:106
size_t size(const MatrixT &matrix)
retrieve the size of a square matrix
constexpr Bool_t kFALSE
Definition RtypesCore.h:108
double Double_t
Double 8 bytes.
Definition RtypesCore.h:73
long long Long64_t
Portable signed long integer 8 bytes.
Definition RtypesCore.h:83
constexpr Bool_t kTRUE
Definition RtypesCore.h:107
const char Option_t
Option string (const char)
Definition RtypesCore.h:80
ROOT::Detail::TRangeCast< T, true > TRangeDynCast
TRangeDynCast is an adapter class that allows the typed iteration through a TCollection.
#define R__ASSERT(e)
Checks condition e and reports a fatal error if it's false.
Definition TError.h:125
winID h TVirtualViewer3D TVirtualGLPainter p
Option_t Option_t TPoint TPoint const char GetTextMagnitude GetFillStyle GetLineColor GetLineWidth GetMarkerStyle GetTextAlign GetTextColor GetTextSize void char Point_t Rectangle_t WindowAttributes_t Float_t Float_t Float_t Int_t Int_t UInt_t UInt_t Rectangle_t Int_t Int_t Window_t TString Int_t GCValues_t GetPrimarySelectionOwner GetDisplay GetScreen GetColormap GetNativeEvent const char const char dpyName wid window const char font_name cursor keysym reg const char only_if_exist regb h Point_t winding char text const char depth char const char Int_t count const char ColorStruct_t color const char filename
Option_t Option_t TPoint TPoint const char GetTextMagnitude GetFillStyle GetLineColor GetLineWidth GetMarkerStyle GetTextAlign GetTextColor GetTextSize void char Point_t Rectangle_t WindowAttributes_t index
Option_t Option_t TPoint TPoint const char GetTextMagnitude GetFillStyle GetLineColor GetLineWidth GetMarkerStyle GetTextAlign GetTextColor GetTextSize void char Point_t Rectangle_t WindowAttributes_t Float_t Float_t Float_t Int_t Int_t UInt_t UInt_t Rectangle_t Int_t Int_t Window_t TString Int_t GCValues_t GetPrimarySelectionOwner GetDisplay GetScreen GetColormap GetNativeEvent const char const char dpyName wid window const char font_name cursor keysym reg const char only_if_exist regb h Point_t winding char text const char depth char const char Int_t count const char ColorStruct_t color const char Pixmap_t Pixmap_t PictureAttributes_t attr const char char ret_data h unsigned char height h Atom_t Int_t ULong_t ULong_t unsigned char prop_list Atom_t Atom_t Atom_t Time_t UChar_t len
char name[80]
Definition TGX11.cxx:110
Int_t Fill(Double_t x) override
TMatrixTColumn_const< Double_t > TMatrixDColumn_const
#define gROOT
Definition TROOT.h:411
char * Form(const char *fmt,...)
Formats a string in a circular formatting buffer.
Definition TString.cxx:2495
Using a TBrowser one can browse all ROOT objects.
Definition TBrowser.h:37
This class stores the date and time with a precision of one second in an unsigned 32 bit word (950130...
Definition TDatime.h:37
1-D histogram with a float per channel (see TH1 documentation)
Definition TH1.h:879
TH1 is the base class of all histogram classes in ROOT.
Definition TH1.h:109
virtual Bool_t Add(TF1 *h1, Double_t c1=1, Option_t *option="")
Performs the operation: this = this + c1*f1 if errors are defined (see TH1::Sumw2),...
Definition TH1.cxx:813
2-D histogram with a float per channel (see TH1 documentation)
Definition TH2.h:307
A doubly linked list.
Definition TList.h:38
TObject * FindObject(const char *name) const override
Find an object in this list using its name.
Definition TList.cxx:575
void Add(TObject *obj) override
Definition TList.h:81
void Delete(Option_t *option="") override
Remove all objects from the list AND delete all heap based objects.
Definition TList.cxx:467
TMatrixDSymEigen.
Int_t GetNrows() const
virtual TMatrixTBase< Element > & Zero()
Set matrix elements to zero.
Bool_t IsValid() const
TMatrixTSym< Element > & Use(Int_t row_lwb, Int_t row_upb, Element *data)
const Element * GetMatrixArray() const override
Definition TMatrixT.h:228
The TNamed class is the base class for all named ROOT classes.
Definition TNamed.h:29
virtual void SetName(const char *name)
Set the name of the TNamed.
Definition TNamed.cxx:149
TNamed & operator=(const TNamed &rhs)
TNamed assignment operator.
Definition TNamed.cxx:50
virtual void Warning(const char *method, const char *msgfmt,...) const
Issue warning message.
Definition TObject.cxx:1057
virtual void Error(const char *method, const char *msgfmt,...) const
Issue error message.
Definition TObject.cxx:1071
Principal Components Analysis (PCA)
Definition TPrincipal.h:21
virtual void MakeMethods(const char *classname="PCA", Option_t *option="")
Generate the file <classname>PCA.cxx which contains the implementation of two methods:
virtual void AddRow(const Double_t *x)
Add a data point and update the covariance matrix.
virtual void X2P(const Double_t *x, Double_t *p)
Calculate the principal components from the original data vector x, and return it in p.
Double_t fTrace
Trace of covarience matrix.
Definition TPrincipal.h:38
TPrincipal()
Empty constructor. Do not use.
void Clear(Option_t *option="") override
Clear the data in Object.
virtual void MakeHistograms(const char *name="pca", Option_t *option="epsdx")
Make histograms of the result of the analysis.
void Print(Option_t *opt="MSE") const override
Print the statistics Options are.
TVectorD fUserData
Vector of original data points.
Definition TPrincipal.h:36
virtual void MakeCode(const char *filename="pca", Option_t *option="")
Generates the file <filename>, with .C appended if it does argument doesn't end in ....
TMatrixD fCovarianceMatrix
Covariance matrix.
Definition TPrincipal.h:29
Int_t fNumberOfVariables
Number of variables.
Definition TPrincipal.h:25
TVectorD fSigmas
vector of sigmas
Definition TPrincipal.h:28
TVectorD fOffDiagonal
Elements of the tridiagonal.
Definition TPrincipal.h:34
TVectorD fEigenValues
Eigenvalue vector of trans.
Definition TPrincipal.h:32
TVectorD fMeanValues
Mean value over all data points.
Definition TPrincipal.h:27
TList * fHistograms
List of histograms.
Definition TPrincipal.h:40
void MakeRealCode(const char *filename, const char *prefix, Option_t *option="")
This is the method that actually generates the code for the transformations to and from feature space...
Bool_t fStoreData
Should we store input data?
Definition TPrincipal.h:43
TMatrixD fEigenVectors
Eigenvector matrix of trans.
Definition TPrincipal.h:31
~TPrincipal() override
Destructor.
virtual void P2X(const Double_t *p, Double_t *x, Int_t nTest)
Calculate x as a function of nTest of the most significant principal components p,...
void Browse(TBrowser *b) override
Browse the TPrincipal object in the TBrowser.
Int_t fNumberOfDataPoints
Number of data points.
Definition TPrincipal.h:24
void MakeNormalised()
Normalize the covariance matrix.
virtual void MakePrincipals()
Perform the principal components analysis.
virtual void SumOfSquareResiduals(const Double_t *x, Double_t *s)
Calculates the sum of the square residuals, that is.
const Double_t * GetRow(Long64_t row)
Return a row of the user supplied data.
void Test(Option_t *option="")
Test the PCA, bye calculating the sum square of residuals (see method SumOfSquareResiduals),...
Bool_t fIsNormalised
Normalize matrix?
Definition TPrincipal.h:42
TPrincipal & operator=(const TPrincipal &)
Assignment operator.
Basic string class.
Definition TString.h:138
static TString Format(const char *fmt,...)
Static method which formats a string using a printf style format descriptor and return a TString.
Definition TString.cxx:2384
void Form(const char *fmt,...)
Formats a string using a printf style format descriptor.
Definition TString.cxx:2362
TVectorT< Element > & Zero()
Set vector elements to zero.
Definition TVectorT.cxx:452
TVectorT< Element > & ResizeTo(Int_t lwb, Int_t upb)
Resize the vector to [lwb:upb] .
Definition TVectorT.cxx:293
Int_t GetNrows() const
Definition TVectorT.h:75
Bool_t IsValid() const
Definition TVectorT.h:103
Element * GetMatrixArray()
Definition TVectorT.h:78
Double_t x[n]
Definition legend1.C:17
Double_t Sqrt(Double_t x)
Returns the square root of x.
Definition TMath.h:673
Short_t Abs(Short_t d)
Returns the absolute value of parameter Short_t d.
Definition TMathBase.h:124
auto * t1
Definition textangle.C:20