# class TLinearFitter: public TVirtualFitter

```
The Linear Fitter - fitting functions that are LINEAR IN PARAMETERS

Linear fitter is used to fit a set of data points with a linear
combination of specified functions. Note, that "linear" in the name
stands only for the model dependency on parameters, the specified
functions can be nonlinear.
The general form of this kind of model is

y(x) = a[0] + a[1]*f[1](x)+...a[n]*f[n](x)

Functions f are fixed functions of x. For example, fitting with a
polynomial is linear fitting in this sense.

The fitting method

The fit is performed using the Normal Equations method with Cholesky
decomposition.

Why should it be used?

The linear fitter is considerably faster than general non-linear
fitters and doesn't require to set the initial values of parameters.

Using the fitter:

1.1 To store or not to store the input data?
- There are 2 options in the constructor - to store or not
store the input data. The advantages of storing the data
are that you'll be able to reset the fitting model without
adding all the points again, and that for very large sets
of points the chisquare is calculated more precisely.
The obvious disadvantage is the amount of memory used to
keep all the points.
- Before you start adding the points, you can change the
store/not store option by StoreData() method.
1.2 The data can be added:
- simply point by point - AddPoint() method
- an array of points at once:
If the data is already stored in some arrays, this data
can be assigned to the linear fitter without physically
coping bytes, thanks to the Use() method of
TVector and TMatrix classes - AssignData() method

2.Setting the formula
2.1 The linear formula syntax:
-Additive parts are separated by 2 plus signes "++"
--for example "1 ++ x" - for fitting a straight line
-All standard functions, undrestood by TFormula, can be used
--TMath functions can be used too
-Functions, used as additive parts, shouldn't have any parameters,
even if those parameters are set.
--for example, if normalizing a sum of a gaus(0, 1) and a
gaus(0, 2), don't use the built-in "gaus" of TFormula,
because it has parameters, take TMath::Gaus(x, 0, 1) instead.
-Polynomials can be used like "pol3", .."polN"
-If fitting a more than 3-dimensional formula, variables should
be numbered as follows:
-- x[0], x[1], x[2]... For example, to fit  "1 ++ x[0] ++ x[1] ++ x[2] ++ x[3]*x[3]"
2.2 Setting the formula:
2.2.1 If fitting a 1-2-3-dimensional formula, one can create a
TF123 based on a linear expression and pass this function
to the fitter:
--Example:
TLinearFitter *lf = new TLinearFitter();
TF2 *f2 = new TF2("f2", "x ++ y ++ x*x*y*y", -2, 2, -2, 2);
lf->SetFormula(f2);
--The results of the fit are then stored in the function,
just like when the TH1::Fit or TGraph::Fit is used
--A linear function of this kind is by no means different
from any other function, it can be drawn, evaluated, etc.

--For multidimensional fitting, TFormulas of the form:
x[0]++...++x[n] can be used
2.2.2 There is no need to create the function if you don't want to,
the formula can be set by expression:
--Example:
// 2 is the number of dimensions
TLinearFitter *lf = new TLinearFitter(2);
lf->SetFormula("x ++ y ++ x*x*y*y");

2.2.3 The fastest functions to compute are polynomials and hyperplanes.
--Polynomials are set the usual way: "pol1", "pol2",...
--Hyperplanes are set by expression "hyp3", "hyp4", ...
---The "hypN" expressions only work when the linear fitter
is used directly, not through TH1::Fit or TGraph::Fit.
To fit a graph or a histogram with a hyperplane, define
the function as "1++x++y".
---A constant term is assumed for a hyperplane, when using
the "hypN" expression, so "hyp3" is in fact fitting with
"1++x++y++z" function.
--Fitting hyperplanes is much faster than fitting other
expressions so if performance is vital, calculate the
function values beforehand and give them to the fitter
as variables
--Example:
You want to fit "sin(x)|cos(2*x)" very fast. Calculate
sin(x) and cos(2*x) beforehand and store them in array *data.
Then:
TLinearFitter *lf=new TLinearFitter(2, "hyp2");
lf->AssignData(npoint, 2, data, y);

2.3 Resetting the formula
2.3.1 If the input data is stored (or added via AssignData() function),
the fitting formula can be reset without re-adding all the points.
--Example:
TLinearFitter *lf=new TLinearFitter("1++x++x*x");
lf->AssignData(n, 1, x, y, e);
lf->Eval()
//looking at the parameter significance, you see,
// that maybe the fit will improve, if you take out
// the constant term
lf->SetFormula("x++x*x");
lf->Eval();

2.3.2 If the input data is not stored, the fitter will have to be
cleared and the data will have to be added again to try a
different formula.

3.Accessing the fit results
3.1 There are methods in the fitter to access all relevant information:
--GetParameters, GetCovarianceMatrix, etc
--the t-values of parameters and their significance can be reached by
GetParTValue() and GetParSignificance() methods
3.2 If fitting with a pre-defined TF123, the fit results are also
written into this function.

4.Robust fitting - Least Trimmed Squares regression (LTS)
Outliers are atypical(by definition), infrequant observations; data points
which do not appear to follow the characteristic distribution of the rest
of the data. These may reflect genuine properties of the underlying
phenomenon(variable), or be due to measurement errors or anomalies which
shouldn't be modelled. (StatSoft electronic textbook)

Even a single gross outlier can greatly influence the results of least-
squares fitting procedure, and in this case use of robust(resistant) methods
is recommended.

The method implemented here is based on the article and algorithm:
"Computing LTS Regression for Large Data Sets" by
P.J.Rousseeuw and Katrien Van Driessen
The idea of the method is to find the fitting coefficients for a subset
of h observations (out of n) with the smallest sum of squared residuals.
The size of the subset h should lie between (npoints + nparameters +1)/2
and n, and represents the minimal number of good points in the dataset.
The default value is set to (npoints + nparameters +1)/2, but of course
if you are sure that the data contains less outliers it's better to change

To perform a robust fit, call EvalRobust() function instead of Eval() after
adding the points and setting the fitting function.
Note, that standard errors on parameters are not computed!

```

## Function Members (Methods)

public:
protected:
 virtual void TObject::DoError(int level, const char* location, const char* fmt, va_list va) const void TObject::MakeZombie() TVirtualFitter& TVirtualFitter::operator=(const TVirtualFitter& tvf)
private:
 void AddToDesign(Double_t* x, Double_t y, Double_t e) void ComputeTValues() void CreateSubset(Int_t ntotal, Int_t h, Int_t* index) Double_t CStep(Int_t step, Int_t h, Double_t* residuals, Int_t* index, Int_t* subdat, Int_t start, Int_t end) Int_t Graph2DLinearFitter(Double_t h) Int_t GraphLinearFitter(Double_t h) Int_t HistLinearFitter() Bool_t Linf() Int_t MultiGraphLinearFitter(Double_t h) Int_t Partition(Int_t nmini, Int_t* indsubdat) void RDraw(Int_t* subdat, Int_t* indsubdat)

## Data Members

private:
 enum TObject::EStatusBits { kCanDelete kMustCleanup kObjInCanvas kIsReferenced kHasUUID kCannotPick kNoContextMenu kInvalidObject }; enum TObject::[unnamed] { kIsOnHeap kNotDeleted kZombie kBitMask kSingleKey kOverwrite kWriteDelete };
protected:
 Double_t* TVirtualFitter::fCache [fCacheSize] array of points data (fNpoints*fPointSize < fCacheSize words) Int_t TVirtualFitter::fCacheSize Size of the fCache array void TVirtualFitter::fFCN TMethodCall* TVirtualFitter::fMethodCall Pointer to MethodCall in case of interpreted function TString TNamed::fName object identifier Int_t TVirtualFitter::fNpoints Number of points to fit TObject* TVirtualFitter::fObjectFit pointer to object being fitted Foption_t TVirtualFitter::fOption struct with the fit options Int_t TVirtualFitter::fPointSize Number of words per point in the cache TString TNamed::fTitle object title TObject* TVirtualFitter::fUserFunc pointer to user theoretical function (a TF1*) Int_t TVirtualFitter::fXfirst first bin on X axis Int_t TVirtualFitter::fXlast last bin on X axis Int_t TVirtualFitter::fYfirst first bin on Y axis Int_t TVirtualFitter::fYlast last bin on Y axis Int_t TVirtualFitter::fZfirst first bin on Z axis Int_t TVirtualFitter::fZlast last bin on Z axis static TString TVirtualFitter::fgDefault name of the default fitter ("Minuit","Fumili",etc) static Double_t TVirtualFitter::fgErrorDef Error definition (default=1) static TVirtualFitter* TVirtualFitter::fgFitter Current fitter (default TFitter) static Int_t TVirtualFitter::fgMaxiter Maximum number of iterations static Int_t TVirtualFitter::fgMaxpar Maximum number of fit parameters for current fitter static Double_t TVirtualFitter::fgPrecision maximum precision
private:
 TVectorD fAtb vector Atb TVectorD fAtbTemp ! temporary vector, used for num.stability TVectorD fAtbTemp2 ! TVectorD fAtbTemp3 ! Double_t fChisquare Chisquare of the fit TMatrixDSym fDesign matrix AtA TMatrixDSym fDesignTemp ! temporary matrix, used for num.stability TMatrixDSym fDesignTemp2 ! TMatrixDSym fDesignTemp3 ! TVectorD fE the errors if they are known TBits fFitsample indices of points, used in the robust fit Bool_t* fFixedParams [fNfixed] array of fixed/released params char* fFormula the formula Int_t fFormulaSize length of the formula TObjArray fFunctions array of basis functions Int_t fH number of good points in robust fit TFormula* fInputFunction the function being fit Bool_t fIsSet Has the formula been set? Int_t fNdim number of dimensions in the formula Int_t fNfixed number of fixed parameters Int_t fNfunctions number of basis functions Int_t fNpoints number of points TMatrixDSym fParCovar matrix of parameters' covariances TVectorD fParSign significance levels of parameters TVectorD fParams vector of parameters Bool_t fRobust true when performing a robust fit Int_t fSpecial =100+n if fitting a polynomial of deg.n Bool_t fStoreData Is the data stored? TVectorD fTValues T-Values of parameters Double_t fVal[1000] ! temporary TMatrixD fX values of x TVectorD fY the values being fit Double_t fY2 sum of square of y, used for chisquare Double_t fY2Temp ! temporary variable used for num.stability

## Function documentation

```default c-tor, input data is stored
If you don't want to store the input data,
run the function StoreData(kFALSE) after constructor
```
TLinearFitter(Int_t ndim)
```The parameter stands for number of dimensions in the fitting formula
The input data is stored. If you don't want to store the input data,
run the function StoreData(kFALSE) after constructor
```
TLinearFitter(Int_t ndim, const char* formula, Option_t* opt = "D")
```First parameter stands for number of dimensions in the fitting formula
Second parameter is the fitting formula: see class description for formula syntax
Options:
The option is to store or not to store the data
If you don't want to store the data, choose "" for the option, or run
StoreData(kFalse) member function after the constructor
```
TLinearFitter(TFormula* function, Option_t* opt = "D")
```This constructor uses a linear function. How to create it?
TFormula now accepts formulas of the following kind:
TFormula("f", "x++y++z++x*x") or
TFormula("f", "x[0]++x[1]++x[2]*x[2]");
Other than the look, it's in no
way different from the regular formula, it can be evaluated,
drawn, etc.
The option is to store or not to store the data
If you don't want to store the data, choose "" for the option, or run
StoreData(kFalse) member function after the constructor
```
TLinearFitter(const TLinearFitter& tlf)
``` Copy ctor
```

``` Linear fitter cleanup.
```
TLinearFitter& operator=(const TLinearFitter& tlf)
``` Assignment operator
```
```Add another linear fitter to this linear fitter. Points and Design matrices
are added, but the previos fitting results (if any) are deleted.
Fitters must have same formulas (this is not checked). Fixed parameters are not changed
```
void AddPoint(Double_t* x, Double_t y, Double_t e = 1)
```Adds 1 point to the fitter.
First parameter stands for the coordinates of the point, where the function is measured
Second parameter - the value being fitted
Third parameter - weight(measurement error) of this point (=1 by default)
```
void AssignData(Int_t npoints, Int_t xncols, Double_t* x, Double_t* y, Double_t* e = 0)
```This function is to use when you already have all the data in arrays
and don't want to copy them into the fitter. In this function, the Use() method
of TVectorD and TMatrixD is used, so no bytes are physically moved around.
First parameter - number of points to fit
Second parameter - number of variables in the model
Third parameter - the variables of the model, stored in the following way:
(x0(0), x1(0), x2(0), x3(0), x0(1), x1(1), x2(1), x3(1),...
```
void AddToDesign(Double_t* x, Double_t y, Double_t e)
```Add a point to the AtA matrix and to the Atb vector.
```
void Clear(Option_t* option = "")
```Clears everything. Used in TH1::Fit and TGraph::Fit().
```
void ClearPoints()
```To be used when different sets of points are fitted with the same formula.
```
void Chisquare()
```Calculates the chisquare.
```
void ComputeTValues()
``` Computes parameters' t-values and significance
```

``` Perform the fit and evaluate the parameters
Returns 0 if the fit is ok, 1 if there are errors
```
void FixParameter(Int_t ipar)
```Fixes paramter #ipar at its current value.
```
void FixParameter(Int_t ipar, Double_t parvalue)
```Fixes parameter #ipar at value parvalue.
```
void ReleaseParameter(Int_t ipar)
```Releases parameter #ipar.
```
void GetAtbVector(TVectorD& v)
```Get the Atb vector - a vector, used for internal computations
```

``` Get the Chisquare.
```
void GetConfidenceIntervals(Int_t n, Int_t ndim, const Double_t* x, Double_t* ci, Double_t cl = 0.95)
```Computes point-by-point confidence intervals for the fitted function
Parameters:
n - number of points
ndim - dimensions of points
x - points, at which to compute the intervals, for ndim > 1
should be in order: (x0,y0, x1, y1, ... xn, yn)
ci - computed intervals are returned in this array
cl - confidence level, default=0.95

NOTE, that this method can only be used when the fitting function inherits from a TF1,
so it's not possible when the fitting function was set as a string or as a pure TFormula
```
void GetConfidenceIntervals(TObject* obj, Double_t cl = 0.95)
```Computes confidence intervals at level cl. Default is 0.95
The TObject parameter can be a TGraphErrors, a TGraph2DErrors or a TH123.
For Graphs, confidence intervals are computed for each point,
the value of the graph at that point is set to the function value at that
point, and the graph y-errors (or z-errors) are set to the value of
the confidence interval at that point
For Histograms, confidence intervals are computed for each bin center
The bin content of this bin is then set to the function value at the bin
center, and the bin error is set to the confidence interval value.
Allowed combinations:
Fitted object               Passed object
TGraph                      TGraphErrors, TH1
TGraphErrors, AsymmErrors   TGraphErrors, TH1
TH1                         TGraphErrors, TH1
TGraph2D                    TGraph2DErrors, TH2
TGraph2DErrors              TGraph2DErrors, TH2
TH2                         TGraph2DErrors, TH2
TH3                         TH3
```
Double_t* GetCovarianceMatrix() const
```Returns covariance matrix
```
void GetCovarianceMatrix(TMatrixD& matr)
```Returns covariance matrix
```
void GetDesignMatrix(TMatrixD& matr)
```Returns the internal design matrix
```
void GetErrors(TVectorD& vpar)
```Returns parameter errors
```
void GetParameters(TVectorD& vpar)
```Returns parameter values
```
Int_t GetParameter(Int_t ipar, char* name, Double_t& value, Double_t& , Double_t& , Double_t& ) const
```Returns the value and the name of the parameter #ipar
```
Double_t GetParError(Int_t ipar) const
```Returns the error of parameter #ipar
```
const char * GetParName(Int_t ipar) const
```Returns name of parameter #ipar
```

```Returns the t-value for parameter #ipar
```

```Returns the significance of parameter #ipar
```
void GetFitSample(TBits& bits)
```For robust lts fitting, returns the sample, on which the best fit was based
```
Int_t Merge(TCollection* list)
```Merge objects in list
```
void SetBasisFunctions(TObjArray* functions)
```set the basis functions in case the fitting function is not
set directly
```
void SetDim(Int_t n)
```set the number of dimensions
```
void SetFormula(const char *formula)
```Additive parts should be separated by "++".
Examples (ai are parameters to fit):
1.fitting function: a0*x0 + a1*x1 + a2*x2
input formula "x[0]++x[1]++x[2]"
2.TMath functions can be used:
fitting function: a0*TMath::Gaus(x, 0, 1) + a1*y
input formula:    "TMath::Gaus(x, 0, 1)++y"
fills the array of functions
```
void SetFormula(TFormula *function)
```Set the fitting function.
```

```Update the design matrix after the formula has been changed.
```
Int_t ExecuteCommand(const char* command, Double_t* args, Int_t nargs)
```To use in TGraph::Fit and TH1::Fit().
```
void PrintResults(Int_t level, Double_t amin = 0) const
``` Level = 3 (to be consistent with minuit)  prints parameters and parameter
errors.
```

```Used in TGraph::Fit().
```

```Minimisation function for a TGraph2D
```

```Minimisation function for a TMultiGraph
```

``` Minimization function for H1s using a Chisquare method.
```
void Streamer(TBuffer& b)
Int_t EvalRobust(Double_t h = -1)
```Finds the parameters of the fitted function in case data contains
outliers.
Parameter h stands for the minimal fraction of good points in the
dataset (h < 1, i.e. for 70% of good points take h=0.7).
The default value of h*Npoints is  (Npoints + Nparameters+1)/2
If the user provides a value of h smaller than above, default is taken
See class description for the algorithm details
```
void CreateSubset(Int_t ntotal, Int_t h, Int_t* index)
```Creates a p-subset to start
ntotal - total number of points from which the subset is chosen
```
Double_t CStep(Int_t step, Int_t h, Double_t* residuals, Int_t* index, Int_t* subdat, Int_t start, Int_t end)
```The CStep procedure, as described in the article
```

Int_t Partition(Int_t nmini, Int_t* indsubdat)
```divides the elements into approximately equal subgroups
number of elements in each subgroup is stored in indsubdat
number of subgroups is returned
```
void RDraw(Int_t* subdat, Int_t* indsubdat)
```Draws ngroup nonoverlapping subdatasets out of a dataset of size n
such that the selected case numbers are uniformly distributed from 1 to n
```
void Chisquare()
Double_t GetCovarianceMatrixElement(Int_t i, Int_t j) const
`{return fParCovar(i, j);}`
void GetErrors(TVectorD& vpar)
Int_t GetNumberTotalParameters() const
`{return fNfunctions;}`
Int_t GetNumberFreeParameters() const
`{return fNfunctions-fNfixed;}`

`{ return fNpoints; }`
Double_t GetParameter(Int_t ipar) const
`{return fParams(ipar);}`
Double_t GetY2() const
`{return fY2;}`
Bool_t IsFixed(Int_t ipar) const
`{return fFixedParams[ipar];}`
void StoreData(Bool_t store)
`{fStoreData=store;}`
Int_t GetStats(Double_t& , Double_t& , Double_t& , Int_t& , Int_t& ) const
`{return 0;}`

`{return 0;}`
void SetFitMethod(const char* )
`{;}`
Int_t SetParameter(Int_t , const char* , Double_t , Double_t , Double_t , Double_t )
`{return 0;}`