Loading [MathJax]/extensions/tex2jax.js
Logo ROOT  
Reference Guide
All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Properties Friends Macros Modules Pages
MultivariateGaussianTest.C File Reference

Detailed Description

View in nbviewer Open in SWAN Comparison of MCMC and PLC in a multi-variate gaussian problem

This tutorial produces an N-dimensional multivariate Gaussian with a non-trivial covariance matrix. By default N=4 (called "dim").

A subset of these are considered parameters of interest. This problem is tractable analytically.

We use this mainly as a test of Markov Chain Monte Carlo and we compare the result to the profile likelihood ratio.

We use the proposal helper to create a customized proposal function for this problem.

For N=4 and 2 parameters of interest it takes about 10-20 seconds and the acceptance rate is 37%

␛[1mRooFit v3.60 -- Developed by Wouter Verkerke and David Kirkby␛[0m
Copyright (C) 2000-2013 NIKHEF, University of California & Stanford University
All rights reserved, please read http://roofit.sourceforge.net/license.txt
[#1] INFO:Minization -- RooMinimizer::optimizeConst: activating const optimization
**********
** 1 **SET PRINT 1
**********
**********
** 2 **SET NOGRAD
**********
PARAMETER DEFINITIONS:
NO. NAME VALUE STEP SIZE LIMITS
1 mu_x0 0.00000e+00 4.00000e-01 -2.00000e+00 2.00000e+00
2 mu_x1 0.00000e+00 4.00000e-01 -2.00000e+00 2.00000e+00
3 mu_x2 0.00000e+00 4.00000e-01 -2.00000e+00 2.00000e+00
4 mu_x3 0.00000e+00 4.00000e-01 -2.00000e+00 2.00000e+00
**********
** 3 **SET ERR 0.5
**********
**********
** 4 **SET PRINT 1
**********
**********
** 5 **SET STR 1
**********
NOW USING STRATEGY 1: TRY TO BALANCE SPEED AGAINST RELIABILITY
**********
** 6 **MIGRAD 2000 1
**********
FIRST CALL TO USER FUNCTION AT NEW START POINT, WITH IFLAG=4.
START MIGRAD MINIMIZATION. STRATEGY 1. CONVERGENCE WHEN EDM .LT. 1.00e-03
FCN=707.822 FROM MIGRAD STATUS=INITIATE 14 CALLS 15 TOTAL
EDM= unknown STRATEGY= 1 NO ERROR MATRIX
EXT PARAMETER CURRENT GUESS STEP FIRST
NO. NAME VALUE ERROR SIZE DERIVATIVE
1 mu_x0 0.00000e+00 4.00000e-01 2.01358e-01 -9.83942e+00
2 mu_x1 0.00000e+00 4.00000e-01 2.01358e-01 -1.25127e+01
3 mu_x2 0.00000e+00 4.00000e-01 2.01358e-01 9.88572e+00
4 mu_x3 0.00000e+00 4.00000e-01 2.01358e-01 -4.09155e+00
ERR DEF= 0.5
MIGRAD MINIMIZATION HAS CONVERGED.
MIGRAD WILL VERIFY CONVERGENCE AND ERROR MATRIX.
COVARIANCE MATRIX CALCULATED SUCCESSFULLY
FCN=706.56 FROM MIGRAD STATUS=CONVERGED 65 CALLS 66 TOTAL
EDM=1.95881e-05 STRATEGY= 1 ERROR MATRIX ACCURATE
EXT PARAMETER STEP FIRST
NO. NAME VALUE ERROR SIZE DERIVATIVE
1 mu_x0 1.80728e-01 1.72980e-01 1.42731e-03 -2.59592e-02
2 mu_x1 2.07351e-01 1.72978e-01 1.42884e-03 -3.69006e-02
3 mu_x2 -1.59412e-02 1.72984e-01 1.42230e-03 3.21539e-02
4 mu_x3 1.23430e-01 1.72982e-01 1.42472e-03 -8.04153e-03
ERR DEF= 0.5
EXTERNAL ERROR MATRIX. NDIM= 25 NPAR= 4 ERR DEF=0.5
3.000e-02 9.997e-03 9.998e-03 9.997e-03
9.997e-03 3.000e-02 9.998e-03 9.997e-03
9.998e-03 9.998e-03 3.000e-02 9.998e-03
9.997e-03 9.997e-03 9.998e-03 3.000e-02
PARAMETER CORRELATION COEFFICIENTS
NO. GLOBAL 1 2 3 4
1 0.44715 1.000 0.333 0.333 0.333
2 0.44715 0.333 1.000 0.333 0.333
3 0.44717 0.333 0.333 1.000 0.333
4 0.44716 0.333 0.333 0.333 1.000
**********
** 7 **SET ERR 0.5
**********
**********
** 8 **SET PRINT 1
**********
**********
** 9 **HESSE 2000
**********
COVARIANCE MATRIX CALCULATED SUCCESSFULLY
FCN=706.56 FROM HESSE STATUS=OK 23 CALLS 89 TOTAL
EDM=1.95881e-05 STRATEGY= 1 ERROR MATRIX ACCURATE
EXT PARAMETER INTERNAL INTERNAL
NO. NAME VALUE ERROR STEP SIZE VALUE
1 mu_x0 1.80728e-01 1.72984e-01 2.85462e-04 9.04875e-02
2 mu_x1 2.07351e-01 1.72982e-01 2.85769e-04 1.03862e-01
3 mu_x2 -1.59412e-02 1.72987e-01 2.84460e-04 -7.97069e-03
4 mu_x3 1.23430e-01 1.72986e-01 2.84944e-04 6.17540e-02
ERR DEF= 0.5
EXTERNAL ERROR MATRIX. NDIM= 25 NPAR= 4 ERR DEF=0.5
3.000e-02 9.999e-03 9.999e-03 9.999e-03
9.999e-03 3.000e-02 9.999e-03 9.999e-03
9.999e-03 9.999e-03 3.000e-02 9.999e-03
9.999e-03 9.999e-03 9.999e-03 3.000e-02
PARAMETER CORRELATION COEFFICIENTS
NO. GLOBAL 1 2 3 4
1 0.44720 1.000 0.333 0.333 0.333
2 0.44719 0.333 1.000 0.333 0.333
3 0.44720 0.333 0.333 1.000 0.333
4 0.44720 0.333 0.333 0.333 1.000
[#1] INFO:Minization -- RooMinimizer::optimizeConst: deactivating const optimization
[#1] INFO:Minization -- createNLL: caching constraint set under name CONSTR_OF_PDF_mvg_FOR_OBS_x0:x1:x2:x3 with 0 entries
Metropolis-Hastings progress: ....................................................................................................
[#1] INFO:Eval -- Proposal acceptance rate: 37.1%
[#1] INFO:Eval -- Number of steps in chain: 3710
[#1] INFO:Minization -- createNLL picked up cached consraints from workspace with 0 entries
[#0] PROGRESS:Minization -- ProfileLikelihoodCalcultor::DoGLobalFit - find MLE
[#0] PROGRESS:Minization -- ProfileLikelihoodCalcultor::DoMinimizeNLL - using Minuit / Migrad with strategy 1
[#1] INFO:Minization -- RooMinimizer::optimizeConst: activating const optimization
[#1] INFO:Minization --
RooFitResult: minimized FCN value: 706.56, estimated distance to minimum: 1.16082e-08
covariance matrix quality: Full, accurate covariance matrix
Status : MINIMIZE=0
Floating Parameter FinalValue +/- Error
-------------------- --------------------------
mu_x0 1.8119e-01 +/- 1.73e-01
mu_x1 2.0792e-01 +/- 1.73e-01
mu_x2 -1.6078e-02 +/- 1.73e-01
mu_x3 1.2370e-01 +/- 1.73e-01
[#1] INFO:Minization -- RooProfileLL::evaluate(nll_mvg_mvgData_Profile[mu_x0,mu_x1]) Creating instance of MINUIT
[#1] INFO:Minization -- RooProfileLL::evaluate(nll_mvg_mvgData_Profile[mu_x0,mu_x1]) determining minimum likelihood for current configurations w.r.t all observable
[#1] INFO:Minization -- RooProfileLL::evaluate(nll_mvg_mvgData_Profile[mu_x0,mu_x1]) minimum found at (mu_x0=0.181184, mu_x1=0.207917)
..[#1] INFO:Minization -- LikelihoodInterval - Finding the contour of mu_x0 ( 0 ) and mu_x1 ( 1 )
MCMC interval on p0: [-0.28, 0.6]
MCMC interval on p1: [-0.28, 0.6]
Real time 0:00:01, CP time 1.680
#include "RooGlobalFunc.h"
#include <stdlib.h>
#include "TMatrixDSym.h"
#include "RooArgList.h"
#include "RooRealVar.h"
#include "TH2F.h"
#include "TCanvas.h"
#include "RooAbsReal.h"
#include "RooFitResult.h"
#include "TStopwatch.h"
using namespace std;
using namespace RooFit;
using namespace RooStats;
void MultivariateGaussianTest(Int_t dim = 4, Int_t nPOI = 2)
{
// let's time this challenging example
t.Start();
RooArgList xVec;
RooArgList muVec;
RooArgSet poi;
// make the observable and means
Int_t i, j;
RooRealVar *mu_x;
for (i = 0; i < dim; i++) {
char *name = Form("x%d", i);
x = new RooRealVar(name, name, 0, -3, 3);
xVec.add(*x);
char *mu_name = Form("mu_x%d", i);
mu_x = new RooRealVar(mu_name, mu_name, 0, -2, 2);
muVec.add(*mu_x);
}
// put them into the list of parameters of interest
for (i = 0; i < nPOI; i++) {
poi.add(*muVec.at(i));
}
// make a covariance matrix that is all 1's
TMatrixDSym cov(dim);
for (i = 0; i < dim; i++) {
for (j = 0; j < dim; j++) {
if (i == j)
cov(i, j) = 3.;
else
cov(i, j) = 1.0;
}
}
// now make the multivariate Gaussian
RooMultiVarGaussian mvg("mvg", "mvg", xVec, muVec, cov);
// --------------------
// make a toy dataset
RooDataSet *data = mvg.generate(xVec, 100);
// now create the model config for this problem
RooWorkspace *w = new RooWorkspace("MVG");
ModelConfig modelConfig(w);
modelConfig.SetPdf(mvg);
modelConfig.SetParametersOfInterest(poi);
// -------------------------------------------------------
// Setup calculators
// MCMC
// we want to setup an efficient proposal function
// using the covariance matrix from a fit to the data
RooFitResult *fit = mvg.fitTo(*data, Save(true));
ProposalHelper ph;
ph.SetVariables((RooArgSet &)fit->floatParsFinal());
ph.SetCovMatrix(fit->covarianceMatrix());
ph.SetUpdateProposalParameters(true);
ph.SetCacheSize(100);
ProposalFunction *pdfProp = ph.GetProposalFunction();
// now create the calculator
MCMCCalculator mc(*data, modelConfig);
mc.SetConfidenceLevel(0.95);
mc.SetNumBurnInSteps(100);
mc.SetNumIters(10000);
mc.SetNumBins(50);
mc.SetProposalFunction(*pdfProp);
MCMCInterval *mcInt = mc.GetInterval();
RooArgList *poiList = mcInt->GetAxes();
// now setup the profile likelihood calculator
ProfileLikelihoodCalculator plc(*data, modelConfig);
plc.SetConfidenceLevel(0.95);
LikelihoodInterval *plInt = (LikelihoodInterval *)plc.GetInterval();
// make some plots
MCMCIntervalPlot mcPlot(*mcInt);
TCanvas *c1 = new TCanvas();
mcPlot.SetLineColor(kGreen);
mcPlot.SetLineWidth(2);
mcPlot.Draw();
LikelihoodIntervalPlot plPlot(plInt);
plPlot.Draw("same");
if (poiList->getSize() == 1) {
RooRealVar *p = (RooRealVar *)poiList->at(0);
Double_t ll = mcInt->LowerLimit(*p);
Double_t ul = mcInt->UpperLimit(*p);
cout << "MCMC interval: [" << ll << ", " << ul << "]" << endl;
}
if (poiList->getSize() == 2) {
RooRealVar *p0 = (RooRealVar *)poiList->at(0);
RooRealVar *p1 = (RooRealVar *)poiList->at(1);
TCanvas *scatter = new TCanvas();
Double_t ll = mcInt->LowerLimit(*p0);
Double_t ul = mcInt->UpperLimit(*p0);
cout << "MCMC interval on p0: [" << ll << ", " << ul << "]" << endl;
ll = mcInt->LowerLimit(*p0);
ul = mcInt->UpperLimit(*p0);
cout << "MCMC interval on p1: [" << ll << ", " << ul << "]" << endl;
// MCMC interval on p0: [-0.2, 0.6]
// MCMC interval on p1: [-0.2, 0.6]
mcPlot.DrawChainScatter(*p0, *p1);
scatter->Update();
}
t.Print();
}
int Int_t
Definition: RtypesCore.h:41
double Double_t
Definition: RtypesCore.h:55
@ kGreen
Definition: Rtypes.h:64
char name[80]
Definition: TGX11.cxx:109
char * Form(const char *fmt,...)
Int_t getSize() const
virtual Bool_t add(const RooAbsArg &var, Bool_t silent=kFALSE)
Add the specified argument to list.
RooArgList is a container object that can hold multiple RooAbsArg objects.
Definition: RooArgList.h:21
RooAbsArg * at(Int_t idx) const
Return object at given index, or nullptr if index is out of range.
Definition: RooArgList.h:74
RooArgSet is a container object that can hold multiple RooAbsArg objects.
Definition: RooArgSet.h:28
virtual Bool_t add(const RooAbsCollection &col, Bool_t silent=kFALSE)
Add a collection of arguments to this collection by calling add() for each element in the source coll...
Definition: RooArgSet.h:88
RooDataSet is a container class to hold unbinned data.
Definition: RooDataSet.h:31
RooFitResult is a container class to hold the input and output of a PDF fit to a dataset.
Definition: RooFitResult.h:40
const TMatrixDSym & covarianceMatrix() const
Return covariance matrix.
const RooArgList & floatParsFinal() const
Definition: RooFitResult.h:110
Multivariate Gaussian p.d.f.
RooRealVar represents a variable that can be changed from the outside.
Definition: RooRealVar.h:35
The RooWorkspace is a persistable container for RooFit projects.
Definition: RooWorkspace.h:43
The Canvas class.
Definition: TCanvas.h:31
virtual void Update()
Update canvas pad buffers.
Definition: TCanvas.cxx:2339
Stopwatch class.
Definition: TStopwatch.h:28
void Start(Bool_t reset=kTRUE)
Start the stopwatch.
Definition: TStopwatch.cxx:58
void Print(Option_t *option="") const
Print the real and cpu time passed between the start and stop events.
Definition: TStopwatch.cxx:219
return c1
Definition: legend1.C:41
Double_t x[n]
Definition: legend1.C:17
The namespace RooFit contains mostly switches that change the behaviour of functions of PDFs (or othe...
RooCmdArg Save(Bool_t flag=kTRUE)
Namespace for the RooStats classes.
Definition: Asimov.h:20
Authors
Kevin Belasco and Kyle Cranmer

Definition in file MultivariateGaussianTest.C.