Logo ROOT   6.21/01
Reference Guide
rs301_splot.C File Reference

Detailed Description

View in nbviewer Open in SWAN SPlot tutorial

This tutorial shows an example of using SPlot to unfold two distributions. The physics context for the example is that we want to know the isolation distribution for real electrons from Z events and fake electrons from QCD. Isolation is our 'control' variable. To unfold them, we need a model for an uncorrelated variable that discriminates between Z and QCD. To do this, we use the invariant mass of two electrons. We model the Z with a Gaussian and the QCD with a falling exponential.

Note, since we don't have real data in this tutorial, we need to generate toy data. To do that we need a model for the isolation variable for both Z and QCD. This is only used to generate the toy data, and would not be needed if we had real data.

pict1_rs301_splot.C.png
#include "RooRealVar.h"
#include "RooStats/SPlot.h"
#include "RooDataSet.h"
#include "RooRealVar.h"
#include "RooGaussian.h"
#include "RooExponential.h"
#include "RooChebychev.h"
#include "RooAddPdf.h"
#include "RooProdPdf.h"
#include "RooAddition.h"
#include "RooProduct.h"
#include "TCanvas.h"
#include "RooAbsPdf.h"
#include "RooFit.h"
#include "RooFitResult.h"
#include "RooWorkspace.h"
#include "RooConstVar.h"
#include <iomanip>
// use this order for safety on library loading
using namespace RooFit;
using namespace RooStats;
// see below for implementation
void AddModel(RooWorkspace *);
void AddData(RooWorkspace *);
void DoSPlot(RooWorkspace *);
void MakePlots(RooWorkspace *);
void rs301_splot()
{
// Create a new workspace to manage the project.
RooWorkspace *wspace = new RooWorkspace("myWS");
// add the signal and background models to the workspace.
// Inside this function you will find a description of our model.
AddModel(wspace);
// add some toy data to the workspace
AddData(wspace);
// inspect the workspace if you wish
// wspace->Print();
// do sPlot.
// This will make a new dataset with sWeights added for every event.
DoSPlot(wspace);
// Make some plots showing the discriminating variable and
// the control variable after unfolding.
MakePlots(wspace);
// cleanup
delete wspace;
}
//____________________________________
void AddModel(RooWorkspace *ws)
{
// Make models for signal (Higgs) and background (Z+jets and QCD)
// In real life, this part requires an intelligent modeling
// of signal and background -- this is only an example.
// set range of observable
Double_t lowRange = 0., highRange = 200.;
// make a RooRealVar for the observables
RooRealVar invMass("invMass", "M_{inv}", lowRange, highRange, "GeV");
RooRealVar isolation("isolation", "isolation", 0., 20., "GeV");
// --------------------------------------
// make 2-d model for Z including the invariant mass
// distribution and an isolation distribution which we want to
// unfold from QCD.
std::cout << "make z model" << std::endl;
// mass model for Z
RooRealVar mZ("mZ", "Z Mass", 91.2, lowRange, highRange);
RooRealVar sigmaZ("sigmaZ", "Width of Gaussian", 2, 0, 10, "GeV");
RooGaussian mZModel("mZModel", "Z+jets Model", invMass, mZ, sigmaZ);
// we know Z mass
mZ.setConstant();
// we leave the width of the Z free during the fit in this example.
// isolation model for Z. Only used to generate toy MC.
// the exponential is of the form exp(c*x). If we want
// the isolation to decay an e-fold every R GeV, we use
// c = -1/R.
RooConstVar zIsolDecayConst("zIsolDecayConst", "z isolation decay constant", -1);
RooExponential zIsolationModel("zIsolationModel", "z isolation model", isolation, zIsolDecayConst);
// make the combined Z model
RooProdPdf zModel("zModel", "2-d model for Z", RooArgSet(mZModel, zIsolationModel));
// --------------------------------------
// make QCD model
std::cout << "make qcd model" << std::endl;
// mass model for QCD.
// the exponential is of the form exp(c*x). If we want
// the mass to decay an e-fold every R GeV, we use
// c = -1/R.
// We can leave this parameter free during the fit.
RooRealVar qcdMassDecayConst("qcdMassDecayConst", "Decay const for QCD mass spectrum", -0.01, -100, 100, "1/GeV");
RooExponential qcdMassModel("qcdMassModel", "qcd Mass Model", invMass, qcdMassDecayConst);
// isolation model for QCD. Only used to generate toy MC
// the exponential is of the form exp(c*x). If we want
// the isolation to decay an e-fold every R GeV, we use
// c = -1/R.
RooConstVar qcdIsolDecayConst("qcdIsolDecayConst", "Et resolution constant", -.1);
RooExponential qcdIsolationModel("qcdIsolationModel", "QCD isolation model", isolation, qcdIsolDecayConst);
// make the 2-d model
RooProdPdf qcdModel("qcdModel", "2-d model for QCD", RooArgSet(qcdMassModel, qcdIsolationModel));
// --------------------------------------
// combined model
// These variables represent the number of Z or QCD events
// They will be fitted.
RooRealVar zYield("zYield", "fitted yield for Z", 50, 0., 1000);
RooRealVar qcdYield("qcdYield", "fitted yield for QCD", 100, 0., 1000);
// now make the combined model
std::cout << "make full model" << std::endl;
RooAddPdf model("model", "z+qcd background models", RooArgList(zModel, qcdModel), RooArgList(zYield, qcdYield));
// interesting for debugging and visualizing the model
model.graphVizTree("fullModel.dot");
std::cout << "import model" << std::endl;
ws->import(model);
}
//____________________________________
void AddData(RooWorkspace *ws)
{
// Add a toy dataset
// how many events do we want?
Int_t nEvents = 1000;
// get what we need out of the workspace to make toy data
RooAbsPdf *model = ws->pdf("model");
RooRealVar *invMass = ws->var("invMass");
RooRealVar *isolation = ws->var("isolation");
// make the toy data
std::cout << "make data set and import to workspace" << std::endl;
RooDataSet *data = model->generate(RooArgSet(*invMass, *isolation), nEvents);
// import data into workspace
ws->import(*data, Rename("data"));
}
//____________________________________
void DoSPlot(RooWorkspace *ws)
{
std::cout << "Calculate sWeights" << std::endl;
// get what we need out of the workspace to do the fit
RooAbsPdf *model = ws->pdf("model");
RooRealVar *zYield = ws->var("zYield");
RooRealVar *qcdYield = ws->var("qcdYield");
RooDataSet *data = (RooDataSet *)ws->data("data");
// fit the model to the data.
model->fitTo(*data, Extended());
// The sPlot technique requires that we fix the parameters
// of the model that are not yields after doing the fit.
//
// This *could* be done with the lines below, however this is taken care of
// by the RooStats::SPlot constructor (or more precisely the AddSWeight
// method).
//
// RooRealVar* sigmaZ = ws->var("sigmaZ");
// RooRealVar* qcdMassDecayConst = ws->var("qcdMassDecayConst");
// sigmaZ->setConstant();
// qcdMassDecayConst->setConstant();
std::cout << "\n\n------------------------------------------\nThe dataset before creating sWeights:\n";
data->Print();
// Now we use the SPlot class to add SWeights to our data set
// based on our model and our yield variables
RooStats::SPlot *sData = new RooStats::SPlot("sData", "An SPlot", *data, model, RooArgList(*zYield, *qcdYield));
std::cout << "\n\nThe dataset after creating sWeights:\n";
data->Print();
// Check that our weights have the desired properties
std::cout << "\n\n------------------------------------------\n\nCheck SWeights:" << std::endl;
std::cout << std::endl
<< "Yield of Z is\t" << zYield->getVal() << ". From sWeights it is "
<< sData->GetYieldFromSWeight("zYield") << std::endl;
std::cout << "Yield of QCD is\t" << qcdYield->getVal() << ". From sWeights it is "
<< sData->GetYieldFromSWeight("qcdYield") << std::endl
<< std::endl;
for (Int_t i = 0; i < 10; i++) {
std::cout << "z Weight for event " << i << std::right << std::setw(12) << sData->GetSWeight(i, "zYield") << " qcd Weight"
<< std::setw(12) << sData->GetSWeight(i, "qcdYield") << " Total Weight" << std::setw(12) << sData->GetSumOfEventSWeight(i)
<< std::endl;
}
std::cout << std::endl;
// import this new dataset with sWeights
std::cout << "import new dataset with sWeights" << std::endl;
ws->import(*data, Rename("dataWithSWeights"));
}
void MakePlots(RooWorkspace *ws)
{
// Here we make plots of the discriminating variable (invMass) after the fit
// and of the control variable (isolation) after unfolding with sPlot.
std::cout << "make plots" << std::endl;
// make our canvas
TCanvas *cdata = new TCanvas("sPlot", "sPlot demo", 400, 600);
cdata->Divide(1, 3);
// get what we need out of the workspace
RooAbsPdf *model = ws->pdf("model");
RooAbsPdf *zModel = ws->pdf("zModel");
RooAbsPdf *qcdModel = ws->pdf("qcdModel");
RooRealVar *isolation = ws->var("isolation");
RooRealVar *invMass = ws->var("invMass");
// note, we get the dataset with sWeights
RooDataSet *data = (RooDataSet *)ws->data("dataWithSWeights");
// this shouldn't be necessary, need to fix something with workspace
// do this to set parameters back to their fitted values.
// model->fitTo(*data, Extended());
// plot invMass for data with full model and individual components overlaid
// TCanvas* cdata = new TCanvas();
cdata->cd(1);
RooPlot *frame = invMass->frame();
data->plotOn(frame);
model->plotOn(frame, Name("FullModel"));
model->plotOn(frame, Components(*zModel), LineStyle(kDashed), LineColor(kRed), Name("ZModel"));
model->plotOn(frame, Components(*qcdModel), LineStyle(kDashed), LineColor(kGreen), Name("QCDModel"));
TLegend leg(0.11, 0.5, 0.5, 0.8);
leg.AddEntry(frame->findObject("FullModel"), "Full model", "L");
leg.AddEntry(frame->findObject("ZModel"), "Z model", "L");
leg.AddEntry(frame->findObject("QCDModel"), "QCD model", "L");
leg.SetBorderSize(0);
leg.SetFillStyle(0);
frame->SetTitle("Fit of model to discriminating variable");
frame->Draw();
leg.DrawClone();
// Now use the sWeights to show isolation distribution for Z and QCD.
// The SPlot class can make this easier, but here we demonstrate in more
// detail how the sWeights are used. The SPlot class should make this
// very easy and needs some more development.
// Plot isolation for Z component.
// Do this by plotting all events weighted by the sWeight for the Z component.
// The SPlot class adds a new variable that has the name of the corresponding
// yield + "_sw".
cdata->cd(2);
// create weighted data set
RooDataSet *dataw_z = new RooDataSet(data->GetName(), data->GetTitle(), data, *data->get(), 0, "zYield_sw");
RooPlot *frame2 = isolation->frame();
// Since the data are weighted, we use SumW2 to compute the errors.
dataw_z->plotOn(frame2, DataError(RooAbsData::SumW2));
frame2->SetTitle("Isolation distribution with s weights to project out Z");
frame2->Draw();
// Plot isolation for QCD component.
// Eg. plot all events weighted by the sWeight for the QCD component.
// The SPlot class adds a new variable that has the name of the corresponding
// yield + "_sw".
cdata->cd(3);
RooDataSet *dataw_qcd = new RooDataSet(data->GetName(), data->GetTitle(), data, *data->get(), 0, "qcdYield_sw");
RooPlot *frame3 = isolation->frame();
dataw_qcd->plotOn(frame3, DataError(RooAbsData::SumW2));
frame3->SetTitle("Isolation distribution with s weights to project out QCD");
frame3->Draw();
// cdata->SaveAs("SPlot.gif");
}
RooFit v3.60 -- Developed by Wouter Verkerke and David Kirkby
Copyright (C) 2000-2013 NIKHEF, University of California & Stanford University
All rights reserved, please read http://roofit.sourceforge.net/license.txt
make z model
make qcd model
make full model
import model
[#1] INFO:ObjectHandling -- RooWorkspace::import(myWS) importing RooAddPdf::model
[#1] INFO:ObjectHandling -- RooWorkspace::import(myWS) importing RooProdPdf::zModel
[#1] INFO:ObjectHandling -- RooWorkspace::import(myWS) importing RooGaussian::mZModel
[#1] INFO:ObjectHandling -- RooWorkspace::import(myWS) importing RooRealVar::invMass
[#1] INFO:ObjectHandling -- RooWorkspace::import(myWS) importing RooRealVar::mZ
[#1] INFO:ObjectHandling -- RooWorkspace::import(myWS) importing RooRealVar::sigmaZ
[#1] INFO:ObjectHandling -- RooWorkspace::import(myWS) importing RooExponential::zIsolationModel
[#1] INFO:ObjectHandling -- RooWorkspace::import(myWS) importing RooRealVar::isolation
[#1] INFO:ObjectHandling -- RooWorkspace::import(myWS) importing RooConstVar::zIsolDecayConst
[#1] INFO:ObjectHandling -- RooWorkspace::import(myWS) importing RooRealVar::zYield
[#1] INFO:ObjectHandling -- RooWorkspace::import(myWS) importing RooProdPdf::qcdModel
[#1] INFO:ObjectHandling -- RooWorkspace::import(myWS) importing RooExponential::qcdMassModel
[#1] INFO:ObjectHandling -- RooWorkspace::import(myWS) importing RooRealVar::qcdMassDecayConst
[#1] INFO:ObjectHandling -- RooWorkspace::import(myWS) importing RooExponential::qcdIsolationModel
[#1] INFO:ObjectHandling -- RooWorkspace::import(myWS) importing RooConstVar::qcdIsolDecayConst
[#1] INFO:ObjectHandling -- RooWorkspace::import(myWS) importing RooRealVar::qcdYield
make data set and import to workspace
[#1] INFO:ObjectHandling -- RooWorkspace::import(myWS) importing dataset modelData
[#1] INFO:ObjectHandling -- RooWorkSpace::import(myWS) changing name of dataset from modelData to data
Calculate sWeights
[#1] INFO:Minization -- createNLL: caching constraint set under name CONSTR_OF_PDF_model_FOR_OBS_invMass:isolation with 0 entries
[#1] INFO:Minization -- RooMinimizer::optimizeConst: activating const optimization
[#1] INFO:Minization -- The following expressions have been identified as constant and will be precalculated and cached: (zIsolationModel,qcdIsolationModel)
[#1] INFO:Minization -- The following expressions will be evaluated in cache-and-track mode: (mZModel,qcdMassModel)
**********
** 1 **SET PRINT 1
**********
**********
** 2 **SET NOGRAD
**********
PARAMETER DEFINITIONS:
NO. NAME VALUE STEP SIZE LIMITS
1 qcdMassDecayConst -1.00000e-02 2.00000e+01 -1.00000e+02 1.00000e+02
2 qcdYield 1.00000e+02 5.00000e+01 0.00000e+00 1.00000e+03
3 sigmaZ 2.00000e+00 1.00000e+00 0.00000e+00 1.00000e+01
4 zYield 5.00000e+01 2.50000e+01 0.00000e+00 1.00000e+03
**********
** 3 **SET ERR 0.5
**********
**********
** 4 **SET PRINT 1
**********
**********
** 5 **SET STR 1
**********
NOW USING STRATEGY 1: TRY TO BALANCE SPEED AGAINST RELIABILITY
**********
** 6 **MIGRAD 2000 1
**********
FIRST CALL TO USER FUNCTION AT NEW START POINT, WITH IFLAG=4.
START MIGRAD MINIMIZATION. STRATEGY 1. CONVERGENCE WHEN EDM .LT. 1.00e-03
FCN=1901.86 FROM MIGRAD STATUS=INITIATE 18 CALLS 19 TOTAL
EDM= unknown STRATEGY= 1 NO ERROR MATRIX
EXT PARAMETER CURRENT GUESS STEP FIRST
NO. NAME VALUE ERROR SIZE DERIVATIVE
1 qcdMassDecayConst -1.00000e-02 2.00000e+01 2.01358e-01 -1.06814e+05
2 qcdYield 1.00000e+02 5.00000e+01 1.72186e-01 -1.57317e+03
3 sigmaZ 2.00000e+00 1.00000e+00 2.57889e-01 3.62916e+01
4 zYield 5.00000e+01 2.50000e+01 1.18625e-01 -1.41930e+03
ERR DEF= 0.5
MIGRAD MINIMIZATION HAS CONVERGED.
MIGRAD WILL VERIFY CONVERGENCE AND ERROR MATRIX.
COVARIANCE MATRIX CALCULATED SUCCESSFULLY
FCN=849.923 FROM MIGRAD STATUS=CONVERGED 110 CALLS 111 TOTAL
EDM=3.1409e-06 STRATEGY= 1 ERROR MATRIX ACCURATE
EXT PARAMETER STEP FIRST
NO. NAME VALUE ERROR SIZE DERIVATIVE
1 qcdMassDecayConst -9.30176e-03 7.55562e-04 1.52115e-07 2.51004e+01
2 qcdYield 6.22140e+02 2.52789e+01 1.04861e-03 -1.45735e-02
3 sigmaZ 1.95257e+00 8.09927e-02 4.09416e-04 -7.77193e-02
4 zYield 3.77836e+02 1.98770e+01 8.24140e-04 -6.51095e-03
ERR DEF= 0.5
EXTERNAL ERROR MATRIX. NDIM= 25 NPAR= 4 ERR DEF=0.5
5.709e-07 1.999e-04 -1.007e-06 -1.999e-04
1.999e-04 6.396e+02 -9.256e-02 -1.747e+01
-1.007e-06 -9.256e-02 6.561e-03 9.257e-02
-1.999e-04 -1.747e+01 9.257e-02 3.953e+02
PARAMETER CORRELATION COEFFICIENTS
NO. GLOBAL 1 2 3 4
1 0.02260 1.000 0.010 -0.016 -0.013
2 0.05626 0.010 1.000 -0.045 -0.035
3 0.07351 -0.016 -0.045 1.000 0.057
4 0.06697 -0.013 -0.035 0.057 1.000
**********
** 7 **SET ERR 0.5
**********
**********
** 8 **SET PRINT 1
**********
**********
** 9 **HESSE 2000
**********
COVARIANCE MATRIX CALCULATED SUCCESSFULLY
FCN=849.923 FROM HESSE STATUS=OK 23 CALLS 134 TOTAL
EDM=3.13931e-06 STRATEGY= 1 ERROR MATRIX ACCURATE
EXT PARAMETER INTERNAL INTERNAL
NO. NAME VALUE ERROR STEP SIZE VALUE
1 qcdMassDecayConst -9.30176e-03 7.55563e-04 3.04229e-08 -9.30176e-05
2 qcdYield 6.22140e+02 2.52790e+01 2.09721e-04 2.46778e-01
3 sigmaZ 1.95257e+00 8.09935e-02 8.18832e-05 -6.55413e-01
4 zYield 3.77836e+02 1.98772e+01 1.64828e-04 -2.46827e-01
ERR DEF= 0.5
EXTERNAL ERROR MATRIX. NDIM= 25 NPAR= 4 ERR DEF=0.5
5.709e-07 2.002e-04 -1.009e-06 -2.002e-04
2.002e-04 6.396e+02 -9.273e-02 -1.749e+01
-1.009e-06 -9.273e-02 6.561e-03 9.273e-02
-2.002e-04 -1.749e+01 9.273e-02 3.953e+02
PARAMETER CORRELATION COEFFICIENTS
NO. GLOBAL 1 2 3 4
1 0.02264 1.000 0.010 -0.016 -0.013
2 0.05634 0.010 1.000 -0.045 -0.035
3 0.07364 -0.016 -0.045 1.000 0.058
4 0.06707 -0.013 -0.035 0.058 1.000
[#1] INFO:Minization -- RooMinimizer::optimizeConst: deactivating const optimization
------------------------------------------
The dataset before creating sWeights:
RooDataSet::data[invMass,isolation] = 1000 entries
The dataset after creating sWeights:
RooDataSet::data[invMass,isolation,zYield_sw,L_zYield,qcdYield_sw,L_qcdYield] = 1000 entries
------------------------------------------
Check SWeights:
Yield of Z is 377.841. From sWeights it is 377.841
Yield of QCD is 622.16. From sWeights it is 622.16
z Weight for event 0 -0.0259121 qcd Weight 1.02591 Total Weight 1
z Weight for event 1 -0.0259121 qcd Weight 1.02591 Total Weight 1
z Weight for event 2 1.02602 qcd Weight -0.0260232 Total Weight 1
z Weight for event 3 1.01358 qcd Weight -0.0135828 Total Weight 1
z Weight for event 4 -0.0259121 qcd Weight 1.02591 Total Weight 1
z Weight for event 5 1.02684 qcd Weight -0.0268354 Total Weight 1
z Weight for event 6 -0.0259121 qcd Weight 1.02591 Total Weight 1
z Weight for event 7 -0.0259121 qcd Weight 1.02591 Total Weight 1
z Weight for event 8 1.03438 qcd Weight -0.0343802 Total Weight 1
z Weight for event 9 0.910182 qcd Weight 0.0898196 Total Weight 1
import new dataset with sWeights
make plots
[#1] INFO:Plotting -- RooAbsReal::plotOn(model) plot on invMass integrates over variables (isolation)
[#1] INFO:Plotting -- RooAbsPdf::plotOn(model) directly selected PDF components: (zModel)
[#1] INFO:Plotting -- RooAbsPdf::plotOn(model) indirectly selected PDF components: (mZModel,zIsolationModel)
[#1] INFO:Plotting -- RooAbsReal::plotOn(model) plot on invMass integrates over variables (isolation)
[#1] INFO:Plotting -- RooAbsPdf::plotOn(model) directly selected PDF components: (qcdModel)
[#1] INFO:Plotting -- RooAbsPdf::plotOn(model) indirectly selected PDF components: (qcdMassModel,qcdIsolationModel)
[#1] INFO:Plotting -- RooAbsReal::plotOn(model) plot on invMass integrates over variables (isolation)
Author
Kyle Cranmer

Definition in file rs301_splot.C.