Logo ROOT  
Reference Guide
rf204b_extendedLikelihood_rangedFit.C
Go to the documentation of this file.
1/// \file
2/// \ingroup tutorial_roofit
3/// \notebook -nodraw
4/// This macro demonstrates how to set up a fit in two ranges
5/// such that it does not only fit the shapes in each region, but also
6/// takes into account the relative normalization of the two.
7///
8/// ### 1. Shape fits (plain likelihood)
9///
10/// If you perform a fit in two ranges in RooFit, e.g. `pdf->fitTo(data,Range("Range1,Range2"))`
11/// it will construct a simple simultaneous fit of the two regions.
12///
13/// In case the pdf is not extended, i.e., a shape fit, it will only fit the shapes in the
14/// two selected ranges, and not take into account the relative predicted yields in those ranges.
15///
16/// In certain models (like exponential decays) and configurations (e.g. narrow ranges that are far apart),
17/// the relative normalization of the ranges may carry much more information about the function parameters
18/// than the shape of the distribution inside those ranges. Therefore, it is important to take that into
19/// account.
20///
21/// This is particularly important for cases where the 2-range fit is meant to be representative of
22/// a full-range fit, but with a blinded signal region inside it.
23///
24///
25/// ### 2. Shape+rate fits (extended likelihood)
26///
27/// Also if your pdf is already extended, i.e. measuring both the distribution in the observable as well
28/// as the event count in the fitted region, some intervention is needed to make fits in ranges
29/// work in a way that corresponds to intuition.
30///
31/// If an extended fit is performed in a sub-range, the observed yield is only that of the subrange, hence
32/// the expected event count will converge to a number that is smaller than what's visible in a plot.
33/// In such cases, it is often preferred to interpret the extended term with respect to the full range
34/// that's plotted, i.e., apply a correction to the extended likelihood term in such a way
35/// that the interpretation of the expected event count remains that of the full range. This can
36/// be done by applying a correcion factor (equal to the fraction of the pdf that is contained in the
37/// fitted range) in the Poisson term that represents the extended likelihood term.
38///
39/// If an extended likelihood fit is performed over *two* sub-ranges, this correction is
40/// even more important: without it, each component likelihood would have a different interpretation
41/// of the expected event count (each corresponding to the count in its own region), and a joint
42/// fit of these regions with different interpretations of the same model parameter results
43/// in a number that is not easily interpreted.
44///
45/// If both regions correct their interpretatin such that N_expected refers to the full range,
46/// it is interpreted easily, and consistent in both regions.
47///
48/// This requires that the likelihood model is extended using RooAddPdf in the
49/// form SumPdf = Nsig * sigPdf + Nbkg * bkgPdf.
50///
51/// \macro_image
52/// \macro_code
53/// \macro_output
54///
55/// \authors Stephan Hageboeck, Wouter Verkerke
56
57#include "RooRealVar.h"
58#include "RooExponential.h"
59#include "RooGaussian.h"
60#include "RooAddPdf.h"
61#include "RooDataSet.h"
62#include "RooPlot.h"
63#include "RooExtendPdf.h"
64#include "TCanvas.h"
65using namespace RooFit;
66
67void rf204b_extendedLikelihood_rangedFit()
68{
69
70 // PART 1: Background-only fits
71 // ----------------------------
72
73 // Build plain exponential model
74 RooRealVar x("x", "x", 10, 100);
75 RooRealVar alpha("alpha", "alpha", -0.04, -0.1, -0.0);
76 RooExponential model("model", "Exponential model", x, alpha);
77
78 // Define side band regions and full range
79 x.setRange("LEFT",10,20);
80 x.setRange("RIGHT",60,100);
81
82 x.setRange("FULL",10,100);
83
84 RooDataSet* data = model.generate(x, 10000);
85
86 // Construct an extended pdf, which measures the event count N **on the full range**.
87 // If the actual domain of x that is fitted is identical to FULL, this has no affect.
88 //
89 // If the fitted domain is a subset of `FULL`, though, the expected event count is divided by
90 // \f[
91 // \mathrm{frac} = \frac{
92 // \int_{\mathrm{Fit range}} \mathrm{model}(x) \; \mathrm{d}x }{
93 // \int_{\mathrm{Full range}} \mathrm{model}(x) \; \mathrm{d}x }.
94 // \f]
95 // `N` will therefore return the count extrapolated to the full range instead of the fit range.
96 //
97 // **Note**: When using a RooAddPdf for extending the likelihood, the same effect can be achieved with
98 // [RooAddPdf::fixCoefRange()](https://root.cern.ch/doc/master/classRooAddPdf.html#ab631caf4b59e4c4221f8967aecbf2a65),
99
100 RooRealVar N("N", "Extended term", 0, 20000);
101 RooExtendPdf extmodel("extmodel", "Extended model", model, N, "FULL");
102
103
104 // It can be instructive to fit the above model to either the LEFT or RIGHT range. `N` should approximately converge to the expected number
105 // of events in the full range. One may try to leave out `"FULL"` in the constructor, o the the interpretation of `N` changes.
106 extmodel.fitTo(*data, Range("LEFT"), PrintLevel(-1));
107 N.Print();
108
109
110 // If we now do a simultaneous fit to the extended model, instead of the original model, the LEFT and RIGHT range will each correct their local
111 // `N` such that it refers to the `FULL` range.
112 //
113 // This joint fit of the extmodel will include (w.r.t. the plain model fit) a product of extended terms
114 // \f[
115 // L_\mathrm{ext} = L
116 // \cdot \mathrm{Poisson} \left( N_\mathrm{obs}^\mathrm{LEFT} | N_\mathrm{exp} / \mathrm{frac LEFT} \right)
117 // \cdot \mathrm{Poisson} \left( N_\mathrm{obs}^\mathrm{RIGHT} | N_\mathrm{exp} / \mathrm{frac RIGHT} \right)
118 // \f]
119 // that will introduce additional sensitivity of the likelihood to the slope parameter alpha of the exponential model through the `frac_LEFT` and `frac_RIGHT` integrals.
120 //
121 // In the extreme case of an exponential function and a fit in narrow LEFT and RIGHT ranges, this sensitivity may actually be larger
122 // than from the shapes.
123 //
124 // This is also nicely demonstrated in the example below where the uncertainty on alpha is almost 5x smaller if the extended term is included.
125
126
127 TCanvas* c = new TCanvas("c", "c", 2100, 700);
128 c->Divide(3);
129 c->cd(1);
130
131 RooFitResult* r = model.fitTo(*data, Range("LEFT,RIGHT"), Save());
132
133 RooPlot* frame = x.frame();
134 data->plotOn(frame);
135 model.plotOn(frame, VisualizeError(*r));
136 model.plotOn(frame);
137 model.paramOn(frame, Label("Bkg fit. Large errors since\nnormalisation ignored"));
138 frame->Draw();
139
140 c->cd(2);
141
142 RooFitResult* r2 = extmodel.fitTo(*data, Range("LEFT,RIGHT"), Save());
143 RooPlot* frame2 = x.frame();
144 data->plotOn(frame2);
145 extmodel.plotOn(frame2);
146 extmodel.plotOn(frame2, VisualizeError(*r2));
147 extmodel.paramOn(frame2, Label("Bkg fit. Normalisation\nincluded"), Layout(0.4,0.95));
148 frame2->Draw();
149
150 // PART 2: Extending with RooAddPdf
151 // --------------------------------
152 //
153 // Now we repeat the above exercise, but instead of explicitly adding an extended term to a single shape pdf (RooExponential),
154 // we assume that we already have an extended likelihood model in the form of a RooAddPdf constructed in the form `Nsig * sigPdf + Nbkg * bkgPdf`.
155 //
156 // We add a Gaussian to the previously defined exponential background.
157 // The signal shape parameters are chosen constant, since the signal region is entirely in the blinded region, i.e., the fit has no sensitivity.
158
159 c->cd(3);
160
161 RooRealVar Nsig("Nsig", "Number of signal events", 1000, 0, 2000);
162 RooRealVar Nbkg("Nbkg", "Number of background events", 10000, 0, 20000);
163
164 RooRealVar mean("mean", "Mean of signal model", 40.);
165 RooRealVar width("width", "Width of signal model", 5.);
166 RooGaussian sig("sig", "Signal model", x, mean, width);
167
168 RooAddPdf modelsum("modelsum", "NSig*signal + NBkg*background", RooArgSet(sig, model), RooArgSet(Nsig, Nbkg));
169
170 // This model will automatically insert the correction factor for the reinterpretation of Nsig and Nnbkg in the full ranges.
171 //
172 // When this happens, it reports this with lines like the following:
173 // ```
174 // [#1] INFO:Fitting -- RooAbsOptTestStatistic::ctor(nll_modelsum_modelsumData_LEFT) fixing interpretation of coefficients of any RooAddPdf to full domain of observables
175 // [#1] INFO:Fitting -- RooAbsOptTestStatistic::ctor(nll_modelsum_modelsumData_RIGHT) fixing interpretation of coefficients of any RooAddPdf to full domain of observables
176 // ```
177
178 RooFitResult* r3 = modelsum.fitTo(*data, Range("LEFT,RIGHT"), Save());
179
180 RooPlot* frame3 = x.frame();
181 data->plotOn(frame3);
182 modelsum.plotOn(frame3);
183 modelsum.plotOn(frame3, VisualizeError(*r3));
184 modelsum.paramOn(frame3, Label("S+B fit with RooAddPdf"), Layout(0.3,0.95));
185 frame3->Draw();
186
187 c->Draw();
188}
ROOT::R::TRInterface & r
Definition: Object.C:4
#define c(i)
Definition: RSha256.hxx:101
include TDocParser_001 C image html pict1_TDocParser_001 png width
Definition: TDocParser.cxx:121
#define N
virtual RooPlot * plotOn(RooPlot *frame, const RooCmdArg &arg1=RooCmdArg::none(), const RooCmdArg &arg2=RooCmdArg::none(), const RooCmdArg &arg3=RooCmdArg::none(), const RooCmdArg &arg4=RooCmdArg::none(), const RooCmdArg &arg5=RooCmdArg::none(), const RooCmdArg &arg6=RooCmdArg::none(), const RooCmdArg &arg7=RooCmdArg::none(), const RooCmdArg &arg8=RooCmdArg::none()) const
Definition: RooAbsData.cxx:547
RooAddPdf is an efficient implementation of a sum of PDFs of the form.
Definition: RooAddPdf.h:32
RooArgSet is a container object that can hold multiple RooAbsArg objects.
Definition: RooArgSet.h:29
RooDataSet is a container class to hold unbinned data.
Definition: RooDataSet.h:33
Exponential PDF.
RooExtendPdf is a wrapper around an existing PDF that adds a parameteric extended likelihood term to ...
Definition: RooExtendPdf.h:22
RooFitResult is a container class to hold the input and output of a PDF fit to a dataset.
Definition: RooFitResult.h:40
Plain Gaussian p.d.f.
Definition: RooGaussian.h:24
A RooPlot is a plot frame and a container for graphics objects within that frame.
Definition: RooPlot.h:44
virtual void Draw(Option_t *options=0)
Draw this plot and all of the elements it contains.
Definition: RooPlot.cxx:691
RooRealVar represents a variable that can be changed from the outside.
Definition: RooRealVar.h:39
The Canvas class.
Definition: TCanvas.h:23
RooCmdArg Layout(Double_t xmin, Double_t xmax=0.99, Double_t ymin=0.95)
RooCmdArg Save(Bool_t flag=kTRUE)
RooCmdArg PrintLevel(Int_t code)
RooCmdArg VisualizeError(const RooDataSet &paramData, Double_t Z=1)
Double_t x[n]
Definition: legend1.C:17
const Rcpp::internal::NamedPlaceHolder & Label
The namespace RooFit contains mostly switches that change the behaviour of functions of PDFs (or othe...
Ta Range(0, 0, 1, 1)