Logo ROOT  
Reference Guide
rf204b_extendedLikelihood_rangedFit.C
Go to the documentation of this file.
1 /// \file
2 /// \ingroup tutorial_roofit
3 /// \notebook -nodraw
4 /// This macro demonstrates how to set up a fit in two ranges
5 /// such that it does not only fit the shapes in each region, but also
6 /// takes into account the relative normalization of the two.
7 ///
8 /// ### 1. Shape fits (plain likelihood)
9 ///
10 /// If you perform a fit in two ranges in RooFit, e.g. `pdf->fitTo(data,Range("Range1,Range2"))`
11 /// it will construct a simple simultaneous fit of the two regions.
12 ///
13 /// In case the pdf is not extended, i.e., a shape fit, it will only fit the shapes in the
14 /// two selected ranges, and not take into account the relative predicted yields in those ranges.
15 ///
16 /// In certain models (like exponential decays) and configurations (e.g. narrow ranges that are far apart),
17 /// the relative normalization of the ranges may carry much more information about the function parameters
18 /// than the shape of the distribution inside those ranges. Therefore, it is important to take that into
19 /// account.
20 ///
21 /// This is particularly important for cases where the 2-range fit is meant to be representative of
22 /// a full-range fit, but with a blinded signal region inside it.
23 ///
24 ///
25 /// ### 2. Shape+rate fits (extended likelihood)
26 ///
27 /// Also if your pdf is already extended, i.e. measuring both the distribution in the observable as well
28 /// as the event count in the fitted region, some intervention is needed to make fits in ranges
29 /// work in a way that corresponds to intuition.
30 ///
31 /// If an extended fit is performed in a sub-range, the observed yield is only that of the subrange, hence
32 /// the expected event count will converge to a number that is smaller than what's visible in a plot.
33 /// In such cases, it is often preferred to interpret the extended term with respect to the full range
34 /// that's plotted, i.e., apply a correction to the extended likelihood term in such a way
35 /// that the interpretation of the expected event count remains that of the full range. This can
36 /// be done by applying a correcion factor (equal to the fraction of the pdf that is contained in the
37 /// fitted range) in the Poisson term that represents the extended likelihood term.
38 ///
39 /// If an extended likelihood fit is performed over *two* sub-ranges, this correction is
40 /// even more important: without it, each component likelihood would have a different interpretation
41 /// of the expected event count (each corresponding to the count in its own region), and a joint
42 /// fit of these regions with different interpretations of the same model parameter results
43 /// in a number that is not easily interpreted.
44 ///
45 /// If both regions correct their interpretatin such that N_expected refers to the full range,
46 /// it is interpreted easily, and consistent in both regions.
47 ///
48 /// This requires that the likelihood model is extended using RooAddPdf in the
49 /// form SumPdf = Nsig * sigPdf + Nbkg * bkgPdf.
50 ///
51 /// \macro_image
52 /// \macro_code
53 /// \macro_output
54 ///
55 /// \authors Stephan Hageboeck, Wouter Verkerke
56 
57 #include "RooRealVar.h"
58 #include "RooExponential.h"
59 #include "RooGaussian.h"
60 #include "RooAddPdf.h"
61 #include "RooDataSet.h"
62 #include "RooPlot.h"
63 #include "RooExtendPdf.h"
64 #include "TCanvas.h"
65 using namespace RooFit;
66 
67 void rf204b_extendedLikelihood_rangedFit()
68 {
69 
70  // PART 1: Background-only fits
71  // ----------------------------
72 
73  // Build plain exponential model
74  RooRealVar x("x", "x", 10, 100);
75  RooRealVar alpha("alpha", "alpha", -0.04, -0.1, -0.0);
76  RooExponential model("model", "Exponential model", x, alpha);
77 
78  // Define side band regions and full range
79  x.setRange("LEFT",10,20);
80  x.setRange("RIGHT",60,100);
81 
82  x.setRange("FULL",10,100);
83 
84  RooDataSet* data = model.generate(x, 10000);
85 
86  // Construct an extended pdf, which measures the event count N **on the full range**.
87  // If the actual domain of x that is fitted is identical to FULL, this has no affect.
88  //
89  // If the fitted domain is a subset of `FULL`, though, the expected event count is divided by
90  // \f[
91  // \mathrm{frac} = \frac{
92  // \int_{\mathrm{Fit range}} \mathrm{model}(x) \; \mathrm{d}x }{
93  // \int_{\mathrm{Full range}} \mathrm{model}(x) \; \mathrm{d}x }.
94  // \f]
95  // `N` will therefore return the count extrapolated to the full range instead of the fit range.
96  //
97  // **Note**: When using a RooAddPdf for extending the likelihood, the same effect can be achieved with
98  // [RooAddPdf::fixCoefRange()](https://root.cern.ch/doc/master/classRooAddPdf.html#ab631caf4b59e4c4221f8967aecbf2a65),
99 
100  RooRealVar N("N", "Extended term", 0, 20000);
101  RooExtendPdf extmodel("extmodel", "Extended model", model, N, "FULL");
102 
103 
104  // It can be instructive to fit the above model to either the LEFT or RIGHT range. `N` should approximately converge to the expected number
105  // of events in the full range. One may try to leave out `"FULL"` in the constructor, o the the interpretation of `N` changes.
106  extmodel.fitTo(*data, Range("LEFT"), PrintLevel(-1));
107  N.Print();
108 
109 
110  // If we now do a simultaneous fit to the extended model, instead of the original model, the LEFT and RIGHT range will each correct their local
111  // `N` such that it refers to the `FULL` range.
112  //
113  // This joint fit of the extmodel will include (w.r.t. the plain model fit) a product of extended terms
114  // \f[
115  // L_\mathrm{ext} = L
116  // \cdot \mathrm{Poisson} \left( N_\mathrm{obs}^\mathrm{LEFT} | N_\mathrm{exp} / \mathrm{frac LEFT} \right)
117  // \cdot \mathrm{Poisson} \left( N_\mathrm{obs}^\mathrm{RIGHT} | N_\mathrm{exp} / \mathrm{frac RIGHT} \right)
118  // \f]
119  // that will introduce additional sensitivity of the likelihood to the slope parameter alpha of the exponential model through the `frac_LEFT` and `frac_RIGHT` integrals.
120  //
121  // In the extreme case of an exponential function and a fit in narrow LEFT and RIGHT ranges, this sensitivity may actually be larger
122  // than from the shapes.
123  //
124  // This is also nicely demonstrated in the example below where the uncertainty on alpha is almost 5x smaller if the extended term is included.
125 
126 
127  TCanvas* c = new TCanvas("c", "c", 2100, 700);
128  c->Divide(3);
129  c->cd(1);
130 
131  RooFitResult* r = model.fitTo(*data, Range("LEFT,RIGHT"), Save());
132 
133  RooPlot* frame = x.frame();
134  data->plotOn(frame);
135  model.plotOn(frame, VisualizeError(*r));
136  model.plotOn(frame);
137  model.paramOn(frame, Label("Bkg fit. Large errors since\nnormalisation ignored"));
138  frame->Draw();
139 
140  c->cd(2);
141 
142  RooFitResult* r2 = extmodel.fitTo(*data, Range("LEFT,RIGHT"), Save());
143  RooPlot* frame2 = x.frame();
144  data->plotOn(frame2);
145  extmodel.plotOn(frame2);
146  extmodel.plotOn(frame2, VisualizeError(*r2));
147  extmodel.paramOn(frame2, Label("Bkg fit. Normalisation\nincluded"), Layout(0.4,0.95));
148  frame2->Draw();
149 
150  // PART 2: Extending with RooAddPdf
151  // --------------------------------
152  //
153  // Now we repeat the above exercise, but instead of explicitly adding an extended term to a single shape pdf (RooExponential),
154  // we assume that we already have an extended likelihood model in the form of a RooAddPdf constructed in the form `Nsig * sigPdf + Nbkg * bkgPdf`.
155  //
156  // We add a Gaussian to the previously defined exponential background.
157  // The signal shape parameters are chosen constant, since the signal region is entirely in the blinded region, i.e., the fit has no sensitivity.
158 
159  c->cd(3);
160 
161  RooRealVar Nsig("Nsig", "Number of signal events", 1000, 0, 2000);
162  RooRealVar Nbkg("Nbkg", "Number of background events", 10000, 0, 20000);
163 
164  RooRealVar mean("mean", "Mean of signal model", 40.);
165  RooRealVar width("width", "Width of signal model", 5.);
166  RooGaussian sig("sig", "Signal model", x, mean, width);
167 
168  RooAddPdf modelsum("modelsum", "NSig*signal + NBkg*background", RooArgSet(sig, model), RooArgSet(Nsig, Nbkg));
169 
170  // This model will automatically insert the correction factor for the reinterpretation of Nsig and Nnbkg in the full ranges.
171  //
172  // When this happens, it reports this with lines like the following:
173  // ```
174  // [#1] INFO:Fitting -- RooAbsOptTestStatistic::ctor(nll_modelsum_modelsumData_LEFT) fixing interpretation of coefficients of any RooAddPdf to full domain of observables
175  // [#1] INFO:Fitting -- RooAbsOptTestStatistic::ctor(nll_modelsum_modelsumData_RIGHT) fixing interpretation of coefficients of any RooAddPdf to full domain of observables
176  // ```
177 
178  RooFitResult* r3 = modelsum.fitTo(*data, Range("LEFT,RIGHT"), Save());
179 
180  RooPlot* frame3 = x.frame();
181  data->plotOn(frame3);
182  modelsum.plotOn(frame3);
183  modelsum.plotOn(frame3, VisualizeError(*r3));
184  modelsum.paramOn(frame3, Label("S+B fit with RooAddPdf"), Layout(0.3,0.95));
185  frame3->Draw();
186 
187  c->Draw();
188 }
c
#define c(i)
Definition: RSha256.hxx:101
RooPlot::Draw
virtual void Draw(Option_t *options=0)
Draw this plot and all of the elements it contains.
Definition: RooPlot.cxx:691
RooAddPdf
RooAddPdf is an efficient implementation of a sum of PDFs of the form.
Definition: RooAddPdf.h:32
RooExtendPdf.h
RooFit::PrintLevel
RooCmdArg PrintLevel(Int_t code)
Definition: RooGlobalFunc.cxx:192
RooFit::VisualizeError
RooCmdArg VisualizeError(const RooDataSet &paramData, Double_t Z=1)
Definition: RooGlobalFunc.cxx:71
r
ROOT::R::TRInterface & r
Definition: Object.C:4
RooGaussian.h
width
include TDocParser_001 C image html pict1_TDocParser_001 png width
Definition: TDocParser.cxx:121
N
#define N
RooExtendPdf
RooExtendPdf is a wrapper around an existing PDF that adds a parameteric extended likelihood term to ...
Definition: RooExtendPdf.h:22
x
Double_t x[n]
Definition: legend1.C:17
RooGaussian
Plain Gaussian p.d.f.
Definition: RooGaussian.h:24
RooAddPdf.h
TCanvas.h
RooDataSet.h
RooFitResult
RooFitResult is a container class to hold the input and output of a PDF fit to a dataset.
Definition: RooFitResult.h:40
RooFit::Label
RooCmdArg Label(const char *str)
Definition: RooGlobalFunc.cxx:222
RooFit::Layout
RooCmdArg Layout(Double_t xmin, Double_t xmax=0.99, Double_t ymin=0.95)
Definition: RooGlobalFunc.cxx:223
RooFit
The namespace RooFit contains mostly switches that change the behaviour of functions of PDFs (or othe...
Definition: RooCFunction1Binding.h:29
RooAbsData::plotOn
virtual RooPlot * plotOn(RooPlot *frame, const RooCmdArg &arg1=RooCmdArg::none(), const RooCmdArg &arg2=RooCmdArg::none(), const RooCmdArg &arg3=RooCmdArg::none(), const RooCmdArg &arg4=RooCmdArg::none(), const RooCmdArg &arg5=RooCmdArg::none(), const RooCmdArg &arg6=RooCmdArg::none(), const RooCmdArg &arg7=RooCmdArg::none(), const RooCmdArg &arg8=RooCmdArg::none()) const
Definition: RooAbsData.cxx:547
RooPlot.h
RooPlot
A RooPlot is a plot frame and a container for graphics objects within that frame.
Definition: RooPlot.h:44
RooRealVar.h
RooExponential.h
TCanvas
The Canvas class.
Definition: TCanvas.h:23
RooFit::Range
RooCmdArg Range(const char *rangeName, Bool_t adjustNorm=kTRUE)
Definition: RooGlobalFunc.cxx:53
RooDataSet
RooDataSet is a container class to hold unbinned data.
Definition: RooDataSet.h:33
make_cnn_model.model
model
Definition: make_cnn_model.py:6
RooRealVar
RooRealVar represents a variable that can be changed from the outside.
Definition: RooRealVar.h:37
RooFit::Save
RooCmdArg Save(Bool_t flag=kTRUE)
Definition: RooGlobalFunc.cxx:190
RooExponential
Exponential PDF.
Definition: RooExponential.h:25
RooArgSet
RooArgSet is a container object that can hold multiple RooAbsArg objects.
Definition: RooArgSet.h:29