Logo ROOT  
Reference Guide
rf204b_extendedLikelihood_rangedFit.py
Go to the documentation of this file.
1## \file
2## \ingroup tutorial_roofit
3## \notebook -nodraw
4## This macro demonstrates how to set up a fit in two ranges for plain
5## likelihoods and extended likelihoods.
6##
7## ### 1. Shape fits (plain likelihood)
8##
9## If you fit a non-extended pdf in two ranges, e.g. `pdf.fitTo(data,Range="Range1,Range2")`,
10## it will fit the shapes in the two selected ranges and also take into account the relative
11## predicted yields in those ranges.
12##
13## This is useful for example to represent a full-range fit, but with a
14## blinded signal region inside it.
15##
16##
17## ### 2. Shape+rate fits (extended likelihood)
18##
19## If your pdf is extended, i.e. measuring both the distribution in the observable as well
20## as the event count in the fitted region, some intervention is needed to make fits in ranges
21## work in a way that corresponds to intuition.
22##
23## If an extended fit is performed in a sub-range, the observed yield is only that of the subrange, hence
24## the expected event count will converge to a number that is smaller than what's visible in a plot.
25## In such cases, it is often preferred to interpret the extended term with respect to the full range
26## that's plotted, i.e., apply a correction to the extended likelihood term in such a way
27## that the interpretation of the expected event count remains that of the full range. This can
28## be done by applying a correcion factor (equal to the fraction of the pdf that is contained in the
29## fitted range) in the Poisson term that represents the extended likelihood term.
30##
31## If an extended likelihood fit is performed over *two* sub-ranges, this correction is
32## even more important: without it, each component likelihood would have a different interpretation
33## of the expected event count (each corresponding to the count in its own region), and a joint
34## fit of these regions with different interpretations of the same model parameter results
35## in a number that is not easily interpreted.
36##
37## If both regions correct their interpretatin such that N_expected refers to the full range,
38## it is interpreted easily, and consistent in both regions.
39##
40## This requires that the likelihood model is extended using RooAddPdf in the
41## form SumPdf = Nsig * sigPdf + Nbkg * bkgPdf.
42##
43## \macro_image
44## \macro_code
45## \macro_output
46##
47## \authors Stephan Hageboeck, Wouter Verkerke
48
49import ROOT
50
51ROOT.gROOT.SetBatch(True)
52
53# PART 1: Background-only fits
54# ----------------------------
55
56# Build plain exponential model
57x = ROOT.RooRealVar("x", "x", 10, 100)
58alpha = ROOT.RooRealVar("alpha", "alpha", -0.04, -0.1, -0.0)
59model = ROOT.RooExponential("model", "Exponential model", x, alpha)
60
61# Define side band regions and full range
62x.setRange("LEFT", 10, 20)
63x.setRange("RIGHT", 60, 100)
64
65x.setRange("FULL", 10, 100)
66
67data = model.generate(x, 10000)
68
69# Construct an extended pdf, which measures the event count N **on the full range**.
70# If the actual domain of x that is fitted is identical to FULL, this has no affect.
71#
72# If the fitted domain is a subset of `FULL`, though, the expected event count is divided by
73# \f[
74# \mathrm{frac} = \frac{
75# \int_{\mathrm{Fit range}} \mathrm{model}(x) \; \mathrm{d}x }{
76# \int_{\mathrm{Full range}} \mathrm{model}(x) \; \mathrm{d}x }.
77# \f]
78# `N` will therefore return the count extrapolated to the full range instead of the fit range.
79#
80# **Note**: When using a RooAddPdf for extending the likelihood, the same effect can be achieved with
81# [RooAddPdf::fixCoefRange()](https://root.cern.ch/doc/master/classRooAddPdf.html#ab631caf4b59e4c4221f8967aecbf2a65),
82
83N = ROOT.RooRealVar("N", "Extended term", 0, 20000)
84extmodel = ROOT.RooExtendPdf("extmodel", "Extended model", model, N, "FULL")
85
86
87# It can be instructive to fit the above model to either the LEFT or RIGHT
88# range. `N` should approximately converge to the expected number of events in
89# the full range. One may try to leave out `"FULL"` in the constructor, or the
90# interpretation of `N` changes.
91extmodel.fitTo(data, Range="LEFT", PrintLevel=-1)
92N.Print()
93
94
95# If we now do a simultaneous fit to the extended model, instead of the
96# original model, the LEFT and RIGHT range will each correct their local `N`
97# such that it refers to the `FULL` range.
98#
99# This joint fit of the extmodel will include (w.r.t. the plain model fit) a product of extended terms
100# \f[
101# L_\mathrm{ext} = L
102# \cdot \mathrm{Poisson} \left( N_\mathrm{obs}^\mathrm{LEFT} | N_\mathrm{exp} / \mathrm{frac LEFT} \right)
103# \cdot \mathrm{Poisson} \left( N_\mathrm{obs}^\mathrm{RIGHT} | N_\mathrm{exp} / \mathrm{frac RIGHT} \right)
104# \f]
105
106
107c = ROOT.TCanvas("c", "c", 2100, 700)
108c.Divide(3)
109c.cd(1)
110
111r = model.fitTo(data, Range="LEFT,RIGHT", PrintLevel=-1, Save=True)
112r.Print()
113
114frame = x.frame()
115data.plotOn(frame)
116model.plotOn(frame, VisualizeError=r)
117model.plotOn(frame)
118model.paramOn(frame, Label="Non-extended fit")
119frame.Draw()
120
121c.cd(2)
122
123r2 = extmodel.fitTo(data, Range="LEFT,RIGHT", PrintLevel=-1, Save=True)
124r2.Print()
125frame2 = x.frame()
126data.plotOn(frame2)
127extmodel.plotOn(frame2)
128extmodel.plotOn(frame2, VisualizeError=r2)
129extmodel.paramOn(frame2, Label="Extended fit", Layout=(0.4, 0.95))
130frame2.Draw()
131
132# PART 2: Extending with RooAddPdf
133# --------------------------------
134#
135# Now we repeat the above exercise, but instead of explicitly adding an extended term to a single shape pdf (RooExponential),
136# we assume that we already have an extended likelihood model in the form of a RooAddPdf constructed in the form `Nsig * sigPdf + Nbkg * bkgPdf`.
137#
138# We add a Gaussian to the previously defined exponential background.
139# The signal shape parameters are chosen constant, since the signal region is entirely in the blinded region, i.e., the fit has no sensitivity.
140
141c.cd(3)
142
143Nsig = ROOT.RooRealVar("Nsig", "Number of signal events", 1000, 0, 2000)
144Nbkg = ROOT.RooRealVar("Nbkg", "Number of background events", 10000, 0, 20000)
145
146mean = ROOT.RooRealVar("mean", "Mean of signal model", 40.0)
147width = ROOT.RooRealVar("width", "Width of signal model", 5.0)
148sig = ROOT.RooGaussian("sig", "Signal model", x, mean, width)
149
150modelsum = ROOT.RooAddPdf("modelsum", "NSig*signal + NBkg*background", [sig, model], [Nsig, Nbkg])
151
152# This model will automatically insert the correction factor for the
153# reinterpretation of Nsig and Nnbkg in the full ranges.
154#
155# When this happens, it reports this with lines like the following:
156# ```
157# [#1] INFO:Fitting -- RooAbsOptTestStatistic::ctor(nll_modelsum_modelsumData_LEFT) fixing interpretation of coefficients of any RooAddPdf to full domain of observables
158# [#1] INFO:Fitting -- RooAbsOptTestStatistic::ctor(nll_modelsum_modelsumData_RIGHT) fixing interpretation of coefficients of any RooAddPdf to full domain of observables
159# ```
160
161r3 = modelsum.fitTo(data, Range="LEFT,RIGHT", PrintLevel=-1, Save=True)
162r3.Print()
163
164frame3 = x.frame()
165data.plotOn(frame3)
166modelsum.plotOn(frame3)
167modelsum.plotOn(frame3, VisualizeError=r3)
168modelsum.paramOn(frame3, Label="S+B fit with RooAddPdf", Layout=(0.3, 0.95))
169frame3.Draw()
170
171c.Draw()
172
173c.SaveAs("rf204b_extendedLikelihood_rangedFit.png")