Logo ROOT  
Reference Guide
RooBinSamplingPdf.cxx
Go to the documentation of this file.
1 // Authors: Stephan Hageboeck, CERN; Andrea Sciandra, SCIPP-UCSC/Atlas; Nov 2020
2 
3 /*****************************************************************************
4  * RooFit
5  * Authors: *
6  * WV, Wouter Verkerke, UC Santa Barbara, verkerke@slac.stanford.edu *
7  * DK, David Kirkby, UC Irvine, dkirkby@uci.edu *
8  * *
9  * Copyright (c) 2000-2020, Regents of the University of California *
10  * and Stanford University. All rights reserved. *
11  * *
12  * Redistribution and use in source and binary forms, *
13  * with or without modification, are permitted according to the terms *
14  * listed in LICENSE (http://roofit.sourceforge.net/license.txt) *
15  *****************************************************************************/
16 
17 /**
18  * \class RooBinSamplingPdf
19  * The RooBinSamplingPdf is supposed to be used as an adapter between a continuous PDF
20  * and a binned distribution.
21  * When RooFit is used to fit binned data, and the PDF is continuous, it takes the probability density
22  * at the bin centre as a proxy for the probability averaged (integrated) over the entire bin. This is
23  * correct only if the second derivative of the function vanishes, though. This is shown in the plots
24  * below.
25  *
26  * For PDFs that have larger curvatures, the RooBinSamplingPdf can be used. It integrates the PDF in each
27  * bin using an adaptive integrator. This usually requires 21 times more function evaluations, but significantly
28  * reduces biases due to better sampling of the PDF. The integrator can be accessed from the outside
29  * using integrator(). This can be used to change the integration rules, so less/more function evaluations are
30  * performed. The target precision of the integrator can be set in the constructor.
31  *
32  *
33  * ### How to use it
34  * There are two ways to use this class:
35  * - Manually wrap a PDF:
36  * ```
37  * RooBinSamplingPdf binSampler("<name>", "title", <binned observable of PDF>, <original PDF> [, <precision for integrator>]);
38  * binSampler.fitTo(data);
39  * ```
40  * When a PDF is wrapped with a RooBinSamplingPDF, just use the bin sampling PDF instead of the original one for fits
41  * or plotting etc.
42  * \note The binning will be taken from the observable. Make sure that this binning is the same as the one of the dataset that should be fit.
43  * Use RooRealVar::setBinning() to adapt it.
44  * - Instruct test statistics to carry out this wrapping automatically:
45  * ```
46  * pdf.fitTo(data, IntegrateBins(<precision>));
47  * ```
48  * This method is especially useful when used with a simultaneous PDF, since each component will automatically be wrapped,
49  * depending on the value of `precision`:
50  * - `precision < 0.`: None of the PDFs are touched, bin sampling is off.
51  * - `precision = 0.`: Continuous PDFs that are fit to a RooDataHist are wrapped into a RooBinSamplingPdf. The target precision
52  * forwarded to the integrator is 1.E-4 (the default argument of the constructor).
53  * - `precision > 0.`: All continuous PDFs are automatically wrapped into a RooBinSamplingPdf, regardless of what data they are
54  * fit to (see next paragraph). The same `'precision'` is used for all integrators.
55  *
56  * ### Simulating a binned fit using RooDataSet
57  * Some frameworks use unbinned data (RooDataSet) to simulate binned datasets. By adding one entry for each bin centre with the
58  * appropriate weight, one can achieve the same result as fitting with RooDataHist. In this case, however, RooFit cannot
59  * auto-detect that a binned fit is running, and that an integration over the bin is desired (note that there are no bins to
60  * integrate over in this kind of dataset).
61  *
62  * In this case, `IntegrateBins(>0.)` needs to be used, and the desired binning needs to be assigned to the observable
63  * of the dataset:
64  * ```
65  * RooRealVar x("x", "x", 0., 5.);
66  * x.setBins(10);
67  *
68  * // <create dataset and model>
69  *
70  * model.fitTo(data, IntegrateBins(>0.));
71  * ```
72  *
73  * \see RooAbsPdf::fitTo()
74  * \see IntegrateBins()
75  *
76  * \note This feature is currently limited to one-dimensional PDFs.
77  *
78  *
79  * \htmlonly <style>div.image img[src="RooBinSamplingPdf_OFF.png"]{width:12cm;}</style> \endhtmlonly
80  * \htmlonly <style>div.image img[src="RooBinSamplingPdf_ON.png" ]{width:12cm;}</style> \endhtmlonly
81  * <table>
82  * <tr><th> Binned fit without %RooBinSamplingPdf <th> Binned fit with %RooBinSamplingPdf </td></tr>
83  * <tr><td> \image html RooBinSamplingPdf_OFF.png ""
84  * </td>
85  * <td> \image html RooBinSamplingPdf_ON.png ""
86  * </td></tr>
87  * </table>
88  *
89  */
90 
91 
92 #include "RooBinSamplingPdf.h"
93 
94 #include "RooHelpers.h"
95 #include "RooRealBinding.h"
96 #include "RunContext.h"
97 
98 #include "Math/Integrator.h"
99 
100 #include <algorithm>
101 
102 ////////////////////////////////////////////////////////////////////////////////
103 /// Construct a new RooBinSamplingPdf.
104 /// \param[in] name A name to identify this object.
105 /// \param[in] title Title (for e.g. plotting)
106 /// \param[in] observable Observable to integrate over (the one that is binned).
107 /// \param[in] inputPdf A PDF whose bins should be sampled with higher precision.
108 /// \param[in] epsilon Relative precision for the integrator, which is used to sample the bins.
109 /// Note that ROOT's default is to use an adaptive integrator, which in its first iteration usually reaches
110 /// relative precision of 1.E-4 or better. Therefore, asking for lower precision rarely has an effect.
111 RooBinSamplingPdf::RooBinSamplingPdf(const char *name, const char *title, RooAbsRealLValue& observable,
112  RooAbsPdf& inputPdf, double epsilon) :
113  RooAbsPdf(name, title),
114  _pdf("inputPdf", "Function to be converted into a PDF", this, inputPdf),
115  _observable("observable", "Observable to integrate over", this, observable, true, true),
116  _relEpsilon(epsilon) {
117  if (!_pdf->dependsOn(*_observable)) {
118  throw std::invalid_argument(std::string("RooBinSamplingPDF(") + GetName()
119  + "): The PDF " + _pdf->GetName() + " needs to depend on the observable "
120  + _observable->GetName());
121  }
122 }
123 
124 
125  ////////////////////////////////////////////////////////////////////////////////
126  /// Copy a RooBinSamplingPdf.
127  /// \param[in] other PDF to copy.
128  /// \param[in] name Optionally rename the copy.
130  RooAbsPdf(other, name),
131  _pdf("inputPdf", this, other._pdf),
132  _observable("observable", this, other._observable),
133  _relEpsilon(other._relEpsilon) { }
134 
135 
136 ////////////////////////////////////////////////////////////////////////////////
137 /// Integrate the PDF over the current bin of the observable.
139  const unsigned int bin = _observable->getBin();
140  const double low = _observable->getBinning().binLow(bin);
141  const double high = _observable->getBinning().binHigh(bin);
142 
143  const double oldX = _observable->getVal();
144  double result;
145  {
146  // Important: When the integrator samples x, caching of sub-tree values needs to be off.
148  result = integrate(_normSet, low, high) / (high-low);
149  }
150 
151  _observable->setVal(oldX);
152 
153  return result;
154 }
155 
156 
157 ////////////////////////////////////////////////////////////////////////////////
158 /// Integrate the PDF over all its bins, and return a batch with those values.
159 /// \param[in,out] evalData Struct with evaluation data.
160 /// \param[in] normSet Normalisation set that's used to evaluate the PDF.
162  // Retrieve binning, which we need to compute the probabilities
163  auto boundaries = binBoundaries();
164  auto xValues = _observable->getValues(evalData, normSet);
165  auto results = evalData.makeBatch(this, xValues.size());
166 
167  // Important: When the integrator samples x, caching of sub-tree values needs to be off.
169 
170  // Now integrate PDF in each bin:
171  for (unsigned int i=0; i < xValues.size(); ++i) {
172  const double x = xValues[i];
173  const auto upperIt = std::upper_bound(boundaries.begin(), boundaries.end(), x);
174  const unsigned int bin = std::distance(boundaries.begin(), upperIt) - 1;
175  assert(bin < boundaries.size());
176 
177  results[i] = integrate(normSet, boundaries[bin], boundaries[bin+1]) / (boundaries[bin+1]-boundaries[bin]);
178  }
179 
180  return results;
181 }
182 
183 
184 ////////////////////////////////////////////////////////////////////////////////
185 /// Get the bin boundaries for the observable.
186 /// These will be recomputed whenever the shape of this object is dirty.
188  if (isShapeDirty() || _binBoundaries.empty()) {
189  _binBoundaries.clear();
190  const RooAbsBinning& binning = _observable->getBinning(nullptr);
191  const double* boundaries = binning.array();
192 
193  for (int i=0; i < binning.numBoundaries(); ++i) {
194  _binBoundaries.push_back(boundaries[i]);
195  }
196 
197  assert(std::is_sorted(_binBoundaries.begin(), _binBoundaries.end()));
198 
199  clearShapeDirty();
200  }
201 
202  return {_binBoundaries};
203 }
204 
205 
206 ////////////////////////////////////////////////////////////////////////////////
207 /// Return a list of all bin boundaries, so the PDF is plotted correctly.
208 /// \param[in] obs Observable to generate the boundaries for.
209 /// \param[in] xlo Beginning of range to create list of boundaries for.
210 /// \param[in] xhi End of range to create to create list of boundaries for.
211 /// \return Pointer to a list to be deleted by caller.
212 std::list<double>* RooBinSamplingPdf::binBoundaries(RooAbsRealLValue& obs, Double_t xlo, Double_t xhi) const {
213  if (obs.namePtr() != _observable->namePtr()) {
214  coutE(Plotting) << "RooBinSamplingPdf::binBoundaries(" << GetName() << "): observable '" << obs.GetName()
215  << "' is not the observable of this PDF ('" << _observable->GetName() << "')." << std::endl;
216  return nullptr;
217  }
218 
219  auto list = new std::list<double>;
220  for (double val : binBoundaries()) {
221  if (xlo <= val && val < xhi)
222  list->push_back(val);
223  }
224 
225  return list;
226 }
227 
228 
229 ////////////////////////////////////////////////////////////////////////////////
230 /// Return a list of all bin edges, so the PDF is plotted as a step function.
231 /// \param[in] obs Observable to generate the sampling hint for.
232 /// \param[in] xlo Beginning of range to create sampling hint for.
233 /// \param[in] xhi End of range to create sampling hint for.
234 /// \return Pointer to a list to be deleted by caller.
235 std::list<double>* RooBinSamplingPdf::plotSamplingHint(RooAbsRealLValue& obs, Double_t xlo, Double_t xhi) const {
236  if (obs.namePtr() != _observable->namePtr()) {
237  coutE(Plotting) << "RooBinSamplingPdf::plotSamplingHint(" << GetName() << "): observable '" << obs.GetName()
238  << "' is not the observable of this PDF ('" << _observable->GetName() << "')." << std::endl;
239  return nullptr;
240  }
241 
242  auto binEdges = new std::list<double>;
243  const auto& binning = obs.getBinning();
244 
245  for (unsigned int bin=0, numBins = static_cast<unsigned int>(binning.numBins()); bin < numBins; ++bin) {
246  const double low = std::max(binning.binLow(bin), xlo);
247  const double high = std::min(binning.binHigh(bin), xhi);
248  const double width = high - low;
249 
250  // Check if this bin is in plotting range at all
251  if (low >= high)
252  continue;
253 
254  // Move support points slightly inside the bin, so step function is plotted correctly.
255  binEdges->push_back(low + 0.001 * width);
256  binEdges->push_back(high - 0.001 * width);
257  }
258 
259  return binEdges;
260 }
261 
262 
263 ////////////////////////////////////////////////////////////////////////////////
264 /// Direct access to the unique_ptr holding the integrator that's used to sample the bins.
265 /// This can be used to change options such as sampling accuracy or to entirely exchange the integrator.
266 ///
267 /// #### Example: Use the 61-point Gauss-Kronrod integration rule
268 /// ```{.cpp}
269 /// ROOT::Math::IntegratorOneDimOptions intOptions = pdf.integrator()->Options();
270 /// intOptions.SetNPoints(6); // 61-point integration rule
271 /// intOptions.SetRelTolerance(1.E-9); // Smaller tolerance -> more subdivisions
272 /// pdf.integrator()->SetOptions(intOptions);
273 /// ```
274 /// \see ROOT::Math::IntegratorOneDim::SetOptions for more details on integration options.
275 /// \note When RooBinSamplingPdf is loaded from files, integrator options will fall back to the default values.
276 std::unique_ptr<ROOT::Math::IntegratorOneDim>& RooBinSamplingPdf::integrator() const {
277  if (!_integrator) {
279  ROOT::Math::IntegrationOneDim::kADAPTIVE, // GSL Integrator. Will really get it only if MathMore enabled.
280  -1., _relEpsilon, // Abs epsilon = default, rel epsilon set by us.
281  0, // We don't limit the sub-intervals. Steer run time via _relEpsilon.
282  2 // This should read ROOT::Math::Integration::kGAUSS21, but this is in MathMore, so we cannot include it here.
283  ));
284  }
285 
286  return _integrator;
287 }
288 
289 
290 ////////////////////////////////////////////////////////////////////////////////
291 /// Binding used by the integrator to evaluate the PDF.
292 double RooBinSamplingPdf::operator()(double x) const {
293  _observable->setVal(x);
295 }
296 
297 
298 ////////////////////////////////////////////////////////////////////////////////
299 /// Integrate the wrapped PDF using our current integrator, with given norm set and limits.
300 double RooBinSamplingPdf::integrate(const RooArgSet* normSet, double low, double high) const {
301  // Need to set this because operator() only takes one argument.
302  _normSetForIntegrator = normSet;
303 
304  return integrator()->Integral(low, high);
305 }
306 
RooAbsPdf::_normSet
RooArgSet * _normSet
Normalization integral (owned by _normMgr)
Definition: RooAbsPdf.h:325
RooBinSamplingPdf::RooBinSamplingPdf
RooBinSamplingPdf()
Definition: RooBinSamplingPdf.h:31
RooHelpers.h
RooBatchCompute::RunContext::makeBatch
RooSpan< double > makeBatch(const RooAbsReal *owner, std::size_t size)
Create a writable batch.
Definition: RunContext.cxx:87
Integrator.h
coutE
#define coutE(a)
Definition: RooMsgService.h:33
RooAbsArg::namePtr
const TNamed * namePtr() const
De-duplicated pointer to this object's name.
Definition: RooAbsArg.h:543
RooAbsReal::getVal
Double_t getVal(const RooArgSet *normalisationSet=nullptr) const
Evaluate object.
Definition: RooAbsReal.h:91
width
include TDocParser_001 C image html pict1_TDocParser_001 png width
Definition: TDocParser.cxx:121
RooBinSamplingPdf::_observable
RooTemplateProxy< RooAbsRealLValue > _observable
Definition: RooBinSamplingPdf.h:111
x
Double_t x[n]
Definition: legend1.C:17
RooAbsArg::inhibitDirty
Bool_t inhibitDirty() const
Delete watch flag.
Definition: RooAbsArg.cxx:116
RooBinSamplingPdf.h
RooBinSamplingPdf::_integrator
std::unique_ptr< ROOT::Math::IntegratorOneDim > _integrator
Default integrator precision.
Definition: RooBinSamplingPdf.h:114
RooAbsBinning::array
virtual Double_t * array() const =0
epsilon
REAL epsilon
Definition: triangle.c:617
RooAbsBinning::binLow
virtual Double_t binLow(Int_t bin) const =0
RooAbsBinning::binHigh
virtual Double_t binHigh(Int_t bin) const =0
RooFit::Plotting
@ Plotting
Definition: RooGlobalFunc.h:60
RooAbsBinning
RooAbsBinning is the abstract base class for RooRealVar binning definitions.
Definition: RooAbsBinning.h:26
RooBinSamplingPdf::_pdf
RooTemplateProxy< RooAbsPdf > _pdf
Definition: RooBinSamplingPdf.h:110
RooBinSamplingPdf::_binBoundaries
std::vector< double > _binBoundaries
Integrator used to sample bins.
Definition: RooBinSamplingPdf.h:115
RooBinSamplingPdf::operator()
double operator()(double x) const
Binding used by the integrator to evaluate the PDF.
Definition: RooBinSamplingPdf.cxx:292
RooAbsArg::clearShapeDirty
void clearShapeDirty() const
Definition: RooAbsArg.h:577
RooAbsArg::isShapeDirty
Bool_t isShapeDirty() const
Definition: RooAbsArg.h:416
RooAbsRealLValue::getBin
virtual Int_t getBin(const char *rangeName=0) const
Definition: RooAbsRealLValue.h:53
Double_t
double Double_t
Definition: RtypesCore.h:59
RooBinSamplingPdf::_relEpsilon
double _relEpsilon
Definition: RooBinSamplingPdf.h:112
RooBinSamplingPdf::evaluate
double evaluate() const override
Integrate the PDF over the current bin of the observable.
Definition: RooBinSamplingPdf.cxx:138
RooAbsBinning::numBoundaries
virtual Int_t numBoundaries() const =0
ROOT::Math::IntegrationOneDim::kADAPTIVE
@ kADAPTIVE
to be used for general functions without singularities
Definition: AllIntegrationTypes.h:36
RooBinSamplingPdf::plotSamplingHint
std::list< double > * plotSamplingHint(RooAbsRealLValue &obs, double xlo, double xhi) const override
Return a list of all bin edges, so the PDF is plotted as a step function.
Definition: RooBinSamplingPdf.cxx:235
name
char name[80]
Definition: TGX11.cxx:110
RooAbsArg::dependsOn
Bool_t dependsOn(const RooAbsCollection &serverList, const RooAbsArg *ignoreArg=0, Bool_t valueOnly=kFALSE) const
Test whether we depend on (ie, are served by) any object in the specified collection.
Definition: RooAbsArg.cxx:799
RooBinSamplingPdf
The RooBinSamplingPdf is supposed to be used as an adapter between a continuous PDF and a binned dist...
Definition: RooBinSamplingPdf.h:28
RunContext.h
ROOT::Math::IntegratorOneDim
User Class for performing numerical integration of a function in one dimension.
Definition: Integrator.h:98
RooBinSamplingPdf::evaluateSpan
RooSpan< double > evaluateSpan(RooBatchCompute::RunContext &evalData, const RooArgSet *normSet) const override
Integrate the PDF over all its bins, and return a batch with those values.
Definition: RooBinSamplingPdf.cxx:161
RooRealBinding.h
RooAbsReal::getValues
virtual RooSpan< const double > getValues(RooBatchCompute::RunContext &evalData, const RooArgSet *normSet=nullptr) const
by this change, please consult the release notes for ROOT 6.24 for guidance on how to make this trans...
Definition: RooAbsReal.cxx:312
RooBinSamplingPdf::integrator
std::unique_ptr< ROOT::Math::IntegratorOneDim > & integrator() const
Direct access to the unique_ptr holding the integrator that's used to sample the bins.
Definition: RooBinSamplingPdf.cxx:276
RooAbsPdf
Definition: RooAbsPdf.h:41
TNamed::GetName
virtual const char * GetName() const
Returns name of object.
Definition: TNamed.h:47
RooBinSamplingPdf::binBoundaries
RooSpan< const double > binBoundaries() const
Get the bin boundaries for the observable.
Definition: RooBinSamplingPdf.cxx:187
RooAbsRealLValue::setVal
virtual void setVal(Double_t value)=0
Set the current value of the object. Needs to be overridden by implementations.
RooAbsRealLValue
RooAbsRealLValue is the common abstract base class for objects that represent a real value that may a...
Definition: RooAbsRealLValue.h:31
RooBatchCompute::RunContext
This struct enables passing computation data around between elements of a computation graph.
Definition: RunContext.h:31
RooAbsRealLValue::getBinning
virtual const RooAbsBinning & getBinning(const char *name=0, Bool_t verbose=kTRUE, Bool_t createOnTheFly=kFALSE) const =0
Retrive binning configuration with given name or default binning.
RooSpan
A simple container to hold a batch of data values.
Definition: RooSpan.h:34
RooArgSet
RooArgSet is a container object that can hold multiple RooAbsArg objects.
Definition: RooArgSet.h:33
RooBinSamplingPdf::_normSetForIntegrator
const RooArgSet * _normSetForIntegrator
Workspace to store data for bin sampling.
Definition: RooBinSamplingPdf.h:116
RooHelpers::DisableCachingRAII
Disable all caches for sub-branches in an expression tree.
Definition: RooHelpers.h:105
RooBinSamplingPdf::integrate
double integrate(const RooArgSet *normSet, double low, double high) const
Integrate the wrapped PDF using our current integrator, with given norm set and limits.
Definition: RooBinSamplingPdf.cxx:300