Logo ROOT  
Reference Guide
 
Loading...
Searching...
No Matches
RooFormula.cxx
Go to the documentation of this file.
1/*****************************************************************************
2 * Project: RooFit *
3 * Package: RooFitCore *
4 * @(#)root/roofitcore:$Id$
5 * Authors: *
6 * WV, Wouter Verkerke, UC Santa Barbara, verkerke@slac.stanford.edu *
7 * DK, David Kirkby, UC Irvine, dkirkby@uci.edu *
8 * *
9 * Copyright (c) 2000-2005, Regents of the University of California *
10 * and Stanford University. All rights reserved. *
11 * *
12 * Redistribution and use in source and binary forms, *
13 * with or without modification, are permitted according to the terms *
14 * listed in LICENSE (http://roofit.sourceforge.net/license.txt) *
15 *****************************************************************************/
16
17/**
18\file RooFormula.cxx
19\class RooFormula
20\ingroup Roofitcore
21
22Internally uses ROOT's TFormula to compute user-defined expressions of RooAbsArgs.
23
24The string expression can be any valid TFormula expression referring to the
25listed servers either by name or by their ordinal list position. These three are
26forms equivalent:
27```
28 RooFormula("formula", "x*y", RooArgList(x,y)) or
29 RooFormula("formula", "@0*@1", RooArgList(x,y))
30 RooFormula("formula", "x[0]*x[1]", RooArgList(x,y))
31```
32Note that `x[i]` is an expression reserved for TFormula. If a variable with
33the name `x` is given, the RooFormula interprets `x` as a variable name,
34but `x[i]` as an index in the list of variables.
35
36### Category expressions
37State information of RooAbsCategories can be accessed using the '::' operator,
38*i.e.*, `tagCat::Kaon` will resolve to the numerical value of
39the `Kaon` state of the RooAbsCategory object named `tagCat`.
40
41A formula to switch between lepton categories could look like this:
42```
43 RooFormula("formulaWithCat",
44 "x * (leptonMulti == leptonMulti::one) + y * (leptonMulti == leptonMulti::two)",
45 RooArgList(x, y, leptonMulti));
46```
47
48### Debugging a formula that won't compile
49When the formula is preprocessed, RooFit can print information in the debug stream.
50These can be retrieved by activating the RooFit::MsgLevel `RooFit::DEBUG`
51and the RooFit::MsgTopic `RooFit::InputArguments`.
52Check the tutorial rf506_msgservice.C for details.
53**/
54
55#include "RooFormula.h"
56#include "RooAbsReal.h"
57#include "RooAbsCategory.h"
58#include "RooArgList.h"
59#include "RooMsgService.h"
60#include "RooBatchCompute.h"
61
62#include "TObjString.h"
63
64#include <memory>
65#include <regex>
66#include <sstream>
67#include <cctype>
68
69using namespace std;
70
71namespace {
72
73////////////////////////////////////////////////////////////////////////////////
74/// Find all input arguments which are categories, and save this information in
75/// with the names of the variables that are being used to evaluate it.
76std::vector<bool> findCategoryServers(const RooAbsCollection& collection) {
77 std::vector<bool> output;
78 output.reserve(collection.size());
79
80 for (unsigned int i = 0; i < collection.size(); ++i) {
81 output.push_back(collection[i]->InheritsFrom(RooAbsCategory::Class()));
82 }
83
84 return output;
85}
86
87/// Convert `@i`-style references to `x[i]`.
88void convertArobaseReferences(std::string &formula)
89{
90 bool match = false;
91 for (std::size_t i = 0; i < formula.size(); ++i) {
92 if (match && !isdigit(formula[i])) {
93 formula.insert(formula.begin() + i, ']');
94 i += 1;
95 match = false;
96 } else if (!match && formula[i] == '@') {
97 formula[i] = 'x';
98 formula.insert(formula.begin() + i + 1, '[');
99 i += 1;
100 match = true;
101 }
102 }
103 if (match)
104 formula += ']';
105}
106
107/// Replace all occurrences of `what` with `with` inside of `inOut`.
108void replaceAll(std::string &inOut, std::string_view what, std::string_view with)
109{
110 for (std::string::size_type pos{}; inOut.npos != (pos = inOut.find(what.data(), pos, what.length()));
111 pos += with.length()) {
112 inOut.replace(pos, what.length(), with.data(), with.length());
113 }
114}
115
116/// Find the word boundaries with a static std::regex and return a bool vector
117/// flagging their positions. The end of the string is considered a word
118/// boundary.
119std::vector<bool> getWordBoundaryFlags(std::string const &s)
120{
121 static const std::regex r{"\\b"};
122 std::vector<bool> out(s.size() + 1);
123
124 for (auto i = std::sregex_iterator(s.begin(), s.end(), r); i != std::sregex_iterator(); ++i) {
125 std::smatch m = *i;
126 out[m.position()] = true;
127 }
128
129 // The end of a string is also a word boundary
130 out[s.size()] = true;
131
132 return out;
133}
134
135/// Replace all named references with "x[i]"-style.
136void replaceVarNamesWithIndexStyle(std::string &formula, RooArgList const &varList)
137{
138 std::vector<bool> isWordBoundary = getWordBoundaryFlags(formula);
139 for (unsigned int i = 0; i < varList.size(); ++i) {
140 std::string_view varName = varList[i].GetName();
141
142 std::stringstream replacementStream;
143 replacementStream << "x[" << i << "]";
144 std::string replacement = replacementStream.str();
145
146 for (std::string::size_type pos{}; formula.npos != (pos = formula.find(varName.data(), pos, varName.length()));
147 pos += replacement.size()) {
148
149 std::string::size_type next = pos + varName.length();
150
151 // The matched variable name has to be surrounded by word boundaries
152 // std::cout << pos << " " << next << std::endl;
153 if (!isWordBoundary[pos] || !isWordBoundary[next])
154 continue;
155
156 // Veto '[' and ']' as next characters. If the variable is called `x`
157 // or `0`, this might otherwise replace `x[0]`.
158 if (next < formula.size() && (formula[next] == '[' || formula[next] == ']')) {
159 continue;
160 }
161
162 // As we replace substrings in the middle of the string, we also have
163 // to update the word boundary flag vector. Note that we don't care
164 // the word boundaries in the `x[i]` are correct, as it has already
165 // been replaced.
166 std::size_t nOld = varName.length();
167 std::size_t nNew = replacement.size();
168 auto wbIter = isWordBoundary.begin() + pos;
169 if (nNew > nOld) {
170 isWordBoundary.insert(wbIter + nOld, nNew - nOld, false);
171 } else if (nNew < nOld) {
172 isWordBoundary.erase(wbIter + nNew, wbIter + nOld);
173 }
174
175 // Do the actual replacement
176 formula.replace(pos, varName.length(), replacement);
177 }
178
179 oocxcoutD(static_cast<TObject *>(nullptr), InputArguments)
180 << "Preprocessing formula: replace named references: " << varName << " --> " << replacement << "\n\t"
181 << formula << std::endl;
182 }
183}
184
185}
186
187
188////////////////////////////////////////////////////////////////////////////////
189/// Construct a new formula.
190/// \param[in] name Name of the formula.
191/// \param[in] formula Formula to be evaluated. Parameters/observables are identified by name
192/// or ordinal position in `varList`.
193/// \param[in] varList List of variables to be passed to the formula.
194/// \param[in] checkVariables Check that the variables being passed in the `varList` are used in
195/// the formula expression.
196RooFormula::RooFormula(const char* name, const char* formula, const RooArgList& varList,
197 bool checkVariables) :
198 TNamed(name, formula)
199{
200 _origList.add(varList);
201 _isCategory = findCategoryServers(_origList);
202
203 installFormulaOrThrow(formula);
204
205 RooArgList useList = usedVariables();
206 if (checkVariables && _origList.size() != useList.size()) {
207 coutI(InputArguments) << "The formula " << GetName() << " claims to use the variables " << _origList
208 << " but only " << useList << " seem to be in use."
209 << "\n inputs: " << formula << std::endl;
210 }
211}
212
213
214
215////////////////////////////////////////////////////////////////////////////////
216/// Copy constructor
217RooFormula::RooFormula(const RooFormula& other, const char* name) :
218 TNamed(name ? name : other.GetName(), other.GetTitle()), RooPrintable(other)
219{
220 _origList.add(other._origList);
221 _isCategory = findCategoryServers(_origList);
222
223 std::unique_ptr<TFormula> newTF;
224 if (other._tFormula) {
225 newTF = std::make_unique<TFormula>(*other._tFormula);
226 newTF->SetName(GetName());
227 }
228
229 _tFormula = std::move(newTF);
230}
231
232
233////////////////////////////////////////////////////////////////////////////////
234/// Process given formula by replacing all ordinal and name references by
235/// `x[i]`, where `i` matches the position of the argument in `_origList`.
236/// Further, references to category states such as `leptonMulti:one` are replaced
237/// by the category index.
238std::string RooFormula::processFormula(std::string formula) const {
239
240 // WARNING to developers: people use RooFormula a lot via RooGenericPdf and
241 // RooFormulaVar! Performance matters here. Avoid non-static std::regex,
242 // because constructing these can become a bottleneck because of the regex
243 // compilation.
244
245 cxcoutD(InputArguments) << "Preprocessing formula step 1: find category tags (catName::catState) in "
246 << formula << std::endl;
247
248 // Step 1: Find all category tags and the corresponding index numbers
249 static const std::regex categoryReg("(\\w+)::(\\w+)");
250 std::map<std::string, int> categoryStates;
251 for (sregex_iterator matchIt = sregex_iterator(formula.begin(), formula.end(), categoryReg);
252 matchIt != sregex_iterator(); ++matchIt) {
253 assert(matchIt->size() == 3);
254 const std::string fullMatch = (*matchIt)[0];
255 const std::string catName = (*matchIt)[1];
256 const std::string catState = (*matchIt)[2];
257
258 const auto catVariable = dynamic_cast<const RooAbsCategory*>(_origList.find(catName.c_str()));
259 if (!catVariable) {
260 cxcoutD(InputArguments) << "Formula " << GetName() << " uses '::' to reference a category state as '" << fullMatch
261 << "' but a category '" << catName << "' cannot be found in the input variables." << std::endl;
262 continue;
263 }
264
265 if (!catVariable->hasLabel(catState)) {
266 coutE(InputArguments) << "Formula " << GetName() << " uses '::' to reference a category state as '" << fullMatch
267 << "' but the category '" << catName << "' does not seem to have the state '" << catState << "'." << std::endl;
268 throw std::invalid_argument(formula);
269 }
270 const int catNum = catVariable->lookupIndex(catState);
271
272 categoryStates[fullMatch] = catNum;
273 cxcoutD(InputArguments) << "\n\t" << fullMatch << "\tname=" << catName << "\tstate=" << catState << "=" << catNum;
274 }
275 cxcoutD(InputArguments) << "-- End of category tags --"<< std::endl;
276
277 // Step 2: Replace all category tags
278 for (const auto& catState : categoryStates) {
279 replaceAll(formula, catState.first, std::to_string(catState.second));
280 }
281
282 cxcoutD(InputArguments) << "Preprocessing formula step 2: replace category tags\n\t" << formula << std::endl;
283
284 // Step 3: Convert `@i`-style references to `x[i]`
285 convertArobaseReferences(formula);
286
287 cxcoutD(InputArguments) << "Preprocessing formula step 3: replace '@'-references\n\t" << formula << std::endl;
288
289 // Step 4: Replace all named references with "x[i]"-style
290 replaceVarNamesWithIndexStyle(formula, _origList);
291
292 cxcoutD(InputArguments) << "Final formula:\n\t" << formula << std::endl;
293
294 return formula;
295}
296
297
298////////////////////////////////////////////////////////////////////////////////
299/// Analyse internal formula to find out which variables are actually in use.
301 RooArgList useList;
302 if (_tFormula == nullptr)
303 return useList;
304
305 const std::string formula(_tFormula->GetTitle());
306
307 std::set<unsigned int> matchedOrdinals;
308 static const std::regex newOrdinalRegex("\\bx\\[([0-9]+)\\]");
309 for (sregex_iterator matchIt = sregex_iterator(formula.begin(), formula.end(), newOrdinalRegex);
310 matchIt != sregex_iterator(); ++matchIt) {
311 assert(matchIt->size() == 2);
312 std::stringstream matchString((*matchIt)[1]);
313 unsigned int i;
314 matchString >> i;
315
316 matchedOrdinals.insert(i);
317 }
318
319 for (unsigned int i : matchedOrdinals) {
320 useList.add(_origList[i]);
321 }
322
323 return useList;
324}
325
326
327////////////////////////////////////////////////////////////////////////////////
328/// From the internal representation, construct a formula by replacing all index place holders
329/// with the names of the variables that are being used to evaluate it.
330std::string RooFormula::reconstructFormula(std::string internalRepr) const {
331 for (unsigned int i = 0; i < _origList.size(); ++i) {
332 const auto& var = _origList[i];
333 std::stringstream regexStr;
334 regexStr << "x\\[" << i << "\\]|@" << i;
335 std::regex regex(regexStr.str());
336
337 std::string replacement = std::string("[") + var.GetName() + "]";
338 internalRepr = std::regex_replace(internalRepr, regex, replacement);
339 }
340
341 return internalRepr;
342}
343
344
346{
347 printMultiline(std::cout, 0);
348}
349
350
351////////////////////////////////////////////////////////////////////////////////
352/// Change used variables to those with the same name in given list.
353/// \param[in] newDeps New dependents to replace the old ones.
354/// \param[in] mustReplaceAll Will yield an error if one dependent does not have a replacement.
355/// \param[in] nameChange Passed down to RooAbsArg::findNewServer(const RooAbsCollection&, bool) const.
356bool RooFormula::changeDependents(const RooAbsCollection& newDeps, bool mustReplaceAll, bool nameChange)
357{
358 //Change current servers to new servers with the same name given in list
359 bool errorStat = false;
360
361 // We only consider the usedVariables() for replacement, because only these
362 // are registered as servers.
363 for (const auto arg : usedVariables()) {
364 RooAbsReal* replace = static_cast<RooAbsReal*>(arg->findNewServer(newDeps,nameChange)) ;
365 if (replace) {
366 _origList.replace(*arg, *replace);
367
368 if (arg->getStringAttribute("origName")) {
369 replace->setStringAttribute("origName",arg->getStringAttribute("origName")) ;
370 } else {
371 replace->setStringAttribute("origName",arg->GetName()) ;
372 }
373
374 } else if (mustReplaceAll) {
375 coutE(LinkStateMgmt) << __func__ << ": cannot find replacement for " << arg->GetName() << std::endl;
376 errorStat = true;
377 }
378 }
379
380 _isCategory = findCategoryServers(_origList);
381
382 return errorStat;
383}
384
385
386
387////////////////////////////////////////////////////////////////////////////////
388/// Evaluate the internal TFormula.
389///
390/// First, all variables serving this instance are evaluated given the normalisation set,
391/// and then the formula is evaluated.
392/// \param[in] nset Normalisation set passed to evaluation of arguments serving values.
393/// \return The result of the evaluation.
394double RooFormula::eval(const RooArgSet* nset) const
395{
396 if (!_tFormula) {
397 coutF(Eval) << __func__ << " (" << GetName() << "): Formula didn't compile: " << GetTitle() << std::endl;
398 std::string what = "Formula ";
399 what += GetTitle();
400 what += " didn't compile.";
401 throw std::runtime_error(what);
402 }
403
404 std::vector<double> pars;
405 pars.reserve(_origList.size());
406 for (unsigned int i = 0; i < _origList.size(); ++i) {
407 if (_isCategory[i]) {
408 const auto& cat = static_cast<RooAbsCategory&>(_origList[i]);
409 pars.push_back(cat.getCurrentIndex());
410 } else {
411 const auto& real = static_cast<RooAbsReal&>(_origList[i]);
412 pars.push_back(real.getVal(nset));
413 }
414 }
415
416 return _tFormula->EvalPar(pars.data());
417}
418
419void RooFormula::computeBatch(double* output, size_t nEvents, RooFit::Detail::DataMap const& dataMap) const
420{
421 const int nPars=_origList.size();
422 std::vector<std::span<const double>> inputSpans(nPars);
423 for (int i=0; i<nPars; i++) {
424 std::span<const double> rhs = dataMap.at( static_cast<const RooAbsReal*>(&_origList[i]) );
425 inputSpans[i] = rhs;
426 }
427
428 std::vector<double> pars(nPars);
429 for (size_t i=0; i<nEvents; i++)
430 {
431 for (int j=0; j<nPars; j++) pars[j] = inputSpans[j].size()>1 ? inputSpans[j][i] : inputSpans[j][0];
432 output[i] = _tFormula->EvalPar( pars.data() );
433 }
434}
435
436////////////////////////////////////////////////////////////////////////////////
437/// Printing interface
438
439void RooFormula::printMultiline(ostream& os, Int_t /*contents*/, bool /*verbose*/, TString indent) const
440{
441 os << indent << "--- RooFormula ---" << std::endl;
442 os << indent << " Formula: '" << GetTitle() << "'" << std::endl;
443 os << indent << " Interpretation: '" << reconstructFormula(GetTitle()) << "'" << std::endl;
444 indent.Append(" ");
445 os << indent << "Servers: " << _origList << "\n";
446 os << indent << "In use : " << actualDependents() << std::endl;
447}
448
449
450////////////////////////////////////////////////////////////////////////////////
451/// Print value of formula
452
453void RooFormula::printValue(ostream& os) const
454{
455 os << const_cast<RooFormula*>(this)->eval(nullptr) ;
456}
457
458
459////////////////////////////////////////////////////////////////////////////////
460/// Print name of formula
461
462void RooFormula::printName(ostream& os) const
463{
464 os << GetName() ;
465}
466
467
468////////////////////////////////////////////////////////////////////////////////
469/// Print title of formula
470
471void RooFormula::printTitle(ostream& os) const
472{
473 os << GetTitle() ;
474}
475
476
477////////////////////////////////////////////////////////////////////////////////
478/// Print class name of formula
479
480void RooFormula::printClassName(ostream& os) const
481{
482 os << ClassName() ;
483}
484
485
486////////////////////////////////////////////////////////////////////////////////
487/// Print arguments of formula, i.e. dependents that are actually used
488
489void RooFormula::printArgs(ostream& os) const
490{
491 os << "[ actualVars=";
492 for (const auto arg : usedVariables()) {
493 os << " " << arg->GetName();
494 }
495 os << " ]";
496}
497
498
499////////////////////////////////////////////////////////////////////////////////
500/// Check that the formula compiles, and also fulfills the assumptions.
501///
502void RooFormula::installFormulaOrThrow(const std::string& formula) {
503 const std::string processedFormula = processFormula(formula);
504
505 cxcoutD(InputArguments) << "RooFormula '" << GetName() << "' will be compiled as "
506 << "\n\t" << processedFormula
507 << "\n and used as"
508 << "\n\t" << reconstructFormula(processedFormula)
509 << "\n with the parameters " << _origList << std::endl;
510
511 auto theFormula = std::make_unique<TFormula>(GetName(), processedFormula.c_str(), false);
512
513 if (!theFormula || !theFormula->IsValid()) {
514 std::stringstream msg;
515 msg << "RooFormula '" << GetName() << "' did not compile or is invalid."
516 << "\nInput:\n\t" << formula
517 << "\nPassed over to TFormula:\n\t" << processedFormula << std::endl;
518 coutF(InputArguments) << msg.str();
519 throw std::runtime_error(msg.str());
520 }
521
522 if (theFormula && theFormula->GetNdim() != 1) {
523 // TFormula thinks that we have a multi-dimensional formula, e.g. with variables x,y,z,t.
524 // We have to check now that this is not the case, as RooFit only uses the syntax x[0], x[1], x[2], ...
525 bool haveProblem = false;
526 std::stringstream msg;
527 msg << "TFormula interprets the formula " << formula << " as " << theFormula->GetNdim() << "-dimensional with the variable(s) {";
528 for (int i=1; i < theFormula->GetNdim(); ++i) {
529 const TString varName = theFormula->GetVarName(i);
530 if (varName.BeginsWith("x[") && varName[varName.Length()-1] == ']')
531 continue;
532
533 haveProblem = true;
534 msg << theFormula->GetVarName(i) << ",";
535 }
536 if (haveProblem) {
537 msg << "}, which could not be supplied by RooFit."
538 << "\nThe formula must be modified, or those variables must be supplied in the list of variables." << std::endl;
539 coutF(InputArguments) << msg.str();
540 throw std::invalid_argument(msg.str());
541 }
542 }
543
544 _tFormula = std::move(theFormula);
545}
size_t size(const MatrixT &matrix)
retrieve the size of a square matrix
#define coutI(a)
#define cxcoutD(a)
#define oocxcoutD(o, a)
#define coutF(a)
#define coutE(a)
static void indent(ostringstream &buf, int indent_level)
Option_t Option_t TPoint TPoint const char GetTextMagnitude GetFillStyle GetLineColor GetLineWidth GetMarkerStyle GetTextAlign GetTextColor GetTextSize void char Point_t Rectangle_t WindowAttributes_t Float_t r
char name[80]
Definition TGX11.cxx:110
void setStringAttribute(const Text_t *key, const Text_t *value)
Associate string 'value' to this object under key 'key'.
A space to attach TBranches.
static TClass * Class()
Abstract container object that can hold multiple RooAbsArg objects.
const char * GetName() const override
Returns name of object.
virtual bool add(const RooAbsArg &var, bool silent=false)
Add the specified argument to list.
Storage_t::size_type size() const
virtual bool replace(const RooAbsArg &var1, const RooAbsArg &var2)
Replace var1 with var2 and return true for success.
RooAbsArg * find(const char *name) const
Find object with given name in list.
Abstract base class for objects that represent a real value and implements functionality common to al...
Definition RooAbsReal.h:59
RooArgList is a container object that can hold multiple RooAbsArg objects.
Definition RooArgList.h:22
RooArgSet is a container object that can hold multiple RooAbsArg objects.
Definition RooArgSet.h:55
std::span< const double > at(RooAbsArg const *arg, RooAbsArg const *caller=nullptr)
Definition DataMap.cxx:22
Internally uses ROOT's TFormula to compute user-defined expressions of RooAbsArgs.
Definition RooFormula.h:27
RooFormula(const char *name, const char *formula, const RooArgList &varList, bool checkVariables=true)
Construct a new formula.
void printTitle(std::ostream &os) const override
Print title of formula.
void printMultiline(std::ostream &os, Int_t contents, bool verbose=false, TString indent="") const override
Printing interface.
RooArgSet actualDependents() const
Return list of arguments which are used in the formula.
Definition RooFormula.h:39
RooArgList usedVariables() const
Analyse internal formula to find out which variables are actually in use.
void printValue(std::ostream &os) const override
Print value of formula.
void printName(std::ostream &os) const override
Print name of formula.
void computeBatch(double *output, size_t nEvents, RooFit::Detail::DataMap const &) const
void installFormulaOrThrow(const std::string &formulaa)
Check that the formula compiles, and also fulfills the assumptions.
void dump() const
DEBUG: Dump state information.
std::vector< bool > _isCategory
! Whether an element of the _origList is a category.
Definition RooFormula.h:80
std::string processFormula(std::string origFormula) const
Process given formula by replacing all ordinal and name references by x[i], where i matches the posit...
double eval(const RooArgSet *nset=nullptr) const
Evaluate all parameters/observables, and then evaluate formula.
bool changeDependents(const RooAbsCollection &newDeps, bool mustReplaceAll, bool nameChange)
Change used variables to those with the same name in given list.
std::string reconstructFormula(std::string internalRepr) const
From the internal representation, construct a formula by replacing all index place holders with the n...
void printClassName(std::ostream &os) const override
Print class name of formula.
RooArgList _origList
! Original list of dependents
Definition RooFormula.h:79
std::unique_ptr< TFormula > _tFormula
! The formula used to compute values
Definition RooFormula.h:81
void printArgs(std::ostream &os) const override
Print arguments of formula, i.e. dependents that are actually used.
A 'mix-in' base class that define the standard RooFit plotting and printing methods.
The TNamed class is the base class for all named ROOT classes.
Definition TNamed.h:29
const char * GetName() const override
Returns name of object.
Definition TNamed.h:47
const char * GetTitle() const override
Returns title of object.
Definition TNamed.h:48
Mother of all ROOT objects.
Definition TObject.h:41
virtual const char * ClassName() const
Returns name of class to which the object belongs.
Definition TObject.cxx:207
Basic string class.
Definition TString.h:139
Ssiz_t Length() const
Definition TString.h:418
Bool_t BeginsWith(const char *s, ECaseCompare cmp=kExact) const
Definition TString.h:624
static const char * what
Definition stlLoader.cc:6
TMarker m
Definition textangle.C:8
static void output()