{
"cells": [
{
"cell_type": "markdown",
"id": "2c04b311",
"metadata": {},
"source": [
"# rs_bernsteinCorrection\n",
"Example of the BernsteinCorrection utility in RooStats.\n",
"\n",
"The idea is that one has a distribution coming either from data or Monte Carlo\n",
"(called \"reality\" in the macro) and a nominal model that is not sufficiently\n",
"flexible to take into account the real distribution. One wants to take into\n",
"account the systematic associated with this imperfect modeling by augmenting\n",
"the nominal model with some correction term (in this case a polynomial).\n",
"The BernsteinCorrection utility will import into your workspace a corrected model\n",
"given by nominal(x) * poly_N(x), where poly_N is an n-th order polynomial in\n",
"the Bernstein basis. The degree N of the polynomial is chosen by specifying the tolerance\n",
"one has in adding an extra term to the polynomial.\n",
"The Bernstein basis is nice because it only has positive-definite terms\n",
"and works well with PDFs.\n",
"Finally, the macro makes a plot of:\n",
" - the data (drawn from 'reality'),\n",
" - the best fit of the nominal model (blue)\n",
" - and the best fit corrected model.\n",
"\n",
"\n",
"\n",
"\n",
"**Author:** Kyle Cranmer \n",
"This notebook tutorial was automatically generated with ROOTBOOK-izer from the macro found in the ROOT repository on Tuesday, May 19, 2026 at 08:36 PM."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "e90bebbe",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2026-05-19T20:36:06.299834Z",
"iopub.status.busy": "2026-05-19T20:36:06.299705Z",
"iopub.status.idle": "2026-05-19T20:36:06.316411Z",
"shell.execute_reply": "2026-05-19T20:36:06.315897Z"
}
},
"outputs": [],
"source": [
"%%cpp -d\n",
"#include \"RooDataSet.h\"\n",
"#include \"RooRealVar.h\"\n",
"#include \"RooConstVar.h\"\n",
"#include \"RooBernstein.h\"\n",
"#include \"TCanvas.h\"\n",
"#include \"RooAbsPdf.h\"\n",
"#include \"RooFitResult.h\"\n",
"#include \"RooPlot.h\"\n",
"#include \n",
"#include \n",
"#include \n",
"#include \n",
"#include \n",
"\n",
"#include \"RooProdPdf.h\"\n",
"#include \"RooAddPdf.h\"\n",
"#include \"RooGaussian.h\"\n",
"#include \"RooProfileLL.h\"\n",
"#include \"RooWorkspace.h\"\n",
"\n",
"#include \"RooStats/BernsteinCorrection.h\"\n",
"\n",
"using namespace RooFit;\n",
"using namespace RooStats;"
]
},
{
"cell_type": "markdown",
"id": "da78309b",
"metadata": {},
"source": [
"set range of observable"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "69bea550",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2026-05-19T20:36:06.318004Z",
"iopub.status.busy": "2026-05-19T20:36:06.317890Z",
"iopub.status.idle": "2026-05-19T20:36:06.643078Z",
"shell.execute_reply": "2026-05-19T20:36:06.641756Z"
}
},
"outputs": [],
"source": [
"Double_t lowRange = -1, highRange = 5;"
]
},
{
"cell_type": "markdown",
"id": "45a95b5b",
"metadata": {},
"source": [
"make a RooRealVar for the observable"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "1bda47eb",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2026-05-19T20:36:06.644650Z",
"iopub.status.busy": "2026-05-19T20:36:06.644520Z",
"iopub.status.idle": "2026-05-19T20:36:06.853554Z",
"shell.execute_reply": "2026-05-19T20:36:06.853148Z"
}
},
"outputs": [],
"source": [
"RooRealVar x(\"x\", \"x\", lowRange, highRange);"
]
},
{
"cell_type": "markdown",
"id": "424533c3",
"metadata": {},
"source": [
"true model"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "6d44f98d",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2026-05-19T20:36:06.866265Z",
"iopub.status.busy": "2026-05-19T20:36:06.866129Z",
"iopub.status.idle": "2026-05-19T20:36:07.196235Z",
"shell.execute_reply": "2026-05-19T20:36:07.195487Z"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"input_line_51:6:1: warning: 'data' shadows a declaration with the same name in the 'std' namespace; use '::data' to reference this declaration\n",
"std::unique_ptr data{reality.generate(x, 1000)};\n",
"^\n"
]
}
],
"source": [
"RooGaussian narrow(\"narrow\", \"\", x, RooConst(0.), RooConst(.8));\n",
"RooGaussian wide(\"wide\", \"\", x, RooConst(0.), RooConst(2.));\n",
"RooAddPdf reality(\"reality\", \"\", RooArgList(narrow, wide), RooConst(0.8));\n",
"\n",
"std::unique_ptr data{reality.generate(x, 1000)};"
]
},
{
"cell_type": "markdown",
"id": "7430e0b9",
"metadata": {},
"source": [
"nominal model"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "52ee23ad",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2026-05-19T20:36:07.198036Z",
"iopub.status.busy": "2026-05-19T20:36:07.197918Z",
"iopub.status.idle": "2026-05-19T20:36:07.427825Z",
"shell.execute_reply": "2026-05-19T20:36:07.421674Z"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"input_line_52:7:14: error: reference to 'data' is ambiguous\n",
"wks->import(*data, Rename(\"data\"));\n",
" ^\n",
"input_line_51:6:29: note: candidate found by name lookup is 'data'\n",
"std::unique_ptr data{reality.generate(x, 1000)};\n",
" ^\n",
"/usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/range_access.h:344:5: note: candidate found by name lookup is 'std::data'\n",
" data(initializer_list<_Tp> __il) noexcept\n",
" ^\n",
"/usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/range_access.h:312:5: note: candidate found by name lookup is 'std::data'\n",
" data(_Container& __cont) noexcept(noexcept(__cont.data()))\n",
" ^\n",
"/usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/range_access.h:323:5: note: candidate found by name lookup is 'std::data'\n",
" data(const _Container& __cont) noexcept(noexcept(__cont.data()))\n",
" ^\n",
"/usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/range_access.h:334:5: note: candidate found by name lookup is 'std::data'\n",
" data(_Tp (&__array)[_Nm]) noexcept\n",
" ^\n"
]
}
],
"source": [
"RooRealVar sigma(\"sigma\", \"\", 1., 0, 10);\n",
"RooGaussian nominal(\"nominal\", \"\", x, RooConst(0.), sigma);\n",
"\n",
"RooWorkspace *wks = new RooWorkspace(\"myWorksspace\");\n",
"\n",
"wks->import(*data, Rename(\"data\"));\n",
"wks->import(nominal);\n",
"\n",
"if (TClass::GetClass(\"ROOT::Minuit2::Minuit2Minimizer\")) {\n",
" // use Minuit2 if ROOT was built with support for it:\n",
" ROOT::Math::MinimizerOptions::SetDefaultMinimizer(\"Minuit2\");\n",
"}"
]
},
{
"cell_type": "markdown",
"id": "1a1eea8c",
"metadata": {},
"source": [
"The tolerance sets the probability to add an unnecessary term.\n",
"lower tolerance will add fewer terms, while higher tolerance\n",
"will add more terms and provide a more flexible function."
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "306d9ff3",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2026-05-19T20:36:07.445487Z",
"iopub.status.busy": "2026-05-19T20:36:07.445330Z",
"iopub.status.idle": "2026-05-19T20:36:07.655905Z",
"shell.execute_reply": "2026-05-19T20:36:07.655370Z"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"input_line_53:14:1: error: reference to 'data' is ambiguous\n",
"data->plotOn(frame);\n",
"^\n",
"input_line_51:6:29: note: candidate found by name lookup is 'data'\n",
"std::unique_ptr data{reality.generate(x, 1000)};\n",
" ^\n",
"/usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/range_access.h:344:5: note: candidate found by name lookup is 'std::data'\n",
" data(initializer_list<_Tp> __il) noexcept\n",
" ^\n",
"/usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/range_access.h:312:5: note: candidate found by name lookup is 'std::data'\n",
" data(_Container& __cont) noexcept(noexcept(__cont.data()))\n",
" ^\n",
"/usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/range_access.h:323:5: note: candidate found by name lookup is 'std::data'\n",
" data(const _Container& __cont) noexcept(noexcept(__cont.data()))\n",
" ^\n",
"/usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/range_access.h:334:5: note: candidate found by name lookup is 'std::data'\n",
" data(_Tp (&__array)[_Nm]) noexcept\n",
" ^\n"
]
}
],
"source": [
"Double_t tolerance = 0.05;\n",
"BernsteinCorrection bernsteinCorrection(tolerance);\n",
"Int_t degree = bernsteinCorrection.ImportCorrectedPdf(wks, \"nominal\", \"x\", \"data\");\n",
"\n",
"if (degree < 0) {\n",
" Error(\"rs_bernsteinCorrection\", \"Bernstein correction failed ! \");\n",
" return;\n",
"}\n",
"\n",
"cout << \" Correction based on Bernstein Poly of degree \" << degree << endl;\n",
"\n",
"RooPlot *frame = x.frame();\n",
"data->plotOn(frame);"
]
},
{
"cell_type": "markdown",
"id": "39880f06",
"metadata": {},
"source": [
"plot the best fit nominal model in blue"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "52231b77",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2026-05-19T20:36:07.664262Z",
"iopub.status.busy": "2026-05-19T20:36:07.664135Z",
"iopub.status.idle": "2026-05-19T20:36:07.874990Z",
"shell.execute_reply": "2026-05-19T20:36:07.874362Z"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"input_line_54:2:18: error: reference to 'data' is ambiguous\n",
" nominal.fitTo(*data, PrintLevel(0));\n",
" ^\n",
"input_line_51:6:29: note: candidate found by name lookup is 'data'\n",
"std::unique_ptr data{reality.generate(x, 1000)};\n",
" ^\n",
"/usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/range_access.h:344:5: note: candidate found by name lookup is 'std::data'\n",
" data(initializer_list<_Tp> __il) noexcept\n",
" ^\n",
"/usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/range_access.h:312:5: note: candidate found by name lookup is 'std::data'\n",
" data(_Container& __cont) noexcept(noexcept(__cont.data()))\n",
" ^\n",
"/usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/range_access.h:323:5: note: candidate found by name lookup is 'std::data'\n",
" data(const _Container& __cont) noexcept(noexcept(__cont.data()))\n",
" ^\n",
"/usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/range_access.h:334:5: note: candidate found by name lookup is 'std::data'\n",
" data(_Tp (&__array)[_Nm]) noexcept\n",
" ^\n",
"input_line_54:3:1: error: use of undeclared identifier 'nominal'\n",
"nominal.plotOn(frame);\n",
"^\n",
"input_line_54:3:16: error: use of undeclared identifier 'frame'\n",
"nominal.plotOn(frame);\n",
" ^\n"
]
}
],
"source": [
" nominal.fitTo(*data, PrintLevel(0));\n",
"nominal.plotOn(frame);"
]
},
{
"cell_type": "markdown",
"id": "8b386014",
"metadata": {},
"source": [
"plot the best fit corrected model in red"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "de5ba8c7",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2026-05-19T20:36:07.887405Z",
"iopub.status.busy": "2026-05-19T20:36:07.887275Z",
"iopub.status.idle": "2026-05-19T20:36:08.101864Z",
"shell.execute_reply": "2026-05-19T20:36:08.101278Z"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"input_line_56:2:3: error: use of undeclared identifier 'wks'\n",
" (wks->pdf(\"corrected\"))\n",
" ^\n",
"Error in : Error evaluating expression (wks->pdf(\"corrected\"))\n",
"Execution of your code was aborted.\n"
]
}
],
"source": [
"RooAbsPdf *corrected = wks->pdf(\"corrected\");\n",
"if (!corrected)\n",
" return;"
]
},
{
"cell_type": "markdown",
"id": "25760743",
"metadata": {},
"source": [
"fit corrected model"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "466f3a66",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2026-05-19T20:36:08.103690Z",
"iopub.status.busy": "2026-05-19T20:36:08.103554Z",
"iopub.status.idle": "2026-05-19T20:36:08.314371Z",
"shell.execute_reply": "2026-05-19T20:36:08.313881Z"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"input_line_57:2:20: error: reference to 'data' is ambiguous\n",
" corrected->fitTo(*data, PrintLevel(0));\n",
" ^\n",
"input_line_51:6:29: note: candidate found by name lookup is 'data'\n",
"std::unique_ptr data{reality.generate(x, 1000)};\n",
" ^\n",
"/usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/range_access.h:344:5: note: candidate found by name lookup is 'std::data'\n",
" data(initializer_list<_Tp> __il) noexcept\n",
" ^\n",
"/usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/range_access.h:312:5: note: candidate found by name lookup is 'std::data'\n",
" data(_Container& __cont) noexcept(noexcept(__cont.data()))\n",
" ^\n",
"/usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/range_access.h:323:5: note: candidate found by name lookup is 'std::data'\n",
" data(const _Container& __cont) noexcept(noexcept(__cont.data()))\n",
" ^\n",
"/usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/range_access.h:334:5: note: candidate found by name lookup is 'std::data'\n",
" data(_Tp (&__array)[_Nm]) noexcept\n",
" ^\n",
"input_line_57:3:19: error: use of undeclared identifier 'frame'\n",
"corrected->plotOn(frame, LineColor(kRed));\n",
" ^\n"
]
}
],
"source": [
"corrected->fitTo(*data, PrintLevel(0));\n",
"corrected->plotOn(frame, LineColor(kRed));"
]
},
{
"cell_type": "markdown",
"id": "4d685e23",
"metadata": {},
"source": [
"plot the correction term (* norm constant) in dashed green\n",
"should make norm constant just be 1, not depend on binning of data"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "e9ffad15",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2026-05-19T20:36:08.316215Z",
"iopub.status.busy": "2026-05-19T20:36:08.316093Z",
"iopub.status.idle": "2026-05-19T20:36:08.530421Z",
"shell.execute_reply": "2026-05-19T20:36:08.529850Z"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"input_line_58:4:34: error: cannot take the address of an rvalue of type 'EColor'\n",
" poly->plotOn(frame, LineColor(kGreen), LineStyle(kDashed));\n",
" ^~~~~~\n",
"Error while creating dynamic expression for:\n",
" poly->plotOn(frame, LineColor(kGreen), LineStyle(kDashed))\n"
]
}
],
"source": [
"RooAbsPdf *poly = wks->pdf(\"poly\");\n",
"if (poly)\n",
" poly->plotOn(frame, LineColor(kGreen), LineStyle(kDashed));"
]
},
{
"cell_type": "markdown",
"id": "6d63f7c0",
"metadata": {},
"source": [
"this is a switch to check the sampling distribution\n",
"of -2 log LR for two comparisons:\n",
"the first is for n-1 vs. n degree polynomial corrections\n",
"the second is for n vs. n+1 degree polynomial corrections\n",
"Here we choose n to be the one chosen by the tolerance\n",
"criterion above, eg. n = \"degree\" in the code.\n",
"Setting this to true is takes about 10 min."
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "21906b7a",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2026-05-19T20:36:08.532333Z",
"iopub.status.busy": "2026-05-19T20:36:08.532210Z",
"iopub.status.idle": "2026-05-19T20:36:08.779775Z",
"shell.execute_reply": "2026-05-19T20:36:08.779415Z"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"[runStaticInitializersOnce]: Failed to materialize symbols: { (main, { _ZN12__cling_N52816__cling_Un1Qu328EPv, _GLOBAL__sub_I_cling_module_286, $.cling-module-286.__inits.0, _ZN7TObjectnwEm, _ZN12__cling_N52817checkSamplingDistE, _ZN12__cling_N5288numToyMCE, __orc_init_func.cling-module-286, cling_module_286_, _ZN12__cling_N5282c1E }) }\n",
"IncrementalExecutor::executeFunction: symbol '_ZN5cling7runtime8internal9EvaluateTIvEET_PNS1_15DynamicExprInfoEPN5clang11DeclContextE' unresolved while linking [cling interface function]!\n",
"You are probably missing the definition of void cling::runtime::internal::EvaluateT(cling::runtime::internal::DynamicExprInfo*, clang::DeclContext*)\n",
"Maybe you need to load the corresponding shared library?\n"
]
}
],
"source": [
"bool checkSamplingDist = true;\n",
"int numToyMC = 20; // increase this value for sensible results\n",
"\n",
"TCanvas *c1 = new TCanvas();\n",
"if (checkSamplingDist) {\n",
" c1->Divide(1, 2);\n",
" c1->cd(1);\n",
"}\n",
"frame->Draw();\n",
"gPad->Update();\n",
"\n",
"if (checkSamplingDist) {\n",
" // check sampling dist\n",
" ROOT::Math::MinimizerOptions::SetDefaultPrintLevel(-1);\n",
" TH1F *samplingDist = new TH1F(\"samplingDist\", \"\", 20, 0, 10);\n",
" TH1F *samplingDistExtra = new TH1F(\"samplingDistExtra\", \"\", 20, 0, 10);\n",
" bernsteinCorrection.CreateQSamplingDist(wks, \"nominal\", \"x\", \"data\", samplingDist, samplingDistExtra, degree,\n",
" numToyMC);\n",
"\n",
" c1->cd(2);\n",
" samplingDistExtra->SetLineColor(kRed);\n",
" samplingDistExtra->Draw();\n",
" samplingDist->Draw(\"same\");\n",
"}"
]
},
{
"cell_type": "markdown",
"id": "d2e0f846",
"metadata": {},
"source": [
"Draw all canvases "
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "7766af98",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2026-05-19T20:36:08.789384Z",
"iopub.status.busy": "2026-05-19T20:36:08.789257Z",
"iopub.status.idle": "2026-05-19T20:36:09.021204Z",
"shell.execute_reply": "2026-05-19T20:36:09.020906Z"
}
},
"outputs": [],
"source": [
"%jsroot on\n",
"gROOT->GetListOfCanvases()->Draw()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "ROOT C++",
"language": "c++",
"name": "root"
},
"language_info": {
"codemirror_mode": "text/x-c++src",
"file_extension": ".C",
"mimetype": " text/x-c++src",
"name": "c++"
}
},
"nbformat": 4,
"nbformat_minor": 5
}