{ "cells": [ { "cell_type": "markdown", "id": "6f33a3fd", "metadata": {}, "source": [ "# HybridStandardForm\n", "A hypothesis testing example based on number counting with background uncertainty.\n", "\n", "A hypothesis testing example based on number counting\n", "with background uncertainty.\n", "\n", "NOTE: This example is like HybridInstructional, but the model is more clearly\n", "generalizable to an analysis with shapes. There is a lot of flexibility\n", "for how one models a problem in RooFit/RooStats. Models come in a few\n", "common forms:\n", " - standard form: extended PDF of some discriminating variable m:\n", " eg: P(m) ~ S*fs(m) + B*fb(m), with S+B events expected\n", " in this case the dataset has N rows corresponding to N events\n", " and the extended term is Pois(N|S+B)\n", "\n", " - fractional form: non-extended PDF of some discriminating variable m:\n", " eg: P(m) ~ s*fs(m) + (1-s)*fb(m), where s is a signal fraction\n", " in this case the dataset has N rows corresponding to N events\n", " and there is no extended term\n", "\n", " - number counting form: in which there is no discriminating variable\n", " and the counts are modeled directly (see HybridInstructional)\n", " eg: P(N) = Pois(N|S+B)\n", " in this case the dataset has 1 row corresponding to N events\n", " and the extended term is the PDF itself.\n", "\n", "Here we convert the number counting form into the standard form by\n", "introducing a dummy discriminating variable m with a uniform distribution.\n", "\n", "This example:\n", " - demonstrates the usage of the HybridCalcultor (Part 4-6)\n", " - demonstrates the numerical integration of RooFit (Part 2)\n", " - validates the RooStats against an example with a known analytic answer\n", " - demonstrates usage of different test statistics\n", " - explains subtle choices in the prior used for hybrid methods\n", " - demonstrates usage of different priors for the nuisance parameters\n", "\n", "The basic setup here is that a main measurement has observed x events with an\n", "expectation of s+b. One can choose an ad hoc prior for the uncertainty on b,\n", "or try to base it on an auxiliary measurement. In this case, the auxiliary\n", "measurement (aka control measurement, sideband) is another counting experiment\n", "with measurement y and expectation tau*b. With an 'original prior' on b,\n", "called $ \\eta(b) $ then one can obtain a posterior from the auxiliary measurement\n", "on b in the main measurement of x, which can then be treated in a hybrid\n", "Bayesian/Frequentist way. Additionally, one can try to treat the two\n", "measurements simultaneously, which is detailed in Part 6 of the tutorial.\n", "\n", "This tutorial is related to the FourBin.C tutorial in the modeling, but\n", "focuses on hypothesis testing instead of interval estimation.\n", "\n", "More background on this 'prototype problem' can be found in the\n", "following papers:\n", "\n", " - Evaluation of three methods for calculating statistical significance\n", " when incorporating a systematic uncertainty into a test of the\n", " background-only hypothesis for a Poisson process\n", " Authors: Robert D. Cousins, James T. Linnemann, Jordan Tucker\n", " http://arxiv.org/abs/physics/0702156\n", " NIM A 595 (2008) 480--501\n", "\n", " - Statistical Challenges for Searches for New Physics at the LHC\n", " Author: Kyle Cranmer\n", " http://arxiv.org/abs/physics/0511028\n", "\n", " - Measures of Significance in HEP and Astrophysics\n", " Author: J. T. Linnemann\n", " http://arxiv.org/abs/physics/0312059\n", "\n", "\n", "\n", "\n", "**Author:** Kyle Cranmer, Wouter Verkerke, and Sven Kreiss \n", "This notebook tutorial was automatically generated with ROOTBOOK-izer from the macro found in the ROOT repository on Tuesday, May 19, 2026 at 08:35 PM." ] }, { "cell_type": "code", "execution_count": 1, "id": "c561559e", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:35:39.475151Z", "iopub.status.busy": "2026-05-19T20:35:39.475058Z", "iopub.status.idle": "2026-05-19T20:35:39.491500Z", "shell.execute_reply": "2026-05-19T20:35:39.491141Z" } }, "outputs": [], "source": [ "%%cpp -d\n", "\n", "#include \"RooGlobalFunc.h\"\n", "#include \"RooRealVar.h\"\n", "#include \"RooProdPdf.h\"\n", "#include \"RooWorkspace.h\"\n", "#include \"RooDataSet.h\"\n", "#include \"RooDataHist.h\"\n", "#include \"TCanvas.h\"\n", "#include \"TStopwatch.h\"\n", "#include \"TH1.h\"\n", "#include \"RooPlot.h\"\n", "#include \"RooMsgService.h\"\n", "\n", "#include \"RooStats/NumberCountingUtils.h\"\n", "\n", "#include \"RooStats/HybridCalculator.h\"\n", "#include \"RooStats/ToyMCSampler.h\"\n", "#include \"RooStats/HypoTestPlot.h\"\n", "\n", "#include \"RooStats/NumEventsTestStat.h\"\n", "#include \"RooStats/ProfileLikelihoodTestStat.h\"\n", "#include \"RooStats/SimpleLikelihoodRatioTestStat.h\"\n", "#include \"RooStats/RatioOfProfiledLikelihoodsTestStat.h\"\n", "#include \"RooStats/MaxLikelihoodEstimateTestStat.h\"\n", "\n", "using namespace RooFit;\n", "using namespace RooStats;" ] }, { "cell_type": "markdown", "id": "656cb135", "metadata": {}, "source": [ "-------------------------------------------------------\n", "A New Test Statistic Class for this example.\n", "It simply returns the sum of the values in a particular\n", "column of a dataset.\n", "You can ignore this class and focus on the macro below" ] }, { "cell_type": "code", "execution_count": 2, "id": "9f1923f0", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:35:39.502297Z", "iopub.status.busy": "2026-05-19T20:35:39.502169Z", "iopub.status.idle": "2026-05-19T20:35:39.836017Z", "shell.execute_reply": "2026-05-19T20:35:39.835501Z" } }, "outputs": [], "source": [ "class BinCountTestStat : public TestStatistic {\n", "public:\n", " BinCountTestStat(void) : fColumnName(\"tmp\") {}\n", " BinCountTestStat(string columnName) : fColumnName(columnName) {}\n", "\n", " virtual Double_t Evaluate(RooAbsData &data, RooArgSet & /*nullPOI*/)\n", " {\n", " // This is the main method in the interface\n", " Double_t value = 0.0;\n", " for (int i = 0; i < data.numEntries(); i++) {\n", " value += data.get(i)->getRealValue(fColumnName.c_str());\n", " }\n", " return value;\n", " }\n", " virtual const TString GetVarName() const { return fColumnName; }\n", "\n", "private:\n", " string fColumnName;\n", "\n", "protected:\n", " ClassDef(BinCountTestStat, 1)\n", "};" ] }, { "cell_type": "markdown", "id": "a3ab0429", "metadata": {}, "source": [ " Arguments are defined. " ] }, { "cell_type": "code", "execution_count": 3, "id": "4802cbda", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:35:39.837785Z", "iopub.status.busy": "2026-05-19T20:35:39.837662Z", "iopub.status.idle": "2026-05-19T20:35:40.046726Z", "shell.execute_reply": "2026-05-19T20:35:40.046190Z" } }, "outputs": [], "source": [ "int ntoys = 6000;" ] }, { "cell_type": "code", "execution_count": 4, "id": "dd9b79f4", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:35:40.048332Z", "iopub.status.busy": "2026-05-19T20:35:40.048214Z", "iopub.status.idle": "2026-05-19T20:35:40.254106Z", "shell.execute_reply": "2026-05-19T20:35:40.253587Z" } }, "outputs": [], "source": [ "double nToysRatio = 20; // ratio Ntoys Null/ntoys ALT" ] }, { "cell_type": "markdown", "id": "3facec4d", "metadata": {}, "source": [ "This tutorial has 6 parts\n", "Table of Contents\n", "Setup\n", "1. Make the model for the 'prototype problem'\n", "Special cases\n", "2. NOT RELEVANT HERE\n", "3. Use RooStats analytic solution for this problem\n", "RooStats HybridCalculator -- can be generalized\n", "4. RooStats ToyMC version of 2. & 3.\n", "5. RooStats ToyMC with an equivalent test statistic\n", "6. RooStats ToyMC with simultaneous control & main measurement" ] }, { "cell_type": "markdown", "id": "f125eb48", "metadata": {}, "source": [ "Part 4 takes ~4 min.\n", "Of course, everything looks nicer with more toys, which takes longer." ] }, { "cell_type": "code", "execution_count": 5, "id": "75a89e7f", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:35:40.256028Z", "iopub.status.busy": "2026-05-19T20:35:40.255895Z", "iopub.status.idle": "2026-05-19T20:35:40.465002Z", "shell.execute_reply": "2026-05-19T20:35:40.464444Z" } }, "outputs": [], "source": [ "TStopwatch t;\n", "t.Start();\n", "TCanvas *c = new TCanvas;\n", "c->Divide(2, 2);" ] }, { "cell_type": "markdown", "id": "0b907d0d", "metadata": {}, "source": [ "-----------------------------------------------------\n", "PART 1 : DIRECT INTEGRATION\n", "====================================================\n", "Make model for prototype on/off problem\n", "Pois(x | s+b) * Pois(y | tau b )\n", "for Z_Gamma, use uniform prior on b." ] }, { "cell_type": "code", "execution_count": 6, "id": "7d2b1b64", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:35:40.466623Z", "iopub.status.busy": "2026-05-19T20:35:40.466494Z", "iopub.status.idle": "2026-05-19T20:35:40.672406Z", "shell.execute_reply": "2026-05-19T20:35:40.671893Z" } }, "outputs": [], "source": [ "RooWorkspace *w = new RooWorkspace(\"w\");" ] }, { "cell_type": "markdown", "id": "e1d8b3ed", "metadata": {}, "source": [ "replace the pdf in 'number counting form'\n", "w->factory(\"Poisson::px(x[150,0,500],sum::splusb(s[0,0,100],b[100,0,300]))\");\n", "with one in standard form. Now x is encoded in event count" ] }, { "cell_type": "code", "execution_count": 7, "id": "3e4ceb1f", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:35:40.677399Z", "iopub.status.busy": "2026-05-19T20:35:40.677279Z", "iopub.status.idle": "2026-05-19T20:35:40.893101Z", "shell.execute_reply": "2026-05-19T20:35:40.892666Z" } }, "outputs": [], "source": [ "w->factory(\"Uniform::f(m[0,1])\"); // m is a dummy discriminating variable\n", "w->factory(\"ExtendPdf::px(f,sum::splusb(s[0,0,100],b[100,0.1,300]))\");\n", "w->factory(\"Poisson::py(y[100,0.1,500],prod::taub(tau[1.],b))\");\n", "w->factory(\"PROD::model(px,py)\");\n", "w->factory(\"Uniform::prior_b(b)\");" ] }, { "cell_type": "markdown", "id": "a46e379b", "metadata": {}, "source": [ "We will control the output level in a few places to avoid\n", "verbose progress messages. We start by keeping track\n", "of the current threshold on messages." ] }, { "cell_type": "code", "execution_count": 8, "id": "513e1dfc", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:35:40.894862Z", "iopub.status.busy": "2026-05-19T20:35:40.894742Z", "iopub.status.idle": "2026-05-19T20:35:41.103886Z", "shell.execute_reply": "2026-05-19T20:35:41.102984Z" } }, "outputs": [], "source": [ "RooFit::MsgLevel msglevel = RooMsgService::instance().globalKillBelow();" ] }, { "cell_type": "markdown", "id": "997f3a66", "metadata": {}, "source": [ "-----------------------------------------------\n", "PART 3 : ANALYTIC RESULT\n", "==============================================\n", "In this special case, the integrals are known analytically\n", "and they are implemented in RooStats::NumberCountingUtils" ] }, { "cell_type": "markdown", "id": "33193b3e", "metadata": {}, "source": [ "analytic Z_Bi" ] }, { "cell_type": "code", "execution_count": 9, "id": "8f42cf65", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:35:41.105361Z", "iopub.status.busy": "2026-05-19T20:35:41.105246Z", "iopub.status.idle": "2026-05-19T20:35:41.314127Z", "shell.execute_reply": "2026-05-19T20:35:41.313756Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "-----------------------------------------\n", "Part 3\n", "Z_Bi p-value (analytic): 0.00094165\n", "Z_Bi significance (analytic): 3.10804\n", "Real time 0:00:00, CP time 0.140\n" ] } ], "source": [ "double p_Bi = NumberCountingUtils::BinomialWithTauObsP(150, 100, 1);\n", "double Z_Bi = NumberCountingUtils::BinomialWithTauObsZ(150, 100, 1);\n", "cout << \"-----------------------------------------\" << endl;\n", "cout << \"Part 3\" << endl;\n", "std::cout << \"Z_Bi p-value (analytic): \" << p_Bi << std::endl;\n", "std::cout << \"Z_Bi significance (analytic): \" << Z_Bi << std::endl;\n", "t.Stop();\n", "t.Print();\n", "t.Reset();\n", "t.Start();" ] }, { "cell_type": "markdown", "id": "65208d98", "metadata": {}, "source": [ "--------------------------------------------------------------\n", "PART 4 : USING HYBRID CALCULATOR\n", "==============================================================\n", "Now we demonstrate the RooStats HybridCalculator.\n", "\n", "Like all RooStats calculators it needs the data and a ModelConfig\n", "for the relevant hypotheses. Since we are doing hypothesis testing\n", "we need a ModelConfig for the null (background only) and the alternate\n", "(signal+background) hypotheses. We also need to specify the PDF,\n", "the parameters of interest, and the observables. Furthermore, since\n", "the parameter of interest is floating, we need to specify which values\n", "of the parameter corresponds to the null and alternate (eg. s=0 and s=50)\n", "\n", "define some sets of variables obs={x} and poi={s}\n", "note here, x is the only observable in the main measurement\n", "and y is treated as a separate measurement, which is used\n", "to produce the prior that will be used in this calculation\n", "to randomize the nuisance parameters." ] }, { "cell_type": "code", "execution_count": 10, "id": "cf17c85f", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:35:41.328184Z", "iopub.status.busy": "2026-05-19T20:35:41.328060Z", "iopub.status.idle": "2026-05-19T20:35:41.534113Z", "shell.execute_reply": "2026-05-19T20:35:41.533152Z" } }, "outputs": [], "source": [ "w->defineSet(\"obs\", \"m\");\n", "w->defineSet(\"poi\", \"s\");" ] }, { "cell_type": "markdown", "id": "08467916", "metadata": {}, "source": [ "create a toy dataset with the x=150\n", "RooDataSet *data = new RooDataSet(\"d\", \"d\", *w->set(\"obs\"));\n", "data->add(*w->set(\"obs\"));" ] }, { "cell_type": "code", "execution_count": 11, "id": "35ca5157", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:35:41.535566Z", "iopub.status.busy": "2026-05-19T20:35:41.535449Z", "iopub.status.idle": "2026-05-19T20:35:41.745970Z", "shell.execute_reply": "2026-05-19T20:35:41.745493Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "input_line_65:2:2: warning: 'data' shadows a declaration with the same name in the 'std' namespace; use '::data' to reference this declaration\n", " std::unique_ptr data{w->pdf(\"px\")->generate(*w->set(\"obs\"), 150)};\n", " ^\n" ] } ], "source": [ "std::unique_ptr data{w->pdf(\"px\")->generate(*w->set(\"obs\"), 150)};" ] }, { "cell_type": "markdown", "id": "1c54e1cd", "metadata": {}, "source": [ "Part 3a : Setup ModelConfigs\n", "-------------------------------------------------------\n", "create the null (background-only) ModelConfig with s=0" ] }, { "cell_type": "code", "execution_count": 12, "id": "61830349", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:35:41.747519Z", "iopub.status.busy": "2026-05-19T20:35:41.747402Z", "iopub.status.idle": "2026-05-19T20:35:41.953479Z", "shell.execute_reply": "2026-05-19T20:35:41.952536Z" } }, "outputs": [], "source": [ "ModelConfig b_model(\"B_model\", w);\n", "b_model.SetPdf(*w->pdf(\"px\"));\n", "b_model.SetObservables(*w->set(\"obs\"));\n", "b_model.SetParametersOfInterest(*w->set(\"poi\"));\n", "w->var(\"s\")->setVal(0.0); // important!\n", "b_model.SetSnapshot(*w->set(\"poi\"));" ] }, { "cell_type": "markdown", "id": "05b9d52d", "metadata": {}, "source": [ "create the alternate (signal+background) ModelConfig with s=50" ] }, { "cell_type": "code", "execution_count": 13, "id": "cac1bd3c", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:35:41.954991Z", "iopub.status.busy": "2026-05-19T20:35:41.954870Z", "iopub.status.idle": "2026-05-19T20:35:42.161064Z", "shell.execute_reply": "2026-05-19T20:35:42.160125Z" } }, "outputs": [], "source": [ "ModelConfig sb_model(\"S+B_model\", w);\n", "sb_model.SetPdf(*w->pdf(\"px\"));\n", "sb_model.SetObservables(*w->set(\"obs\"));\n", "sb_model.SetParametersOfInterest(*w->set(\"poi\"));\n", "w->var(\"s\")->setVal(50.0); // important!\n", "sb_model.SetSnapshot(*w->set(\"poi\"));" ] }, { "cell_type": "markdown", "id": "a958991a", "metadata": {}, "source": [ "Part 3b : Choose Test Statistic\n", "--------------------------------------------------------------\n", "To make an equivalent calculation we need to use x as the test\n", "statistic. This is not a built-in test statistic in RooStats\n", "so we define it above. The new class inherits from the\n", "RooStats::TestStatistic interface, and simply returns the value\n", "of x in the dataset." ] }, { "cell_type": "code", "execution_count": 14, "id": "47c6d65a", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:35:42.162559Z", "iopub.status.busy": "2026-05-19T20:35:42.162436Z", "iopub.status.idle": "2026-05-19T20:35:42.368719Z", "shell.execute_reply": "2026-05-19T20:35:42.367768Z" } }, "outputs": [], "source": [ "NumEventsTestStat eventCount(*w->pdf(\"px\"));" ] }, { "cell_type": "markdown", "id": "56c7555a", "metadata": {}, "source": [ "Part 3c : Define Prior used to randomize nuisance parameters\n", "-------------------------------------------------------------\n", "\n", "The prior used for the hybrid calculator is the posterior\n", "from the auxiliary measurement y. The model for the aux.\n", "measurement is Pois(y|tau*b), thus the likelihood function\n", "is proportional to (has the form of) a Gamma distribution.\n", "if the 'original prior' $\\eta(b)$ is uniform, then from\n", "Bayes's theorem we have the posterior:\n", "$\\pi(b) = Pois(y|tau*b) * \\eta(b)$\n", "If $\\eta(b)$ is flat, then we arrive at a Gamma distribution.\n", "Since RooFit will normalize the PDF we can actually supply\n", "py=Pois(y,tau*b) that will be equivalent to multiplying by a uniform.\n", "\n", "Alternatively, we could explicitly use a gamma distribution:\n", "\n", "`w->factory(\"Gamma::gamma(b,sum::temp(y,1),1,0)\");`\n", "\n", "or we can use some other ad hoc prior that do not naturally\n", "follow from the known form of the auxiliary measurement.\n", "The common choice is the equivalent Gaussian:" ] }, { "cell_type": "code", "execution_count": 15, "id": "1106fde3", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:35:42.370189Z", "iopub.status.busy": "2026-05-19T20:35:42.370068Z", "iopub.status.idle": "2026-05-19T20:35:42.576321Z", "shell.execute_reply": "2026-05-19T20:35:42.575381Z" } }, "outputs": [], "source": [ "w->factory(\"Gaussian::gauss_prior(b,y, expr::sqrty('sqrt(y)',y))\");" ] }, { "cell_type": "markdown", "id": "cda8a6c5", "metadata": {}, "source": [ "this corresponds to the \"Z_N\" calculation.\n", "\n", "or one could use the analogous log-normal prior" ] }, { "cell_type": "code", "execution_count": 16, "id": "a4cf3ce2", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:35:42.577819Z", "iopub.status.busy": "2026-05-19T20:35:42.577690Z", "iopub.status.idle": "2026-05-19T20:35:42.779901Z", "shell.execute_reply": "2026-05-19T20:35:42.779467Z" } }, "outputs": [], "source": [ "w->factory(\"Lognormal::lognorm_prior(b,y, expr::kappa('1+1./sqrt(y)',y))\");" ] }, { "cell_type": "markdown", "id": "8069e9a0", "metadata": {}, "source": [ "Ideally, the HybridCalculator would be able to inspect the full\n", "model Pois(x | s+b) * Pois(y | tau b ) and be given the original\n", "prior $\\eta(b)$ to form $\\pi(b) = Pois(y|tau*b) * \\eta(b)$.\n", "This is not yet implemented because in the general case\n", "it is not easy to identify the terms in the PDF that correspond\n", "to the auxiliary measurement. So for now, it must be set\n", "explicitly with:\n", "- ForcePriorNuisanceNull()\n", "- ForcePriorNuisanceAlt()\n", "the name \"ForcePriorNuisance\" was chosen because we anticipate\n", "this to be auto-detected, but will leave the option open\n", "to force to a different prior for the nuisance parameters." ] }, { "cell_type": "markdown", "id": "1fb1458e", "metadata": {}, "source": [ "Part 3d : Construct and configure the HybridCalculator\n", "-------------------------------------------------------" ] }, { "cell_type": "code", "execution_count": 17, "id": "e55aab1b", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:35:42.782299Z", "iopub.status.busy": "2026-05-19T20:35:42.782179Z", "iopub.status.idle": "2026-05-19T20:35:42.992838Z", "shell.execute_reply": "2026-05-19T20:35:42.992123Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "input_line_87:2:24: error: reference to 'data' is ambiguous\n", " HybridCalculator hc1(*data, sb_model, b_model);\n", " ^\n", "input_line_65:2:30: note: candidate found by name lookup is 'data'\n", " std::unique_ptr data{w->pdf(\"px\")->generate(*w->set(\"obs\"), 150)};\n", " ^\n", "/usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/range_access.h:344:5: note: candidate found by name lookup is 'std::data'\n", " data(initializer_list<_Tp> __il) noexcept\n", " ^\n", "/usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/range_access.h:312:5: note: candidate found by name lookup is 'std::data'\n", " data(_Container& __cont) noexcept(noexcept(__cont.data()))\n", " ^\n", "/usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/range_access.h:323:5: note: candidate found by name lookup is 'std::data'\n", " data(const _Container& __cont) noexcept(noexcept(__cont.data()))\n", " ^\n", "/usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/range_access.h:334:5: note: candidate found by name lookup is 'std::data'\n", " data(_Tp (&__array)[_Nm]) noexcept\n", " ^\n" ] } ], "source": [ "HybridCalculator hc1(*data, sb_model, b_model);\n", "ToyMCSampler *toymcs1 = (ToyMCSampler *)hc1.GetTestStatSampler();" ] }, { "cell_type": "markdown", "id": "a5494b05", "metadata": {}, "source": [ "toymcs1->SetNEventsPerToy(1); // because the model is in number counting form" ] }, { "cell_type": "code", "execution_count": 18, "id": "38dec491", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:35:42.994163Z", "iopub.status.busy": "2026-05-19T20:35:42.994050Z", "iopub.status.idle": "2026-05-19T20:35:43.201087Z", "shell.execute_reply": "2026-05-19T20:35:43.200355Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "input_line_89:2:3: error: use of undeclared identifier 'toymcs1'\n", " (toymcs1->SetTestStatistic(&((*(NumEventsTestStat*)0x7f761000b000))))\n", " ^\n", "Error in : Error evaluating expression (toymcs1->SetTestStatistic(&((*(NumEventsTestStat*)0x7f761000b000))))\n", "Execution of your code was aborted.\n" ] } ], "source": [ "toymcs1->SetTestStatistic(&eventCount); // set the test statistic" ] }, { "cell_type": "markdown", "id": "a51fa86b", "metadata": {}, "source": [ "toymcs1->SetGenerateBinned();" ] }, { "cell_type": "code", "execution_count": 19, "id": "42a2b5ec", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:35:43.202427Z", "iopub.status.busy": "2026-05-19T20:35:43.202313Z", "iopub.status.idle": "2026-05-19T20:35:43.409051Z", "shell.execute_reply": "2026-05-19T20:35:43.408630Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "input_line_91:2:3: error: use of undeclared identifier 'hc1'\n", " (hc1.SetToys(((*(int*)0x7f7635487000)), ((*(int*)0x7f7635487000)) / ((*(double*)0x7f7635484000))))\n", " ^\n", "Error in : Error evaluating expression (hc1.SetToys(((*(int*)0x7f7635487000)), ((*(int*)0x7f7635487000)) / ((*(double*)0x7f7635484000))))\n", "Execution of your code was aborted.\n" ] } ], "source": [ "hc1.SetToys(ntoys, ntoys / nToysRatio);\n", "hc1.ForcePriorNuisanceAlt(*w->pdf(\"py\"));\n", "hc1.ForcePriorNuisanceNull(*w->pdf(\"py\"));" ] }, { "cell_type": "markdown", "id": "6c48c5ec", "metadata": {}, "source": [ "if you wanted to use the ad hoc Gaussian prior instead\n", "~~~\n", "hc1.ForcePriorNuisanceAlt(*w->pdf(\"gauss_prior\"));\n", "hc1.ForcePriorNuisanceNull(*w->pdf(\"gauss_prior\"));\n", "~~~\n", "if you wanted to use the ad hoc log-normal prior instead\n", "~~~\n", "hc1.ForcePriorNuisanceAlt(*w->pdf(\"lognorm_prior\"));\n", "hc1.ForcePriorNuisanceNull(*w->pdf(\"lognorm_prior\"));\n", "~~~" ] }, { "cell_type": "markdown", "id": "6c723cfd", "metadata": {}, "source": [ "these lines save current msg level and then kill any messages below ERROR" ] }, { "cell_type": "code", "execution_count": 20, "id": "d9278761", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:35:43.410537Z", "iopub.status.busy": "2026-05-19T20:35:43.410427Z", "iopub.status.idle": "2026-05-19T20:35:43.620768Z", "shell.execute_reply": "2026-05-19T20:35:43.619756Z" } }, "outputs": [], "source": [ "RooMsgService::instance().setGlobalKillBelow(RooFit::ERROR);" ] }, { "cell_type": "markdown", "id": "096e3a09", "metadata": {}, "source": [ "Get the result" ] }, { "cell_type": "code", "execution_count": 21, "id": "61961d9d", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:35:43.622375Z", "iopub.status.busy": "2026-05-19T20:35:43.622258Z", "iopub.status.idle": "2026-05-19T20:35:43.829157Z", "shell.execute_reply": "2026-05-19T20:35:43.828668Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "input_line_94:2:3: error: use of undeclared identifier 'hc1'\n", " (hc1.GetHypoTest())\n", " ^\n", "Error in : Error evaluating expression (hc1.GetHypoTest())\n", "Execution of your code was aborted.\n" ] } ], "source": [ "HypoTestResult *r1 = hc1.GetHypoTest();\n", "RooMsgService::instance().setGlobalKillBelow(msglevel); // set it back\n", "cout << \"-----------------------------------------\" << endl;\n", "cout << \"Part 4\" << endl;\n", "r1->Print();\n", "t.Stop();\n", "t.Print();\n", "t.Reset();\n", "t.Start();\n", "\n", "c->cd(2);\n", "HypoTestPlot *p1 = new HypoTestPlot(*r1, 30); // 30 bins, TS is discrete\n", "p1->Draw();\n", "\n", "return; // keep the running time sort by default" ] }, { "cell_type": "markdown", "id": "00dc6c5c", "metadata": {}, "source": [ "-------------------------------------------------------------------------\n", "# PART 5 : USING HYBRID CALCULATOR WITH\n", "# A N A L T E R N A T I V E T E S T S T A T I S T I C\n", "\n", "A likelihood ratio test statistics should be 1-to-1 with the count x\n", "when the value of b is fixed in the likelihood. This is implemented\n", "by the SimpleLikelihoodRatioTestStat" ] }, { "cell_type": "code", "execution_count": 22, "id": "2dc24cb1", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:35:43.830887Z", "iopub.status.busy": "2026-05-19T20:35:43.830769Z", "iopub.status.idle": "2026-05-19T20:35:44.039759Z", "shell.execute_reply": "2026-05-19T20:35:44.039227Z" } }, "outputs": [], "source": [ "SimpleLikelihoodRatioTestStat slrts(*b_model.GetPdf(), *sb_model.GetPdf());\n", "slrts.SetNullParameters(*b_model.GetSnapshot());\n", "slrts.SetAltParameters(*sb_model.GetSnapshot());" ] }, { "cell_type": "markdown", "id": "270066cc", "metadata": {}, "source": [ "HYBRID CALCULATOR" ] }, { "cell_type": "code", "execution_count": 23, "id": "6e6322ed", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:35:44.051359Z", "iopub.status.busy": "2026-05-19T20:35:44.051229Z", "iopub.status.idle": "2026-05-19T20:35:44.261821Z", "shell.execute_reply": "2026-05-19T20:35:44.260724Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "input_line_96:2:24: error: reference to 'data' is ambiguous\n", " HybridCalculator hc2(*data, sb_model, b_model);\n", " ^\n", "input_line_65:2:30: note: candidate found by name lookup is 'data'\n", " std::unique_ptr data{w->pdf(\"px\")->generate(*w->set(\"obs\"), 150)};\n", " ^\n", "/usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/range_access.h:344:5: note: candidate found by name lookup is 'std::data'\n", " data(initializer_list<_Tp> __il) noexcept\n", " ^\n", "/usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/range_access.h:312:5: note: candidate found by name lookup is 'std::data'\n", " data(_Container& __cont) noexcept(noexcept(__cont.data()))\n", " ^\n", "/usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/range_access.h:323:5: note: candidate found by name lookup is 'std::data'\n", " data(const _Container& __cont) noexcept(noexcept(__cont.data()))\n", " ^\n", "/usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/range_access.h:334:5: note: candidate found by name lookup is 'std::data'\n", " data(_Tp (&__array)[_Nm]) noexcept\n", " ^\n" ] } ], "source": [ "HybridCalculator hc2(*data, sb_model, b_model);\n", "ToyMCSampler *toymcs2 = (ToyMCSampler *)hc2.GetTestStatSampler();" ] }, { "cell_type": "markdown", "id": "223805aa", "metadata": {}, "source": [ "toymcs2->SetNEventsPerToy(1);" ] }, { "cell_type": "code", "execution_count": 24, "id": "984debd6", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:35:44.268384Z", "iopub.status.busy": "2026-05-19T20:35:44.268253Z", "iopub.status.idle": "2026-05-19T20:35:44.478351Z", "shell.execute_reply": "2026-05-19T20:35:44.477866Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "input_line_98:2:3: error: use of undeclared identifier 'toymcs2'\n", " (toymcs2->SetTestStatistic(&((*(SimpleLikelihoodRatioTestStat*)0x7f75e3926000))))\n", " ^\n", "Error in : Error evaluating expression (toymcs2->SetTestStatistic(&((*(SimpleLikelihoodRatioTestStat*)0x7f75e3926000))))\n", "Execution of your code was aborted.\n" ] } ], "source": [ "toymcs2->SetTestStatistic(&slrts);" ] }, { "cell_type": "markdown", "id": "37d01430", "metadata": {}, "source": [ "toymcs2->SetGenerateBinned();" ] }, { "cell_type": "code", "execution_count": 25, "id": "d98c87ee", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:35:44.480005Z", "iopub.status.busy": "2026-05-19T20:35:44.479887Z", "iopub.status.idle": "2026-05-19T20:35:44.686859Z", "shell.execute_reply": "2026-05-19T20:35:44.686435Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "input_line_100:2:3: error: use of undeclared identifier 'hc2'\n", " (hc2.SetToys(((*(int*)0x7f7635487000)), ((*(int*)0x7f7635487000)) / ((*(double*)0x7f7635484000))))\n", " ^\n", "Error in : Error evaluating expression (hc2.SetToys(((*(int*)0x7f7635487000)), ((*(int*)0x7f7635487000)) / ((*(double*)0x7f7635484000))))\n", "Execution of your code was aborted.\n" ] } ], "source": [ "hc2.SetToys(ntoys, ntoys / nToysRatio);\n", "hc2.ForcePriorNuisanceAlt(*w->pdf(\"py\"));\n", "hc2.ForcePriorNuisanceNull(*w->pdf(\"py\"));" ] }, { "cell_type": "markdown", "id": "27a88467", "metadata": {}, "source": [ "if you wanted to use the ad hoc Gaussian prior instead\n", "~~~\n", "hc2.ForcePriorNuisanceAlt(*w->pdf(\"gauss_prior\"));\n", "hc2.ForcePriorNuisanceNull(*w->pdf(\"gauss_prior\"));\n", "~~~\n", "if you wanted to use the ad hoc log-normal prior instead\n", "~~~\n", "hc2.ForcePriorNuisanceAlt(*w->pdf(\"lognorm_prior\"));\n", "hc2.ForcePriorNuisanceNull(*w->pdf(\"lognorm_prior\"));\n", "~~~" ] }, { "cell_type": "markdown", "id": "75166dfb", "metadata": {}, "source": [ "these lines save current msg level and then kill any messages below ERROR" ] }, { "cell_type": "code", "execution_count": 26, "id": "af2447bd", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:35:44.688708Z", "iopub.status.busy": "2026-05-19T20:35:44.688568Z", "iopub.status.idle": "2026-05-19T20:35:44.898036Z", "shell.execute_reply": "2026-05-19T20:35:44.896987Z" } }, "outputs": [], "source": [ "RooMsgService::instance().setGlobalKillBelow(RooFit::ERROR);" ] }, { "cell_type": "markdown", "id": "fd40f397", "metadata": {}, "source": [ "Get the result" ] }, { "cell_type": "code", "execution_count": 27, "id": "e280dfa1", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:35:44.899677Z", "iopub.status.busy": "2026-05-19T20:35:44.899544Z", "iopub.status.idle": "2026-05-19T20:35:45.106457Z", "shell.execute_reply": "2026-05-19T20:35:45.105948Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "input_line_103:2:3: error: use of undeclared identifier 'hc2'\n", " (hc2.GetHypoTest())\n", " ^\n", "Error in : Error evaluating expression (hc2.GetHypoTest())\n", "Execution of your code was aborted.\n" ] } ], "source": [ "HypoTestResult *r2 = hc2.GetHypoTest();\n", "cout << \"-----------------------------------------\" << endl;\n", "cout << \"Part 5\" << endl;\n", "r2->Print();\n", "t.Stop();\n", "t.Print();\n", "t.Reset();\n", "t.Start();\n", "RooMsgService::instance().setGlobalKillBelow(msglevel);\n", "\n", "c->cd(3);\n", "HypoTestPlot *p2 = new HypoTestPlot(*r2, 30); // 30 bins\n", "p2->Draw();\n", "\n", "return; // so standard tutorial runs faster" ] }, { "cell_type": "markdown", "id": "b61b0f4a", "metadata": {}, "source": [ "---------------------------------------------\n", "OUTPUT (2.66 GHz Intel Core i7)\n", "============================================" ] }, { "cell_type": "markdown", "id": "2d374f59", "metadata": {}, "source": [ "-----------------------------------------\n", "Part 3\n", "Z_Bi p-value (analytic): 0.00094165\n", "Z_Bi significance (analytic): 3.10804\n", "Real time 0:00:00, CP time 0.610" ] }, { "cell_type": "markdown", "id": "c3a2e279", "metadata": {}, "source": [ "Results HybridCalculator_result:\n", "- Null p-value = 0.00103333 +/- 0.000179406\n", "- Significance = 3.08048 sigma\n", "- Number of S+B toys: 1000\n", "- Number of B toys: 30000\n", "- Test statistic evaluated on data: 150\n", "- CL_b: 0.998967 +/- 0.000185496\n", "- CL_s+b: 0.495 +/- 0.0158106\n", "- CL_s: 0.495512 +/- 0.0158272\n", "Real time 0:04:43, CP time 283.780" ] }, { "cell_type": "markdown", "id": "5772517f", "metadata": {}, "source": [ "-------------------------------------------------------\n", "Comparison\n", "-------------------------------------------------------\n", "LEPStatToolsForLHC\n", "https://plone4.fnal.gov:4430/P0/phystat/packages/0703002\n", "Uses Gaussian prior\n", "CL_b = 6.218476e-04, Significance = 3.228665 sigma\n", "\n", "-------------------------------------------------------\n", "Comparison\n", "-------------------------------------------------------\n", "Asymptotic\n", "From the value of the profile likelihood ratio (5.0338)\n", "The significance can be estimated using Wilks's theorem\n", "significance = sqrt(2*profileLR) = 3.1729 sigma" ] }, { "cell_type": "markdown", "id": "8fb3d78d", "metadata": {}, "source": [ "Draw all canvases " ] }, { "cell_type": "code", "execution_count": 28, "id": "433bac0a", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2026-05-19T20:35:45.108236Z", "iopub.status.busy": "2026-05-19T20:35:45.108113Z", "iopub.status.idle": "2026-05-19T20:35:45.341020Z", "shell.execute_reply": "2026-05-19T20:35:45.340462Z" } }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "
\n", "
\n", "\n", "\n", "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%jsroot on\n", "gROOT->GetListOfCanvases()->Draw()" ] } ], "metadata": { "kernelspec": { "display_name": "ROOT C++", "language": "c++", "name": "root" }, "language_info": { "codemirror_mode": "text/x-c++src", "file_extension": ".C", "mimetype": " text/x-c++src", "name": "c++" } }, "nbformat": 4, "nbformat_minor": 5 }