// This example is a generalization of the on/off problem. /* FourBin Instructional Tutorial: authors: Kyle Cranmer <cranmer@cern.ch> Tanja Rommerskirchen <tanja.rommerskirchen@cern.ch> date: June 1, 2010 This example is a generalization of the on/off problem. It's a common setup for SUSY searches. Imagine that one has two variables "x" and "y" (eg. missing ET and SumET), see figure. The signal region has high values of both of these variables (top right). One can see low values of "x" or "y" acting as side-bands. If we just used "y" as a sideband, we would have the on/off problem. - In the signal region we observe non events and expect s+b events. - In the region with low values of "y" (bottom right) we observe noff events and expect tau*b events. Note the significance of tau. In the background only case: tau ~ <expectation off> / <expectation on> If tau is known, this model is sufficient, but often tau is not known exactly. So one can use low values of "x" as an additional constraint for tau. Note that this technique critically depends on the notion that the joint distribution for "x" and "y" can be factorized. Generally, these regions have many events, so it the ratio can be measured very precisely there. So we extend the model to describe the left two boxes... denoted with "bar". - In the upper left we observe nonbar events and expect bbar events - In the bottom left we observe noffbar events and expect tau bbar events Note again we have: tau ~ <expecation off bar> / <expectation on bar> One can further expand the model to account for the systematic associated to assuming the distribution of "x" and "y" factorizes (eg. that tau is the same for off/on and offbar/onbar). This can be done in several ways, but here we introduce an additional parameter rho, which so that one set of models will use tau and the other tau*rho. The choice is arbitary, but it has consequences on the numerical stability of the algorithms. The "bar" measurements typically have more events (& smaller relative errors). If we choose <expectation noffbar> = tau * rho * <expectation noonbar>, the product tau*rho will be known very precisely (~1/sqrt(bbar)) and the contour in those parameters will be narrow and have a non-trivial tau~1/rho shape. However, if we choose to put rho on the non/noff measurements (where the product will have an error ~1/sqrt(b)), the contours will be more ameanable to numerical techniques. Thus, here we choose to define tau := <expecation off bar> / (<expectation on bar>) rho := <expecation off> / (<expectation on> * tau) ^ y | |---------------------------+ | | | | nonbar | non | | bbar | s+b | | | | |---------------+-----------| | | | | noffbar | noff | | tau bbar | tau b rho | | | | +-----------------------------> x Left in this way, the problem is under-constrained. However, one may have some auxiliary measurement (usually based on Monte Carlo) to constrain rho. Let us call this auxiliary measurement that gives the nominal value of rho "rhonom". Thus, there is a 'constraint' term in the full model: P(rhonom | rho). In this case, we consider a Gaussian constraint with standard deviation sigma. In the example, the initial values of the parameters are - s = 40 - b = 100 - tau = 5 - bbar = 1000 - rho = 1 (sigma for rho = 20%) and in the toy dataset: - non = 139 - noff = 528 - nonbar = 993 - noffbar = 4906 - rhonom = 1.27824 Note, the covariance matrix of the parameters has large off-diagonal terms. Clearly s,b are anti-correlated. Similary, since noffbar >> nonbar, one would expect bbar,tau to be anti-correlated. This can be seen below. GLOBAL b bbar rho s tau b 0.96820 1.000 0.191 -0.942 -0.762 -0.209 bbar 0.91191 0.191 1.000 0.000 -0.146 -0.912 rho 0.96348 -0.942 0.000 1.000 0.718 -0.000 s 0.76250 -0.762 -0.146 0.718 1.000 0.160 tau 0.92084 -0.209 -0.912 -0.000 0.160 1.000 Similarly, since tau*rho appears as a product, we expect rho,tau to be anti-correlated. When the error on rho is significantly larger than 1/sqrt(bbar), tau is essentially known and the correlation is minimal (tau mainly cares about bbar, and rho about b,s). In the alternate parametrizaton (bbar* tau * rho) the correlation coefficient for rho,tau is large (and negative). The code below uses best-practices for RooFit & RooStats as of June 2010. It proceeds as follows: - create a workspace to hold the model - use workspace factory to quickly create the terms of the model - use workspace factory to define total model (a prod pdf) - create a RooStats ModelConfig to specify observables, parameters of interest - add to the ModelConfig a prior on the parameters for Bayesian techniques - note, the pdf it is factorized for parameters of interest & nuisance params - visualize the model - write the workspace to a file - use several of RooStats IntervalCalculators & compare results */ #include "TStopwatch.h" #include "TCanvas.h" #include "TROOT.h" #include "RooPlot.h" #include "RooAbsPdf.h" #include "RooWorkspace.h" #include "RooDataSet.h" #include "RooGlobalFunc.h" #include "RooFitResult.h" #include "RooRandom.h" #include "RooStats/ProfileLikelihoodCalculator.h" #include "RooStats/LikelihoodInterval.h" #include "RooStats/LikelihoodIntervalPlot.h" #include "RooStats/BayesianCalculator.h" #include "RooStats/MCMCCalculator.h" #include "RooStats/MCMCInterval.h" #include "RooStats/MCMCIntervalPlot.h" #include "RooStats/ProposalHelper.h" #include "RooStats/SimpleInterval.h" #include "RooStats/FeldmanCousins.h" #include "RooStats/PointSetInterval.h" using namespace RooFit; using namespace RooStats; void FourBinInstructional(bool doBayesian=false, bool doFeldmanCousins=false, bool doMCMC=false){ // let's time this challenging example TStopwatch t; t.Start(); // set RooFit random seed for reproducible results RooRandom::randomGenerator()->SetSeed(4357); // make model RooWorkspace* wspace = new RooWorkspace("wspace"); wspace->factory("Poisson::on(non[0,1000], sum::splusb(s[40,0,100],b[100,0,300]))"); wspace->factory("Poisson::off(noff[0,5000], prod::taub(b,tau[5,3,7],rho[1,0,2]))"); wspace->factory("Poisson::onbar(nonbar[0,10000], bbar[1000,500,2000])"); wspace->factory("Poisson::offbar(noffbar[0,1000000], prod::lambdaoffbar(bbar, tau))"); wspace->factory("Gaussian::mcCons(rhonom[1.,0,2], rho, sigma[.2])"); wspace->factory("PROD::model(on,off,onbar,offbar,mcCons)"); wspace->defineSet("obs","non,noff,nonbar,noffbar,rhonom"); wspace->factory("Uniform::prior_poi({s})"); wspace->factory("Uniform::prior_nuis({b,bbar,tau, rho})"); wspace->factory("PROD::prior(prior_poi,prior_nuis)"); /////////////////////////////////////////// // Control some interesting variations // define parameers of interest // for 1-d plots wspace->defineSet("poi","s"); wspace->defineSet("nuis","b,tau,rho,bbar"); // for 2-d plots to inspect correlations: // wspace->defineSet("poi","s,rho"); // test simpler cases where parameters are known. // wspace->var("tau")->setConstant(); // wspace->var("rho")->setConstant(); // wspace->var("b")->setConstant(); // wspace->var("bbar")->setConstant(); // inspect workspace // wspace->Print(); //////////////////////////////////////////////////////////// // Generate toy data // generate toy data assuming current value of the parameters // import into workspace. // add Verbose() to see how it's being generated RooDataSet* data = wspace->pdf("model")->generate(*wspace->set("obs"),1); // data->Print("v"); wspace->import(*data); ///////////////////////////////////////////////////// // Now the statistical tests // model config ModelConfig* modelConfig = new ModelConfig("FourBins"); modelConfig->SetWorkspace(*wspace); modelConfig->SetPdf(*wspace->pdf("model")); modelConfig->SetPriorPdf(*wspace->pdf("prior")); modelConfig->SetParametersOfInterest(*wspace->set("poi")); modelConfig->SetNuisanceParameters(*wspace->set("nuis")); wspace->import(*modelConfig); wspace->writeToFile("FourBin.root"); ////////////////////////////////////////////////// // If you want to see the covariance matrix uncomment // wspace->pdf("model")->fitTo(*data); // use ProfileLikelihood ProfileLikelihoodCalculator plc(*data, *modelConfig); plc.SetConfidenceLevel(0.95); LikelihoodInterval* plInt = plc.GetInterval(); RooFit::MsgLevel msglevel = RooMsgService::instance().globalKillBelow(); RooMsgService::instance().setGlobalKillBelow(RooFit::FATAL); plInt->LowerLimit( *wspace->var("s") ); // get ugly print out of the way. Fix. RooMsgService::instance().setGlobalKillBelow(msglevel); // use FeldmaCousins (takes ~20 min) FeldmanCousins fc(*data, *modelConfig); fc.SetConfidenceLevel(0.95); //number counting: dataset always has 1 entry with N events observed fc.FluctuateNumDataEntries(false); fc.UseAdaptiveSampling(true); fc.SetNBins(40); PointSetInterval* fcInt = NULL; if(doFeldmanCousins){ // takes 7 minutes fcInt = (PointSetInterval*) fc.GetInterval(); // fix cast } // use BayesianCalculator (only 1-d parameter of interest, slow for this problem) BayesianCalculator bc(*data, *modelConfig); bc.SetConfidenceLevel(0.95); SimpleInterval* bInt = NULL; if(doBayesian && wspace->set("poi")->getSize() == 1) { bInt = bc.GetInterval(); } else{ cout << "Bayesian Calc. only supports on parameter of interest" << endl; } // use MCMCCalculator (takes about 1 min) // Want an efficient proposal function, so derive it from covariance // matrix of fit RooFitResult* fit = wspace->pdf("model")->fitTo(*data,Save()); ProposalHelper ph; ph.SetVariables((RooArgSet&)fit->floatParsFinal()); ph.SetCovMatrix(fit->covarianceMatrix()); ph.SetUpdateProposalParameters(kTRUE); // auto-create mean vars and add mappings ph.SetCacheSize(100); ProposalFunction* pf = ph.GetProposalFunction(); MCMCCalculator mc(*data, *modelConfig); mc.SetConfidenceLevel(0.95); mc.SetProposalFunction(*pf); mc.SetNumBurnInSteps(500); // first N steps to be ignored as burn-in mc.SetNumIters(50000); mc.SetLeftSideTailFraction(0.5); // make a central interval MCMCInterval* mcInt = NULL; if(doMCMC) mcInt = mc.GetInterval(); ////////////////////////////////////// // Make some plots TCanvas* c1 = (TCanvas*) gROOT->Get("c1"); if(!c1) c1 = new TCanvas("c1"); if(doBayesian && doMCMC){ c1->Divide(3); c1->cd(1); } else if (doBayesian || doMCMC){ c1->Divide(2); c1->cd(1); } LikelihoodIntervalPlot* lrplot = new LikelihoodIntervalPlot(plInt); lrplot->Draw(); if(doBayesian && wspace->set("poi")->getSize() == 1) { c1->cd(2); // the plot takes a long time and print lots of error // using a scan it is better bc.SetScanOfPosterior(20); RooPlot* bplot = bc.GetPosteriorPlot(); bplot->Draw(); } if(doMCMC){ if(doBayesian && wspace->set("poi")->getSize() == 1) c1->cd(3); else c1->cd(2); MCMCIntervalPlot mcPlot(*mcInt); mcPlot.Draw(); } //////////////////////////////////// // querry intervals cout << "Profile Likelihood interval on s = [" << plInt->LowerLimit( *wspace->var("s") ) << ", " << plInt->UpperLimit( *wspace->var("s") ) << "]" << endl; //Profile Likelihood interval on s = [12.1902, 88.6871] if(doBayesian && wspace->set("poi")->getSize() == 1) { cout << "Bayesian interval on s = [" << bInt->LowerLimit( ) << ", " << bInt->UpperLimit( ) << "]" << endl; } if(doFeldmanCousins){ cout << "Feldman Cousins interval on s = [" << fcInt->LowerLimit( *wspace->var("s") ) << ", " << fcInt->UpperLimit( *wspace->var("s") ) << "]" << endl; //Feldman Cousins interval on s = [18.75 +/- 2.45, 83.75 +/- 2.45] } if(doMCMC){ cout << "MCMC interval on s = [" << mcInt->LowerLimit(*wspace->var("s") ) << ", " << mcInt->UpperLimit(*wspace->var("s") ) << "]" << endl; //MCMC interval on s = [15.7628, 84.7266] } t.Print(); } FourBinInstructional.C:1 FourBinInstructional.C:2 FourBinInstructional.C:3 FourBinInstructional.C:4 FourBinInstructional.C:5 FourBinInstructional.C:6 FourBinInstructional.C:7 FourBinInstructional.C:8 FourBinInstructional.C:9 FourBinInstructional.C:10 FourBinInstructional.C:11 FourBinInstructional.C:12 FourBinInstructional.C:13 FourBinInstructional.C:14 FourBinInstructional.C:15 FourBinInstructional.C:16 FourBinInstructional.C:17 FourBinInstructional.C:18 FourBinInstructional.C:19 FourBinInstructional.C:20 FourBinInstructional.C:21 FourBinInstructional.C:22 FourBinInstructional.C:23 FourBinInstructional.C:24 FourBinInstructional.C:25 FourBinInstructional.C:26 FourBinInstructional.C:27 FourBinInstructional.C:28 FourBinInstructional.C:29 FourBinInstructional.C:30 FourBinInstructional.C:31 FourBinInstructional.C:32 FourBinInstructional.C:33 FourBinInstructional.C:34 FourBinInstructional.C:35 FourBinInstructional.C:36 FourBinInstructional.C:37 FourBinInstructional.C:38 FourBinInstructional.C:39 FourBinInstructional.C:40 FourBinInstructional.C:41 FourBinInstructional.C:42 FourBinInstructional.C:43 FourBinInstructional.C:44 FourBinInstructional.C:45 FourBinInstructional.C:46 FourBinInstructional.C:47 FourBinInstructional.C:48 FourBinInstructional.C:49 FourBinInstructional.C:50 FourBinInstructional.C:51 FourBinInstructional.C:52 FourBinInstructional.C:53 FourBinInstructional.C:54 FourBinInstructional.C:55 FourBinInstructional.C:56 FourBinInstructional.C:57 FourBinInstructional.C:58 FourBinInstructional.C:59 FourBinInstructional.C:60 FourBinInstructional.C:61 FourBinInstructional.C:62 FourBinInstructional.C:63 FourBinInstructional.C:64 FourBinInstructional.C:65 FourBinInstructional.C:66 FourBinInstructional.C:67 FourBinInstructional.C:68 FourBinInstructional.C:69 FourBinInstructional.C:70 FourBinInstructional.C:71 FourBinInstructional.C:72 FourBinInstructional.C:73 FourBinInstructional.C:74 FourBinInstructional.C:75 FourBinInstructional.C:76 FourBinInstructional.C:77 FourBinInstructional.C:78 FourBinInstructional.C:79 FourBinInstructional.C:80 FourBinInstructional.C:81 FourBinInstructional.C:82 FourBinInstructional.C:83 FourBinInstructional.C:84 FourBinInstructional.C:85 FourBinInstructional.C:86 FourBinInstructional.C:87 FourBinInstructional.C:88 FourBinInstructional.C:89 FourBinInstructional.C:90 FourBinInstructional.C:91 FourBinInstructional.C:92 FourBinInstructional.C:93 FourBinInstructional.C:94 FourBinInstructional.C:95 FourBinInstructional.C:96 FourBinInstructional.C:97 FourBinInstructional.C:98 FourBinInstructional.C:99 FourBinInstructional.C:100 FourBinInstructional.C:101 FourBinInstructional.C:102 FourBinInstructional.C:103 FourBinInstructional.C:104 FourBinInstructional.C:105 FourBinInstructional.C:106 FourBinInstructional.C:107 FourBinInstructional.C:108 FourBinInstructional.C:109 FourBinInstructional.C:110 FourBinInstructional.C:111 FourBinInstructional.C:112 FourBinInstructional.C:113 FourBinInstructional.C:114 FourBinInstructional.C:115 FourBinInstructional.C:116 FourBinInstructional.C:117 FourBinInstructional.C:118 FourBinInstructional.C:119 FourBinInstructional.C:120 FourBinInstructional.C:121 FourBinInstructional.C:122 FourBinInstructional.C:123 FourBinInstructional.C:124 FourBinInstructional.C:125 FourBinInstructional.C:126 FourBinInstructional.C:127 FourBinInstructional.C:128 FourBinInstructional.C:129 FourBinInstructional.C:130 FourBinInstructional.C:131 FourBinInstructional.C:132 FourBinInstructional.C:133 FourBinInstructional.C:134 FourBinInstructional.C:135 FourBinInstructional.C:136 FourBinInstructional.C:137 FourBinInstructional.C:138 FourBinInstructional.C:139 FourBinInstructional.C:140 FourBinInstructional.C:141 FourBinInstructional.C:142 FourBinInstructional.C:143 FourBinInstructional.C:144 FourBinInstructional.C:145 FourBinInstructional.C:146 FourBinInstructional.C:147 FourBinInstructional.C:148 FourBinInstructional.C:149 FourBinInstructional.C:150 FourBinInstructional.C:151 FourBinInstructional.C:152 FourBinInstructional.C:153 FourBinInstructional.C:154 FourBinInstructional.C:155 FourBinInstructional.C:156 FourBinInstructional.C:157 FourBinInstructional.C:158 FourBinInstructional.C:159 FourBinInstructional.C:160 FourBinInstructional.C:161 FourBinInstructional.C:162 FourBinInstructional.C:163 FourBinInstructional.C:164 FourBinInstructional.C:165 FourBinInstructional.C:166 FourBinInstructional.C:167 FourBinInstructional.C:168 FourBinInstructional.C:169 FourBinInstructional.C:170 FourBinInstructional.C:171 FourBinInstructional.C:172 FourBinInstructional.C:173 FourBinInstructional.C:174 FourBinInstructional.C:175 FourBinInstructional.C:176 FourBinInstructional.C:177 FourBinInstructional.C:178 FourBinInstructional.C:179 FourBinInstructional.C:180 FourBinInstructional.C:181 FourBinInstructional.C:182 FourBinInstructional.C:183 FourBinInstructional.C:184 FourBinInstructional.C:185 FourBinInstructional.C:186 FourBinInstructional.C:187 FourBinInstructional.C:188 FourBinInstructional.C:189 FourBinInstructional.C:190 FourBinInstructional.C:191 FourBinInstructional.C:192 FourBinInstructional.C:193 FourBinInstructional.C:194 FourBinInstructional.C:195 FourBinInstructional.C:196 FourBinInstructional.C:197 FourBinInstructional.C:198 FourBinInstructional.C:199 FourBinInstructional.C:200 FourBinInstructional.C:201 FourBinInstructional.C:202 FourBinInstructional.C:203 FourBinInstructional.C:204 FourBinInstructional.C:205 FourBinInstructional.C:206 FourBinInstructional.C:207 FourBinInstructional.C:208 FourBinInstructional.C:209 FourBinInstructional.C:210 FourBinInstructional.C:211 FourBinInstructional.C:212 FourBinInstructional.C:213 FourBinInstructional.C:214 FourBinInstructional.C:215 FourBinInstructional.C:216 FourBinInstructional.C:217 FourBinInstructional.C:218 FourBinInstructional.C:219 FourBinInstructional.C:220 FourBinInstructional.C:221 FourBinInstructional.C:222 FourBinInstructional.C:223 FourBinInstructional.C:224 FourBinInstructional.C:225 FourBinInstructional.C:226 FourBinInstructional.C:227 FourBinInstructional.C:228 FourBinInstructional.C:229 FourBinInstructional.C:230 FourBinInstructional.C:231 FourBinInstructional.C:232 FourBinInstructional.C:233 FourBinInstructional.C:234 FourBinInstructional.C:235 FourBinInstructional.C:236 FourBinInstructional.C:237 FourBinInstructional.C:238 FourBinInstructional.C:239 FourBinInstructional.C:240 FourBinInstructional.C:241 FourBinInstructional.C:242 FourBinInstructional.C:243 FourBinInstructional.C:244 FourBinInstructional.C:245 FourBinInstructional.C:246 FourBinInstructional.C:247 FourBinInstructional.C:248 FourBinInstructional.C:249 FourBinInstructional.C:250 FourBinInstructional.C:251 FourBinInstructional.C:252 FourBinInstructional.C:253 FourBinInstructional.C:254 FourBinInstructional.C:255 FourBinInstructional.C:256 FourBinInstructional.C:257 FourBinInstructional.C:258 FourBinInstructional.C:259 FourBinInstructional.C:260 FourBinInstructional.C:261 FourBinInstructional.C:262 FourBinInstructional.C:263 FourBinInstructional.C:264 FourBinInstructional.C:265 FourBinInstructional.C:266 FourBinInstructional.C:267 FourBinInstructional.C:268 FourBinInstructional.C:269 FourBinInstructional.C:270 FourBinInstructional.C:271 FourBinInstructional.C:272 FourBinInstructional.C:273 FourBinInstructional.C:274 FourBinInstructional.C:275 FourBinInstructional.C:276 FourBinInstructional.C:277 FourBinInstructional.C:278 FourBinInstructional.C:279 FourBinInstructional.C:280 FourBinInstructional.C:281 FourBinInstructional.C:282 FourBinInstructional.C:283 FourBinInstructional.C:284 FourBinInstructional.C:285 FourBinInstructional.C:286 FourBinInstructional.C:287 FourBinInstructional.C:288 FourBinInstructional.C:289 FourBinInstructional.C:290 FourBinInstructional.C:291 FourBinInstructional.C:292 FourBinInstructional.C:293 FourBinInstructional.C:294 FourBinInstructional.C:295 FourBinInstructional.C:296 FourBinInstructional.C:297 FourBinInstructional.C:298 FourBinInstructional.C:299 FourBinInstructional.C:300 FourBinInstructional.C:301 FourBinInstructional.C:302 FourBinInstructional.C:303 FourBinInstructional.C:304 FourBinInstructional.C:305 FourBinInstructional.C:306 FourBinInstructional.C:307 FourBinInstructional.C:308 FourBinInstructional.C:309 FourBinInstructional.C:310 FourBinInstructional.C:311 FourBinInstructional.C:312 FourBinInstructional.C:313 FourBinInstructional.C:314 FourBinInstructional.C:315 FourBinInstructional.C:316 FourBinInstructional.C:317 FourBinInstructional.C:318 FourBinInstructional.C:319 FourBinInstructional.C:320 FourBinInstructional.C:321 FourBinInstructional.C:322 FourBinInstructional.C:323 FourBinInstructional.C:324 FourBinInstructional.C:325 FourBinInstructional.C:326 FourBinInstructional.C:327 FourBinInstructional.C:328 FourBinInstructional.C:329 FourBinInstructional.C:330 FourBinInstructional.C:331 FourBinInstructional.C:332 |
|