Logo ROOT   6.14/05
Reference Guide
TSPlot.cxx
Go to the documentation of this file.
1 // @(#)root/splot:$Id$
2 // Author: Muriel Pivk, Anna Kreshuk 10/2005
3 
4 /**********************************************************************
5  * *
6  * Copyright (c) 2005 ROOT Foundation, CERN/PH-SFT *
7  * *
8  **********************************************************************/
9 
10 #include "TSPlot.h"
11 #include "TVirtualFitter.h"
12 #include "TH1.h"
13 #include "TTreePlayer.h"
14 #include "TTreeFormula.h"
15 #include "TTreeFormulaManager.h"
16 #include "TSelectorDraw.h"
17 #include "TBrowser.h"
18 #include "TClass.h"
19 #include "TMath.h"
20 
21 extern void Yields(Int_t &, Double_t *, Double_t &f, Double_t *x, Int_t iflag);
22 
24 
25 //____________________________________________________________________
26 //Begin_Html <!--
27 /* -->
28 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
29 <p>
30 <b><font size="+2">Overview</font></b>
31 
32 </p><p>
33 A common method used in High Energy Physics to perform measurements is
34 the maximum Likelihood method, exploiting discriminating variables to
35 disentangle signal from background. The crucial point for such an
36 analysis to be reliable is to use an exhaustive list of sources of
37 events combined with an accurate description of all the Probability
38 Density Functions (PDF).
39 </p><p>To assess the validity of the fit, a convincing quality check
40 is to explore further the data sample by examining the distributions of
41 control variables. A control variable can be obtained for instance by
42 removing one of the discriminating variables before performing again
43 the maximum Likelihood fit: this removed variable is a control
44 variable. The expected distribution of this control variable, for
45 signal, is to be compared to the one extracted, for signal, from the
46 data sample. In order to be able to do so, one must be able to unfold
47 from the distribution of the whole data sample.
48 </p><p>The TSPlot method allows to reconstruct the distributions for
49 the control variable, independently for each of the various sources of
50 events, without making use of any <em>a priori</em> knowledge on <u>this</u>
51 variable. The aim is thus to use the knowledge available for the
52 discriminating variables to infer the behaviour of the individual
53 sources of events with respect to the control variable.
54 </p><p>
55 TSPlot is optimal if the control variable is uncorrelated with the discriminating variables.
56 
57 </p><p>
58 A detail description of the formalism itself, called <!-- MATH
59  $\hbox{$_s$}{\cal P}lot$
60  -->
61 <img src="gif/sPlot_img5.png" alt="$\hbox{$_s$}{\cal P}lot$" align="middle" border="0" height="34" width="48">, is given in&nbsp;[<a href="http://www.slac.stanford.edu/%7Epivk/sPlot/sPlot_ROOT/node1.html#bib:sNIM">1</a>].
62 
63 </p><p>
64 <b><font size="+2">The method</font></b>
65 
66 </p><p>
67 The <!-- MATH
68  $\hbox{$_s$}{\cal P}lot$
69  -->
70 <img src="gif/sPlot_img5.png" alt="$\hbox{$_s$}{\cal P}lot$" align="middle" border="0" height="34" width="48"> technique is developped in the above context of a maximum Likelihood method making use of discriminating variables.
71 
72 </p><p>One considers a data sample in which are merged several species
73 of events. These species represent various signal components and
74 background components which all together account for the data sample.
75 The different terms of the log-Likelihood are:
76 </p><ul>
77 <li><img src="gif/sPlot_img6.png" alt="$N$" align="bottom" border="0" height="17" width="22">: the total number of events in the data sample,
78 </li>
79 <li><!-- MATH
80  ${\rm N}_{\rm s}$
81  -->
82 <img src="gif/sPlot_img7.png" alt="${\rm N}_{\rm s}$" align="middle" border="0" height="34" width="25">: the number of species of events populating the data sample,
83 </li>
84 <li><img src="gif/sPlot_img8.png" alt="$N_i$" align="middle" border="0" height="34" width="25">: the number of events expected on the average for the <img src="gif/sPlot_img9.png" alt="$i^{\rm th}$" align="bottom" border="0" height="20" width="25"> species,
85 </li>
86 <li><!-- MATH
87  ${\rm f}_i(y_e)$
88  -->
89 <img src="gif/sPlot_img10.png" alt="${\rm f}_i(y_e)$" align="middle" border="0" height="37" width="47">: the value of the PDFs of the discriminating variables <img src="gif/sPlot_img11.png" alt="$y$" align="middle" border="0" height="33" width="15"> for the <img src="gif/sPlot_img12.png" alt="$i^{th}$" align="bottom" border="0" height="20" width="25"> species and for event <img src="gif/sPlot_img13.png" alt="$e$" align="bottom" border="0" height="17" width="13">,
90 </li>
91 <li><img src="gif/sPlot_img14.png" alt="$x$" align="bottom" border="0" height="17" width="15">: the set of control variables which, by definition, do not appear in the expression of the Likelihood function <img src="gif/sPlot_img15.png" alt="${\cal L}$" align="bottom" border="0" height="18" width="18">.
92 </li>
93 </ul>
94 The extended log-Likelihood reads:
95 <br>
96 <div align="right">
97 
98 <!-- MATH
99  \begin{equation}
100 {\cal L}=\sum_{e=1}^{N}\ln \Big\{ \sum_{i=1}^{{\rm N}_{\rm s}}N_i{\rm f}_i(y_e) \Big\} -\sum_{i=1}^{{\rm N}_{\rm s}}N_i ~.
101 \end{equation}
102  -->
103 <table align="center" width="100%">
104 <tbody><tr valign="middle"><td align="center" nowrap="nowrap"><a name="eq:eLik"></a><img src="gif/sPlot_img16.png" alt="\begin{displaymath}
105 {\cal L}=\sum_{e=1}^{N}\ln \Big\{ \sum_{i=1}^{{\rm N}_{\rm s}}N_i{\rm f}_i(y_e) \Big\} -\sum_{i=1}^{{\rm N}_{\rm s}}N_i ~.
106 \end{displaymath}" border="0" height="59" width="276"></td>
107 <td align="right" width="10">
108 (1)</td></tr>
109 </tbody></table>
110 <br clear="all"></div><p></p>
111 From this expression, after maximization of <img src="gif/sPlot_img15.png" alt="${\cal L}$" align="bottom" border="0" height="18" width="18"> with respect to the <img src="gif/sPlot_img8.png" alt="$N_i$" align="middle" border="0" height="34" width="25"> parameters, a weight can be computed for every event and each species, in order to obtain later the true distribution <!-- MATH
112  ${\hbox{\bf {M}}}_i(x)$
113  -->
114 <img src="gif/sPlot_img17.png" alt="${\hbox{\bf {M}}}_i(x)$" align="middle" border="0" height="37" width="56"> of variable <img src="gif/sPlot_img14.png" alt="$x$" align="bottom" border="0" height="17" width="15">. If <img src="gif/sPlot_img18.png" alt="${\rm n}$" align="bottom" border="0" height="17" width="15"> is one of the <!-- MATH
115  ${\rm N}_{\rm s}$
116  -->
117 <img src="gif/sPlot_img7.png" alt="${\rm N}_{\rm s}$" align="middle" border="0" height="34" width="25"> species present in the data sample, the weight for this species is defined by:
118 <br>
119 <div align="right">
120 
121 <!-- MATH
122  \begin{equation}
123 \begin{Large}\fbox{$
124 {_s{\cal P}}_{\rm n}(y_e)={\sum_{j=1}^{{\rm N}_{\rm s}} \hbox{\bf V}_{{\rm n}j}{\rm f}_j(y_e)\over\sum_{k=1}^{{\rm N}_{\rm s}}N_k{\rm f}_k(y_e) } $}\end{Large} ~,
125 \end{equation}
126  -->
127 <table align="center" width="100%">
128 <tbody><tr valign="middle"><td align="center" nowrap="nowrap"><a name="eq:weightxnotiny"></a><img src="gif/sPlot_img19.png" alt="\begin{displaymath}
129 \begin{Large}
130 \fbox{$
131 {_s{\cal P}}_{\rm n}(y_e)={\sum_{j=1}^...
132 ...um_{k=1}^{{\rm N}_{\rm s}}N_k{\rm f}_k(y_e) } $}\end{Large} ~,
133 \end{displaymath}" border="0" height="76" width="279"></td>
134 <td align="right" width="10">
135 (2)</td></tr>
136 </tbody></table>
137 <br clear="all"></div><p></p>
138 where <!-- MATH
139  $\hbox{\bf V}_{{\rm n}j}$
140  -->
141 <img src="gif/sPlot_img20.png" alt="$\hbox{\bf V}_{{\rm n}j}$" align="middle" border="0" height="34" width="35">
142 is the covariance matrix resulting from the Likelihood maximization.
143 This matrix can be used directly from the fit, but this is numerically
144 less accurate than the direct computation:
145 <br>
146 <div align="right">
147 
148 <!-- MATH
149  \begin{equation}
150 \hbox{\bf V}^{-1}_{{\rm n}j}~=~
151 {\partial^2(-{\cal L})\over\partial N_{\rm n}\partial N_j}~=~
152 \sum_{e=1}^N {{\rm f}_{\rm n}(y_e){\rm f}_j(y_e)\over(\sum_{k=1}^{{\rm N}_{\rm s}}N_k{\rm f}_k(y_e))^2} ~.
153 \end{equation}
154  -->
155 <table align="center" width="100%">
156 <tbody><tr valign="middle"><td align="center" nowrap="nowrap"><a name="eq:VarianceMatrixDirect"></a><img src="gif/sPlot_img21.png" alt="\begin{displaymath}
157 \hbox{\bf V}^{-1}_{{\rm n}j}~=~
158 {\partial^2(-{\cal L})\over\...
159 ...y_e)\over(\sum_{k=1}^{{\rm N}_{\rm s}}N_k{\rm f}_k(y_e))^2} ~.
160 \end{displaymath}" border="0" height="58" width="360"></td>
161 <td align="right" width="10">
162 (3)</td></tr>
163 </tbody></table>
164 <br clear="all"></div><p></p>
165 The distribution of the control variable&nbsp;<img src="gif/sPlot_img14.png" alt="$x$" align="bottom" border="0" height="17" width="15"> obtained by histogramming the weighted events reproduces, on average, the true distribution <!-- MATH
166  ${\hbox{\bf {M}}}_{\rm n}(x)$
167  -->
168 <img src="gif/sPlot_img22.png" alt="${\hbox{\bf {M}}}_{\rm n}(x)$" align="middle" border="0" height="37" width="59">.
169 
170 <p>
171 The class TSPlot allows to reconstruct the true distribution <!-- MATH
172  ${\hbox{\bf {M}}}_{\rm n}(x)$
173  -->
174 <img src="gif/sPlot_img22.png" alt="${\hbox{\bf {M}}}_{\rm n}(x)$" align="middle" border="0" height="37" width="59"> of a control variable&nbsp;<img src="gif/sPlot_img14.png" alt="$x$" align="bottom" border="0" height="17" width="15"> for each of the <!-- MATH
175  ${\rm N}_{\rm s}$
176  -->
177 <img src="gif/sPlot_img7.png" alt="${\rm N}_{\rm s}$" align="middle" border="0" height="34" width="25"> species from the sole knowledge of the PDFs of the discriminating variables <img src="gif/sPlot_img23.png" alt="${\rm f}_i(y)$" align="middle" border="0" height="37" width="40">. The plots obtained thanks to the TSPlot class are called <!-- MATH
178  $\hbox{$_s$}{\cal P}lots$
179  -->
180 <img src="gif/sPlot_img4.png" alt="$\hbox {$_s$}{\cal P}lots$" align="middle" border="0" height="34" width="57">.
181 
182 </p><p>
183 <b><font size="+2">Some properties and checks</font></b>
184 
185 </p><p>
186 Beside reproducing the true distribution, <!-- MATH
187  $\hbox{$_s$}{\cal P}lots$
188  -->
189 <img src="gif/sPlot_img4.png" alt="$\hbox {$_s$}{\cal P}lots$" align="middle" border="0" height="34" width="57"> bear remarkable properties:
190 
191 </p><ul>
192 <li>
193 Each <img src="gif/sPlot_img14.png" alt="$x$" align="bottom" border="0" height="17" width="15">-distribution is properly normalized:
194 <br>
195 <div align="right">
196 
197 <!-- MATH
198  \begin{equation}
199 \sum_{e=1}^{N} {_s{\cal P}}_{\rm n}(y_e)~=~N_{\rm n}~.
200 \end{equation}
201  -->
202 <table align="center" width="100%">
203 <tbody><tr valign="middle"><td align="center" nowrap="nowrap"><a name="eq:NormalizationOK"></a><img src="gif/sPlot_img24.png" alt="\begin{displaymath}
204 \sum_{e=1}^{N} {_s{\cal P}}_{\rm n}(y_e)~=~N_{\rm n}~.
205 \end{displaymath}" border="0" height="58" width="158"></td>
206 <td align="right" width="10">
207 (4)</td></tr>
208 </tbody></table>
209 <br clear="all"></div><p></p>
210 </li>
211 <li>
212 For any event:
213 <br>
214 <div align="right">
215 
216 <!-- MATH
217  \begin{equation}
218 \sum_{l=1}^{{\rm N}_{\rm s}} {_s{\cal P}}_l(y_e) ~=~1 ~.
219 \end{equation}
220  -->
221 <table align="center" width="100%">
222 <tbody><tr valign="middle"><td align="center" nowrap="nowrap"><a name="eq:numberconservation"></a><img src="gif/sPlot_img25.png" alt="\begin{displaymath}
223 \sum_{l=1}^{{\rm N}_{\rm s}} {_s{\cal P}}_l(y_e) ~=~1 ~.
224 \end{displaymath}" border="0" height="59" width="140"></td>
225 <td align="right" width="10">
226 (5)</td></tr>
227 </tbody></table>
228 <br clear="all"></div><p></p>
229 That is to say that, summing up the <!-- MATH
230  ${\rm N}_{\rm s}$
231  -->
232 <img src="gif/sPlot_img7.png" alt="${\rm N}_{\rm s}$" align="middle" border="0" height="34" width="25"> <!-- MATH
233  $\hbox{$_s$}{\cal P}lots$
234  -->
235 <img src="gif/sPlot_img4.png" alt="$\hbox {$_s$}{\cal P}lots$" align="middle" border="0" height="34" width="57">, one recovers the data sample distribution in&nbsp;<img src="gif/sPlot_img14.png" alt="$x$" align="bottom" border="0" height="17" width="15">, and summing up the number of events entering in a <!-- MATH
236  $\hbox{$_s$}{\cal P}lot$
237  -->
238 <img src="gif/sPlot_img5.png" alt="$\hbox{$_s$}{\cal P}lot$" align="middle" border="0" height="34" width="48"> for a given species, one recovers the yield of the species, as provided by the fit. The property&nbsp;<a href="http://www.slac.stanford.edu/%7Epivk/sPlot/sPlot_ROOT/sPlot_ROOT.html#eq:NormalizationOK">4</a> is implemented in the TSPlot class as a check.
239 </li>
240 <li>the sum of the statistical uncertainties per bin
241 <br>
242 <div align="right">
243 
244 <!-- MATH
245  \begin{equation}
246 \sigma[N_{\rm n}\ _s\tilde{\rm M}_{\rm n}(x) {\delta x}]~=~\sqrt{\sum_{e \subset {\delta x}} ({_s{\cal P}}_{\rm n})^2} ~.
247 \end{equation}
248  -->
249 <table align="center" width="100%">
250 <tbody><tr valign="middle"><td align="center" nowrap="nowrap"><a name="eq:ErrorPerBin"></a><img src="gif/sPlot_img26.png" alt="\begin{displaymath}
251 \sigma[N_{\rm n}\ _s\tilde{\rm M}_{\rm n}(x) {\delta x}]~=~\sqrt{\sum_{e \subset {\delta x}} ({_s{\cal P}}_{\rm n})^2} ~.
252 \end{displaymath}" border="0" height="55" width="276"></td>
253 <td align="right" width="10">
254 (6)</td></tr>
255 </tbody></table>
256 <br clear="all"></div><p></p>
257 reproduces the statistical uncertainty on the yield <img src="gif/sPlot_img27.png" alt="$N_{\rm n}$" align="middle" border="0" height="34" width="28">, as provided by the fit: <!-- MATH
258  $\sigma[N_{\rm n}]\equiv\sqrt{\hbox{\bf V}_{{\rm n}{\rm n}}}$
259  -->
260 <img src="gif/sPlot_img28.png" alt="$\sigma[N_{\rm n}]\equiv\sqrt{\hbox{\bf V}_{{\rm n}{\rm n}}}$" align="middle" border="0" height="40" width="123">.
261 Because of that and since the determination of the yields is optimal
262 when obtained using a Likelihood fit, one can conclude that the<!-- MATH
263  $\hbox{$_s$}{\cal P}lot$
264  -->
265  <img src="gif/sPlot_img5.png" alt="$\hbox{$_s$}{\cal P}lot$" align="middle" border="0" height="34" width="48"> technique is itself an optimal method to reconstruct distributions of control variables.
266 </li>
267 </ul>
268 
269 <p>
270 <b><font size="+2">Different steps followed by TSPlot</font></b>
271 
272 </p><p>
273 
274 </p><ol>
275 <li>A maximum Likelihood fit is performed to obtain the yields <img src="gif/sPlot_img8.png" alt="$N_i$" align="middle" border="0" height="34" width="25"> of the various species.
276 The fit relies on discriminating variables&nbsp;<img src="gif/sPlot_img11.png" alt="$y$" align="middle" border="0" height="33" width="15"> uncorrelated with a control variable&nbsp;<img src="gif/sPlot_img14.png" alt="$x$" align="bottom" border="0" height="17" width="15">:
277 the later is therefore totally absent from the fit.
278 </li>
279 <li>The weights <img src="gif/sPlot_img29.png" alt="${_s{\cal P}}$" align="middle" border="0" height="34" width="27"> are calculated using Eq.&nbsp;(<a href="http://www.slac.stanford.edu/%7Epivk/sPlot/sPlot_ROOT/sPlot_ROOT.html#eq:weightxnotiny">2</a>) where the covariance matrix is taken from Minuit.
280 </li>
281 <li>Histograms of&nbsp;<img src="gif/sPlot_img14.png" alt="$x$" align="bottom" border="0" height="17" width="15"> are filled by weighting the events with <img src="gif/sPlot_img29.png" alt="${_s{\cal P}}$" align="middle" border="0" height="34" width="27">.
282 </li>
283 <li>Error bars per bin are given by Eq.&nbsp;(<a href="http://www.slac.stanford.edu/%7Epivk/sPlot/sPlot_ROOT/sPlot_ROOT.html#eq:ErrorPerBin">6</a>).
284 </li>
285 </ol>
286 The <!-- MATH
287  $\hbox{$_s$}{\cal P}lots$
288  -->
289 <img src="gif/sPlot_img4.png" alt="$\hbox {$_s$}{\cal P}lots$" align="middle" border="0" height="34" width="57"> reproduce the true distributions of the species in the control variable&nbsp;<img src="gif/sPlot_img14.png" alt="$x$" align="bottom" border="0" height="17" width="15">, within the above defined statistical uncertainties.
290 
291 <p>
292 <b><font size="+2">Illustrations</font></b>
293 
294 </p><p>
295 To illustrate the technique, one considers an example derived from the analysis where <!-- MATH
296  $\hbox{$_s$}{\cal P}lots$
297  -->
298 <img src="gif/sPlot_img4.png" alt="$\hbox {$_s$}{\cal P}lots$" align="middle" border="0" height="34" width="57">
299 have been first used (charmless B decays). One is dealing with a data
300 sample in which two species are present: the first is termed signal and
301 the second background. A maximum Likelihood fit is performed to obtain
302 the two yields <img src="gif/sPlot_img30.png" alt="$N_1$" align="middle" border="0" height="34" width="27"> and <img src="gif/sPlot_img31.png" alt="$N_2$" align="middle" border="0" height="34" width="27">. The fit relies on two discriminating variables collectively denoted&nbsp;<img src="gif/sPlot_img11.png" alt="$y$" align="middle" border="0" height="33" width="15"> which are chosen within three possible variables denoted <img src="gif/sPlot_img1.png" alt="${m_{\rm ES}}$" align="middle" border="0" height="33" width="39">, <img src="gif/sPlot_img2.png" alt="$\Delta E$" align="bottom" border="0" height="17" width="35"> and <img src="gif/sPlot_img3.png" alt="${\cal F}$" align="bottom" border="0" height="18" width="20">.
303 The variable which is not incorporated in&nbsp;<img src="gif/sPlot_img11.png" alt="$y$" align="middle" border="0" height="33" width="15"> is used as the control variable&nbsp;<img src="gif/sPlot_img14.png" alt="$x$" align="bottom" border="0" height="17" width="15">. The six distributions of the three variables are assumed to be the ones depicted in Fig.&nbsp;<a href="http://www.slac.stanford.edu/%7Epivk/sPlot/sPlot_ROOT/sPlot_ROOT.html#fig:pdfs">1</a>.
304 
305 </p><p>
306 
307 </p><div align="center"><a name="fig:pdfs"></a><a name="106"></a>
308 <table>
309 <caption align="bottom"><strong>Figure 1:</strong>
310 Distributions of the three discriminating variables available to perform the Likelihood fit:
311 <img src="gif/sPlot_img32.png" alt="${m_{\rm ES}}$" align="middle" border="0" height="33" width="39">, <img src="gif/sPlot_img2.png" alt="$\Delta E$" align="bottom" border="0" height="17" width="35">, <img src="gif/sPlot_img3.png" alt="${\cal F}$" align="bottom" border="0" height="18" width="20">.
312 Among the three variables, two are used to perform the fit while one is
313 kept out of the fit to serve the purpose of a control variable. The
314 three distributions on the top (resp. bottom) of the figure correspond
315 to the signal (resp. background). The unit of the vertical axis is
316 chosen such that it indicates the number of entries per bin, if one
317 slices the histograms in 25 bins.</caption>
318 <tbody><tr><td><img src="gif/sPlot_img33.png" alt="\begin{figure}\begin{center}
319 \mbox{{\psfig{file=pdfmesNIM.eps,width=0.33\linewi...
320 ...th}}
321 {\psfig{file=pdffiNIM.eps,width=0.33\linewidth}}}
322 \end{center}\end{figure}" border="0" height="162" width="544"></td></tr>
323 </tbody></table>
324 </div>
325 
326 <p>
327 A data sample being built through a Monte Carlo simulation based on the distributions shown in Fig.&nbsp;<a href="http://www.slac.stanford.edu/%7Epivk/sPlot/sPlot_ROOT/sPlot_ROOT.html#fig:pdfs">1</a>, one obtains the three distributions of Fig.&nbsp;<a href="http://www.slac.stanford.edu/%7Epivk/sPlot/sPlot_ROOT/sPlot_ROOT.html#fig:pdfstot">2</a>. Whereas the distribution of&nbsp;<img src="gif/sPlot_img2.png" alt="$\Delta E$" align="bottom" border="0" height="17" width="35"> clearly indicates the presence of the signal, the distribution of <img src="gif/sPlot_img1.png" alt="${m_{\rm ES}}$" align="middle" border="0" height="33" width="39"> and <img src="gif/sPlot_img3.png" alt="${\cal F}$" align="bottom" border="0" height="18" width="20"> are less obviously populated by signal.
328 
329 </p><p>
330 
331 </p><div align="center"><a name="fig:pdfstot"></a><a name="169"></a>
332 <table>
333 <caption align="bottom"><strong>Figure 2:</strong>
334 Distributions of the three discriminating variables for signal plus
335 background. The three distributions are the ones obtained from a data
336 sample obtained through a Monte Carlo simulation based on the
337 distributions shown in Fig.&nbsp;<a href="http://www.slac.stanford.edu/%7Epivk/sPlot/sPlot_ROOT/sPlot_ROOT.html#fig:pdfs">1</a>. The data sample consists of 500 signal events and 5000 background events.</caption>
338 <tbody><tr><td><img src="gif/sPlot_img34.png" alt="\begin{figure}\begin{center}
339 \mbox{{\psfig{file=genmesTOTNIM.eps,width=0.33\lin...
340 ...}
341 {\psfig{file=genfiTOTNIM.eps,width=0.33\linewidth}}}
342 \end{center}\end{figure}" border="0" height="160" width="545"></td></tr>
343 </tbody></table>
344 </div>
345 
346 <p>
347 Chosing <img src="gif/sPlot_img2.png" alt="$\Delta E$" align="bottom" border="0" height="17" width="35"> and <img src="gif/sPlot_img3.png" alt="${\cal F}$" align="bottom" border="0" height="18" width="20"> as discriminating variables to determine <img src="gif/sPlot_img30.png" alt="$N_1$" align="middle" border="0" height="34" width="27"> and <img src="gif/sPlot_img31.png" alt="$N_2$" align="middle" border="0" height="34" width="27"> through a maximum Likelihood fit, one builds, for the control variable <img src="gif/sPlot_img1.png" alt="${m_{\rm ES}}$" align="middle" border="0" height="33" width="39"> which is unknown to the fit, the two <!-- MATH
348  $\hbox{$_s$}{\cal P}lots$
349  -->
350 <img src="gif/sPlot_img4.png" alt="$\hbox {$_s$}{\cal P}lots$" align="middle" border="0" height="34" width="57"> for signal and background shown in Fig.&nbsp;<a href="http://www.slac.stanford.edu/%7Epivk/sPlot/sPlot_ROOT/sPlot_ROOT.html#fig:messPlots">3</a>. One observes that the <!-- MATH
351  $\hbox{$_s$}{\cal P}lot$
352  -->
353 <img src="gif/sPlot_img5.png" alt="$\hbox{$_s$}{\cal P}lot$" align="middle" border="0" height="34" width="48">
354 for signal reproduces correctly the PDF even where the latter vanishes,
355 although the error bars remain sizeable. This results from the almost
356 complete cancellation between positive and negative weights: the sum of
357 weights is close to zero while the sum of weights squared is not. The
358 occurence of negative weights occurs through the appearance of the
359 covariance matrix, and its negative components, in the definition of
360 Eq.&nbsp;(<a href="http://www.slac.stanford.edu/%7Epivk/sPlot/sPlot_ROOT/sPlot_ROOT.html#eq:weightxnotiny">2</a>).
361 
362 </p><p>
363 A word of caution is in order with respect to the error bars. Whereas
364 their sum in quadrature is identical to the statistical uncertainties
365 of the yields determined by the fit, and if, in addition, they are
366 asymptotically correct, the error bars should be handled with care for
367 low statistics and/or for too fine binning. This is because the error
368 bars do not incorporate two known properties of the PDFs: PDFs are
369 positive definite and can be non-zero in a given x-bin, even if in the
370 particular data sample at hand, no event is observed in this bin. The
371 latter limitation is not specific to<!-- MATH
372  $\hbox{$_s$}{\cal P}lots$
373  -->
374  <img src="gif/sPlot_img4.png" alt="$\hbox {$_s$}{\cal P}lots$" align="middle" border="0" height="34" width="57">,
375 rather it is always present when one is willing to infer the PDF at the
376 origin of an histogram, when, for some bins, the number of entries does
377 not guaranty the applicability of the Gaussian regime. In such
378 situations, a satisfactory practice is to attach allowed ranges to the
379 histogram to indicate the upper and lower limits of the PDF value which
380 are consistent with the actual observation, at a given confidence
381 level.
382 </p><p>
383 
384 </p><div align="center"><a name="fig:messPlots"></a><a name="127"></a>
385 <table>
386 <caption align="bottom"><strong>Figure 3:</strong>
387 The <!-- MATH
388  $\hbox{$_s$}{\cal P}lots$
389  -->
390 <img src="gif/sPlot_img4.png" alt="$\hbox {$_s$}{\cal P}lots$" align="middle" border="0" height="34" width="57"> (signal on the left, background on the right) obtained for <img src="gif/sPlot_img32.png" alt="${m_{\rm ES}}$" align="middle" border="0" height="33" width="39"> are represented as dots with error bars. They are obtained from a fit using only information from <img src="gif/sPlot_img2.png" alt="$\Delta E$" align="bottom" border="0" height="17" width="35"> and <img src="gif/sPlot_img3.png" alt="${\cal F}$" align="bottom" border="0" height="18" width="20">.</caption>
391 <tbody><tr><td><img src="gif/sPlot_img35.png" alt="\begin{figure}\begin{center}
392 \mbox{\psfig{file=mass-sig-sPlot.eps,width=0.48\li...
393 ... \psfig{file=mass-bkg-sPlot.eps,width=0.48\linewidth}}
394 \end{center}\end{figure}" border="0" height="181" width="539"></td></tr>
395 </tbody></table>
396 </div>
397 
398 <p>
399 Chosing <img src="gif/sPlot_img1.png" alt="${m_{\rm ES}}$" align="middle" border="0" height="33" width="39"> and <img src="gif/sPlot_img2.png" alt="$\Delta E$" align="bottom" border="0" height="17" width="35"> as discriminating variables to determine <img src="gif/sPlot_img30.png" alt="$N_1$" align="middle" border="0" height="34" width="27"> and <img src="gif/sPlot_img31.png" alt="$N_2$" align="middle" border="0" height="34" width="27"> through a maximum Likelihood fit, one builds, for the control variable <img src="gif/sPlot_img3.png" alt="${\cal F}$" align="bottom" border="0" height="18" width="20"> which is unknown to the fit, the two <!-- MATH
400  $\hbox{$_s$}{\cal P}lots$
401  -->
402 <img src="gif/sPlot_img4.png" alt="$\hbox {$_s$}{\cal P}lots$" align="middle" border="0" height="34" width="57"> for signal and background shown in Fig.&nbsp;<a href="http://www.slac.stanford.edu/%7Epivk/sPlot/sPlot_ROOT/sPlot_ROOT.html#fig:FisPlots">4</a>. In the <!-- MATH
403  $\hbox{$_s$}{\cal P}lot$
404  -->
405 <img src="gif/sPlot_img5.png" alt="$\hbox{$_s$}{\cal P}lot$" align="middle" border="0" height="34" width="48"> for signal one observes that error bars are the largest in the&nbsp;<img src="gif/sPlot_img14.png" alt="$x$" align="bottom" border="0" height="17" width="15"> regions where the background is the largest.
406 
407 </p><p>
408 
409 </p><div align="center"><a name="fig:FisPlots"></a><a name="136"></a>
410 <table>
411 <caption align="bottom"><strong>Figure 4:</strong>
412 The <!-- MATH
413  $\hbox{$_s$}{\cal P}lots$
414  -->
415 <img src="gif/sPlot_img4.png" alt="$\hbox {$_s$}{\cal P}lots$" align="middle" border="0" height="34" width="57"> (signal on the left, background on the right) obtained for <img src="gif/sPlot_img3.png" alt="${\cal F}$" align="bottom" border="0" height="18" width="20"> are represented as dots with error bars. They are obtained from a fit using only information from <img src="gif/sPlot_img32.png" alt="${m_{\rm ES}}$" align="middle" border="0" height="33" width="39"> and <img src="gif/sPlot_img2.png" alt="$\Delta E$" align="bottom" border="0" height="17" width="35">.</caption>
416 <tbody><tr><td><img src="gif/sPlot_img36.png" alt="\begin{figure}\begin{center}
417 \mbox{\psfig{file=fisher-sig-sPlot.eps,width=0.48\...
418 ...psfig{file=fisher-bkg-sPlot.eps,width=0.48\linewidth}}
419 \end{center}\end{figure}" border="0" height="180" width="539"></td></tr>
420 </tbody></table>
421 </div>
422 
423 <p>
424 The results above can be obtained by running the tutorial TestSPlot.C
425 </p>
426 <!--*/
427 //-->End_Html
428 
429 
430 ////////////////////////////////////////////////////////////////////////////////
431 /// default constructor (used by I/O only)
432 
434  fTree(0),
435  fTreename(0),
436  fVarexp(0),
437  fSelection(0)
438 {
439  fNx = 0;
440  fNy=0;
441  fNevents = 0;
442  fNSpecies=0;
444 }
445 
446 ////////////////////////////////////////////////////////////////////////////////
447 
449  fTreename(0),
450  fVarexp(0),
451  fSelection(0)
452 
453 {
454  //normal TSPlot constructor
455  // nx : number of control variables
456  // ny : number of discriminating variables
457  // ne : total number of events
458  // ns : number of species
459  // tree: input data
460 
461  fNx = nx;
462  fNy=ny;
463  fNevents = ne;
464  fNSpecies=ns;
465 
470  fTree = tree;
471  fNumbersOfEvents = 0;
472 }
473 
474 ////////////////////////////////////////////////////////////////////////////////
475 /// destructor
476 
478 {
479  if (fNumbersOfEvents)
480  delete [] fNumbersOfEvents;
481  if (!fXvarHists.IsEmpty())
482  fXvarHists.Delete();
483  if (!fYvarHists.IsEmpty())
484  fYvarHists.Delete();
485  if (!fYpdfHists.IsEmpty())
486  fYpdfHists.Delete();
487 }
488 
489 ////////////////////////////////////////////////////////////////////////////////
490 ///To browse the histograms
491 
493 {
494  if (!fSWeightsHists.IsEmpty()) {
495  TIter next(&fSWeightsHists);
496  TH1D* h = 0;
497  while ((h = (TH1D*)next()))
498  b->Add(h,h->GetName());
499  }
500 
501  if (!fYpdfHists.IsEmpty()) {
502  TIter next(&fYpdfHists);
503  TH1D* h = 0;
504  while ((h = (TH1D*)next()))
505  b->Add(h,h->GetName());
506  }
507  if (!fYvarHists.IsEmpty()) {
508  TIter next(&fYvarHists);
509  TH1D* h = 0;
510  while ((h = (TH1D*)next()))
511  b->Add(h,h->GetName());
512  }
513  if (!fXvarHists.IsEmpty()) {
514  TIter next(&fXvarHists);
515  TH1D* h = 0;
516  while ((h = (TH1D*)next()))
517  b->Add(h,h->GetName());
518  }
519  b->Add(&fSWeights, "sWeights");
520 }
521 
522 
523 ////////////////////////////////////////////////////////////////////////////////
524 ///Set the initial number of events of each species - used
525 ///as initial estimates in minuit
526 
528 {
529  if (!fNumbersOfEvents)
531  for (Int_t i=0; i<fNSpecies; i++)
532  fNumbersOfEvents[i]=numbers[i];
533 }
534 
535 ////////////////////////////////////////////////////////////////////////////////
536 ///Calculates the sWeights
537 ///The option controls the print level
538 ///"Q" - no print out
539 ///"V" - prints the estimated #of events in species - default
540 ///"VV" - as "V" + the minuit printing + sums of weights for control
541 
543 {
544 
545  if (!fNumbersOfEvents){
546  Error("MakeSPlot","Initial numbers of events in species have not been set");
547  return;
548  }
549  Int_t i, j, ispecies;
550 
551  TString opt = option;
552  opt.ToUpper();
553  opt.ReplaceAll("VV", "W");
554 
555  //make sure that global fitter is minuit
556  char s[]="TFitter";
558  Int_t strdiff=strcmp(TVirtualFitter::GetFitter()->IsA()->GetName(), s);
559  if (strdiff!=0)
560  delete TVirtualFitter::GetFitter();
561  }
562 
563 
564  TVirtualFitter *minuit = TVirtualFitter::Fitter(0, 2);
566 
567  //now let's do it, excluding different yvars
568  //for iplot = -1 none is excluded
569  for (Int_t iplot=-1; iplot<fNy; iplot++){
570  for (i=0; i<fNevents; i++){
571  for (ispecies=0; ispecies<fNSpecies; ispecies++){
572  fPdfTot(i, ispecies)=1;
573  for (j=0; j<fNy; j++){
574  if (j!=iplot)
575  fPdfTot(i, ispecies)*=fYpdf(i, ispecies*fNy+j);
576  }
577  }
578  }
579  minuit->Clear();
580  minuit->SetFCN(Yields);
581  Double_t arglist[10];
582  //set the print level
583  if (opt.Contains("Q")||opt.Contains("V")){
584  arglist[0]=-1;
585  }
586  if (opt.Contains("W"))
587  arglist[0]=0;
588  minuit->ExecuteCommand("SET PRINT", arglist, 1);
589 
590  minuit->SetObjectFit(&fPdfTot); //a tricky way to get fPdfTot matrix to fcn
591  for (ispecies=0; ispecies<fNSpecies; ispecies++)
592  minuit->SetParameter(ispecies, "", fNumbersOfEvents[ispecies], 1, 0, 0);
593 
594  minuit->ExecuteCommand("MIGRAD", arglist, 0);
595  for (ispecies=0; ispecies<fNSpecies; ispecies++){
596  fNumbersOfEvents[ispecies]=minuit->GetParameter(ispecies);
597  if (!opt.Contains("Q"))
598  printf("estimated #of events in species %d = %f\n", ispecies, fNumbersOfEvents[ispecies]);
599  }
600  if (!opt.Contains("Q"))
601  printf("\n");
602  Double_t *covmat = minuit->GetCovarianceMatrix();
603  SPlots(covmat, iplot);
604 
605  if (opt.Contains("W")){
606  Double_t *sumweight = new Double_t[fNSpecies];
607  for (i=0; i<fNSpecies; i++){
608  sumweight[i]=0;
609  for (j=0; j<fNevents; j++)
610  sumweight[i]+=fSWeights(j, (iplot+1)*fNSpecies + i);
611  printf("checking sum of weights[%d]=%f\n", i, sumweight[i]);
612  }
613  printf("\n");
614  delete [] sumweight;
615  }
616  }
617 }
618 
619 ////////////////////////////////////////////////////////////////////////////////
620 ///Computes the sWeights from the covariance matrix
621 
622 void TSPlot::SPlots(Double_t *covmat, Int_t i_excl)
623 {
624  Int_t i, ispecies, k;
625  Double_t numerator, denominator;
626  for (i=0; i<fNevents; i++){
627  denominator=0;
628  for (ispecies=0; ispecies<fNSpecies; ispecies++)
629  denominator+=fNumbersOfEvents[ispecies]*fPdfTot(i, ispecies);
630  for (ispecies=0; ispecies<fNSpecies; ispecies++){
631  numerator=0;
632  for (k=0; k<fNSpecies; k++)
633  numerator+=covmat[ispecies*fNSpecies+k]*fPdfTot(i, k);
634  fSWeights(i, (i_excl+1)*fNSpecies + ispecies)=numerator/denominator;
635  }
636  }
637 
638 }
639 
640 ////////////////////////////////////////////////////////////////////////////////
641 ///Returns the matrix of sweights
642 
644 {
645  if (weights.GetNcols()!=fNSpecies*(fNy+1) || weights.GetNrows()!=fNevents)
646  weights.ResizeTo(fNevents, fNSpecies*(fNy+1));
647  weights = fSWeights;
648 }
649 
650 ////////////////////////////////////////////////////////////////////////////////
651 ///Returns the matrix of sweights. It is assumed that the array passed in the argurment is big enough
652 
654 {
655  for (Int_t i=0; i<fNevents; i++){
656  for (Int_t j=0; j<fNSpecies; j++){
657  weights[i*fNSpecies+j]=fSWeights(i, j);
658  }
659  }
660 }
661 
662 ////////////////////////////////////////////////////////////////////////////////
663 ///Fills the histograms of x variables (not weighted) with nbins
664 
666 {
667  Int_t i, j;
668 
669  if (!fXvarHists.IsEmpty()){
670  if (((TH1D*)fXvarHists.First())->GetNbinsX()!=nbins)
671  fXvarHists.Delete();
672  else
673  return;
674  }
675 
676  //make the histograms
677  char name[12];
678  for (i=0; i<fNx; i++){
679  snprintf(name,sizeof(name), "x%d", i);
680  TH1D *h = new TH1D(name, name, nbins, fMinmax(0, i), fMinmax(1, i));
681  for (j=0; j<fNevents; j++)
682  h->Fill(fXvar(j, i));
683  fXvarHists.Add(h);
684  }
685 
686 }
687 
688 ////////////////////////////////////////////////////////////////////////////////
689 ///Returns the array of histograms of x variables (not weighted)
690 ///If histograms have not already
691 ///been filled, they are filled with default binning 100.
692 
694 {
695  Int_t nbins = 100;
696  if (fXvarHists.IsEmpty())
697  FillXvarHists(nbins);
698  else if (((TH1D*)fXvarHists.First())->GetNbinsX()!=nbins)
699  FillXvarHists(nbins);
700  return &fXvarHists;
701 }
702 
703 ////////////////////////////////////////////////////////////////////////////////
704 ///Returns the histogram of variable #ixvar
705 ///If histograms have not already
706 ///been filled, they are filled with default binning 100.
707 
709 {
710  Int_t nbins = 100;
711  if (fXvarHists.IsEmpty())
712  FillXvarHists(nbins);
713  else if (((TH1D*)fXvarHists.First())->GetNbinsX()!=nbins)
714  FillXvarHists(nbins);
715 
716  return (TH1D*)fXvarHists.UncheckedAt(ixvar);
717 }
718 
719 ////////////////////////////////////////////////////////////////////////////////
720 ///Fill the histograms of y variables
721 
723 {
724  Int_t i, j;
725 
726  if (!fYvarHists.IsEmpty()){
727  if (((TH1D*)fYvarHists.First())->GetNbinsX()!=nbins)
728  fYvarHists.Delete();
729  else
730  return;
731  }
732 
733  //make the histograms
734  char name[12];
735  for (i=0; i<fNy; i++){
736  snprintf(name,sizeof(name), "y%d", i);
737  TH1D *h=new TH1D(name, name, nbins, fMinmax(0, fNx+i), fMinmax(1, fNx+i));
738  for (j=0; j<fNevents; j++)
739  h->Fill(fYvar(j, i));
740  fYvarHists.Add(h);
741  }
742 }
743 
744 ////////////////////////////////////////////////////////////////////////////////
745 ///Returns the array of histograms of y variables. If histograms have not already
746 ///been filled, they are filled with default binning 100.
747 
749 {
750  Int_t nbins = 100;
751  if (fYvarHists.IsEmpty())
752  FillYvarHists(nbins);
753  else if (((TH1D*)fYvarHists.First())->GetNbinsX()!=nbins)
754  FillYvarHists(nbins);
755  return &fYvarHists;
756 }
757 
758 ////////////////////////////////////////////////////////////////////////////////
759 ///Returns the histogram of variable iyvar.If histograms have not already
760 ///been filled, they are filled with default binning 100.
761 
763 {
764  Int_t nbins = 100;
765  if (fYvarHists.IsEmpty())
766  FillYvarHists(nbins);
767  else if (((TH1D*)fYvarHists.First())->GetNbinsX()!=nbins)
768  FillYvarHists(nbins);
769  return (TH1D*)fYvarHists.UncheckedAt(iyvar);
770 }
771 
772 ////////////////////////////////////////////////////////////////////////////////
773 ///Fills the histograms of pdf-s of y variables with binning nbins
774 
776 {
777  Int_t i, j, ispecies;
778 
779  if (!fYpdfHists.IsEmpty()){
780  if (((TH1D*)fYpdfHists.First())->GetNbinsX()!=nbins)
781  fYpdfHists.Delete();
782  else
783  return;
784  }
785 
786  char name[34];
787  for (ispecies=0; ispecies<fNSpecies; ispecies++){
788  for (i=0; i<fNy; i++){
789  snprintf(name,sizeof(name), "pdf_species%d_y%d", ispecies, i);
790  //TH1D *h = new TH1D(name, name, nbins, ypdfmin[ispecies*fNy+i], ypdfmax[ispecies*fNy+i]);
791  TH1D *h = new TH1D(name, name, nbins, fMinmax(0, fNx+fNy+ispecies*fNy+i), fMinmax(1, fNx+fNy+ispecies*fNy+i));
792  for (j=0; j<fNevents; j++)
793  h->Fill(fYpdf(j, ispecies*fNy+i));
794  fYpdfHists.Add(h);
795  }
796  }
797 }
798 
799 ////////////////////////////////////////////////////////////////////////////////
800 ///Returns the array of histograms of pdf's of y variables with binning nbins
801 ///If histograms have not already
802 ///been filled, they are filled with default binning 100.
803 
805 {
806  Int_t nbins = 100;
807  if (fYpdfHists.IsEmpty())
808  FillYpdfHists(nbins);
809 
810  return &fYpdfHists;
811 }
812 
813 ////////////////////////////////////////////////////////////////////////////////
814 ///Returns the histogram of the pdf of variable #iyvar for species #ispecies, binning nbins
815 ///If histograms have not already
816 ///been filled, they are filled with default binning 100.
817 
819 {
820  Int_t nbins = 100;
821  if (fYpdfHists.IsEmpty())
822  FillYpdfHists(nbins);
823 
824  return (TH1D*)fYpdfHists.UncheckedAt(fNy*ispecies+iyvar);
825 }
826 
827 ////////////////////////////////////////////////////////////////////////////////
828 ///The order of histograms in the array:
829 ///x0_species0, x0_species1,..., x1_species0, x1_species1,..., y0_species0, y0_species1,...
830 ///If the histograms have already been filled with a different binning, they are refilled
831 ///and all histograms are deleted
832 
834 {
835  if (fSWeights.GetNoElements()==0){
836  Error("GetSWeightsHists", "SWeights were not computed");
837  return;
838  }
839 
840  if (!fSWeightsHists.IsEmpty()){
841  if (((TH1D*)fSWeightsHists.First())->GetNbinsX()!=nbins)
843  else
844  return;
845  }
846 
847  char name[30];
848 
849  //Fill histograms of x-variables weighted with sWeights
850  for (Int_t ivar=0; ivar<fNx; ivar++){
851  for (Int_t ispecies=0; ispecies<fNSpecies; ispecies++){
852  snprintf(name,30, "x%d_species%d", ivar, ispecies);
853  TH1D *h = new TH1D(name, name, nbins, fMinmax(0, ivar), fMinmax(1, ivar));
854  h->Sumw2();
855  for (Int_t ievent=0; ievent<fNevents; ievent++)
856  h->Fill(fXvar(ievent, ivar), fSWeights(ievent, ispecies));
858  }
859  }
860 
861  //Fill histograms of y-variables (exluded from the fit), weighted with sWeights
862  for (Int_t iexcl=0; iexcl<fNy; iexcl++){
863  for(Int_t ispecies=0; ispecies<fNSpecies; ispecies++){
864  snprintf(name,30, "y%d_species%d", iexcl, ispecies);
865  TH1D *h = new TH1D(name, name, nbins, fMinmax(0, fNx+iexcl), fMinmax(1, fNx+iexcl));
866  h->Sumw2();
867  for (Int_t ievent=0; ievent<fNevents; ievent++)
868  h->Fill(fYvar(ievent, iexcl), fSWeights(ievent, fNSpecies*(iexcl+1)+ispecies));
870  }
871  }
872 }
873 
874 ////////////////////////////////////////////////////////////////////////////////
875 ///Returns an array of all histograms of variables, weighted with sWeights
876 ///If histograms have not been already filled, they are filled with default binning 50
877 ///The order of histograms in the array:
878 ///x0_species0, x0_species1,..., x1_species0, x1_species1,..., y0_species0, y0_species1,...
879 
881 {
882  Int_t nbins = 50; //default binning
883  if (fSWeightsHists.IsEmpty())
884  FillSWeightsHists(nbins);
885 
886  return &fSWeightsHists;
887 }
888 
889 ////////////////////////////////////////////////////////////////////////////////
890 ///The Fill...Hist() methods fill the histograms with the real limits on the variables
891 ///This method allows to refill the specified histogram with user-set boundaries min and max
892 ///Parameters:
893 ///type = 1 - histogram of x variable #nvar
894 /// = 2 - histogram of y variable #nvar
895 /// = 3 - histogram of y_pdf for y #nvar and species #nspecies
896 /// = 4 - histogram of x variable #nvar, species #nspecies, WITH sWeights
897 /// = 5 - histogram of y variable #nvar, species #nspecies, WITH sWeights
898 
899 void TSPlot::RefillHist(Int_t type, Int_t nvar, Int_t nbins, Double_t min, Double_t max, Int_t nspecies)
900 {
901  if (type<1 || type>5){
902  Error("RefillHist", "type must lie between 1 and 5");
903  return;
904  }
905  char name[20];
906  Int_t j;
907  TH1D *hremove;
908  if (type==1){
909  hremove = (TH1D*)fXvarHists.RemoveAt(nvar);
910  delete hremove;
911  snprintf(name,20,"x%d",nvar);
912  TH1D *h = new TH1D(name, name, nbins, min, max);
913  for (j=0; j<fNevents;j++)
914  h->Fill(fXvar(j, nvar));
915  fXvarHists.AddAt(h, nvar);
916  }
917  if (type==2){
918  hremove = (TH1D*)fYvarHists.RemoveAt(nvar);
919  delete hremove;
920  snprintf(name,20, "y%d", nvar);
921  TH1D *h = new TH1D(name, name, nbins, min, max);
922  for (j=0; j<fNevents;j++)
923  h->Fill(fYvar(j, nvar));
924  fXvarHists.AddAt(h, nvar);
925  }
926  if (type==3){
927  hremove = (TH1D*)fYpdfHists.RemoveAt(nspecies*fNy+nvar);
928  delete hremove;
929  snprintf(name,20, "pdf_species%d_y%d", nspecies, nvar);
930  TH1D *h=new TH1D(name, name, nbins, min, max);
931  for (j=0; j<fNevents; j++)
932  h->Fill(fYpdf(j, nspecies*fNy+nvar));
933  fYpdfHists.AddAt(h, nspecies*fNy+nvar);
934  }
935  if (type==4){
936  hremove = (TH1D*)fSWeightsHists.RemoveAt(fNSpecies*nvar+nspecies);
937  delete hremove;
938  snprintf(name,20, "x%d_species%d", nvar, nspecies);
939  TH1D *h = new TH1D(name, name, nbins, min, max);
940  h->Sumw2();
941  for (Int_t ievent=0; ievent<fNevents; ievent++)
942  h->Fill(fXvar(ievent, nvar), fSWeights(ievent, nspecies));
943  fSWeightsHists.AddAt(h, fNSpecies*nvar+nspecies);
944  }
945  if (type==5){
946  hremove = (TH1D*)fSWeightsHists.RemoveAt(fNx*fNSpecies + fNSpecies*nvar+nspecies);
947  delete hremove;
948  snprintf(name,20, "y%d_species%d", nvar, nspecies);
949  TH1D *h = new TH1D(name, name, nbins, min, max);
950  h->Sumw2();
951  for (Int_t ievent=0; ievent<fNevents; ievent++)
952  h->Fill(fYvar(ievent, nvar), fSWeights(ievent, nspecies));
953  fSWeightsHists.AddAt(h, fNx*fNSpecies + fNSpecies*nvar+nspecies);
954  }
955 }
956 ////////////////////////////////////////////////////////////////////////////////
957 ///Returns the histogram of a variable, weithed with sWeights
958 ///If histograms have not been already filled, they are filled with default binning 50
959 ///If parameter ixvar!=-1, the histogram of x-variable #ixvar is returned for species ispecies
960 ///If parameter ixvar==-1, the histogram of y-variable #iyexcl is returned for species ispecies
961 ///If the histogram has already been filled and the binning is different from the parameter nbins
962 ///all histograms with old binning will be deleted and refilled.
963 
964 TH1D *TSPlot::GetSWeightsHist(Int_t ixvar, Int_t ispecies,Int_t iyexcl)
965 {
966 
967  Int_t nbins = 50; //default binning
968  if (fSWeightsHists.IsEmpty())
969  FillSWeightsHists(nbins);
970 
971  if (ixvar==-1)
972  return (TH1D*)fSWeightsHists.UncheckedAt(fNx*fNSpecies + fNSpecies*iyexcl+ispecies);
973  else
974  return (TH1D*)fSWeightsHists.UncheckedAt(fNSpecies*ixvar + ispecies);
975 
976 }
977 
978 
979 ////////////////////////////////////////////////////////////////////////////////
980 /// Set the input Tree
981 
983 {
984  fTree = tree;
985 }
986 
987 ////////////////////////////////////////////////////////////////////////////////
988 ///Specifies the variables from the tree to be used for splot
989 ///
990 ///Variables fNx, fNy, fNSpecies and fNEvents should already be set!
991 ///
992 ///In the 1st parameter it is assumed that first fNx variables are x(control variables),
993 ///then fNy y variables (discriminating variables),
994 ///then fNy*fNSpecies ypdf variables (probability distribution functions of dicriminating
995 ///variables for different species). The order of pdfs should be: species0_y0, species0_y1,...
996 ///species1_y0, species1_y1,...species[fNSpecies-1]_y0...
997 ///The 2nd parameter allows to make a cut
998 ///TTree::Draw method description contains more details on specifying expression and selection
999 
1000 void TSPlot::SetTreeSelection(const char* varexp, const char *selection, Long64_t firstentry)
1001 {
1002  TTreeFormula **var;
1003  std::vector<TString> cnames;
1004  TList *formulaList = new TList();
1005  TSelectorDraw *selector = (TSelectorDraw*)(((TTreePlayer*)fTree->GetPlayer())->GetSelector());
1006 
1007  Long64_t entry, entryNumber;
1008  Int_t i,nch;
1009  Int_t ncols;
1010  TObjArray *leaves = fTree->GetListOfLeaves();
1011 
1012  fTreename= new TString(fTree->GetName());
1013  if (varexp)
1014  fVarexp = new TString(varexp);
1015  if (selection)
1016  fSelection= new TString(selection);
1017 
1018  nch = varexp ? strlen(varexp) : 0;
1019 
1020 
1021 //*-*- Compile selection expression if there is one
1022  TTreeFormula *select = 0;
1023  if (selection && strlen(selection)) {
1024  select = new TTreeFormula("Selection",selection,fTree);
1025  if (!select) return;
1026  if (!select->GetNdim()) { delete select; return; }
1027  formulaList->Add(select);
1028  }
1029 //*-*- if varexp is empty, take first nx + ny + ny*nspecies columns by default
1030 
1031  if (nch == 0) {
1032  ncols = fNx + fNy + fNy*fNSpecies;
1033  for (i=0;i<ncols;i++) {
1034  cnames.push_back( leaves->At(i)->GetName() );
1035  }
1036 //*-*- otherwise select only the specified columns
1037  } else {
1038  ncols = selector->SplitNames(varexp,cnames);
1039  }
1040  var = new TTreeFormula* [ncols];
1041  Double_t *xvars = new Double_t[ncols];
1042 
1043  fMinmax.ResizeTo(2, ncols);
1044  for (i=0; i<ncols; i++){
1045  fMinmax(0, i)=1e30;
1046  fMinmax(1, i)=-1e30;
1047  }
1048 
1049 //*-*- Create the TreeFormula objects corresponding to each column
1050  for (i=0;i<ncols;i++) {
1051  var[i] = new TTreeFormula("Var1",cnames[i].Data(),fTree);
1052  formulaList->Add(var[i]);
1053  }
1054 
1055 //*-*- Create a TreeFormulaManager to coordinate the formulas
1056  TTreeFormulaManager *manager=0;
1057  if (formulaList->LastIndex()>=0) {
1058  manager = new TTreeFormulaManager;
1059  for(i=0;i<=formulaList->LastIndex();i++) {
1060  manager->Add((TTreeFormula*)formulaList->At(i));
1061  }
1062  manager->Sync();
1063  }
1064 //*-*- loop on all selected entries
1065  // fSelectedRows = 0;
1066  Int_t tnumber = -1;
1067  Long64_t selectedrows=0;
1068  for (entry=firstentry;entry<firstentry+fNevents;entry++) {
1069  entryNumber = fTree->GetEntryNumber(entry);
1070  if (entryNumber < 0) break;
1071  Long64_t localEntry = fTree->LoadTree(entryNumber);
1072  if (localEntry < 0) break;
1073  if (tnumber != fTree->GetTreeNumber()) {
1074  tnumber = fTree->GetTreeNumber();
1075  if (manager) manager->UpdateFormulaLeaves();
1076  }
1077  Int_t ndata = 1;
1078  if (manager && manager->GetMultiplicity()) {
1079  ndata = manager->GetNdata();
1080  }
1081 
1082  for(Int_t inst=0;inst<ndata;inst++) {
1083  Bool_t loaded = kFALSE;
1084  if (select) {
1085  if (select->EvalInstance(inst) == 0) {
1086  continue;
1087  }
1088  }
1089 
1090  if (inst==0) loaded = kTRUE;
1091  else if (!loaded) {
1092  // EvalInstance(0) always needs to be called so that
1093  // the proper branches are loaded.
1094  for (i=0;i<ncols;i++) {
1095  var[i]->EvalInstance(0);
1096  }
1097  loaded = kTRUE;
1098  }
1099 
1100  for (i=0;i<ncols;i++) {
1101  xvars[i] = var[i]->EvalInstance(inst);
1102  }
1103 
1104  // curentry = entry-firstentry;
1105  //printf("event#%d\n", curentry);
1106  //for (i=0; i<ncols; i++)
1107  // printf("xvars[%d]=%f\n", i, xvars[i]);
1108  //selectedrows++;
1109  for (i=0; i<fNx; i++){
1110  fXvar(selectedrows, i) = xvars[i];
1111  if (fXvar(selectedrows, i) < fMinmax(0, i))
1112  fMinmax(0, i)=fXvar(selectedrows, i);
1113  if (fXvar(selectedrows, i) > fMinmax(1, i))
1114  fMinmax(1, i)=fXvar(selectedrows, i);
1115  }
1116  for (i=0; i<fNy; i++){
1117  fYvar(selectedrows, i) = xvars[i+fNx];
1118  //printf("y_in_loop(%d, %d)=%f, xvars[%d]=%f\n", selectedrows, i, fYvar(selectedrows, i), i+fNx, xvars[i+fNx]);
1119  if (fYvar(selectedrows, i) < fMinmax(0, i+fNx))
1120  fMinmax(0, i+fNx) = fYvar(selectedrows, i);
1121  if (fYvar(selectedrows, i) > fMinmax(1, i+fNx))
1122  fMinmax(1, i+fNx) = fYvar(selectedrows, i);
1123  for (Int_t j=0; j<fNSpecies; j++){
1124  fYpdf(selectedrows, j*fNy + i)=xvars[j*fNy + i+fNx+fNy];
1125  if (fYpdf(selectedrows, j*fNy+i) < fMinmax(0, j*fNy+i+fNx+fNy))
1126  fMinmax(0, j*fNy+i+fNx+fNy) = fYpdf(selectedrows, j*fNy+i);
1127  if (fYpdf(selectedrows, j*fNy+i) > fMinmax(1, j*fNy+i+fNx+fNy))
1128  fMinmax(1, j*fNy+i+fNx+fNy) = fYpdf(selectedrows, j*fNy+i);
1129  }
1130  }
1131  selectedrows++;
1132  }
1133  }
1134  fNevents=selectedrows;
1135  // for (i=0; i<fNevents; i++){
1136  // printf("event#%d\n", i);
1137  //for (Int_t iy=0; iy<fNy; iy++)
1138  // printf("y[%d]=%f\n", iy, fYvar(i, iy));
1139  //for (Int_t ispecies=0; ispecies<fNSpecies; ispecies++){
1140  // for (Int_t iy=0; iy<fNy; iy++)
1141  // printf("ypdf[sp. %d, y %d]=%f\n", ispecies, iy, fYpdf(i, ispecies*fNy+iy));
1142  // }
1143  //}
1144  delete [] xvars;
1145  delete [] var;
1146 }
1147 
1148 ////////////////////////////////////////////////////////////////////////////////
1149 /// FCN-function for Minuit
1150 
1151 void Yields(Int_t &, Double_t *, Double_t &f, Double_t *x, Int_t /*iflag*/)
1152 {
1153  Double_t lik;
1154  Int_t i, ispecies;
1155 
1157  TMatrixD *pdftot = (TMatrixD*)fitter->GetObjectFit();
1158  Int_t nev = pdftot->GetNrows();
1159  Int_t nes = pdftot->GetNcols();
1160  f=0;
1161  for (i=0; i<nev; i++){
1162  lik=0;
1163  for (ispecies=0; ispecies<nes; ispecies++)
1164  lik+=x[ispecies]*(*pdftot)(i, ispecies);
1165  if (lik<0) lik=1;
1166  f+=TMath::Log(lik);
1167  }
1168  //extended likelihood, equivalent to chi2
1169  Double_t ntot=0;
1170  for (i=0; i<nes; i++)
1171  ntot += x[i];
1172  f = -2*(f-ntot);
1173 }
1174 
void Add(TObject *obj, const char *name=0, Int_t check=-1)
Add object with name to browser.
Definition: TBrowser.cxx:261
virtual const char * GetName() const
Returns name of object.
Definition: TNamed.h:47
void SPlots(Double_t *covmat, Int_t i_excl)
Computes the sWeights from the covariance matrix.
Definition: TSPlot.cxx:622
virtual Int_t Fill(Double_t x)
Increment bin with abscissa X by 1.
Definition: TH1.cxx:3251
An array of TObjects.
Definition: TObjArray.h:37
Int_t fNevents
Definition: TSPlot.h:44
long long Long64_t
Definition: RtypesCore.h:69
TH1D * GetYpdfHist(Int_t iyvar, Int_t ispecies)
Returns the histogram of the pdf of variable #iyvar for species #ispecies, binning nbins If histogram...
Definition: TSPlot.cxx:818
Double_t Log(Double_t x)
Definition: TMath.h:759
void Yields(Int_t &, Double_t *, Double_t &f, Double_t *x, Int_t iflag)
FCN-function for Minuit.
Definition: TSPlot.cxx:1151
void SetInitialNumbersOfSpecies(Int_t *numbers)
Set the initial number of events of each species - used as initial estimates in minuit.
Definition: TSPlot.cxx:527
const char Option_t
Definition: RtypesCore.h:62
virtual Bool_t Sync()
Synchronize all the formulae.
void MakeSPlot(Option_t *option="v")
Calculates the sWeights The option controls the print level "Q" - no print out "V" - prints the estim...
Definition: TSPlot.cxx:542
virtual void Delete(Option_t *option="")
Remove all objects from the array AND delete all heap based objects.
Definition: TObjArray.cxx:355
virtual Int_t SetParameter(Int_t ipar, const char *parname, Double_t value, Double_t verr, Double_t vlow, Double_t vhigh)=0
TString & ReplaceAll(const TString &s1, const TString &s2)
Definition: TString.h:687
virtual Double_t * GetCovarianceMatrix() const =0
Int_t GetNcols() const
Definition: TMatrixTBase.h:125
TMatrixD fYpdf
Definition: TSPlot.h:25
TObjArray fYpdfHists
Definition: TSPlot.h:32
void ToUpper()
Change string to upper case.
Definition: TString.cxx:1113
virtual Double_t GetParameter(Int_t ipar) const =0
TMatrixD fXvar
Definition: TSPlot.h:23
Basic string class.
Definition: TString.h:131
#define f(i)
Definition: RSha256.hxx:104
Bool_t IsEmpty() const
Definition: TObjArray.h:70
int Int_t
Definition: RtypesCore.h:41
bool Bool_t
Definition: RtypesCore.h:59
virtual void UpdateFormulaLeaves()
This function could be called TTreePlayer::UpdateFormulaLeaves, itself called by TChain::LoadTree whe...
TObject * At(Int_t idx) const
Definition: TObjArray.h:165
virtual TMatrixTBase< Element > & ResizeTo(Int_t nrows, Int_t ncols, Int_t=-1)
Set size of the matrix to nrows x ncols New dynamic elements are created, the overlapping part of the...
Definition: TMatrixT.cxx:1210
TVirtualTreePlayer * GetPlayer()
Load the TTreePlayer (if not already done).
Definition: TTree.cxx:6023
virtual void Add(TTreeFormula *)
Add a new formula to the list of formulas managed The manager of the formula will be changed and the ...
TMatrixD fPdfTot
Definition: TSPlot.h:26
virtual Int_t GetNdim() const
Definition: TFormula.h:237
Int_t GetNoElements() const
Definition: TMatrixTBase.h:126
Double_t x[n]
Definition: legend1.C:17
Used to coordinate one or more TTreeFormula objects.
TString * fVarexp
Definition: TSPlot.h:37
virtual Long64_t LoadTree(Long64_t entry)
Set current entry.
Definition: TTree.cxx:6190
Definition: TSPlot.h:21
void FillYvarHists(Int_t nbins=100)
Fill the histograms of y variables.
Definition: TSPlot.cxx:722
virtual ~TSPlot()
destructor
Definition: TSPlot.cxx:477
void FillYpdfHists(Int_t nbins=100)
Fills the histograms of pdf-s of y variables with binning nbins.
Definition: TSPlot.cxx:775
virtual void Clear(Option_t *option="")=0
Set name and title to empty strings ("").
Used to pass a selection expression to the Tree drawing routine.
Definition: TTreeFormula.h:58
virtual Int_t GetTreeNumber() const
Definition: TTree.h:440
TObjArray * GetYpdfHists()
Returns the array of histograms of pdf&#39;s of y variables with binning nbins If histograms have not alr...
Definition: TSPlot.cxx:804
A doubly linked list.
Definition: TList.h:44
TObject * First() const
Return the object in the first slot.
Definition: TObjArray.cxx:495
void RefillHist(Int_t type, Int_t var, Int_t nbins, Double_t min, Double_t max, Int_t nspecies=-1)
The Fill...Hist() methods fill the histograms with the real limits on the variables This method allow...
Definition: TSPlot.cxx:899
Using a TBrowser one can browse all ROOT objects.
Definition: TBrowser.h:37
virtual void SetObjectFit(TObject *obj)
virtual Long64_t GetEntryNumber(Long64_t entry) const
Return entry number corresponding to entry.
Definition: TTree.cxx:5584
TMatrixD fYvar
Definition: TSPlot.h:24
virtual TObject * RemoveAt(Int_t idx)
Remove object at index idx.
Definition: TObjArray.cxx:678
void GetSWeights(TMatrixD &weights)
Returns the matrix of sweights.
Definition: TSPlot.cxx:643
TObjArray fXvarHists
Definition: TSPlot.h:30
virtual void Error(const char *method, const char *msgfmt,...) const
Issue error message.
Definition: TObject.cxx:880
TString * fSelection
Definition: TSPlot.h:38
static TVirtualFitter * GetFitter()
static: return the current Fitter
Int_t fNSpecies
Definition: TSPlot.h:43
virtual Int_t ExecuteCommand(const char *command, Double_t *args, Int_t nargs)=0
TObjArray fYvarHists
Definition: TSPlot.h:31
virtual TObject * At(Int_t idx) const
Returns the object at position idx. Returns 0 if idx is out of range.
Definition: TList.cxx:354
1-D histogram with a double per channel (see TH1 documentation)}
Definition: TH1.h:610
void SetTree(TTree *tree)
Set the input Tree.
Definition: TSPlot.cxx:982
TObjArray * GetYvarHists()
Returns the array of histograms of y variables.
Definition: TSPlot.cxx:748
virtual void AddAt(TObject *obj, Int_t idx)
Add object at position ids.
Definition: TObjArray.cxx:253
Int_t LastIndex() const
TMatrixD fSWeights
Definition: TSPlot.h:28
#define h(i)
Definition: RSha256.hxx:106
A specialized TSelector for TTree::Draw.
Definition: TSelectorDraw.h:31
Int_t GetNrows() const
Definition: TMatrixTBase.h:122
const Bool_t kFALSE
Definition: RtypesCore.h:88
TH1D * GetYvarHist(Int_t iyvar)
Returns the histogram of variable iyvar.If histograms have not already been filled, they are filled with default binning 100.
Definition: TSPlot.cxx:762
TSPlot()
default constructor (used by I/O only)
Definition: TSPlot.cxx:433
TObject * UncheckedAt(Int_t i) const
Definition: TObjArray.h:89
void Browse(TBrowser *b)
To browse the histograms.
Definition: TSPlot.cxx:492
#define ClassImp(name)
Definition: Rtypes.h:359
double Double_t
Definition: RtypesCore.h:55
int type
Definition: TGX11.cxx:120
Double_t * fNumbersOfEvents
Definition: TSPlot.h:46
TTree * fTree
Definition: TSPlot.h:35
Bool_t Contains(const char *pat, ECaseCompare cmp=kExact) const
Definition: TString.h:619
static constexpr double s
virtual void SetFCN(void(*fcn)(Int_t &, Double_t *, Double_t &f, Double_t *, Int_t))
To set the address of the minimization objective function called by the native compiler (see function...
T EvalInstance(Int_t i=0, const char *stringStack[]=0)
Evaluate this treeformula.
Abstract Base Class for Fitting.
void SetTreeSelection(const char *varexp="", const char *selection="", Long64_t firstentry=0)
Specifies the variables from the tree to be used for splot.
Definition: TSPlot.cxx:1000
TObjArray fSWeightsHists
Definition: TSPlot.h:33
void FillXvarHists(Int_t nbins=100)
Fills the histograms of x variables (not weighted) with nbins.
Definition: TSPlot.cxx:665
static TVirtualFitter * Fitter(TObject *obj, Int_t maxpar=25)
Static function returning a pointer to the current fitter.
virtual void Add(TObject *obj)
Definition: TList.h:87
TMatrixD fMinmax
Definition: TSPlot.h:27
virtual void Sumw2(Bool_t flag=kTRUE)
Create structure to store sum of squares of weights.
Definition: TH1.cxx:8313
you should not use this method at all Int_t Int_t Double_t Double_t Double_t Int_t Double_t Double_t Double_t Double_t b
Definition: TRolke.cxx:630
Int_t fNx
Definition: TSPlot.h:41
Int_t fNy
Definition: TSPlot.h:42
#define snprintf
Definition: civetweb.c:1351
THist< 1, double, THistStatContent, THistStatUncertainty > TH1D
Definition: THist.hxx:284
virtual UInt_t SplitNames(const TString &varexp, std::vector< TString > &names)
Build Index array for names in varexp.
Definition: tree.py:1
virtual void AddLast(TObject *obj)
Add object in the next empty slot in the array.
Definition: TObjArray.cxx:177
void Add(TObject *obj)
Definition: TObjArray.h:73
A TTree object has a header with a name and a title.
Definition: TTree.h:70
virtual Int_t GetNdata(Bool_t forceLoadDim=kFALSE)
Return number of available instances in the formulas.
Implement some of the functionality of the class TTree requiring access to extra libraries (Histogram...
Definition: TTreePlayer.h:37
virtual const char * GetName() const
Returns name of object.
Definition: TObject.cxx:357
virtual TObject * GetObjectFit() const
const Bool_t kTRUE
Definition: RtypesCore.h:87
TObjArray * GetXvarHists()
Returns the array of histograms of x variables (not weighted) If histograms have not already been fil...
Definition: TSPlot.cxx:693
virtual Int_t GetMultiplicity() const
static constexpr double ns
TH1D * GetXvarHist(Int_t ixvar)
Returns the histogram of variable #ixvar If histograms have not already been filled, they are filled with default binning 100.
Definition: TSPlot.cxx:708
char name[80]
Definition: TGX11.cxx:109
void FillSWeightsHists(Int_t nbins=50)
The order of histograms in the array: x0_species0, x0_species1,..., x1_species0, x1_species1,..., y0_species0, y0_species1,...
Definition: TSPlot.cxx:833
TString * fTreename
Definition: TSPlot.h:36
TObjArray * GetSWeightsHists()
Returns an array of all histograms of variables, weighted with sWeights If histograms have not been a...
Definition: TSPlot.cxx:880
virtual TObjArray * GetListOfLeaves()
Definition: TTree.h:410
TH1D * GetSWeightsHist(Int_t ixvar, Int_t ispecies, Int_t iyexcl=-1)
Returns the histogram of a variable, weithed with sWeights If histograms have not been already filled...
Definition: TSPlot.cxx:964