{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "6c9f7944",
   "metadata": {},
   "source": [
    "# rf404_categories\n",
    "Data and categories: working with ROOT.RooCategory objects to describe discrete variables\n",
    "\n",
    "\n",
    "\n",
    "\n",
    "**Author:**  Clemens Lange, Wouter Verkerke (C++ version)  \n",
    "<i><small>This notebook tutorial was automatically generated with <a href= \"https://github.com/root-project/root/blob/master/documentation/doxygen/converttonotebook.py\">ROOTBOOK-izer</a> from the macro found in the ROOT repository  on Tuesday, May 19, 2026 at 08:31 PM.</small></i>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "55cfc809",
   "metadata": {
    "collapsed": false,
    "execution": {
     "iopub.execute_input": "2026-05-19T20:31:43.840667Z",
     "iopub.status.busy": "2026-05-19T20:31:43.840524Z",
     "iopub.status.idle": "2026-05-19T20:31:44.828458Z",
     "shell.execute_reply": "2026-05-19T20:31:44.827784Z"
    }
   },
   "outputs": [],
   "source": [
    "import ROOT"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "70324421",
   "metadata": {},
   "source": [
    "Construct a category with labels\n",
    "--------------------------------------------"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f7069135",
   "metadata": {},
   "source": [
    "Define a category with labels only"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "f2918d04",
   "metadata": {
    "collapsed": false,
    "execution": {
     "iopub.execute_input": "2026-05-19T20:31:44.830587Z",
     "iopub.status.busy": "2026-05-19T20:31:44.830446Z",
     "iopub.status.idle": "2026-05-19T20:31:45.001242Z",
     "shell.execute_reply": "2026-05-19T20:31:44.995412Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "RooCategory::tagCat = Lepton(idx = 0)\n",
      "\n"
     ]
    }
   ],
   "source": [
    "tagCat = ROOT.RooCategory(\"tagCat\", \"Tagging category\")\n",
    "tagCat.defineType(\"Lepton\")\n",
    "tagCat.defineType(\"Kaon\")\n",
    "tagCat.defineType(\"NetTagger-1\")\n",
    "tagCat.defineType(\"NetTagger-2\")\n",
    "tagCat.Print()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c8964218",
   "metadata": {},
   "source": [
    "Construct a category with labels and indices\n",
    "------------------------------------------------"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8a35504e",
   "metadata": {},
   "source": [
    "Define a category with explicitly numbered states"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "e8eb426e",
   "metadata": {
    "collapsed": false,
    "execution": {
     "iopub.execute_input": "2026-05-19T20:31:45.002966Z",
     "iopub.status.busy": "2026-05-19T20:31:45.002838Z",
     "iopub.status.idle": "2026-05-19T20:31:45.111854Z",
     "shell.execute_reply": "2026-05-19T20:31:45.111088Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "RooCategory::b0flav = B0(idx = -1)\n",
      "\n"
     ]
    }
   ],
   "source": [
    "b0flav = ROOT.RooCategory(\"b0flav\", \"B0 flavour eigenstate\", {\"B0\": -1, \"B0bar\": 1})\n",
    "b0flav.Print()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "03a4722d",
   "metadata": {},
   "source": [
    "Generate dummy data for tabulation demo\n",
    "------------------------------------------------"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b46e1de7",
   "metadata": {},
   "source": [
    "Generate a dummy dataset"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "c0ed185f",
   "metadata": {
    "collapsed": false,
    "execution": {
     "iopub.execute_input": "2026-05-19T20:31:45.113430Z",
     "iopub.status.busy": "2026-05-19T20:31:45.113307Z",
     "iopub.status.idle": "2026-05-19T20:31:45.303262Z",
     "shell.execute_reply": "2026-05-19T20:31:45.302464Z"
    }
   },
   "outputs": [],
   "source": [
    "x = ROOT.RooRealVar(\"x\", \"x\", 0, 10)\n",
    "data = ROOT.RooPolynomial(\"p\", \"p\", x).generate({x, b0flav, tagCat}, 10000)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "24f67e5c",
   "metadata": {},
   "source": [
    "Print tables of category contents of datasets\n",
    "--------------------------------------------------"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "eae32f64",
   "metadata": {},
   "source": [
    "Tables are equivalent of plots for categories"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "269d12c9",
   "metadata": {
    "collapsed": false,
    "execution": {
     "iopub.execute_input": "2026-05-19T20:31:45.305126Z",
     "iopub.status.busy": "2026-05-19T20:31:45.304995Z",
     "iopub.status.idle": "2026-05-19T20:31:45.419678Z",
     "shell.execute_reply": "2026-05-19T20:31:45.418904Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Roo1DTable::b0flav = (B0=5040,B0bar=4960)\n",
      "\n",
      "  Table b0flav : pData\n",
      "  +-------+------+\n",
      "  |    B0 | 5040 |\n",
      "  | B0bar | 4960 |\n",
      "  +-------+------+\n",
      "\n"
     ]
    }
   ],
   "source": [
    "btable = data.table(b0flav)\n",
    "btable.Print()\n",
    "btable.Print(\"v\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0c547e9b",
   "metadata": {},
   "source": [
    "Create table for subset of events matching cut expression"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "0e9837f6",
   "metadata": {
    "collapsed": false,
    "execution": {
     "iopub.execute_input": "2026-05-19T20:31:45.421217Z",
     "iopub.status.busy": "2026-05-19T20:31:45.421092Z",
     "iopub.status.idle": "2026-05-19T20:31:45.558043Z",
     "shell.execute_reply": "2026-05-19T20:31:45.557276Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Roo1DTable::tagCat = (Lepton=487,Kaon=433,NetTagger-1=439,NetTagger-2=406)\n",
      "\n",
      "  Table tagCat : pData(x>8.23)\n",
      "  +-------------+-----+\n",
      "  |      Lepton | 487 |\n",
      "  |        Kaon | 433 |\n",
      "  | NetTagger-1 | 439 |\n",
      "  | NetTagger-2 | 406 |\n",
      "  +-------------+-----+\n",
      "\n"
     ]
    }
   ],
   "source": [
    "ttable = data.table(tagCat, \"x>8.23\")\n",
    "ttable.Print()\n",
    "ttable.Print(\"v\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fcc3be85",
   "metadata": {},
   "source": [
    "Create table for all (tagCat x b0flav) state combinations"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "24ff1990",
   "metadata": {
    "collapsed": false,
    "execution": {
     "iopub.execute_input": "2026-05-19T20:31:45.559594Z",
     "iopub.status.busy": "2026-05-19T20:31:45.559462Z",
     "iopub.status.idle": "2026-05-19T20:31:45.676301Z",
     "shell.execute_reply": "2026-05-19T20:31:45.675579Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "  Table (tagCat x b0flav) : pData\n",
      "  +---------------------+------+\n",
      "  |         {Lepton;B0} | 1281 |\n",
      "  |           {Kaon;B0} | 1253 |\n",
      "  |    {NetTagger-1;B0} | 1234 |\n",
      "  |    {NetTagger-2;B0} | 1272 |\n",
      "  |      {Lepton;B0bar} | 1269 |\n",
      "  |        {Kaon;B0bar} | 1255 |\n",
      "  | {NetTagger-1;B0bar} | 1219 |\n",
      "  | {NetTagger-2;B0bar} | 1217 |\n",
      "  +---------------------+------+\n",
      "\n"
     ]
    }
   ],
   "source": [
    "bttable = data.table({tagCat, b0flav})\n",
    "bttable.Print(\"v\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0dd615b8",
   "metadata": {},
   "source": [
    "Retrieve number of events from table\n",
    "Number can be non-integer if source dataset has weighed events"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "5b176704",
   "metadata": {
    "collapsed": false,
    "execution": {
     "iopub.execute_input": "2026-05-19T20:31:45.680900Z",
     "iopub.status.busy": "2026-05-19T20:31:45.680770Z",
     "iopub.status.idle": "2026-05-19T20:31:45.788984Z",
     "shell.execute_reply": "2026-05-19T20:31:45.788362Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Number of events with B0 flavor is  5040.0\n"
     ]
    }
   ],
   "source": [
    "nb0 = btable.get(\"B0\")\n",
    "print(\"Number of events with B0 flavor is \", nb0)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b4042084",
   "metadata": {},
   "source": [
    "Retrieve fraction of events with \"Lepton\" tag"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "fd1cd440",
   "metadata": {
    "collapsed": false,
    "execution": {
     "iopub.execute_input": "2026-05-19T20:31:45.793542Z",
     "iopub.status.busy": "2026-05-19T20:31:45.793414Z",
     "iopub.status.idle": "2026-05-19T20:31:45.901638Z",
     "shell.execute_reply": "2026-05-19T20:31:45.900924Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Fraction of events tagged with Lepton tag is  0.27592067988668556\n"
     ]
    }
   ],
   "source": [
    "fracLep = ttable.getFrac(\"Lepton\")\n",
    "print(\"Fraction of events tagged with Lepton tag is \", fracLep)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3d462ff0",
   "metadata": {},
   "source": [
    "Defining ranges for plotting, fitting on categories\n",
    "------------------------------------------------------------------------------------------------------"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c09fff5a",
   "metadata": {},
   "source": [
    "Define named range as comma separated list of labels"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "1cd789a8",
   "metadata": {
    "collapsed": false,
    "execution": {
     "iopub.execute_input": "2026-05-19T20:31:45.903391Z",
     "iopub.status.busy": "2026-05-19T20:31:45.903264Z",
     "iopub.status.idle": "2026-05-19T20:31:46.010433Z",
     "shell.execute_reply": "2026-05-19T20:31:46.010085Z"
    }
   },
   "outputs": [],
   "source": [
    "tagCat.setRange(\"good\", \"Lepton,Kaon\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d165e657",
   "metadata": {},
   "source": [
    "Or add state names one by one"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "ec592d82",
   "metadata": {
    "collapsed": false,
    "execution": {
     "iopub.execute_input": "2026-05-19T20:31:46.012407Z",
     "iopub.status.busy": "2026-05-19T20:31:46.012288Z",
     "iopub.status.idle": "2026-05-19T20:31:46.119566Z",
     "shell.execute_reply": "2026-05-19T20:31:46.119139Z"
    }
   },
   "outputs": [],
   "source": [
    "tagCat.addToRange(\"soso\", \"NetTagger-1\")\n",
    "tagCat.addToRange(\"soso\", \"NetTagger-2\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "05712932",
   "metadata": {},
   "source": [
    "Use category range in dataset reduction specification"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "ed751899",
   "metadata": {
    "collapsed": false,
    "execution": {
     "iopub.execute_input": "2026-05-19T20:31:46.121219Z",
     "iopub.status.busy": "2026-05-19T20:31:46.121103Z",
     "iopub.status.idle": "2026-05-19T20:31:46.254715Z",
     "shell.execute_reply": "2026-05-19T20:31:46.254285Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "  Table tagCat : pData\n",
      "  +-------------+------+\n",
      "  |      Lepton | 2550 |\n",
      "  |        Kaon | 2508 |\n",
      "  | NetTagger-1 |    0 |\n",
      "  | NetTagger-2 |    0 |\n",
      "  +-------------+------+\n",
      "\n"
     ]
    }
   ],
   "source": [
    "goodData = data.reduce(CutRange=\"good\")\n",
    "goodData.table(tagCat).Print(\"v\")"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
