{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "24f07c57",
   "metadata": {},
   "source": [
    "# TMVARegression\n",
    "This macro provides examples for the training and testing of the\n",
    "TMVA classifiers.\n",
    "\n",
    "As input data is used a toy-MC sample consisting of four Gaussian-distributed\n",
    "and linearly correlated input variables.\n",
    "\n",
    "The methods to be used can be switched on and off by means of booleans, or\n",
    "via the prompt command, for example:\n",
    "\n",
    "    root -l TMVARegression.C\\(\\\"LD,MLP\\\"\\)\n",
    "\n",
    "(note that the backslashes are mandatory)\n",
    "If no method given, a default set is used.\n",
    "\n",
    "The output file \"TMVAReg.root\" can be analysed with the use of dedicated\n",
    "macros (simply say: root -l <macro.C>), which can be conveniently\n",
    "invoked through a GUI that will appear at the end of the run of this macro.\n",
    "- Project   : TMVA - a Root-integrated toolkit for multivariate data analysis\n",
    "- Package   : TMVA\n",
    "- Root Macro: TMVARegression\n",
    "\n",
    "\n",
    "\n",
    "**Author:** Andreas Hoecker  \n",
    "<i><small>This notebook tutorial was automatically generated with <a href= \"https://github.com/root-project/root/blob/master/documentation/doxygen/converttonotebook.py\">ROOTBOOK-izer</a> from the macro found in the ROOT repository  on Tuesday, May 19, 2026 at 08:08 PM.</small></i>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "29c919d4",
   "metadata": {
    "collapsed": false,
    "execution": {
     "iopub.execute_input": "2026-05-19T20:09:02.068402Z",
     "iopub.status.busy": "2026-05-19T20:09:02.068295Z",
     "iopub.status.idle": "2026-05-19T20:09:02.072491Z",
     "shell.execute_reply": "2026-05-19T20:09:02.071938Z"
    }
   },
   "outputs": [],
   "source": [
    "%%cpp -d\n",
    "#include <cstdlib>\n",
    "#include <iostream>\n",
    "#include <map>\n",
    "#include <string>\n",
    "\n",
    "#include \"TChain.h\"\n",
    "#include \"TFile.h\"\n",
    "#include \"TTree.h\"\n",
    "#include \"TString.h\"\n",
    "#include \"TObjString.h\"\n",
    "#include \"TSystem.h\"\n",
    "#include \"TROOT.h\"\n",
    "\n",
    "#include \"TMVA/Tools.h\"\n",
    "#include \"TMVA/Factory.h\"\n",
    "#include \"TMVA/DataLoader.h\"\n",
    "#include \"TMVA/TMVARegGui.h\"\n",
    "\n",
    "\n",
    "using namespace TMVA;"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2d6cbc8d",
   "metadata": {},
   "source": [
    " Arguments are defined. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "8682ffbd",
   "metadata": {
    "collapsed": false,
    "execution": {
     "iopub.execute_input": "2026-05-19T20:09:02.073713Z",
     "iopub.status.busy": "2026-05-19T20:09:02.073579Z",
     "iopub.status.idle": "2026-05-19T20:09:02.276184Z",
     "shell.execute_reply": "2026-05-19T20:09:02.275487Z"
    }
   },
   "outputs": [],
   "source": [
    "TString myMethodList = \"\";"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8f59907a",
   "metadata": {},
   "source": [
    "The explicit loading of the shared libTMVA is done in TMVAlogon.C, defined in .rootrc\n",
    "if you use your private .rootrc, or run from a different directory, please copy the\n",
    "corresponding lines from .rootrc"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4c9995fc",
   "metadata": {},
   "source": [
    "methods to be processed can be given as an argument; use format:\n",
    "\n",
    "mylinux~> root -l TMVARegression.C\\(\\\"myMethod1,myMethod2,myMethod3\\\"\\)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5d31d2b3",
   "metadata": {},
   "source": [
    "---------------------------------------------------------------\n",
    "This loads the library"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "98d53ea1",
   "metadata": {
    "collapsed": false,
    "execution": {
     "iopub.execute_input": "2026-05-19T20:09:02.278015Z",
     "iopub.status.busy": "2026-05-19T20:09:02.277902Z",
     "iopub.status.idle": "2026-05-19T20:09:02.480709Z",
     "shell.execute_reply": "2026-05-19T20:09:02.479976Z"
    }
   },
   "outputs": [],
   "source": [
    "TMVA::Tools::Instance();"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "30a20816",
   "metadata": {},
   "source": [
    "Default MVA methods to be trained + tested"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "4f5197f7",
   "metadata": {
    "collapsed": false,
    "execution": {
     "iopub.execute_input": "2026-05-19T20:09:02.482737Z",
     "iopub.status.busy": "2026-05-19T20:09:02.482596Z",
     "iopub.status.idle": "2026-05-19T20:09:02.685320Z",
     "shell.execute_reply": "2026-05-19T20:09:02.684647Z"
    }
   },
   "outputs": [],
   "source": [
    "std::map<std::string,int> Use;"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "759453ca",
   "metadata": {},
   "source": [
    "Mutidimensional likelihood and Nearest-Neighbour methods"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "cf89165e",
   "metadata": {
    "collapsed": false,
    "execution": {
     "iopub.execute_input": "2026-05-19T20:09:02.687280Z",
     "iopub.status.busy": "2026-05-19T20:09:02.687167Z",
     "iopub.status.idle": "2026-05-19T20:09:02.889975Z",
     "shell.execute_reply": "2026-05-19T20:09:02.889318Z"
    }
   },
   "outputs": [],
   "source": [
    "Use[\"PDERS\"]           = 0;\n",
    "Use[\"PDEFoam\"]         = 1;\n",
    "Use[\"KNN\"]             = 1;"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "19e98e53",
   "metadata": {},
   "source": [
    "Linear Discriminant Analysis"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "bc85b307",
   "metadata": {
    "collapsed": false,
    "execution": {
     "iopub.execute_input": "2026-05-19T20:09:02.891944Z",
     "iopub.status.busy": "2026-05-19T20:09:02.891828Z",
     "iopub.status.idle": "2026-05-19T20:09:03.094615Z",
     "shell.execute_reply": "2026-05-19T20:09:03.093886Z"
    }
   },
   "outputs": [],
   "source": [
    "Use[\"LD\"]              = 1;"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8e82af07",
   "metadata": {},
   "source": [
    "Function Discriminant analysis"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "84d20934",
   "metadata": {
    "collapsed": false,
    "execution": {
     "iopub.execute_input": "2026-05-19T20:09:03.096253Z",
     "iopub.status.busy": "2026-05-19T20:09:03.096138Z",
     "iopub.status.idle": "2026-05-19T20:09:03.298986Z",
     "shell.execute_reply": "2026-05-19T20:09:03.298272Z"
    }
   },
   "outputs": [],
   "source": [
    "Use[\"FDA_GA\"]          = 0;\n",
    "Use[\"FDA_MC\"]          = 0;\n",
    "Use[\"FDA_MT\"]          = 0;\n",
    "Use[\"FDA_GAMT\"]        = 0;"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c9815fe6",
   "metadata": {},
   "source": [
    "Neural Network"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "21433707",
   "metadata": {
    "collapsed": false,
    "execution": {
     "iopub.execute_input": "2026-05-19T20:09:03.300734Z",
     "iopub.status.busy": "2026-05-19T20:09:03.300601Z",
     "iopub.status.idle": "2026-05-19T20:09:03.503393Z",
     "shell.execute_reply": "2026-05-19T20:09:03.502648Z"
    }
   },
   "outputs": [],
   "source": [
    "Use[\"MLP\"]             = 0;"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3274d430",
   "metadata": {},
   "source": [
    "Deep neural network (with CPU or GPU)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "7c06e01f",
   "metadata": {
    "collapsed": false,
    "execution": {
     "iopub.execute_input": "2026-05-19T20:09:03.505107Z",
     "iopub.status.busy": "2026-05-19T20:09:03.504987Z",
     "iopub.status.idle": "2026-05-19T20:09:03.707761Z",
     "shell.execute_reply": "2026-05-19T20:09:03.707131Z"
    }
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Unbalanced braces. This cell was not processed.\n"
     ]
    }
   ],
   "source": [
    "#ifdef R__HAS_TMVAGPU\n",
    "Use[\"DNN_GPU\"] = 1;\n",
    "Use[\"DNN_CPU\"] = 0;\n",
    "#else\n",
    "Use[\"DNN_GPU\"] = 0;\n",
    "#ifdef R__HAS_TMVACPU\n",
    "Use[\"DNN_CPU\"] = 1;\n",
    "#else\n",
    "Use[\"DNN_CPU\"] = 0;\n",
    "#endif\n",
    "#endif"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "44c833d1",
   "metadata": {},
   "source": [
    "Support Vector Machine"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "b104b544",
   "metadata": {
    "collapsed": false,
    "execution": {
     "iopub.execute_input": "2026-05-19T20:09:03.709288Z",
     "iopub.status.busy": "2026-05-19T20:09:03.709144Z",
     "iopub.status.idle": "2026-05-19T20:09:03.912110Z",
     "shell.execute_reply": "2026-05-19T20:09:03.911381Z"
    }
   },
   "outputs": [],
   "source": [
    "Use[\"SVM\"]             = 0;"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fb92ab1a",
   "metadata": {},
   "source": [
    "Boosted Decision Trees"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "c708a30e",
   "metadata": {
    "collapsed": false,
    "execution": {
     "iopub.execute_input": "2026-05-19T20:09:03.913747Z",
     "iopub.status.busy": "2026-05-19T20:09:03.913624Z",
     "iopub.status.idle": "2026-05-19T20:09:04.116279Z",
     "shell.execute_reply": "2026-05-19T20:09:04.115694Z"
    }
   },
   "outputs": [],
   "source": [
    "Use[\"BDT\"]             = 0;\n",
    "Use[\"BDTG\"]            = 1;"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d02901fd",
   "metadata": {},
   "source": [
    "---------------------------------------------------------------"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "75bb3247",
   "metadata": {
    "collapsed": false,
    "execution": {
     "iopub.execute_input": "2026-05-19T20:09:04.118380Z",
     "iopub.status.busy": "2026-05-19T20:09:04.118266Z",
     "iopub.status.idle": "2026-05-19T20:09:04.321235Z",
     "shell.execute_reply": "2026-05-19T20:09:04.320572Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "==> Start TMVARegression\n"
     ]
    }
   ],
   "source": [
    "std::cout << std::endl;\n",
    "std::cout << \"==> Start TMVARegression\" << std::endl;"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "aee9172b",
   "metadata": {},
   "source": [
    "Select methods (don't look at this code - not of interest)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "cd2d5e9c",
   "metadata": {
    "collapsed": false,
    "execution": {
     "iopub.execute_input": "2026-05-19T20:09:04.322647Z",
     "iopub.status.busy": "2026-05-19T20:09:04.322518Z",
     "iopub.status.idle": "2026-05-19T20:09:04.524892Z",
     "shell.execute_reply": "2026-05-19T20:09:04.524102Z"
    }
   },
   "outputs": [],
   "source": [
    "if (myMethodList != \"\") {\n",
    "   for (std::map<std::string,int>::iterator it = Use.begin(); it != Use.end(); it++) it->second = 0;\n",
    "\n",
    "   std::vector<TString> mlist = gTools().SplitString( myMethodList, ',' );\n",
    "   for (UInt_t i=0; i<mlist.size(); i++) {\n",
    "      std::string regMethod(mlist[i].Data());\n",
    "\n",
    "      if (Use.find(regMethod) == Use.end()) {\n",
    "         std::cout << \"Method \\\"\" << regMethod << \"\\\" not known in TMVA under this name. Choose among the following:\" << std::endl;\n",
    "         for (std::map<std::string,int>::iterator it = Use.begin(); it != Use.end(); it++) std::cout << it->first << \" \";\n",
    "         std::cout << std::endl;\n",
    "         return;\n",
    "      }\n",
    "      Use[regMethod] = 1;\n",
    "   }\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fc397baa",
   "metadata": {},
   "source": [
    "--------------------------------------------------------------------------------------------------"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "24637cf1",
   "metadata": {},
   "source": [
    "Here the preparation phase begins"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6f6eac84",
   "metadata": {},
   "source": [
    "Create a new root output file"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "id": "0b988d44",
   "metadata": {
    "collapsed": false,
    "execution": {
     "iopub.execute_input": "2026-05-19T20:09:04.526640Z",
     "iopub.status.busy": "2026-05-19T20:09:04.526514Z",
     "iopub.status.idle": "2026-05-19T20:09:04.728874Z",
     "shell.execute_reply": "2026-05-19T20:09:04.728119Z"
    }
   },
   "outputs": [],
   "source": [
    "TString outfileName( \"TMVAReg.root\" );\n",
    "TFile* outputFile = TFile::Open( outfileName, \"RECREATE\" );"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7e318149",
   "metadata": {},
   "source": [
    "Create the factory object. Later you can choose the methods\n",
    "whose performance you'd like to investigate. The factory will\n",
    "then run the performance analysis for you.\n",
    "\n",
    "The first argument is the base of the name of all the\n",
    "weightfiles in the directory weight\n",
    "\n",
    "The second argument is the output file for the training results\n",
    "All TMVA output can be suppressed by removing the \"!\" (not) in\n",
    "front of the \"Silent\" argument in the option string"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "id": "55e4c5bd",
   "metadata": {
    "collapsed": false,
    "execution": {
     "iopub.execute_input": "2026-05-19T20:09:04.730419Z",
     "iopub.status.busy": "2026-05-19T20:09:04.730304Z",
     "iopub.status.idle": "2026-05-19T20:09:04.932949Z",
     "shell.execute_reply": "2026-05-19T20:09:04.932285Z"
    }
   },
   "outputs": [],
   "source": [
    "TMVA::Factory *factory = new TMVA::Factory( \"TMVARegression\", outputFile,\n",
    "                                            \"!V:!Silent:Color:!DrawProgressBar:AnalysisType=Regression\" );\n",
    "\n",
    "\n",
    "TMVA::DataLoader *dataloader=new TMVA::DataLoader(\"datasetreg\");"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "407441ac",
   "metadata": {},
   "source": [
    "If you wish to modify default settings\n",
    "(please check \"src/Config.h\" to see all available global options)\n",
    "\n",
    "(TMVA::gConfig().GetVariablePlotting()).fTimesRMS = 8.0;\n",
    "(TMVA::gConfig().GetIONames()).fWeightFileDir = \"myWeightDirectory\";"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "718e9b51",
   "metadata": {},
   "source": [
    "Define the input variables that shall be used for the MVA training\n",
    "note that you may also use variable expressions, such as: \"3*var1/var2*abs(var3)\"\n",
    "[all types of expressions that can also be parsed by TTree::Draw( \"expression\" )]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "id": "241de486",
   "metadata": {
    "collapsed": false,
    "execution": {
     "iopub.execute_input": "2026-05-19T20:09:04.934760Z",
     "iopub.status.busy": "2026-05-19T20:09:04.934619Z",
     "iopub.status.idle": "2026-05-19T20:09:05.137188Z",
     "shell.execute_reply": "2026-05-19T20:09:05.136352Z"
    }
   },
   "outputs": [],
   "source": [
    "dataloader->AddVariable( \"var1\", \"Variable 1\", \"units\", 'F' );\n",
    "dataloader->AddVariable( \"var2\", \"Variable 2\", \"units\", 'F' );"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9b1c3d4a",
   "metadata": {},
   "source": [
    "You can add so-called \"Spectator variables\", which are not used in the MVA training,\n",
    "but will appear in the final \"TestTree\" produced by TMVA. This TestTree will contain the\n",
    "input variables, the response values of all trained MVAs, and the spectator variables"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "id": "f7faad7f",
   "metadata": {
    "collapsed": false,
    "execution": {
     "iopub.execute_input": "2026-05-19T20:09:05.138802Z",
     "iopub.status.busy": "2026-05-19T20:09:05.138666Z",
     "iopub.status.idle": "2026-05-19T20:09:05.341127Z",
     "shell.execute_reply": "2026-05-19T20:09:05.340368Z"
    }
   },
   "outputs": [],
   "source": [
    "dataloader->AddSpectator( \"spec1:=var1*2\",  \"Spectator 1\", \"units\", 'F' );\n",
    "dataloader->AddSpectator( \"spec2:=var1*3\",  \"Spectator 2\", \"units\", 'F' );"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fe4cd109",
   "metadata": {},
   "source": [
    "Add the variable carrying the regression target"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "id": "519d4c8b",
   "metadata": {
    "collapsed": false,
    "execution": {
     "iopub.execute_input": "2026-05-19T20:09:05.342704Z",
     "iopub.status.busy": "2026-05-19T20:09:05.342572Z",
     "iopub.status.idle": "2026-05-19T20:09:05.545006Z",
     "shell.execute_reply": "2026-05-19T20:09:05.544252Z"
    }
   },
   "outputs": [],
   "source": [
    "dataloader->AddTarget( \"fvalue\" );"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "da8f8a7a",
   "metadata": {},
   "source": [
    "It is also possible to declare additional targets for multi-dimensional regression, ie:\n",
    "factory->AddTarget( \"fvalue2\" );\n",
    "BUT: this is currently ONLY implemented for MLP"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2d110bf1",
   "metadata": {},
   "source": [
    "Read training and test data (see TMVAClassification for reading ASCII files)\n",
    "load the signal and background event samples from ROOT trees"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "id": "5a6bb171",
   "metadata": {
    "collapsed": false,
    "execution": {
     "iopub.execute_input": "2026-05-19T20:09:05.546519Z",
     "iopub.status.busy": "2026-05-19T20:09:05.546403Z",
     "iopub.status.idle": "2026-05-19T20:09:05.749400Z",
     "shell.execute_reply": "2026-05-19T20:09:05.748853Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "--- TMVARegression           : Using input file: /github/home/ROOT-CI/build/tutorials/machine_learning/data/tmva_reg_example.root\n"
     ]
    }
   ],
   "source": [
    "TFile *input(nullptr);\n",
    "TString fname =  gROOT->GetTutorialDir() + \"/machine_learning/data/tmva_reg_example.root\";\n",
    "if (!gSystem->AccessPathName( fname )) {\n",
    "   input = TFile::Open( fname ); // check if file in local directory exists\n",
    "}\n",
    "if (!input) {\n",
    "   std::cout << \"ERROR: could not open data file\" << std::endl;\n",
    "   exit(1);\n",
    "}\n",
    "std::cout << \"--- TMVARegression           : Using input file: \" << input->GetName() << std::endl;"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "978a8b86",
   "metadata": {},
   "source": [
    "Register the regression tree"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "id": "59b312d6",
   "metadata": {
    "collapsed": false,
    "execution": {
     "iopub.execute_input": "2026-05-19T20:09:05.750810Z",
     "iopub.status.busy": "2026-05-19T20:09:05.750696Z",
     "iopub.status.idle": "2026-05-19T20:09:05.953363Z",
     "shell.execute_reply": "2026-05-19T20:09:05.952633Z"
    }
   },
   "outputs": [],
   "source": [
    "TTree *regTree = (TTree*)input->Get(\"TreeR\");"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "24314283",
   "metadata": {},
   "source": [
    "global event weights per tree (see below for setting event-wise weights)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "id": "feb7f441",
   "metadata": {
    "collapsed": false,
    "execution": {
     "iopub.execute_input": "2026-05-19T20:09:05.954861Z",
     "iopub.status.busy": "2026-05-19T20:09:05.954748Z",
     "iopub.status.idle": "2026-05-19T20:09:06.157435Z",
     "shell.execute_reply": "2026-05-19T20:09:06.156792Z"
    }
   },
   "outputs": [],
   "source": [
    "Double_t regWeight  = 1.0;"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7a5c4a43",
   "metadata": {},
   "source": [
    "You can add an arbitrary number of regression trees"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "id": "cf931012",
   "metadata": {
    "collapsed": false,
    "execution": {
     "iopub.execute_input": "2026-05-19T20:09:06.159431Z",
     "iopub.status.busy": "2026-05-19T20:09:06.159317Z",
     "iopub.status.idle": "2026-05-19T20:09:06.362224Z",
     "shell.execute_reply": "2026-05-19T20:09:06.361586Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "DataSetInfo              : [datasetreg] : Added class \"Regression\"\n",
      "                         : Add Tree TreeR of type Regression with 10000 events\n"
     ]
    }
   ],
   "source": [
    "dataloader->AddRegressionTree( regTree, regWeight );"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c78113e4",
   "metadata": {},
   "source": [
    "This would set individual event weights (the variables defined in the\n",
    "expression need to exist in the original TTree)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "id": "9a085739",
   "metadata": {
    "collapsed": false,
    "execution": {
     "iopub.execute_input": "2026-05-19T20:09:06.363639Z",
     "iopub.status.busy": "2026-05-19T20:09:06.363514Z",
     "iopub.status.idle": "2026-05-19T20:09:06.565916Z",
     "shell.execute_reply": "2026-05-19T20:09:06.565156Z"
    }
   },
   "outputs": [],
   "source": [
    "dataloader->SetWeightExpression( \"var1\", \"Regression\" );"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e7627a16",
   "metadata": {},
   "source": [
    "Apply additional cuts on the signal and background samples (can be different)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "id": "8114cb94",
   "metadata": {
    "collapsed": false,
    "execution": {
     "iopub.execute_input": "2026-05-19T20:09:06.567522Z",
     "iopub.status.busy": "2026-05-19T20:09:06.567410Z",
     "iopub.status.idle": "2026-05-19T20:09:06.770103Z",
     "shell.execute_reply": "2026-05-19T20:09:06.769401Z"
    }
   },
   "outputs": [],
   "source": [
    "TCut mycut = \"\"; // for example: TCut mycut = \"abs(var1)<0.5 && abs(var2-0.5)<1\";"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ba9250a7",
   "metadata": {},
   "source": [
    "tell the DataLoader to use all remaining events in the trees after training for testing:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "id": "ed3b0e4a",
   "metadata": {
    "collapsed": false,
    "execution": {
     "iopub.execute_input": "2026-05-19T20:09:06.771837Z",
     "iopub.status.busy": "2026-05-19T20:09:06.771697Z",
     "iopub.status.idle": "2026-05-19T20:09:06.974136Z",
     "shell.execute_reply": "2026-05-19T20:09:06.973720Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "                         : Dataset[datasetreg] : Class index : 0  name : Regression\n"
     ]
    }
   ],
   "source": [
    "dataloader->PrepareTrainingAndTestTree( mycut,\n",
    "                                      \"nTrain_Regression=1000:nTest_Regression=0:SplitMode=Random:NormMode=NumEvents:!V\" );"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0f2ea7c0",
   "metadata": {},
   "source": [
    "dataloader->PrepareTrainingAndTestTree( mycut,\n",
    "\"nTrain_Regression=0:nTest_Regression=0:SplitMode=Random:NormMode=NumEvents:!V\" );"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a0f6b1b9",
   "metadata": {},
   "source": [
    "If no numbers of events are given, half of the events in the tree are used\n",
    "for training, and the other half for testing:\n",
    "\n",
    "dataloader->PrepareTrainingAndTestTree( mycut, \"SplitMode=random:!V\" );"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8b32c193",
   "metadata": {},
   "source": [
    "Book MVA methods\n",
    "\n",
    "Please lookup the various method configuration options in the corresponding cxx files, eg:\n",
    "src/MethoCuts.cxx, etc, or here: http://tmva.sourceforge.net/old_site/optionRef.html\n",
    "it is possible to preset ranges in the option string in which the cut optimisation should be done:\n",
    "\"...:CutRangeMin[2]=-1:CutRangeMax[2]=1\"...\", where [2] is the third input variable"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1ebf380e",
   "metadata": {},
   "source": [
    "PDE - RS method"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "id": "8f9ffbef",
   "metadata": {
    "collapsed": false,
    "execution": {
     "iopub.execute_input": "2026-05-19T20:09:06.975716Z",
     "iopub.status.busy": "2026-05-19T20:09:06.975586Z",
     "iopub.status.idle": "2026-05-19T20:09:07.177796Z",
     "shell.execute_reply": "2026-05-19T20:09:07.177325Z"
    }
   },
   "outputs": [],
   "source": [
    "if (Use[\"PDERS\"])\n",
    "   factory->BookMethod( dataloader,  TMVA::Types::kPDERS, \"PDERS\",\n",
    "                        \"!H:!V:NormTree=T:VolumeRangeMode=Adaptive:KernelEstimator=Gauss:GaussSigma=0.3:NEventsMin=40:NEventsMax=60:VarTransform=None\" );"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "385ebd87",
   "metadata": {},
   "source": [
    "And the options strings for the MinMax and RMS methods, respectively:\n",
    "\n",
    "\"!H:!V:VolumeRangeMode=MinMax:DeltaFrac=0.2:KernelEstimator=Gauss:GaussSigma=0.3\" );\n",
    "\"!H:!V:VolumeRangeMode=RMS:DeltaFrac=3:KernelEstimator=Gauss:GaussSigma=0.3\" );"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "id": "dc51db1a",
   "metadata": {
    "collapsed": false,
    "execution": {
     "iopub.execute_input": "2026-05-19T20:09:07.179721Z",
     "iopub.status.busy": "2026-05-19T20:09:07.179589Z",
     "iopub.status.idle": "2026-05-19T20:09:07.382922Z",
     "shell.execute_reply": "2026-05-19T20:09:07.382443Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Factory                  : Booking method: \u001b[1mPDEFoam\u001b[0m\n",
      "                         : \n",
      "                         : Rebuilding Dataset datasetreg\n",
      "                         : Building event vectors for type 2 Regression\n",
      "                         : Dataset[datasetreg] :  create input formulas for tree TreeR\n",
      "DataSetFactory           : [datasetreg] : Number of events in input trees\n",
      "                         : \n",
      "                         : Number of training and testing events\n",
      "                         : ---------------------------------------------------------------------------\n",
      "                         : Regression -- training events            : 1000\n",
      "                         : Regression -- testing events             : 9000\n",
      "                         : Regression -- training and testing events: 10000\n",
      "                         : \n",
      "DataSetInfo              : Correlation matrix (Regression):\n",
      "                         : ------------------------\n",
      "                         :             var1    var2\n",
      "                         :    var1:  +1.000  -0.032\n",
      "                         :    var2:  -0.032  +1.000\n",
      "                         : ------------------------\n",
      "DataSetFactory           : [datasetreg] :  \n",
      "                         : \n"
     ]
    }
   ],
   "source": [
    "if (Use[\"PDEFoam\"])\n",
    "    factory->BookMethod( dataloader,  TMVA::Types::kPDEFoam, \"PDEFoam\",\n",
    "          \"!H:!V:MultiTargetRegression=F:TargetSelection=Mpv:TailCut=0.001:VolFrac=0.0666:nActiveCells=500:nSampl=2000:nBin=5:Compress=T:Kernel=None:Nmin=10:VarTransform=None\" );"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f2828aa8",
   "metadata": {},
   "source": [
    "K-Nearest Neighbour classifier (KNN)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "id": "d3959d1c",
   "metadata": {
    "collapsed": false,
    "execution": {
     "iopub.execute_input": "2026-05-19T20:09:07.384220Z",
     "iopub.status.busy": "2026-05-19T20:09:07.384109Z",
     "iopub.status.idle": "2026-05-19T20:09:07.586477Z",
     "shell.execute_reply": "2026-05-19T20:09:07.586018Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Factory                  : Booking method: \u001b[1mKNN\u001b[0m\n",
      "                         : \n"
     ]
    }
   ],
   "source": [
    "if (Use[\"KNN\"])\n",
    "   factory->BookMethod( dataloader,  TMVA::Types::kKNN, \"KNN\",\n",
    "                        \"nkNN=20:ScaleFrac=0.8:SigmaFact=1.0:Kernel=Gaus:UseKernel=F:UseWeight=T:!Trim\" );"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "54151de5",
   "metadata": {},
   "source": [
    "Linear discriminant"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "id": "66f4bce3",
   "metadata": {
    "collapsed": false,
    "execution": {
     "iopub.execute_input": "2026-05-19T20:09:07.587980Z",
     "iopub.status.busy": "2026-05-19T20:09:07.587872Z",
     "iopub.status.idle": "2026-05-19T20:09:07.790245Z",
     "shell.execute_reply": "2026-05-19T20:09:07.789718Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Factory                  : Booking method: \u001b[1mLD\u001b[0m\n",
      "                         : \n"
     ]
    }
   ],
   "source": [
    "if (Use[\"LD\"])\n",
    "   factory->BookMethod( dataloader,  TMVA::Types::kLD, \"LD\",\n",
    "                        \"!H:!V:VarTransform=None\" );"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ea79d309",
   "metadata": {},
   "source": [
    "Function discrimination analysis (FDA) -- test of various fitters - the recommended one is Minuit (or GA or SA)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "id": "5493c3a7",
   "metadata": {
    "collapsed": false,
    "execution": {
     "iopub.execute_input": "2026-05-19T20:09:07.791458Z",
     "iopub.status.busy": "2026-05-19T20:09:07.791349Z",
     "iopub.status.idle": "2026-05-19T20:09:07.993995Z",
     "shell.execute_reply": "2026-05-19T20:09:07.993229Z"
    }
   },
   "outputs": [],
   "source": [
    "if (Use[\"FDA_MC\"])\n",
    "   factory->BookMethod( dataloader,  TMVA::Types::kFDA, \"FDA_MC\",\n",
    "                       \"!H:!V:Formula=(0)+(1)*x0+(2)*x1:ParRanges=(-100,100);(-100,100);(-100,100):FitMethod=MC:SampleSize=100000:Sigma=0.1:VarTransform=D\" );\n",
    "\n",
    "if (Use[\"FDA_GA\"]) // can also use Simulated Annealing (SA) algorithm (see Cuts_SA options) .. the formula of this example is good for parabolas\n",
    "   factory->BookMethod( dataloader,  TMVA::Types::kFDA, \"FDA_GA\",\n",
    "                        \"!H:!V:Formula=(0)+(1)*x0+(2)*x1:ParRanges=(-100,100);(-100,100);(-100,100):FitMethod=GA:PopSize=100:Cycles=3:Steps=30:Trim=True:SaveBestGen=1:VarTransform=Norm\" );\n",
    "\n",
    "if (Use[\"FDA_MT\"])\n",
    "   factory->BookMethod( dataloader,  TMVA::Types::kFDA, \"FDA_MT\",\n",
    "                        \"!H:!V:Formula=(0)+(1)*x0+(2)*x1:ParRanges=(-100,100);(-100,100);(-100,100);(-10,10):FitMethod=MINUIT:ErrorLevel=1:PrintLevel=-1:FitStrategy=2:UseImprove:UseMinos:SetBatch\" );\n",
    "\n",
    "if (Use[\"FDA_GAMT\"])\n",
    "   factory->BookMethod( dataloader,  TMVA::Types::kFDA, \"FDA_GAMT\",\n",
    "                        \"!H:!V:Formula=(0)+(1)*x0+(2)*x1:ParRanges=(-100,100);(-100,100);(-100,100):FitMethod=GA:Converger=MINUIT:ErrorLevel=1:PrintLevel=-1:FitStrategy=0:!UseImprove:!UseMinos:SetBatch:Cycles=1:PopSize=5:Steps=5:Trim\" );"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b5e95ec0",
   "metadata": {},
   "source": [
    "Neural network (MLP)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "id": "9516df90",
   "metadata": {
    "collapsed": false,
    "execution": {
     "iopub.execute_input": "2026-05-19T20:09:07.995359Z",
     "iopub.status.busy": "2026-05-19T20:09:07.995247Z",
     "iopub.status.idle": "2026-05-19T20:09:08.197368Z",
     "shell.execute_reply": "2026-05-19T20:09:08.196860Z"
    }
   },
   "outputs": [],
   "source": [
    "if (Use[\"MLP\"])\n",
    "   factory->BookMethod( dataloader,  TMVA::Types::kMLP, \"MLP\", \"!H:!V:VarTransform=Norm:NeuronType=tanh:NCycles=20000:HiddenLayers=N+20:TestRate=6:TrainingMethod=BFGS:Sampling=0.3:SamplingEpoch=0.8:ConvergenceImprove=1e-6:ConvergenceTests=15:!UseRegulator\" );\n",
    "\n",
    "if (Use[\"DNN_CPU\"] || Use[\"DNN_GPU\"]) {\n",
    "\n",
    "   TString archOption =  Use[\"DNN_GPU\"] ? \"GPU\" : \"CPU\";\n",
    "\n",
    "   TString layoutString(\"Layout=TANH|50,TANH|50,TANH|50,LINEAR\");\n",
    "\n",
    "\n",
    "   TString trainingStrategyString(\"TrainingStrategy=\");\n",
    "\n",
    "   trainingStrategyString +=\"LearningRate=1e-3,Momentum=0.3,ConvergenceSteps=20,BatchSize=50,TestRepetitions=1,WeightDecay=0.0,Regularization=None,Optimizer=Adam\";\n",
    "\n",
    "   TString nnOptions(\"!H:V:ErrorStrategy=SUMOFSQUARES:VarTransform=G:WeightInitialization=XAVIERUNIFORM:Architecture=\");\n",
    "   nnOptions.Append(archOption);\n",
    "   nnOptions.Append(\":\");\n",
    "   nnOptions.Append(layoutString);\n",
    "   nnOptions.Append(\":\");\n",
    "   nnOptions.Append(trainingStrategyString);\n",
    "\n",
    "   TString methodName = TString(\"DNN_\") + archOption;\n",
    "\n",
    "   factory->BookMethod(dataloader, TMVA::Types::kDL, methodName, nnOptions); // NN\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6ce8f22a",
   "metadata": {},
   "source": [
    "Support Vector Machine"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "id": "bf3c56c3",
   "metadata": {
    "collapsed": false,
    "execution": {
     "iopub.execute_input": "2026-05-19T20:09:08.198942Z",
     "iopub.status.busy": "2026-05-19T20:09:08.198833Z",
     "iopub.status.idle": "2026-05-19T20:09:08.401072Z",
     "shell.execute_reply": "2026-05-19T20:09:08.400550Z"
    }
   },
   "outputs": [],
   "source": [
    "if (Use[\"SVM\"])\n",
    "   factory->BookMethod( dataloader,  TMVA::Types::kSVM, \"SVM\", \"Gamma=0.25:Tol=0.001:VarTransform=Norm\" );"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c097199e",
   "metadata": {},
   "source": [
    "Boosted Decision Trees"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "id": "476d5e28",
   "metadata": {
    "collapsed": false,
    "execution": {
     "iopub.execute_input": "2026-05-19T20:09:08.402924Z",
     "iopub.status.busy": "2026-05-19T20:09:08.402815Z",
     "iopub.status.idle": "2026-05-19T20:09:08.605416Z",
     "shell.execute_reply": "2026-05-19T20:09:08.604983Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Factory                  : Booking method: \u001b[1mBDTG\u001b[0m\n",
      "                         : \n",
      "<WARNING>                : Value for option maxdepth was previously set to 3\n",
      "                         : the option NegWeightTreatment=InverseBoostNegWeights does not exist for BoostType=Grad\n",
      "                         : --> change to new default NegWeightTreatment=Pray\n"
     ]
    }
   ],
   "source": [
    "if (Use[\"BDT\"])\n",
    "  factory->BookMethod( dataloader,  TMVA::Types::kBDT, \"BDT\",\n",
    "                        \"!H:!V:NTrees=100:MinNodeSize=1.0%:BoostType=AdaBoostR2:SeparationType=RegressionVariance:nCuts=20:PruneMethod=CostComplexity:PruneStrength=30\" );\n",
    "\n",
    "if (Use[\"BDTG\"])\n",
    "  factory->BookMethod( dataloader,  TMVA::Types::kBDT, \"BDTG\",\n",
    "                        \"!H:!V:NTrees=2000::BoostType=Grad:Shrinkage=0.1:UseBaggedBoost:BaggedSampleFraction=0.5:nCuts=20:MaxDepth=3:MaxDepth=4\" );"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2fc61feb",
   "metadata": {},
   "source": [
    "--------------------------------------------------------------------------------------------------"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d90b1abe",
   "metadata": {},
   "source": [
    "Now you can tell the factory to train, test, and evaluate the MVAs"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "41617bfb",
   "metadata": {},
   "source": [
    "Train MVAs using the set of training events"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "id": "c903218a",
   "metadata": {
    "collapsed": false,
    "execution": {
     "iopub.execute_input": "2026-05-19T20:09:08.606903Z",
     "iopub.status.busy": "2026-05-19T20:09:08.606794Z",
     "iopub.status.idle": "2026-05-19T20:09:10.489128Z",
     "shell.execute_reply": "2026-05-19T20:09:10.488646Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Factory                  : \u001b[1mTrain all methods\u001b[0m\n",
      "Factory                  : [datasetreg] : Create Transformation \"I\" with events from all classes.\n",
      "                         : \n",
      "                         : Transformation, Variable selection : \n",
      "                         : Input : variable 'var1' <---> Output : variable 'var1'\n",
      "                         : Input : variable 'var2' <---> Output : variable 'var2'\n",
      "TFHandler_Factory        : Variable        Mean        RMS   [        Min        Max ]\n",
      "                         : -----------------------------------------------------------\n",
      "                         :     var1:     3.3615     1.1815   [  0.0010317     4.9864 ]\n",
      "                         :     var2:     2.4456     1.4269   [  0.0039980     4.9846 ]\n",
      "                         :   fvalue:     163.04     79.540   [     1.8147     358.73 ]\n",
      "                         : -----------------------------------------------------------\n",
      "                         : Ranking input variables (method unspecific)...\n",
      "IdTransformation         : Ranking result (top variable is best ranked)\n",
      "                         : --------------------------------------------\n",
      "                         : Rank : Variable  : |Correlation with target|\n",
      "                         : --------------------------------------------\n",
      "                         :    1 : var2      : 7.559e-01\n",
      "                         :    2 : var1      : 6.143e-01\n",
      "                         : --------------------------------------------\n",
      "IdTransformation         : Ranking result (top variable is best ranked)\n",
      "                         : -------------------------------------\n",
      "                         : Rank : Variable  : Mutual information\n",
      "                         : -------------------------------------\n",
      "                         :    1 : var2      : 2.014e+00\n",
      "                         :    2 : var1      : 1.978e+00\n",
      "                         : -------------------------------------\n",
      "IdTransformation         : Ranking result (top variable is best ranked)\n",
      "                         : ------------------------------------\n",
      "                         : Rank : Variable  : Correlation Ratio\n",
      "                         : ------------------------------------\n",
      "                         :    1 : var1      : 6.270e+00\n",
      "                         :    2 : var2      : 2.543e+00\n",
      "                         : ------------------------------------\n",
      "IdTransformation         : Ranking result (top variable is best ranked)\n",
      "                         : ----------------------------------------\n",
      "                         : Rank : Variable  : Correlation Ratio (T)\n",
      "                         : ----------------------------------------\n",
      "                         :    1 : var2      : 1.051e+00\n",
      "                         :    2 : var1      : 5.263e-01\n",
      "                         : ----------------------------------------\n",
      "Factory                  : Train method: PDEFoam for Regression\n",
      "                         : \n",
      "                         : Build mono target regression foam\n",
      "                         : Elapsed time: 0.279 sec                                 \n",
      "                         : Elapsed time for training with 1000 events: 0.282 sec         \n",
      "                         : Dataset[datasetreg] : Create results for training\n",
      "                         : Dataset[datasetreg] : Evaluation of PDEFoam on training sample\n",
      "                         : Dataset[datasetreg] : Elapsed time for evaluation of 1000 events: 0.00237 sec       \n",
      "                         : Create variable histograms\n",
      "                         : Create regression target histograms\n",
      "                         : Create regression average deviation\n",
      "                         : Results created\n",
      "                         : Creating xml weight file: \u001b[0;36mdatasetreg/weights/TMVARegression_PDEFoam.weights.xml\u001b[0m\n",
      "                         : writing foam MonoTargetRegressionFoam to file\n",
      "                         : Foams written to file: \u001b[0;36mdatasetreg/weights/TMVARegression_PDEFoam.weights_foams.root\u001b[0m\n",
      "Factory                  : Training finished\n",
      "                         : \n",
      "Factory                  : Train method: KNN for Regression\n",
      "                         : \n",
      "KNN                      : <Train> start...\n",
      "                         : Reading 1000 events\n",
      "                         : Number of signal events 1000\n",
      "                         : Number of background events 0\n",
      "                         : Creating kd-tree with 1000 events\n",
      "                         : Computing scale factor for 1d distributions: (ifrac, bottom, top) = (80%, 10%, 90%)\n",
      "ModulekNN                : Optimizing tree for 2 variables with 1000 values\n",
      "                         : <Fill> Class 1 has     1000 events\n",
      "                         : Elapsed time for training with 1000 events: 0.000713 sec         \n",
      "                         : Dataset[datasetreg] : Create results for training\n",
      "                         : Dataset[datasetreg] : Evaluation of KNN on training sample\n",
      "                         : Dataset[datasetreg] : Elapsed time for evaluation of 1000 events: 0.00389 sec       \n",
      "                         : Create variable histograms\n",
      "                         : Create regression target histograms\n",
      "                         : Create regression average deviation\n",
      "                         : Results created\n",
      "                         : Creating xml weight file: \u001b[0;36mdatasetreg/weights/TMVARegression_KNN.weights.xml\u001b[0m\n",
      "Factory                  : Training finished\n",
      "                         : \n",
      "Factory                  : Train method: LD for Regression\n",
      "                         : \n",
      "LD                       : Results for LD coefficients:\n",
      "                         : -----------------------\n",
      "                         : Variable:  Coefficient:\n",
      "                         : -----------------------\n",
      "                         :     var1:      +41.434\n",
      "                         :     var2:      +42.995\n",
      "                         : (offset):      -81.387\n",
      "                         : -----------------------\n",
      "                         : Elapsed time for training with 1000 events: 0.00019 sec         \n",
      "                         : Dataset[datasetreg] : Create results for training\n",
      "                         : Dataset[datasetreg] : Evaluation of LD on training sample\n",
      "                         : Dataset[datasetreg] : Elapsed time for evaluation of 1000 events: 0.000148 sec       \n",
      "                         : Create variable histograms\n",
      "                         : Create regression target histograms\n",
      "                         : Create regression average deviation\n",
      "                         : Results created\n",
      "                         : Creating xml weight file: \u001b[0;36mdatasetreg/weights/TMVARegression_LD.weights.xml\u001b[0m\n",
      "Factory                  : Training finished\n",
      "                         : \n",
      "Factory                  : Train method: BDTG for Regression\n",
      "                         : \n",
      "                         : Regression Loss Function: Huber\n",
      "                         : Training 2000 Decision Trees ... patience please\n",
      "                         : Elapsed time for training with 1000 events: 0.817 sec         \n",
      "                         : Dataset[datasetreg] : Create results for training\n",
      "                         : Dataset[datasetreg] : Evaluation of BDTG on training sample\n",
      "                         : Dataset[datasetreg] : Elapsed time for evaluation of 1000 events: 0.158 sec       \n",
      "                         : Create variable histograms\n",
      "                         : Create regression target histograms\n",
      "                         : Create regression average deviation\n",
      "                         : Results created\n",
      "                         : Creating xml weight file: \u001b[0;36mdatasetreg/weights/TMVARegression_BDTG.weights.xml\u001b[0m\n",
      "                         : TMVAReg.root:/datasetreg/Method_BDT/BDTG\n",
      "Factory                  : Training finished\n",
      "                         : \n",
      "Factory                  : === Destroy and recreate all methods via weight files for testing ===\n",
      "                         : \n",
      "                         : Reading weight file: \u001b[0;36mdatasetreg/weights/TMVARegression_PDEFoam.weights.xml\u001b[0m\n",
      "                         : Read foams from file: \u001b[0;36mdatasetreg/weights/TMVARegression_PDEFoam.weights_foams.root\u001b[0m\n",
      "                         : Reading weight file: \u001b[0;36mdatasetreg/weights/TMVARegression_KNN.weights.xml\u001b[0m\n",
      "                         : Creating kd-tree with 1000 events\n",
      "                         : Computing scale factor for 1d distributions: (ifrac, bottom, top) = (80%, 10%, 90%)\n",
      "ModulekNN                : Optimizing tree for 2 variables with 1000 values\n",
      "                         : <Fill> Class 1 has     1000 events\n",
      "                         : Reading weight file: \u001b[0;36mdatasetreg/weights/TMVARegression_LD.weights.xml\u001b[0m\n",
      "                         : Reading weight file: \u001b[0;36mdatasetreg/weights/TMVARegression_BDTG.weights.xml\u001b[0m\n"
     ]
    }
   ],
   "source": [
    "factory->TrainAllMethods();"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9032258d",
   "metadata": {},
   "source": [
    "Evaluate all MVAs using the set of test events"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "id": "196af4aa",
   "metadata": {
    "collapsed": false,
    "execution": {
     "iopub.execute_input": "2026-05-19T20:09:10.490459Z",
     "iopub.status.busy": "2026-05-19T20:09:10.490352Z",
     "iopub.status.idle": "2026-05-19T20:09:11.669597Z",
     "shell.execute_reply": "2026-05-19T20:09:11.669125Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Factory                  : \u001b[1mTest all methods\u001b[0m\n",
      "Factory                  : Test method: PDEFoam for Regression performance\n",
      "                         : \n",
      "                         : Dataset[datasetreg] : Create results for testing\n",
      "                         : Dataset[datasetreg] : Evaluation of PDEFoam on testing sample\n",
      "                         : Dataset[datasetreg] : Elapsed time for evaluation of 9000 events: 0.0202 sec       \n",
      "                         : Create variable histograms\n",
      "                         : Create regression target histograms\n",
      "                         : Create regression average deviation\n",
      "                         : Results created\n",
      "Factory                  : Test method: KNN for Regression performance\n",
      "                         : \n",
      "                         : Dataset[datasetreg] : Create results for testing\n",
      "                         : Dataset[datasetreg] : Evaluation of KNN on testing sample\n",
      "                         : Dataset[datasetreg] : Elapsed time for evaluation of 9000 events: 0.0366 sec       \n",
      "                         : Create variable histograms\n",
      "                         : Create regression target histograms\n",
      "                         : Create regression average deviation\n",
      "                         : Results created\n",
      "Factory                  : Test method: LD for Regression performance\n",
      "                         : \n",
      "                         : Dataset[datasetreg] : Create results for testing\n",
      "                         : Dataset[datasetreg] : Evaluation of LD on testing sample\n",
      "                         : Dataset[datasetreg] : Elapsed time for evaluation of 9000 events: 0.00122 sec       \n",
      "                         : Create variable histograms\n",
      "                         : Create regression target histograms\n",
      "                         : Create regression average deviation\n",
      "                         : Results created\n",
      "Factory                  : Test method: BDTG for Regression performance\n",
      "                         : \n",
      "                         : Dataset[datasetreg] : Create results for testing\n",
      "                         : Dataset[datasetreg] : Evaluation of BDTG on testing sample\n",
      "                         : Dataset[datasetreg] : Elapsed time for evaluation of 9000 events: 0.899 sec       \n",
      "                         : Create variable histograms\n",
      "                         : Create regression target histograms\n",
      "                         : Create regression average deviation\n",
      "                         : Results created\n"
     ]
    }
   ],
   "source": [
    "factory->TestAllMethods();"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a3484ba5",
   "metadata": {},
   "source": [
    "Evaluate and compare performance of all configured MVAs"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "id": "c7d8a680",
   "metadata": {
    "collapsed": false,
    "execution": {
     "iopub.execute_input": "2026-05-19T20:09:11.670816Z",
     "iopub.status.busy": "2026-05-19T20:09:11.670708Z",
     "iopub.status.idle": "2026-05-19T20:09:13.486375Z",
     "shell.execute_reply": "2026-05-19T20:09:13.485933Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Factory                  : \u001b[1mEvaluate all methods\u001b[0m\n",
      "                         : Evaluate regression method: PDEFoam\n",
      "                         : TestRegression (testing)\n",
      "                         : Calculate regression for all events\n",
      "                         : Elapsed time for evaluation of 9000 events: 0.0199 sec       \n",
      "                         : TestRegression (training)\n",
      "                         : Calculate regression for all events\n",
      "                         : Elapsed time for evaluation of 1000 events: 0.00222 sec       \n",
      "TFHandler_PDEFoam        : Variable        Mean        RMS   [        Min        Max ]\n",
      "                         : -----------------------------------------------------------\n",
      "                         :     var1:     3.3370     1.1877   [ 0.00020069     5.0000 ]\n",
      "                         :     var2:     2.4902     1.4378   [ 0.00071490     5.0000 ]\n",
      "                         :   fvalue:     164.24     84.217   [     1.6186     394.84 ]\n",
      "                         : -----------------------------------------------------------\n",
      "                         : Evaluate regression method: KNN\n",
      "                         : TestRegression (testing)\n",
      "                         : Calculate regression for all events\n",
      "                         : Elapsed time for evaluation of 9000 events: 0.0374 sec       \n",
      "                         : TestRegression (training)\n",
      "                         : Calculate regression for all events\n",
      "                         : Elapsed time for evaluation of 1000 events: 0.00412 sec       \n",
      "TFHandler_KNN            : Variable        Mean        RMS   [        Min        Max ]\n",
      "                         : -----------------------------------------------------------\n",
      "                         :     var1:     3.3370     1.1877   [ 0.00020069     5.0000 ]\n",
      "                         :     var2:     2.4902     1.4378   [ 0.00071490     5.0000 ]\n",
      "                         :   fvalue:     164.24     84.217   [     1.6186     394.84 ]\n",
      "                         : -----------------------------------------------------------\n",
      "                         : Evaluate regression method: LD\n",
      "                         : TestRegression (testing)\n",
      "                         : Calculate regression for all events\n",
      "                         : Elapsed time for evaluation of 9000 events: 0.00184 sec       \n",
      "                         : TestRegression (training)\n",
      "                         : Calculate regression for all events\n",
      "                         : Elapsed time for evaluation of 1000 events: 0.000178 sec       \n",
      "TFHandler_LD             : Variable        Mean        RMS   [        Min        Max ]\n",
      "                         : -----------------------------------------------------------\n",
      "                         :     var1:     3.3370     1.1877   [ 0.00020069     5.0000 ]\n",
      "                         :     var2:     2.4902     1.4378   [ 0.00071490     5.0000 ]\n",
      "                         :   fvalue:     164.24     84.217   [     1.6186     394.84 ]\n",
      "                         : -----------------------------------------------------------\n",
      "                         : Evaluate regression method: BDTG\n",
      "                         : TestRegression (testing)\n",
      "                         : Calculate regression for all events\n",
      "                         : Elapsed time for evaluation of 9000 events: 0.898 sec       \n",
      "                         : TestRegression (training)\n",
      "                         : Calculate regression for all events\n",
      "                         : Elapsed time for evaluation of 1000 events: 0.0981 sec       \n",
      "TFHandler_BDTG           : Variable        Mean        RMS   [        Min        Max ]\n",
      "                         : -----------------------------------------------------------\n",
      "                         :     var1:     3.3370     1.1877   [ 0.00020069     5.0000 ]\n",
      "                         :     var2:     2.4902     1.4378   [ 0.00071490     5.0000 ]\n",
      "                         :   fvalue:     164.24     84.217   [     1.6186     394.84 ]\n",
      "                         : -----------------------------------------------------------\n",
      "                         : \n",
      "                         : Evaluation results ranked by smallest RMS on test sample:\n",
      "                         : (\"Bias\" quotes the mean deviation of the regression from true target.\n",
      "                         :  \"MutInf\" is the \"Mutual Information\" between regression and target.\n",
      "                         :  Indicated by \"_T\" are the corresponding \"truncated\" quantities ob-\n",
      "                         :  tained when removing events deviating more than 2sigma from average.)\n",
      "                         : --------------------------------------------------------------------------------------------------\n",
      "                         : --------------------------------------------------------------------------------------------------\n",
      "                         : datasetreg           BDTG           :   0.0489   0.0694     2.42     1.86  |  3.157  3.194\n",
      "                         : datasetreg           KNN            :    -1.25   0.0612     7.84     4.47  |  2.870  2.864\n",
      "                         : datasetreg           PDEFoam        :    -1.10   -0.585     10.2     8.00  |  2.281  2.331\n",
      "                         : datasetreg           LD             :   -0.301     1.50     19.9     17.9  |  1.984  1.960\n",
      "                         : --------------------------------------------------------------------------------------------------\n",
      "                         : \n",
      "                         : Evaluation results ranked by smallest RMS on training sample:\n",
      "                         : (overtraining check)\n",
      "                         : --------------------------------------------------------------------------------------------------\n",
      "                         : DataSet Name:         MVA Method:        <Bias>   <Bias_T>    RMS    RMS_T  |  MutInf MutInf_T\n",
      "                         : --------------------------------------------------------------------------------------------------\n",
      "                         : datasetreg           BDTG           :   0.0188   0.0107    0.445    0.254  |  3.483  3.514\n",
      "                         : datasetreg           KNN            :   -0.486    0.354     5.18     3.61  |  2.948  2.988\n",
      "                         : datasetreg           PDEFoam        :-3.12e-07    0.265     7.58     6.09  |  2.514  2.592\n",
      "                         : datasetreg           LD             :-9.54e-07     1.31     19.0     17.5  |  2.081  2.113\n",
      "                         : --------------------------------------------------------------------------------------------------\n",
      "                         : \n",
      "Dataset:datasetreg       : Created tree 'TestTree' with 9000 events\n",
      "                         : \n",
      "Dataset:datasetreg       : Created tree 'TrainTree' with 1000 events\n",
      "                         : \n",
      "Factory                  : \u001b[1mThank you for using TMVA!\u001b[0m\n",
      "                         : \u001b[1mFor citation information, please visit: http://tmva.sf.net/citeTMVA.html\u001b[0m\n"
     ]
    }
   ],
   "source": [
    "factory->EvaluateAllMethods();"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "de5f5b24",
   "metadata": {},
   "source": [
    "--------------------------------------------------------------"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "73f01c20",
   "metadata": {},
   "source": [
    "Save the output"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "id": "56c9d330",
   "metadata": {
    "collapsed": false,
    "execution": {
     "iopub.execute_input": "2026-05-19T20:09:13.487746Z",
     "iopub.status.busy": "2026-05-19T20:09:13.487630Z",
     "iopub.status.idle": "2026-05-19T20:09:13.690337Z",
     "shell.execute_reply": "2026-05-19T20:09:13.689963Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "==> Wrote root file: TMVAReg.root\n",
      "==> TMVARegression is done!\n"
     ]
    }
   ],
   "source": [
    "outputFile->Close();\n",
    "\n",
    "std::cout << \"==> Wrote root file: \" << outputFile->GetName() << std::endl;\n",
    "std::cout << \"==> TMVARegression is done!\" << std::endl;\n",
    "\n",
    "delete factory;\n",
    "delete dataloader;"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d9b732fb",
   "metadata": {},
   "source": [
    "Launch the GUI for the root macros"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "id": "60970064",
   "metadata": {
    "collapsed": false,
    "execution": {
     "iopub.execute_input": "2026-05-19T20:09:13.691849Z",
     "iopub.status.busy": "2026-05-19T20:09:13.691740Z",
     "iopub.status.idle": "2026-05-19T20:09:13.894215Z",
     "shell.execute_reply": "2026-05-19T20:09:13.893693Z"
    }
   },
   "outputs": [],
   "source": [
    "if (!gROOT->IsBatch()) TMVA::TMVARegGui( outfileName );"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "ROOT C++",
   "language": "c++",
   "name": "root"
  },
  "language_info": {
   "codemirror_mode": "text/x-c++src",
   "file_extension": ".C",
   "mimetype": " text/x-c++src",
   "name": "c++"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}