{ "cells": [ { "cell_type": "markdown", "id": "6106380a", "metadata": {}, "source": [ "# TMVAMinimalClassification\n", "Minimal self-contained example for setting up TMVA with binary\n", "classification.\n", "\n", "This is intended as a simple foundation to build on. It assumes you are\n", "familiar with TMVA already. As such concepts like the Factory, the DataLoader\n", "and others are not explained. For descriptions and tutorials use the TMVA online manual\n", "https://root.cern/manual/tmva/ or the more detailed examples provided with TMVA\n", " e.g. TMVAClassification.C. or the TMVA Users Guide\n", "https://github.com/root-project/root/blob/master/documentation/tmva/UsersGuide/TMVAUsersGuide.pdf\n", "\n", "Sets up a minimal binary classification example with two slightly overlapping\n", "2-D gaussian distributions and trains a BDT classifier to discriminate the\n", "data.\n", "\n", "- Project : TMVA - a ROOT-integrated toolkit for multivariate data analysis\n", "- Package : TMVA\n", "- Root Macro: TMVAMinimalClassification.C\n", "\n", "\n", "\n", "**Author:** Kim Albertsson \n", "This notebook tutorial was automatically generated with ROOTBOOK-izer from the macro found in the ROOT repository on Tuesday, May 19, 2026 at 08:24 PM." ] }, { "cell_type": "code", "execution_count": null, "id": "261f4a77", "metadata": { "collapsed": false }, "outputs": [], "source": [ "%%cpp -d\n", "#include \"TMVA/DataLoader.h\"\n", "#include \"TMVA/Factory.h\"\n", "\n", "#include \"TFile.h\"\n", "#include \"TString.h\"\n", "#include \"TTree.h\"" ] }, { "cell_type": "markdown", "id": "86d8190f", "metadata": {}, "source": [ " \n", "\n", "Helper function to generate 2-D gaussian data points and fill to a ROOT\n", "TTree.\n", "\n", "\n", "Arguments:\n", "nPoints Number of points to generate.\n", "offset Mean of the generated numbers\n", "scale Standard deviation of the generated numbers.\n", "seed Seed for random number generator. Use `seed=0` for random\n", "seed.\n", "Returns a TTree ready to be used as input to TMVA.\n", "\n", "\n", " " ] }, { "cell_type": "code", "execution_count": null, "id": "9249c14c", "metadata": { "collapsed": false }, "outputs": [], "source": [ "%%cpp -d\n", "TTree *genTree(Int_t nPoints, Double_t offset, Double_t scale, UInt_t seed = 100)\n", "{\n", " TRandom rng(seed);\n", " Double_t x = 0;\n", " Double_t y = 0;\n", "\n", " TTree *data = new TTree();\n", " data->Branch(\"x\", &x, \"x/D\");\n", " data->Branch(\"y\", &y, \"y/D\");\n", "\n", " for (Int_t n = 0; n < nPoints; ++n) {\n", " x = rng.Rndm() * scale;\n", " y = offset + rng.Rndm() * scale;\n", " data->Fill();\n", " }\n", "\n", " Important: Disconnects the tree from the memory locations of x and y.\n", " data->ResetBranchAddresses();\n", " return data;\n", "}" ] }, { "cell_type": "code", "execution_count": null, "id": "3530adcd", "metadata": { "collapsed": false }, "outputs": [], "source": [ "TString outputFilename = \"out.root\";\n", "TFile *outFile = new TFile(outputFilename, \"RECREATE\");" ] }, { "cell_type": "markdown", "id": "3796acae", "metadata": {}, "source": [ "Data generation" ] }, { "cell_type": "code", "execution_count": null, "id": "c0b281d0", "metadata": { "collapsed": false }, "outputs": [], "source": [ "TTree *signalTree = genTree(1000, 0.0, 2.0, 100);\n", "TTree *backgroundTree = genTree(1000, 1.0, 2.0, 101);\n", "\n", "TString factoryOptions = \"AnalysisType=Classification\";\n", "TMVA::Factory factory{\"\", outFile, factoryOptions};\n", "\n", "TMVA::DataLoader dataloader{\"dataset\"};" ] }, { "cell_type": "markdown", "id": "5cf8ef07", "metadata": {}, "source": [ "Data specification" ] }, { "cell_type": "code", "execution_count": null, "id": "5c703d9d", "metadata": { "collapsed": false }, "outputs": [], "source": [ "dataloader.AddVariable(\"x\", 'D');\n", "dataloader.AddVariable(\"y\", 'D');\n", "\n", "dataloader.AddSignalTree(signalTree, 1.0);\n", "dataloader.AddBackgroundTree(backgroundTree, 1.0);\n", "\n", "TCut signalCut = \"\";\n", "TCut backgroundCut = \"\";\n", "TString datasetOptions = \"SplitMode=Random\";\n", "dataloader.PrepareTrainingAndTestTree(signalCut, backgroundCut, datasetOptions);" ] }, { "cell_type": "markdown", "id": "0e3484ac", "metadata": {}, "source": [ "Method specification" ] }, { "cell_type": "code", "execution_count": null, "id": "ae749667", "metadata": { "collapsed": false }, "outputs": [], "source": [ "TString methodOptions = \"\";\n", "factory.BookMethod(&dataloader, TMVA::Types::kBDT, \"BDT\", methodOptions);" ] }, { "cell_type": "markdown", "id": "21fc298e", "metadata": {}, "source": [ "Training and Evaluation" ] }, { "cell_type": "code", "execution_count": null, "id": "c8e1c3fe", "metadata": { "collapsed": false }, "outputs": [], "source": [ "factory.TrainAllMethods();\n", "factory.TestAllMethods();\n", "factory.EvaluateAllMethods();" ] }, { "cell_type": "markdown", "id": "d5c59900", "metadata": {}, "source": [ "Clean up" ] }, { "cell_type": "code", "execution_count": null, "id": "d1f5360b", "metadata": { "collapsed": false }, "outputs": [], "source": [ "outFile->Close();\n", "\n", "delete outFile;\n", "delete signalTree;\n", "delete backgroundTree;" ] } ], "metadata": { "kernelspec": { "display_name": "ROOT C++", "language": "c++", "name": "root" }, "language_info": { "codemirror_mode": "text/x-c++src", "file_extension": ".C", "mimetype": " text/x-c++src", "name": "c++" } }, "nbformat": 4, "nbformat_minor": 5 }