Logo ROOT   6.16/01
Reference Guide
df014_CSVDataSource.py
Go to the documentation of this file.
1## \file
2## \ingroup tutorial_dataframe
3## \notebook -draw
4## This tutorial illustrates how use the RDataFrame in combination with a
5## RDataSource. In this case we use a TCsvDS. This data source allows to read
6## a CSV file from a RDataFrame.
7## As a result of running this tutorial, we will produce plots of the dimuon
8## spectrum starting from a subset of the CMS collision events of Run2010B.
9## Dataset Reference:
10## McCauley, T. (2014). Dimuon event information derived from the Run2010B
11## public Mu dataset. CERN Open Data Portal.
12## DOI: [10.7483/OPENDATA.CMS.CB8H.MFFA](http://opendata.cern.ch/record/700).
13##
14## \macro_code
15## \macro_image
16##
17## \date October 2017
18## \author Enric Tejedor
19
20import ROOT
21
22# Let's first create a RDF that will read from the CSV file.
23# The types of the columns will be automatically inferred.
24fileName = "df014_CsvDataSource_MuRun2010B.csv"
25MakeCsvDataFrame = ROOT.ROOT.RDF.MakeCsvDataFrame
26tdf = MakeCsvDataFrame(fileName)
27
28# Now we will apply a first filter based on two columns of the CSV,
29# and we will define a new column that will contain the invariant mass.
30# Note how the new invariant mass column is defined from several other
31# columns that already existed in the CSV file.
32filteredEvents = tdf.Filter("Q1 * Q2 == -1") \
33 .Define("m", "sqrt(pow(E1 + E2, 2) - (pow(px1 + px2, 2) + pow(py1 + py2, 2) + pow(pz1 + pz2, 2)))")
34
35# Next we create a histogram to hold the invariant mass values and we draw it.
36invMass = filteredEvents.Histo1D(("invMass", "CMS Opendata: #mu#mu mass;#mu#mu mass [GeV];Events", 512, 2, 110), "m")
37
38c = ROOT.TCanvas()
39c.SetLogx()
40c.SetLogy()
41invMass.Draw()
42
43# We will now produce a plot also for the J/Psi particle. We will plot
44# on the same canvas the full spectrum and the zoom in the J/psi particle.
45# First we will create the full spectrum histogram from the invariant mass
46# column, using a different histogram model than before.
47fullSpectrum = filteredEvents.Histo1D(("Spectrum", "Subset of CMS Run 2010B;#mu#mu mass [GeV];Events", 1024, 2, 110), "m")
48
49# Next we will create the histogram for the J/psi particle, applying first
50# the corresponding cut.
51jpsiLow = 2.95
52jpsiHigh = 3.25
53jpsiCut = 'm < %s && m > %s' % (jpsiHigh, jpsiLow)
54jpsi = filteredEvents.Filter(jpsiCut) \
55 .Histo1D(("jpsi", "Subset of CMS Run 2010B: J/#psi window;#mu#mu mass [GeV];Events", 128, jpsiLow, jpsiHigh), "m")
56
57# Finally we draw the two histograms side by side.
58dualCanvas = ROOT.TCanvas("DualCanvas", "DualCanvas", 800, 512)
59dualCanvas.Divide(2, 1)
60leftPad = dualCanvas.cd(1)
61leftPad.SetLogx()
62leftPad.SetLogy()
63fullSpectrum.Draw("Hist")
64dualCanvas.cd(2)
65jpsi.Draw("HistP")
RDataFrame MakeCsvDataFrame(std::string_view fileName, bool readHeaders=true, char delimiter=',', Long64_t linesChunkSize=-1LL)
Factory method to create a CSV RDataFrame.
Definition: RCsvDS.cxx:454