Logo ROOT  
Reference Guide
 
All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Properties Friends Macros Modules Pages
Loading...
Searching...
No Matches
df014_CSVDataSource.py
Go to the documentation of this file.
1## \file
2## \ingroup tutorial_dataframe
3## \notebook -draw
4## Process a CSV file with RDataFrame and the CSV data source.
5##
6## This tutorial illustrates how use the RDataFrame in combination with a
7## RDataSource. In this case we use a TCsvDS. This data source allows to read
8## a CSV file from a RDataFrame.
9## As a result of running this tutorial, we will produce plots of the dimuon
10## spectrum starting from a subset of the CMS collision events of Run2010B.
11## Dataset Reference:
12## McCauley, T. (2014). Dimuon event information derived from the Run2010B
13## public Mu dataset. CERN Open Data Portal.
14## DOI: [10.7483/OPENDATA.CMS.CB8H.MFFA](http://opendata.cern.ch/record/700).
15##
16## \macro_code
17## \macro_image
18##
19## \date October 2017
20## \author Enric Tejedor (CERN)
21
22import ROOT
23import urllib.request
24import os
25
26# Let's first create a RDF that will read from the CSV file.
27# The types of the columns will be automatically inferred.
28fileNameUrl = "http://root.cern/files/tutorials/df014_CsvDataSource_MuRun2010B.csv"
29fileName = "CsvDataSource_MuRun2010B.csv"
30if not os.path.isfile(fileName):
31 urllib.request.urlretrieve(fileNameUrl, fileName)
32
33df = ROOT.RDF.FromCSV(fileName)
34
35# Now we will apply a first filter based on two columns of the CSV,
36# and we will define a new column that will contain the invariant mass.
37# Note how the new invariant mass column is defined from several other
38# columns that already existed in the CSV file.
39filteredEvents = df.Filter("Q1 * Q2 == -1") \
40 .Define("m", "sqrt(pow(E1 + E2, 2) - (pow(px1 + px2, 2) + pow(py1 + py2, 2) + pow(pz1 + pz2, 2)))")
41
42# Next we create a histogram to hold the invariant mass values and we draw it.
43invMass = filteredEvents.Histo1D(("invMass", "CMS Opendata: #mu#mu mass;#mu#mu mass [GeV];Events", 512, 2, 110), "m")
44
45c = ROOT.TCanvas()
49c.SaveAs("df014_invMass.png")
50
51# We will now produce a plot also for the J/Psi particle. We will plot
52# on the same canvas the full spectrum and the zoom in on the J/psi particle.
53# First we will create the full spectrum histogram from the invariant mass
54# column, using a different histogram model than before.
55fullSpectrum = filteredEvents.Histo1D(("Spectrum", "Subset of CMS Run 2010B;#mu#mu mass [GeV];Events", 1024, 2, 110), "m")
56
57# Next we will create the histogram for the J/psi particle, applying first
58# the corresponding cut.
59jpsiLow = 2.95
60jpsiHigh = 3.25
61jpsiCut = 'm < %s && m > %s' % (jpsiHigh, jpsiLow)
62jpsi = filteredEvents.Filter(jpsiCut) \
63 .Histo1D(("jpsi", "Subset of CMS Run 2010B: J/#psi window;#mu#mu mass [GeV];Events", 128, jpsiLow, jpsiHigh), "m")
64
65# Finally we draw the two histograms side by side.
66dualCanvas = ROOT.TCanvas("DualCanvas", "DualCanvas", 800, 512)
68leftPad = dualCanvas.cd(1)
74jpsi.Draw("HistP")
75dualCanvas.SaveAs("df014_jpsi.png")
76
77print("Saved figures to df014_*.png")
ROOT::Detail::TRangeCast< T, true > TRangeDynCast
TRangeDynCast is an adapter class that allows the typed iteration through a TCollection.