This function provides an example of how one might massage a csv data file to read into a ROOT TTree via TTree::ReadStream.
This could be useful if the data read out from some DAQ program doesn't 'quite' match the formatting expected by ROOT (e.g. comma- separated, tab-separated with white-space strings, headers not matching the expected format, etc.)
This example is shipped with a data file that looks like:
Date/Time Synchro Capacity Temp.Cold Head Temp. Electrode HV Supply Voltage Electrode 1 Electrode 2 Electrode 3 Electrode 4
# Example data to read out. Some data have oddities that might need to
# dealt with, including the 'NaN' in Electrode 4 and the empty string in Date/Time (last row)
08112010.160622 7 5.719000E-10 8.790500 24.237700 -0.008332 0 0 0 0
8112010.160626 7 5.710000E-10 8.828400 24.237500 -0.008818 0 0 0 0
08112010.160626 7 5.719000E-10 8.828400 24.237500 -0.008818 0 0 0 0
08112010.160627 7 5.719000E-10 9.014300 24.237400 -0.028564 0 0 0
NaN08112010.160627 7 5.711000E-10 8.786000 24.237400 -0.008818 0 0 0 0
08112010.160628 7 5.702000E-10 8.786000 24.237400 -0.009141 0 0 0 0
08112010.160633 7 5.710000E-10 9.016200 24.237200 -0.008818 0 0 0 0
7 5.710000E-10 8.903400 24.237200 -0.008818 0 0 0 0
These data require some massaging, including:
- Date/Time has a blank ('') entry that must be handled
- The headers are not in the correct format
- Tab-separated entries with additional white space
- NaN entries
import ROOT
import sys
import os
ROOT.gROOT.SetBatch()
header_mapping_dictionary = {
'Date/Time' : ('Datetime' , str) ,
'Synchro' : ('Synchro' , int) ,
'Capacity' : ('Capacitance' , float) ,
'Temp.Cold Head' : ('TempColdHead' , float) ,
'Temp. Electrode' : ('TempElectrode' , float) ,
'HV Supply Voltage' : ('HVSupplyVoltage', float) ,
'Electrode 1' : ('Electrode1' , int) ,
'Electrode 2' : ('Electrode2' , int) ,
'Electrode 3' : ('Electrode3' , int) ,
'Electrode 4' : ('Electrode4' , int) ,
}
type_mapping_dictionary = {
str : 'C',
int : 'I',
float : 'F'
}
header_row =
open(afile).readline().strip().split(
'\t')
branch_descriptor = ':'.join([header_mapping_dictionary[row][0]+'/'+
type_mapping_dictionary[header_mapping_dictionary[row][1]]
for row in header_row])
output_ROOT_file_name = os.path.splitext(afile)[0] + '.root'
output_file = ROOT.TFile(output_ROOT_file_name, 'recreate')
print "Outputting %s -> %s" % (afile, output_ROOT_file_name)
output_tree = ROOT.TTree(tree_name, tree_name)
file_lines =
open(afile).readlines()
file_lines = ['\t'.join([val if (val.find(' ') == -1 and val != '')
else 'empty' for val in line.split('\t')])
for line in file_lines[1:] if line[0] != '#' ]
file_as_string = ('\n'.join(file_lines)).replace('NaN', str(0.0))
istring = ROOT.istringstream(file_as_string)
output_tree.ReadStream(istring, branch_descriptor)
output_file.cd()
output_tree.Write()
if __name__ == '__main__':
if len(sys.argv) < 2:
print "Usage: %s file_to_parse.dat" % sys.argv[0]
sys.exit(1)
- Author
- Michael Marino
Definition in file parse_CSV_file_with_TTree_ReadStream.py.