Re: pyroot objects

From: Sebastien Binet <binet_at_cern.ch>
Date: Thu, 11 Aug 2011 10:25:09 +0200


On Wed, 10 Aug 2011 11:07:30 +0200, Yngve Inntjore Levinsen <yngve.inntjore.levinsen_at_cern.ch> wrote: Non-text part: multipart/mixed
> Dear developers,
>
> I have been using ROOT in python for a while and I have a couple of
> requests/suggestions. This might very well be a result of my ignorance
> for not knowing python well enough, and/or not taking into account all
> the different scenarios ROOT objects have to account for. I also still
> do not understand the ROOT data structure very well.
>
> Say I have a TFile object, "myfile", which contains a tree "mytree",
> which contains "mybranch1" and "mybranch2". In my opinion based on how
> I believe most python objects are structured, this should then work:
>
> for tree in myfile:
> print "Tree name:", tree.GetName()
> for branch in tree:
> print "Branch name:",branch.GetName()
>
> Which should then output:
> Tree name: mytree
> Branch: mybranch1
> Branch: mybranch2
>
> This can be done by defining the __iter__() function for the python
> objects (right?).
>
> A second thing I would have loved to see is a built-in function for
> the branches which extracts a numpy array of the data in the branch. I
> often prefer to get the numpy arrays because they behave more like
> python objects which I know how to manipulate (slicing,
> averages...). On the other hand, since the ROOT objects are so focused
> on events, you might not want to promote this kind of usage
> at all? I really don't understand the structure of the data types well
> enough to have a clear idea in this matter. Comments are most welcome!
>
> I attach a small example script which sort of exemplifies my
> request. The iteration part I show with extended classes for TFile and
> TTree, which I consider to be a very easy extension. The
> branch2numpy() function works, but it is dead slow for large amounts
> of data. I could probably improve it by using eg. cython, but it would
> be better if a person that know both root and python to a much better
> degree than me would write some similar functionality built into
> PyROOT.

note that for this part there is also still this problem of ndim>1 arrays:
https://savannah.cern.ch/bugs/?62600

as for having a more efficient way to get at a column of data over all entries, one can use the TTree::Scan and/or TTree::Draw (with "goff") and then access the data through TTree::GetVal. then, the next step would be to implement a memoryview (a la PEP3118) to have this performed lazily (and perhaps federated to not crawl thru the whole dataset twice when you write tree.branch1, tree.branch2 - but only once)

-s

-- 
#########################################
# Dr. Sebastien Binet
# Laboratoire de l'Accelerateur Lineaire
# Universite Paris-Sud XI
# Batiment 200
# 91898 Orsay
#########################################

  • application/pgp-signature attachment: stored
Received on Thu Aug 11 2011 - 10:25:16 CEST

This archive was generated by hypermail 2.2.0 : Thu Aug 11 2011 - 17:50:02 CEST