[proposal] TTree access

I’ve never understand why accessing to TTree it’s so intricated.

Now for reading a tree:

int one;
float two;
TTree t;
t.SetBranchAddress("Branch1",&one)
t.SetBranchAddress("Branch2",&two)
for(i=0;..............)
    t.GetEnttry(i);

I don’t like it. Better would be:

for(i=0;...............) {
   t.Get("Brach1")[i];
   t.Get("Brach2")[i];
}

and why don’t use iterators? Or iterators that can filter data (like generators in python)?

TCut cut = "Brach1 > 10 && Branch2<100";
TTree t;
i = t.GetIterator(cut);
for (i=t.begin();i!=t.end();++i)
{
  cout << i.Get("Branch1"); //only when the condition cut is true they are printed
  cout << i.Get("Branch2");
}

or maybe you can use function class instead of TCut like:

struct myFilter{
 bool operator()(all branches in a struct) {if (condition) return true;}
};

and (improbable) why don’t implement a syntax like SQL syntax? Or why don’t implement a function t.toSQL() that convert TTree to a more standar SQL database?

[quote=“wiso”]I’ve never understand why accessing to TTree it’s so intricated.

I don’t like it. Better would be: for(i=0;...............) { t.Get("Brach1")[i]; t.Get("Brach2")[i]; } [/quote]

You can do things like this if you really wanted to. But there is a reason it’s not done like that. When you are looping over millions of events, the cost of calling a function like ‘get’ and the [i] operator would be enough to slow the process down significantly. Much more efficient is if you tell the tree where you want to put the value, then within the loop just do one ‘GetEntry’ which fetches all of the values into memory in the most efficient way possible. Then within the loop, you just specify the variable where you want to use it. No function calls and lookups each time you you want to do something with the value in Branch1.

You can do things like what you want to do, anyway, but a) they are not the most efficient and b) they are not very elegant.

for ( Long64_t i... ) t.GetEntry(i) t.GetLeaf("Branch1").GetValue()

In short, there are multiple parts to it: Reading the data from the tree into memory (which is done in GetEntry), then fetching the value from the memory address (.GetValue()). Alternatively, you could just tell the tree beforehand where to put the value (With SetBranchAddress), and use GetEntry(). No messing around with fetching values from the relevant place, just physics logic.

[quote]and why don’t use iterators? Or iterators that can filter data (like generators in python)?

TCut cut = "Brach1 > 10 && Branch2<100";
TTree t;
i = t.GetIterator(cut);
for (i=t.begin();i!=t.end();++i)
{
  cout << i.Get("Branch1"); //only when the condition cut is true they are printed
  cout << i.Get("Branch2");
}

[/quote]

Again, speed and elegance. You can do what you want to do, though:

Consider this snippet, which just loops over the events that match your cut:

[code]
TTree t…;

Double_t Branch1;
t.SetBranchAddress( “Branch1”, &Branch1 );

t.Draw(“>>MyEventList”, “Brach1 > 10 && Branch2<100”);
TEventList *MyEventList;
gDirectory->GetObject(“MyEventList”, MyEventList);

for ( Long64_t i = 0; i < MyEventList->GetN(); i++ )
{
t.GetEntry( MyEventList->GetEntry( i ) );
cout << Branch1 << endl;
}[/code]

This is a matter of taste and just the way the framework evolved. You could probably find a way of doing things like that if you really wanted to, but you would be turning your project into a programming exercise than a physics (or whatever else you are doing) one. (If it’s not already possible to do exactly this)

Why would one want the complexities associated with a fully SQL compatible database? You just need access to numbers efficiently. That’s what a TTree is about. If you want a relational database and associated detritus, then find some other way of doing what you want. More likely is that the only time you need a feature present in an SQL database, you would be able to implement it easily with what is available in ROOT anyway.

IANARD (I am not a ROOT developer). I am not speaking on behalf of them, or anything like that. But in my experience, things are the way they are for a good reason, especially in cases like this. ROOT can be a bit complex to get your head around sometimes, but if you want to do something efficiently, there is usually a good way to do it - and it will usually be faster for you to put up with the way it is done than to try to reinvent the wheel.

If you like Python, try PyROOT. It is easier to implement some constructs which are like what you are after - but you will always take a big speed hit when doing things like looping over events.

thanks Peter for your nice reply. You save me a big headache ::slight_smile:

Rene

instead of SetBranchAddress, GetEntry, don’t you like this:

TTree t
t::Structure s = t.Get[i]

where t::Structure is a struct that contains all variable in t, or it can be a std::map of pointers.

I don’t think SetBranchAddress is very elegant… and safe.

iterators are very elegant

ok, this is better, but

gDirectory->GetObject("MyEventList", MyEventList);

this is not elegant, in my opinion better would be:

TEventList MyEventList = t.Cut("Brach1 > 10 && Branch2<100");

I think that you have to use function class instead TCut because:

  1. TCut are string (TString not std)! They are not safe, they are not checked at compile time, if you mistype a variabile name what happens? This is a common problem in ROOT, a lot of functions take string argument, like TH1F::Draw(“same”). What is “same”? better is TH1F::Draw(kdraw_same) where kdraw_same is a constant of a type, for example TDrawOption. Now there is Option_t type, but in fact, it’s a string. As someone said: “A set of class enums or, better, a config object (or collection thereof) would be much safer.”
  2. functional programming is very elegant (code less, do more)

using functional object it’s a way to not reinvent the wheel! Do you know the C++ Algorithms libray? If TTree and other classes were based on STL container now you can use these algorithms on it, to do a lot of thing, first, to select event that match a test (a function class) using cppreference.com/cppalgorith … py_if.html or similar. Sorry, but I think that now you are reinventing the wheel, and if you continue on this way, it’s users that have a big headache

Let me start off by saying:

There are many, many ways to skin a cat.

It turns out you have selected a preference for doing things in a very particular way. One thing that you should be aware of, is that you have formed your preferences with a lot of hindsight. ROOT has been in development since 1999. GCC was in version 2.*, and other compilers of the time weren’t so hot, either. There was little support for things like templates, for example. Python was still stuck at version 1., and didn’t have generators. The landscape was very different.

I agree, things could be better, if we were living in an ideal world. Unfortunately, we are not in an ideal world, and software evolves. ROOT has been evolving for a long time. More and more features have been added, and it is apparent that some of the design decisions weren’t the best in the end - but maybe they were at least in the beginning.

If you’re saying it should be changed now then I think you are neglecting the amount of effort it would take, not just for the ROOT developers to change ROOT, but also for anyone using ROOT, to revamp their programs to use enums instead of strings, as per your example. (not to mention the re-learning curve! no one would switch - you should see how unwilling people are to switch between different ROOT releases as it is…)

Another thing to consider is that not all people using ROOT are particularly proficient programmers, so the more concepts you can hide away the better. (for example: a string is easier to conceptualise than bitwise-or’ing together enums). Also using a string for options allows one to write software that will work with different versions of ROOT - if a version of ROOT doesn’t support some draw option, it gets ignored, for example. If you did it with an enum, it would not compile with different versions of ROOT without being changed.

It’s unfortunate that the standard library has never been very portable. Also, I pointed out that I am not a ROOT developer, but I think they have done a good job considering. For the moment, this is the best we have to work with. If you think you can do better, be my guest. ROOT is even open source, so you could fork.

[quote=“pwaller”]Let me start off by saying:

software evolves. ROOT has been evolving for a long time. More and more features have been added
[/quote]
this is one problem. I think that ROOT has too many features, developers must stop to introduce new features and re-code some things, re-think how to do some things.

I think that you can change a lot of things without backward compatibility problems, you can introduce new things (for example function class instead of TCut) and say: TCuts are deprecated, use function class. Old programs will work. You can introduce namespace, and say: TH1F is deprecated, use ROOT:TH1F, and define:

#define H1F ROOT:H1F

or something similar.

Another thing to consider is that not all people using ROOT are particularly proficient programmers

and if you don’t show them how a good framework is, they don’t learn, and they produce bad programs, they use string to pass argument to their function…[/code]

[quote=“wiso”]I think that you can change a lot of things without backward compatibility problems, you can introduce new things (for example function class instead of TCut) and say: TCuts are deprecated, use function class. Old programs will work. You can introduce namespace, and say: TH1F is deprecated, use ROOT:TH1F, and define:

#define H1F ROOT:H1F

or something similar.[/quote]

Then you have more bloat. You would have double the number of interfaces to everything. Think of the poor confused programmers as they have to now navigate double the number of possible function calls and class names?

I really think if you want something as radically new as you are proposing, it would have to be a whole new framework, and a new community to use it.

One thing I’ve maybe not made clear: I agree on several points. I think things could be ‘nicer’ in many respects. But to reiterate: this is what we have. It works well.

I seriously recommend you take a look at PyROOT by the way. An example of some things you can do:

[code]from ROOT import TFile

def GetKeyNames( self ):
return [key.GetName() for name in MyFile.GetListOfKeys()]
TFile.GetKeyNames = GetKeyNames

MyFile = TFile( “file.root” )
MyFile.MyHistogram.Draw()

print “Keys in file:”, MyFile.GetKeyNames()

for event in MyFile.MyTuple:
print event.Branch1[/code]

The GetKeyNames thing is a bit of a lame example. I have typically used it to add things like ‘MultiDraw’ capability to TTree, so that it can draw many histograms in one loop over events in a tree - but it’s always necessary to do the grunt work in C++. This is easy since with PyROOT you can import names defined in macros you have loaded.

ROOT.gROOT.ProcessLine(".L MyCxxFunction.cxx+") from ROOT import MyCxxFunction MyCxxFunction( "Hello, world" )

I have a radical idea in my head that some new interfaces could be implemented in Python at some point in the future. I think there are drawbacks with doing this though.

Bare in mind that whatever framework it has, it will never be perfect from one’s own perspective, since we all have different ideas of how things might work. Different ways of doing things have different pros and cons, as we have discussed with the string-to-pass-an-option example. At some point you just have to use what you have available and what works.

[quote=“pwaller”]
An example of some things you can do:

[code]from ROOT import TFile

def GetKeyNames( self ):
return [key.GetName() for name in MyFile.GetListOfKeys()]
TFile.GetKeyNames = GetKeyNames

MyFile = TFile( “file.root” )
MyFile.MyHistogram.Draw()

print “Keys in file:”, MyFile.GetKeyNames()

for event in MyFile.MyTuple:
print event.Branch1[/code]

The GetKeyNames thing is a bit of a lame example. I have typically used it to add things like ‘MultiDraw’ capability to TTree, so that it can draw many histograms in one loop over events in a tree - but it’s always necessary to do the grunt work in C++. [/quote]

Your script can’t be correct, this is my version:



def GetKeyNames( self ):
    return [name.GetName() for name in self.GetListOfKeys()]

TFile.GetKeyNames = GetKeyNames
MyFile = TFile( "file.root" )


#MyFile.MyHistogram.Draw()

print "Keys in file:", MyFile.GetKeyNames()

for event in MyFile.GetKeyNames():
    print event.Branch1

but I don’t understand the last two lines…

How do you mean "Can’t be correct’ ?

I didn’t test the script exactly, but I do things like that which I showed you all the time.

for event in tuple: print event.pt

does work (it loops over all events in ‘tuple’, event.whatever is the value of a variable ‘whatever’ in ‘tuple’.),

as does TFileObject.ObjectName - it returns the object named ‘ObjectName’ in file.

Cheers,

  • Pete

Ah, and yes. You did correct a couple of mistakes I didn’t see with the MyFile/self thing.

I didn’t look closely enough.

Cheers,

  • Pete

there are three main problems:

MyFile.MyTuple doesn’t exist
event.Branch1 doesn’t exist
MyFile.GetKeyNames() is not iterable

[quote=“wiso”]there are three main problems:

MyFile.MyTuple doesn’t exist
event.Branch1 doesn’t exist
MyFile.GetKeyNames() is not iterable[/quote]

These were placeholder names. If you had a tree called ‘MyTuple’, with a branch called Branch1, then this would work. I don’t see why the last one wouldn’t work though. GetKeyNames returns a list, which is iterable.

def GetKeyNames( self ):
    return [name.GetName() for name in self.GetListOfKeys()]

self is a TFile, functionTFile::GetListOfKeys return a TList*, TList::GetName return const char*, that is not a list

def GetKeyNames( self ):
    return [name.GetName() for name in self.GetListOfKeys()]

self is a TFile, functionTFile::GetListOfKeys return a TList*, TList::GetName return const char*, that is not a list[/quote]

TList is iterable.

This is a list comprehension:

[quote]>>> print [a**2 for a in xrange( 10 )]
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
[/quote][/quote]