ROOT6 and Backward Compatibility

Hi everyone, dear Matt!

Matt Walker has posted an extensive review of ROOT and what he would hope the future of ROOT to be. Because I think many of his comments are good ones, and because I have heard some of them from several people in the past, I decided to give the answer to an audience that's little bit wider, in a dedicated post.

Backward Compatibility with CINT

We will discontinue support for using '.' and '->' interchangeably. Note that references have the identical "performance issue" as pointer derefs and use '.' and that '.' often results in a memory load, too, so I don't take performance differences as an argument :-) But we want to encourage proper C++. We do provide most of the interpreter extensions, but they can be turned off.

Class Hierarchy

A pet peeve of many, where it's not even clear how much your average novice physicist really cares. None of the real - in contrast to "I'd rather want to do computing"-physicists like me :-) - physicists I talked to ever saw that as an issue. That said: there is a Jira task assigned to it; Lorenzo will investigate it. If you have suggestions don't hesitate to comment on the Jira ticket!

Templates

ROOT was limited in the use of templates due to the way CINT called functions. Now that we have a just in time compiler we will be able to instantiate templates at runtime, and call their functions as needed. That's e.g. the main reason why the TTreeReader only really works well in ROOT6. We will migrate interfaces where it's useful, but due to the lack of documentability of possible template arguments (where concepts would help), novices usually find templates repelling compared to traditional classes.

C++11

There are several parts to it: ROOT needs to be able to parse C++11 code, and ROOT 6 will be able to do that out of the box when built with C++11 turned in. The next part is the accessibility of C++11 features through its reflection interfaces (e.g. TClass) - that will be done after ROOT 6. And the last part of C++11 support is adding features to ROOT's I/O where needed. O and by the way: we'd love to use C++11 also in the ROOT interfaces (where it makes them more readable / compact / performant), but we have to wait until all experiments have migrated as you cannot mix and match C++ 2003 and 11. And yes, I agree that C++11 is a new language :-)

FFT

Please open Jiras tickets to discuss design issues. I would have copied and pasted your comments into one, but you might want to reformat / restructure and explain some parts of your comments on that, at least I didn't want to preempt that.

Re-implementing ROOT

You won't believe how often we hit a situation where we wish we could make existing interfaces non-existent or unused. But as soon as we have users of code we try to not change the interfaces anymore, unless we have a very good reason for it. Otherwise the experiments' update to new ROOT versions would eat up lots of manpower - and their job is physics, not beauty of coding. Physics doesn't change a bit by adding templates. We don't want to have 20000 angry physicists (minus Matt) in our corridor, protesting because we broke all their code ("but your new code will be much nicer!"). That would be an epic failure in addressing customer needs.

So what we need is an alternative program, something that comes after ROOT. Something that challenges ROOT - not by a wish list, but by an alternative implementation that's powerful and draws users. Until then: please do keep your comments flowing; we hear them and try to address them!

Cheers,
Axel

Templates are not the solution

Dear all, Since Axel keeps getting these responses that templates (as a "modern programming tool") will help solve the design issues of ROOT, I feel the need to support Axel's non-enthusiastic responses. Axel is more polite than I am, so he wrote only "novices usually find templates repelling compared to traditional classes.". I worked as a professional programmer twenty years ago, so I may be old-fashioned, but I'm probably not a novice. Yet I too find templates repelling for exactly the reason Axel points out: the error messages are completely broken. Error messages are crucial, not a "nice-to-have for programmers worst than I". Yes, with very very careful reading and some experience physicists can decode most template errors. But when you have the physics problem you're trying to solve in the back of your mind, having to delve so deeply into coding is a major distraction. Concepts may help, but we're not there yet. As bad as the interface ROOT presents to the programmer is (my pet example is TChain and TTree), templates can make it so much worst. None of this is new. The informative, somewhat outdated and terribly outspoken (to the point of alienating most prospective US readers) C++ FQA gives some interesting examples from the STL programmers and programs that exist to help "novices" deal with the STL templates. You can take a look at http://yosefk.com/c++fqa/templates.html#fqa-35.17. But be warned, the rants are particularly vitriolic when it comes to templates and without the comprehensive review of the weaknesses of C++ (http://yosefk.com/c++fqa/defective.html) some of the rants won't make sense. BTW: as for pyroot being an unfair introduction to Python, I sort of agree with that. But personally, I think Python is so useful that even this unfair introduction will convert most analyzers to Python, and then their natural growth will lead them to learn more of native python. Of course, pyroot itself is simply brilliant. The complaint is about the ROOT packages, the unavoidable object ownership problems, etc. Regards, Amnon Harel

Of course one shouldnt use

Of course one shouldnt use templates for templates sake. However there are many problems which can be solved elegantly using templates. The argument about hard to read error messages concerns gcc. Since root 6 is based on llvm technology, this won't be an issue anyway.

bad...

Since root 6 is based on llvm technology, this won't be an issue anyway. http://www.hotellyonouest.com

Give us some credit

I find your dismissal of templates and a logical class hierarchy a bit concerning: in both cases you make reference to the needs of "novice" physicists, implying that a well designed framework isn't a concern for "real physicists". I can't overemphasize how out-of-touch this claim is with the daily toil of every particle physicist I know (at least those who write code). This isn't a question of "beauty of coding" or writing "nice" code, a badly designed framework means lost productivity.

As you say, many physicists prefer to focus more on physics and less on class hierarchy, but that misses the point: C++ was designed to solve problems, not to be beautiful. Templates and inheritance exist because they make coding easier and less buggy. They are absolutely central to C++, and the language continues to evolve under the assumption that they will be used.

Leaving such central pieces out of C++ is like leaving verbs out of the English language: it might be fun for a while, and it may even make the language easier for novices, but in the end it's just frustrating and impractical. We shouldn't be blundering around with a hobbled data analysis framework just because software isn't our end-goal, and while it's frustrating to hear physicists dismiss good coding practices, it's bewildering and slightly disheartening to hear that dismissal from a ROOT developer.

Furthermore, while I understand that incorporating contemporary software design into the framework is difficult, I think you give physicists far too little credit: we are a clever bunch, if an average software engineer can figure out how to use a template I think we can too. Give us a framework that has templates and a sensible class hierarchy and we'll use it. Unfortunately, you've given physicists a language that can't handle templates, and the framework overall looks like a straw-man argument against object oriented code*. It's no surprise that your average novice physicist doesn't care about these things, the examples they've seen are a disaster.

That being said, I'm quite intrigued by the last point: it sounds like a replacement for ROOT is on the horizon. Given that the younger generation is increasingly turning to non-ROOT libraries and languages (from simple things like stl, boost, and python, to new data formats like HDF5 and Protocol Buffers, to frameworks like scipy and matplotlib), I'd tend to agree. But if this is the case, shouldn't the ROOT team be focusing on breaking up the ROOT framework so that the parts can be salvaged? An enormous amount of work went into ROOT, and it doesn't seem logical to flush the entire framework just because it's grown too big to maintain.

*This is to say nothing of PyROOT, which gives an unfair introduction to Python. I'm constantly amazed by how many physicists assume that ROOT bugs and segfaults are a shortcoming of the python language. Outside PyROOT, a python module that causes a segmentation fault is universally considered defective: you can't make normal python segfault, and yet somehow PyROOT has managed to make infinite loops of segfaults common.

Re: PyROOT

Hi Code Monkey,

"you can't make normal python segfault" ... well, not to put too fine a point on it, but yes you can. You just have to push it around a little bit rougher than most users do. For example:

  import sys
  sys.setrecursionlimit(1 << 30)
  f = lambda f:f(f)
  if __name__ == '__main__':
    f(f)

And a problem like this exists by design: it will never be fixed and crashes python3 just a simply. Of course, most users will not run with a recursion limit that high, but it does prove that if you want to make the python interpreter crash, you can. :)

On a more serious note, PyROOT is not an extension module, it is a language binding: it exposes C++ and sometimes automatic choices are no choices and bring C++ features with all crashable details into Python. The proper comparison is with ctypes, which makes it even easier (much easier) to create crashes, and that module is in the standard library!

If you want users to start out with a cleaner, more pythonistic, Python introduction to ROOT, point them to rootpy.org, as the goal of that project is precisely that.

Cheers,
Wim 

Re: Credit

Hi Monkey

Thanks for your comment!

I am not against templates (in general), I am not against a well designed class hierarchy. Both would not make any sense. I am against code breaking changes for the sake of beauty.

I have one issue with template: it's currently (i.e. without concepts) impossible to document what types are appropriate as template parameters. In ROOT 5, templates actually mean an inflation of object code: all functions must be instantiated because they might be called through the interpreter. On the other hand a wider use of templates

  • improves code re-use and thus reduces bugs
  • can increase code readability (because of generic / meta programming)
  • reduces ambiguities in object ownership
Especially the last one is super crucial, especially for ROOT with its excessive use of raw pointers, and especially for novices. As a side note, in the C++ committee there are efforts to encapsulate all raw pointers in templates that describe the concept of the pointer use: raw memory address? something that can be iterated on? an array? an element of an array? a string? something that someone else owns? a resource I own? an optional value? an error code? - to name a few :-)

Did that convince you that I like templates? :-) But we have existing code, written by people who want to get physics results. What should we do? We always have to hide the templates if we use them in existing interfaces - which is not a satisfying solution for anyone.

Regarding the serialization libraries you mention: none of them can compare with ROOT's. (And most of them are all but new!) They offer a reduced feature set at higher throughput, but you can usually do the same with ROOT (turning off features and gaining speed). I/O is a really tricky business, it's easy to make bold statements and amazingly difficult to get it right, all the way, over decades, on petabyte levels.

Once ROOT 6 is out and we can use modern C++, and once the experiments have moved to C++11, I expect that we will polish ROOT's interfaces to bring them into current C++(14?) shape. But we will have to do that in a way that is backward compatible, at least to a large extend. And that's the tricky part.

Cheers, Axel

Template Concepts

Sorry to get in on this so late, but I wanted to point out that you can implement concepts quite easily in C++. Have a look at the boost concept_check library. These concepts use the compiler to check for failures on template instantiation of types which are meant to model the concept. Obviously, doing it with CInt/Cling would be more difficult, but certainly possible.

+1

Hi Axel,

I totally agree with you and I think the roadmap the ROOT-Team is on is sensible. I use ROOT not in the context of physics but in the context of complex systems simulation, e.g., neural networks. Before I found ROOT I tested numerous frameworks for data storage and analysis - and quite honestly when it comes to handling gigabytes of simulation data everything else besides ROOT was simply unusable. The API of ROOT may seem a bit awkward to software engineering "purists", but from a practical point of view I find it extremly efficient to use.

Keep on the good work,
Jochen

Re: +1

Hi Jochen,

Thank you for your nice comments! In principle we know that, but it really feels good to have that confirmed from time to time :-) Feel yourself signed up for the next ROOT workshop, by the way (Winter of 2015 in Saas Fee if all goes according to plan). I am really curious to hear more about your use of ROOT!

Cheers, Axel.