sorry, if this appears twice somewhere. It is a reply on a Pasha Murat mail. ---------- Forwarded message ---------- Date: Wed, 1 Apr 1998 19:37:17 +0200 From: Christoph Borgmeier <borg@hera-b.desy.de> Newsgroups: cern.root Subject: Re: again ROOT db (long) Hi Pasha, thank you for the remarks. It still seems to me, I could not make myself completely understandable. So I'll try to clarify some points. Some of these parts might look like arguing, but that is definitely not what I want. So first of all: I understand and agree with many of your points, if my further explanations and examples seem a bit odd or even sarcastic, it's just my struggling to be understood and mainly the effect of my little knowledge of the English language. On 31 Mar 1998, Pasha Murat wrote: [...] > Christoph Borgmeier writes: > > * all objects of one type must be stored in the same array. That might > > lead to problems with temporary and semi-temporary objects, i.e. objects > > which should not be stored or only be stored under certain circumstances. > > > Why isn't it possible to have 2 TCloneArray's or TObjArray's ? > I'm presently working on comparing 2 different pattern recognition > algorithms and 2 TObjArray's of tracks (with different names!) > and 2 different arrays of track segments (again - with different > names) coexist in the code just fine... oops, of course it's possible to have more than one array of a certain type. But you need, as you describe, completely disconnected sets of data. [...] > > * all `integer pointers' point into the same array. This forbids the use > > of polymorphism, which is a major advantage of the ROOT system. > > > Why isn't it possible for a track object to have one integer data > member being a number of the primary vertex and another integer > being a number of the calorimeter tower pointed to by a track? I think that was exactly my point: I have for example an array of primary vertices at the interaction point, and some other vertices, which have slightly different parameter sets linked with some tracks, from which they are reconstructed. Of course, they share a major part of their functionality, while they are still different types and are stored in different places. This would be an example of polymorphism. > > > * the objects pointed to are not defined by language constructs, the > > relations are not stored explicitly in the DB. So any code reading the DB > > has to have already built in the additional information about the > > relations. > > > In case of ROOT it is a Streamer function which writes/reads > an object to/from ROOT file. So code writing the ROOT file already > has to have built-in knowledge about the things it writes out. > As the same Streamer function does both reading and writing > there is nothing wrong with the same "knowledge" to be available > on the read branch. Apparently I did not express myself correctly. Let me try it again a little more verbousely: I might have the track class you mention above with an integer meaning `calorimeter tower'. I write some data into a file and access the database later. Now what tells me the entry `5' in this field? For example our reconstruction program has stored clusters in an array called RCAL? Is it the fifth (sixth) entry of it? Or does it point into our geometry array GEOXXX and denotes a certain tower? Maybe one has another array for yet another calorimeter. What do I try to explain: Nothing - neither a part of the C++ language, nor the ROOT run time type information - tells me, that this `5' actually means I have to look up the fifth (sixth) element of the recoCal array, which happens to be a data member of a certain FooEvent class. That was just a pathologic example. A closely related point is, that maybe I still could not get used to the idea, that the identity of an object is directly related to its certain position in a certain (global?) array. Note that this is the very classical (ZEBRA) approach, which differs from other possible ways: in C/C++ the identity of an object is given by its location in memory, while an object in a database might be recovered by a key (maybe similar to TNamed). The C/C++ pointers have obviously the advantage of being more dynamic: you can `new' and `delete' them without creating visible holes (like in an array) and without touching the relations of other objects. The latter would become a major problem when `Compress'ing a TClonesArray object which has collected some holes. [...] > > ROOT provides part of the necessary power, e.g. the possibility to store > > canvases with deep polymorphic substructures. But up to now, I failed to > > store parts of my events in a similar way. > > > Here is an important comment: if we consider the requirements to run-time > representation of the objects and to their persistent representation > it is easy to see that they are very different. At run-time one needs > the representation which would be as convenient and efficient as > possible so, for example, it makes sense to store in track/particle > object track px,py,pz, pt, mass, eta, phi and energy calculated once > to avoid multiple calculations of square roots(again - root ...), > sines, cosines etc. important point: I would not like to have more scalars on the surface as absolutely necessary. I would want to have also here a substructure, which provides some encapsulation: a FourVector, a function for calculating distances to (arbitrary) other objects, etc. That means, I would like to have a transparent functionality, with physics and geometry in the foreground, invisibly supported as well by the run-time representation as by the persistent representation. On the lowest level, even the most beautiful structures become `0' and `1'. This point has of course little to do with the TClonesArray-index discussion, but in many discussions I have experienced a certain correlation with it. You might get the impression, that the things I desire are completely theoretical. So I'll try to give some vague ideas: * Locality of serialized objects, `crosslinked branches': If one would not use ordinary pointers, but a class with the dereferencing operator working like a pointer, one could load additional objects on demand. This sounds similar to the virtual memory management, where additional pages are `ordered' by a page fault. Maybe one could even adapt this (platform dependent) version and avoid the replacement of the standard pointers. * `missing information' in TClonesArray-indices: One should define the relation between certain arrays _systematically_ and _persistently_. (uh, even our ZEBRA/FORTRAN package does that). That means, the class members should not be ints but `ArrayPointers' which can be normally dereferenced. Internally, such things could be ints. This is another example of what I mean when writing about `transparency'. ok, these are just vague first ideas, but maybe someone has further thoughts on them? [...] > But again - all this is mostly a matter of taste. yes, and I think it's always helpful and maybe even pleasant to discuss it. Cheers, Christoph [...]
This archive was generated by hypermail 2b29 : Tue Jan 04 2000 - 00:34:31 MET