Re: Tree Questions

From: Christoph Borgmeier (borg@mail.desy.de)
Date: Fri Jan 15 1999 - 17:03:40 MET


On 12 Jan 1999, Rene Brun wrote:

[...]
> I agree that it would be nice to have a symetric behaviour if you work
> in split mode or not. We are currently a few doing brainstorming
> on new requirements for TTrees. This point is being discussed.
> I take this opportunity to ask users to send now specific requirements
> regarding TTrees. It is the right time.
[...]

Hello all,

I think, there could be certain things done to achieve a consistent
behaviour or slit and non-split Trees and TClonesArrays and polymorphic
tree entries.

As I understand it, split TClonesArrays are used to store huge amounts of
flat rectangular data, similar to the FORTRAN solutions. Graphical objects
can be stored in a polymorphic way (non-TClonesArray) and are not split.
This seems to be an object-oriented database. I wonder, if it is possible
to combine these two approaches smoothly. For reconstructed objects like
different type of particle patterns and vertices, the polymorphic storage
seems very attractive.

I found one problem on each side up to now:

* The polymorphic version stores objects, which are referenced by others,
and recreates them when the others are read from the tree. A hashing
mechanism makes sure, that they are only loaded once per branch. The
problem is, that they are not deleted, when the next event is loaded. This
should be simple, since a list of all specially created objects must
exist. Up to now, I make my own hash table, looking up the TDataMembers.
That really takes some time. This small extension to ROOT would help a
lot.

* The idea of the TClonesArray-Entries is to store mainly scalars, and to
reference entries of other Arrays by their indices. This is similar to the
FORTRAN world. The disadvantage is, that it is not self-describing and
different arrays can be easily confused, e.g. what would be

  Int_t fcluster;

if there are several TClonesArrays of reconstructed and Monte-Carlo
Calorimeter information. Another restriction is, that all instanciations
of certain classes have to be member of exactly one TClonesArray. (The
latter might be unavoidable.)

My proposal would be to create a class around these integers to provide
the referencation and to guide the compiler and the user. This could be
done inline for compiled and CINT-interpreted code. Something like this
(very rough):

template<class T>
class TClonesPointer
{
public:
  ...
  inline T& operator*() const { return *fArray->At( fIdx ); }
  inline T* operator->() const { return fArray->At( fIdx ); }
  ...
private:
  static TClonesArray* fArray;
  Uint_t fIdx;
};

Note that these objects can have the same memory layout as unsigned int
(if they are not derived from anything and don't have virtual functions).
For each template instanciation, one can indicate which array is refered
to. Maybe like this:

  TCLonesArray myEllipses("TEllipse");
  ClonesPointer<TEllipse>::fArray = &myEllipses; // maybe done by some
                                                 // clever automatism
  class GoesIntoTheClonesArray
  {
    ...
    ClonesPointer<TEllipse> elli;
  } x;

  x.elli = ... ;

  x.elli->SetX1(.2);

What do you think?

Kind regards
Christoph


-- 
 Christoph Borgmeier    Mail:  DESY F15/HERA-B, Geb. 61/117
                               Notkestr. 85, 22607 Hamburg
 Humboldt Univ Berlin   Phone: +49 40 8998 4850
                        Email: Christoph.Borgmeier@desy.de



This archive was generated by hypermail 2b29 : Tue Jan 04 2000 - 00:43:28 MET