Re: [ROOT] TTree modification

From: Rene Brun (Rene.Brun@cern.ch)
Date: Mon Feb 26 2001 - 10:05:11 MET


Hi Anton,

We are well aware of all your points 1 to 6 and the associated theory.
I respect your experience with DOS/CPM. I have also spent sometime
in the past, may be too much, in developing systems based on fixed
size binary records and there are many reasons why we abandonned
such systems.
 I/O is of crucial importance for our framework and I hope you understand
that we are targeting primarily the most urgent requests from the
experiments we are working with.
 Root supports currently two kinds of storage techniques
 - via MyClass::Write function if MyClass derives from TObject.
   This is based on a simple serialisation algorithm. You can delete, replace,
   overwrite objects by new versions. This technique is OK for a small
   number of objects (say < one million). The main limitation is the size
   of the table of the keys (on average 60 bytes per key) and not so much
   the time to find an object (hashtable). Use this technique for objects
   such as histograms, graphs, canvases, geometries, calibrations.
   Note that the serialisation algorithm in version 3.00 has nothing to do,
   like you claim, with the Borland algorithms.
 - via TTree. Trees are designed to support the large collection of events
   in the style write-once/read many times. Trees can be grouped into TChains.
   This is optimum for sequential processing or sequential/skip/direct mode
   when you loop on a subset of a large collection.
   
Some clarification on a "replace mode" for TTrees.
A replace option could be considered only in the special case of branches
containing only basic types and with no compression. Replacing variable length
structures or compressed branches will destroy completly the simple and
efficient addressing scheme used internally.
Replacing a complete branch could also be considered in all modes.

I want to stress here that our priority is the consolidation of the
I/O sub system. Version 3.00 includes an important improvement in this area.
Support for an automatic schema evolution was a must. It is now operational
and it is a big progress compared to the automatic or hand coded Streamers
with user written code to support the class evolution.

I/O performance is another priority. It was a main consideration
in the new I/O scheme and also why it took so much time to implement it.

You are also raising some non technical points. Concerning the support
for Root and the size of the support team: The CERN Computing Review
has recently recommended in its final report that the system must now be
officially supported and additional manpower provided.

Concerning the main Root web page, we will update this page in due time
as soon as we will announce the new production version.
Talking about Web pages, I noticed that your web page 
http://www.smartquant.com is nearly unreadable.

Rene Brun

Anton Fokin wrote:
> 
> Hi,
> 
> > I'd like to point out a few things. First of all, there is a principal
> difference
> > between the I/O  methods used by "DB-like" and "non-DB-like" applications
> and one
> > of the consequences of this difference is that the latter can achieve much
> higher
> > I/O bandwidth per process.
> 
> In old good times of DOS/CPM I've been involved in a low level database
> design. From this experience
> I would say that the highest I/O performance can be achieved if you
> 
> 1. Write/Read fixed size binary records.
> 2. Do not provide insert functionality but do provide replace functionality
> instead.
> 3. Delete records via setting "deleted" flag and clean up (rewrite) a
> database during idle (night) time
> 4. Do format c: before installing a database to not jump between cilinders
> on the HD
> 5. Provide smart buffering/caching which adopts (system) I/O buffer to
> record size and user queries.
> 6. Provide smart indexing (hashing for string fields)
> 
> This lets you read/write data with (near) your system I/O speed which can be
> much higher than 25-30MB/sec
> on modern SCSI devices.
> 
> The I/O performance currently is one of
> > the key issues for HEP experiments. For example, CDF already had to modify
> > ROOT object-oriented I/O mechanism when writing out the TTree's out of the
> DAQ
> > to be be able to achieve the rate of about 25-30 MB/sec per process (the
> default I/O
> > doesn't provide such rate), and this is what defines the overall data
> logging
> > rate for us now.
> 
> I think that ROOT Trees are much heavier than 1-6 described above. That is
> the reason for your modification.
> 
> > ROOT I/O allows to write objects into a file and to modify/delete them
> after they
> > have been written. TTree is just one of many objects ROOT can write out.
> > Let people correct me if I'm wrong, but as far as I can tell, TTree is a
> very
> > specialized container, designed to optimize the I/O performance for the
> objects
> > stored in it, and the assumption that the TTree object is not going to be
> modified,
> > only appended, seems to be quite important for this optimization.
> Therefore,
> > I'd be extremely cautious about making any changes to the design of TTree
> which
> > could have impact on the performance.
> 
> Object serialization mechanism which we use in ROOT was initially developed
> by Borland for TurboVision and TurboPascal 5.5-6.0 in somewhat 1985-90. I do
> not think somebody considers this mechanism for real databasing.
> 
> Unfortunately TTree is the only database-like container in ROOT. ROOT
> doesn't have a hierarchy of data storage classes which provide different
> functionalities on different levels. For example if I write only fixed size
> records with several binary fields I do not need 80% of TTree
> functionalities. Thus I would guess I can gain xx% in I/O performace
> providing a class for this specific case. At the same time I would like to
> use TTree like query/drawing so that I would like the same (virtual) user
> interface for all databasing classes.
> 
> > Definitely, having additional DB-oriented capabilities in ROOT would be
> nice.
> > However a question of whether these capabilities should be provided
> > by modifying the TTree or by implementing a different kind of data
> container
> > is an open one.
> 
> This is exactly what I have asked. If nobody needs these features in TTree I
> would like to write my own storage class for my project. I have also noticed
> that ROOT doesn't work weel with small events of a few tens of bytes. Thus I
> think it should be stated quite clearly in what field ROOT is suppose to be
> used. Operating with hundreds of HEP 10-100MB events is quite different from
> millions of 100 byte spectroscopy events.
> 
> > I'd also like to comment on another issue. I know that there is a lot of
> > requests to the ROOT team coming from the HEP experiments, which
> implementation
> > requires significant resources. For example, PROOF-server is along-awaited
> > project. The implementation of the specialized ROOT client-server utility
> to
> > minimize the traffic over the net when running ROOT on a remote node gives
> another
> > example. Full integration of the "TBuffer-exchange" (fast I/O) mode into
> ROOT
> > is yet another one. CDF has requested this mode and is using it for the
> data-taking
> > and I believe that the next generation of experiments will depend on this
> mode
> > even stronger. Taking into consideration the actual resources of the ROOT
> team,
> > I think, that we need to have well-specified priorities.
> 
> I do not want to start any kind of flame, but could you tell me why ROOT
> team consists of only two persons if it serves experiments with billion
> annual bugets? My long research experience tells me that scientific
> organizations have a very inefficient management. Is it a kind of game? Just
> for fun look into "future plans" on the ROOT site (last updated in 95 or so)
> and compare it with the present ROOT status. If ROOT would take one or two
> more people with permanent positions all these plans could become true.
> 
> Regards,
> Anton



This archive was generated by hypermail 2b29 : Tue Jan 01 2002 - 17:50:37 MET