Re: [ROOT] TTree modification

From: Pasha Murat (630)840-8237 FNAL (630)859-3463 home (murat@murat.fnal.gov)
Date: Sun Feb 25 2001 - 19:35:17 MET


I'd like to point out a few things. First of all, there is a principal difference 
between the I/O  methods used by "DB-like" and "non-DB-like" applications and one 
of the consequences of this difference is that the latter can achieve much higher
I/O bandwidth per process. The I/O performance currently is one of 
the key issues for HEP experiments. For example, CDF already had to modify 
ROOT object-oriented I/O mechanism when writing out the TTree's out of the DAQ
to be be able to achieve the rate of about 25-30 MB/sec per process (the default I/O 
doesn't provide such rate), and this is what defines the overall data logging 
rate for us now. 

ROOT I/O allows to write objects into a file and to modify/delete them after they 
have been written. TTree is just one of many objects ROOT can write out. 
Let people correct me if I'm wrong, but as far as I can tell, TTree is a very 
specialized container, designed to optimize the I/O performance for the objects 
stored in it, and the assumption that the TTree object is not going to be modified, 
only appended, seems to be quite important for this optimization. Therefore, 
I'd be extremely cautious about making any changes to the design of TTree which 
could have impact on the performance. 

Definitely, having additional DB-oriented capabilities in ROOT would be nice.
However a question of whether these capabilities should be provided
by modifying the TTree or by implementing a different kind of data container
is an open one.

I'd also like to comment on another issue. I know that there is a lot of
requests to the ROOT team coming from the HEP experiments, which implementation
requires significant resources. For example, PROOF-server is along-awaited 
project. The implementation of the specialized ROOT client-server utility to 
minimize the traffic over the net when running ROOT on a remote node gives another 
example. Full integration of the "TBuffer-exchange" (fast I/O) mode into ROOT 
is yet another one. CDF has requested this mode and is using it for the data-taking
and I believe that the next generation of experiments will depend on this mode 
even stronger. Taking into consideration the actual resources of the ROOT  team, 
I think, that we need to have well-specified priorities.

							best, Pasha

Anton Fokin wrote:
> 
> Hi Rene and rooters,
> 
> I saw another message on the roottalk about TTree modification. Rene, is
> "one write - many read" policy your final answer? I still do not understand
> why you do not want to provide (optional) insert/delete/modify entry
> mechanism if you told once it is simple from technical point of view.
> 
> Here is a simple example of an application where ROOT fits data warehousing
> needs but can not be efficiently used without TTree modifocation. Trade and
> quote real time data from Reuters/Bloomberg and other providers are quite
> similar to experimental on-line data. Every day you get a plane stream of
> thousands of records ordered in time. Thus you can write them in a root tree
> and beat any other RDBMS with the speed when you analyse this data
> warehouse. One small detail breaks down this idyllic picture: after some
> time you can get a signal saying that a record #xxx was not correct and the
> next record is corrected #xxx.
> 
> The same happens when you want to mark certain events in the warehouse as
> outliers. Etc. etc.
> 
> I would say that in general this feature prevents ROOT from spreading out in
> the other than hep communities. One of such applications I have mentioned
> above.
> 
> Should we discuss this tree modification topic within ROOT community and
> formulate the final decision?
> 
> Regards,
> Anton
> 
> http://www.smartquant.com



This archive was generated by hypermail 2b29 : Tue Jan 01 2002 - 17:50:37 MET