Hi Pasha, right, in our experiments we also never overwrite primary dataset which contains everything, including lots of garbage. We did it with tapes/PAW and we do it now with HDD/ROOT. That's OK. Let's assume that an "average customer" would also keep primary data in a safe place. On the other hand it is hard to understand why secondary dataset should be overwritten completely if I want to see how results change after changing one field in one record. Also, it is a bit hard to argue "why any SQL database has INSERT and DELETE commands". I personally prefer to have both possibilities (overwrite or not) even if I would not use overwrite in 99.9% of my data manipulation. On the other hand I do not request to implement something which will affect ROOT IO performance. I asked this question just because Rene mentioned several times that overwriting is quite easy and is not introduced because of political reasons. I am also a bit curious why ROOT has only one storage class, but that's another issue. Regards, Anton http://www.smartquant.com ----- Original Message ----- From: Pasha Murat (630)840-8237@169G <murat@ncdf41.fnal.gov> To: Anton Fokin <anton.fokin@smartquant.com> Cc: roottalk <roottalk@pcroot.cern.ch>; Eddy Offermann <eddy@rentec.com> Sent: Tuesday, February 27, 2001 2:46 AM Subject: Re: [ROOT] Re: TTree modification > Anton Fokin wrote: > ... snip... > > But it is quite hard to > > explain "one write many reads" issue to an average customer of the > > framework, which suppose to be general enough to serve people doing > > different things (including university researchers in finance for example). > > hi Anton - didn't you try to explain to an "average customer" something along > the following lines: > > the data which are coming in in real time and can't be > reproduced later SHOULD NEVER be overwritten/corrected. This is what is called > primary datasets in HEP. THis is why "one write many reads" is a very natural > concept for the experimental physics. We also have a notion of a secondary dataset > which is a derivative (selection/result of reprocessing) from the primary dataset. > The secondary datasets, unlike the primary ones, can be recreated as many times as > it is needed, so correcting the "raw" data never becomes an issue. > > best, Pasha > >
This archive was generated by hypermail 2b29 : Tue Jan 01 2002 - 17:50:38 MET