Re: TTree::AutoSave

From: Rene Brun (Rene.Brun@cern.ch)
Date: Wed Nov 03 1999 - 09:10:13 MET


Hi Sue, Hi Volker,

To use the AutoSave facility, you have two options:
 1- call TTree::SetAutoSave(Int_t nbytes). When filling the tree,
Root               automatically will force a save of the current memory
buffers once the
    amount of information written to disk since the previous autosave is 
    more than nbytes.
 2- You invoke yourself TTree::AutoSave() at your selected frequency.

In case of a job crash, Root will be able to recover all the data up to
the
last AutoSave.
Now, you have to make a compromise between the following parameters:
 - basket buffer size
 - frequency to call AutoSave
 - time spent in too many AutoSaves
 - read performance

   If the basket buffer size is too large, basket buffers will be kept
   by definition more time in memory, so you may loose more data.
   However, large basket sizes are much better for the read performance.
   If your basket size is too small, you essentially do not need to call
   AutoSave, however you will see a degraded read performance.

   It is important to tune the relative size of the branch buffers such
that
   ideally baskets for all branches are saved at the same frequency.
   When you do a TTree::Print, you can see the number of baskets per
branch.
   Making this number about the same for all branches is good for
AutoSave,
   but not necessarily for the performance.
   For example, you may have a branch holding only one class data
member.
   If you specify a large basket size, you may be lucky to have in one
or a few
   baskets all the events for this member. Read performance will be
optimum
   because you minimize the number of seeks on disk.
   However, in the extreme situation where all the events for one branch
   can be kept in memory, you may loose the info for all events if you
have
   not called AutoSave.
   AutoSave guarantees data consistency for all branches.

So, it is up to you to tune the basket sizes to minimize the overhead
induced by too many AutoSaves in time and the read performance.
I would appreciate getting more feedback on this important subject.

One could for example imagine to implement the following scenario.
Root could automatically select a better basket size when enough data
has
been collected to compute the mean/rms of each entry per branch.
One could also imagine a function TTree::Hints suggesting the best
numbers.


Rene Brun



SCHUBERT@mnhep.hep.umn.edu wrote:
> 
> Hi Volker,
> 
> We are also playing with the use of TTree::AutoSave within the context
> of a data model in which we hope to be able to read event data from
> a ROOT file before the file has been closed by the writing process.
>   In my experiments with TTree::AutoSave (in root v2.22) so far, I have found
> that:
> 
> 1)Using a gDirectory->Purge() after the AutoSave will get rid of previously
>   AutoSave'd trees and keep the output file size small.  The sequence is:
> 
>    tree -> AutoSave();    //Call AutoSave to save the tree to disk
>    gDirectory -> Purge(); //Purge old trees
> 
>   Rene's modification to TTree::AutoSave in version 2.23/08 of ROOT
>   however makes the Purge unnecessary.
> 
> 2)As Rene suggests, tuning helps in the determination of when and how
>   often to AutoSave the tree.  Something that was not clear to me at
>   first was that using the TTree::SetAutoSave mechanism, e.g.
> 
>   tree -> SetAutoSave(1000000);  //AutoSave after every 1 Mbyte written to disk
> 
>   will invoke AutoSave after the specified number of bytes (1 Mbyte in
>   this example) are written to disk by basket dumps, and NOT after
>   1 Mbyte of data has been filled in the tree.
>   Thus the frequency of AutoSave's using the SetAutoSave mechanism is
>   directly tied to the basketsize set in the tree branch definition.
>   TTree::AutoSave will not rewrite information which has already been
>   dumped to disk by basket dumps.   Tuning the SetAutoSave and basket
>   size parameters so that the largest baskets are dumped to disk just
>   before AutoSave is invoked saves a lot of processing time.
> 
> Good luck with your work,
>  Sue Kasahara
> 
> p.s. If anybody else has experimented with AutoSave and has tips on how
> to optimize its use, I'd like to hear about it.
> 
> Rene Brun wrote:
> 
> > Hi Volker,
> > In the coming 2.23/08 I have modified TTree::AutoSave to
> > automatically delete the previously saved tree header once the new
> > one has been written.  I have also removed the print statement.
> >
> > Note that calling TTree::AutoSave too frequently (or similarly
> > calling TTree::SetAutoSave with a small value) is an expensive
> > operation.  You should make tests for your own application to find a
> > compromize between speed and the quantity of information you may
> > lose in case of a job crash.
> 
> > Rene Brun
> 
> > Volker Boerchers wrote:
> >
> 
> > > > The TTree::AutoSave function saves the Tree header in memory
> > > > including:
> > > >   - the branches/leaves data structure
> > > >   - the branch buffers in memory
> > > >
> > > > Depending the number of branches in your tree and the branch
> > > > buffer size the space occupied by the Tree header may be non-
> > > > negligible. It must be bigger than the size of one single event.
> > > > It does not make sense to call this function after each Fill.
> > > > The intented use is to save at regular intervals depending on:
> > > >   - how much data has been written so far on disk
> > > >   - how many cpu/rt seconds used since the last autosave
> > >
> 
> > > in our case the fill happens only once after each event (which
> > > takes a lot of cpu time). So this is actually a good occasion for a
> > > save (perhaps only after each n'th event).
> >
> 
> > > > Currently AutoSave does not delete the Tree headers on disk
> > > > generated by previous Autosave. If you have too many Autosave,
> > > > you will get (what you found) many Tree headers and the space can
> > > > be much bigger than the useful data.
> > > > I could implement an automatic delete of the previous AutoSave
> > > > once the current Autosave has successfully completed.
> > >
> 
> > > This would be a great improvement IMHO because older AutoSave data
> > > is of no use, right?  In the meanwhile, can I do the deletion `by
> > > hand'? (I can't, I suppose...)
> > >
> 
> > > Regards,
> > >  Volker



This archive was generated by hypermail 2b29 : Tue Jan 04 2000 - 00:43:42 MET