Hi Sue, Hi Volker, To use the AutoSave facility, you have two options: 1- call TTree::SetAutoSave(Int_t nbytes). When filling the tree, Root automatically will force a save of the current memory buffers once the amount of information written to disk since the previous autosave is more than nbytes. 2- You invoke yourself TTree::AutoSave() at your selected frequency. In case of a job crash, Root will be able to recover all the data up to the last AutoSave. Now, you have to make a compromise between the following parameters: - basket buffer size - frequency to call AutoSave - time spent in too many AutoSaves - read performance If the basket buffer size is too large, basket buffers will be kept by definition more time in memory, so you may loose more data. However, large basket sizes are much better for the read performance. If your basket size is too small, you essentially do not need to call AutoSave, however you will see a degraded read performance. It is important to tune the relative size of the branch buffers such that ideally baskets for all branches are saved at the same frequency. When you do a TTree::Print, you can see the number of baskets per branch. Making this number about the same for all branches is good for AutoSave, but not necessarily for the performance. For example, you may have a branch holding only one class data member. If you specify a large basket size, you may be lucky to have in one or a few baskets all the events for this member. Read performance will be optimum because you minimize the number of seeks on disk. However, in the extreme situation where all the events for one branch can be kept in memory, you may loose the info for all events if you have not called AutoSave. AutoSave guarantees data consistency for all branches. So, it is up to you to tune the basket sizes to minimize the overhead induced by too many AutoSaves in time and the read performance. I would appreciate getting more feedback on this important subject. One could for example imagine to implement the following scenario. Root could automatically select a better basket size when enough data has been collected to compute the mean/rms of each entry per branch. One could also imagine a function TTree::Hints suggesting the best numbers. Rene Brun SCHUBERT@mnhep.hep.umn.edu wrote: > > Hi Volker, > > We are also playing with the use of TTree::AutoSave within the context > of a data model in which we hope to be able to read event data from > a ROOT file before the file has been closed by the writing process. > In my experiments with TTree::AutoSave (in root v2.22) so far, I have found > that: > > 1)Using a gDirectory->Purge() after the AutoSave will get rid of previously > AutoSave'd trees and keep the output file size small. The sequence is: > > tree -> AutoSave(); //Call AutoSave to save the tree to disk > gDirectory -> Purge(); //Purge old trees > > Rene's modification to TTree::AutoSave in version 2.23/08 of ROOT > however makes the Purge unnecessary. > > 2)As Rene suggests, tuning helps in the determination of when and how > often to AutoSave the tree. Something that was not clear to me at > first was that using the TTree::SetAutoSave mechanism, e.g. > > tree -> SetAutoSave(1000000); //AutoSave after every 1 Mbyte written to disk > > will invoke AutoSave after the specified number of bytes (1 Mbyte in > this example) are written to disk by basket dumps, and NOT after > 1 Mbyte of data has been filled in the tree. > Thus the frequency of AutoSave's using the SetAutoSave mechanism is > directly tied to the basketsize set in the tree branch definition. > TTree::AutoSave will not rewrite information which has already been > dumped to disk by basket dumps. Tuning the SetAutoSave and basket > size parameters so that the largest baskets are dumped to disk just > before AutoSave is invoked saves a lot of processing time. > > Good luck with your work, > Sue Kasahara > > p.s. If anybody else has experimented with AutoSave and has tips on how > to optimize its use, I'd like to hear about it. > > Rene Brun wrote: > > > Hi Volker, > > In the coming 2.23/08 I have modified TTree::AutoSave to > > automatically delete the previously saved tree header once the new > > one has been written. I have also removed the print statement. > > > > Note that calling TTree::AutoSave too frequently (or similarly > > calling TTree::SetAutoSave with a small value) is an expensive > > operation. You should make tests for your own application to find a > > compromize between speed and the quantity of information you may > > lose in case of a job crash. > > > Rene Brun > > > Volker Boerchers wrote: > > > > > > > The TTree::AutoSave function saves the Tree header in memory > > > > including: > > > > - the branches/leaves data structure > > > > - the branch buffers in memory > > > > > > > > Depending the number of branches in your tree and the branch > > > > buffer size the space occupied by the Tree header may be non- > > > > negligible. It must be bigger than the size of one single event. > > > > It does not make sense to call this function after each Fill. > > > > The intented use is to save at regular intervals depending on: > > > > - how much data has been written so far on disk > > > > - how many cpu/rt seconds used since the last autosave > > > > > > > in our case the fill happens only once after each event (which > > > takes a lot of cpu time). So this is actually a good occasion for a > > > save (perhaps only after each n'th event). > > > > > > > Currently AutoSave does not delete the Tree headers on disk > > > > generated by previous Autosave. If you have too many Autosave, > > > > you will get (what you found) many Tree headers and the space can > > > > be much bigger than the useful data. > > > > I could implement an automatic delete of the previous AutoSave > > > > once the current Autosave has successfully completed. > > > > > > > This would be a great improvement IMHO because older AutoSave data > > > is of no use, right? In the meanwhile, can I do the deletion `by > > > hand'? (I can't, I suppose...) > > > > > > > Regards, > > > Volker
This archive was generated by hypermail 2b29 : Tue Jan 04 2000 - 00:43:42 MET