Re: [ROOT] 2.25.03: anomalous TKey cycles in TFile for large TTrees

From: Matthew D. Langston (langston@SLAC.stanford.edu)
Date: Tue Nov 14 2000 - 18:30:40 MET


Hi Rene,

Thank you for your help, and for setting me straight about the AutoSave
feature.

Wouldn't it make more sense for the AutoSave feature to treat the last
TTree::Write in the same way as previous calls to TTree::Write?  In other
words, if the last TTree::Write is successful, why keep around the previous
cycle produced by the AutoSave mechanism?  That previous cycle is useless,
and just takes up precious space if the last TTree::Write is successful.

Also, is there a convenient "purge" facility (ala VMS) to get rid of old
cycles in a TFile?  Or, does one just have to iterate through the TKeys of
the TFile by hand and delete the corresponding objects.  If the latter, then
is it necessary to "defragment" the TFile after deleting old cycles, or does
ROOT do this for us?

Regards, Matt

----- Original Message -----
From: "Rene Brun" <Rene.Brun@cern.ch>
To: "Matthew D. Langston" <langston@SLAC.Stanford.EDU>
Cc: "roottalk" <roottalk@pcroot.cern.ch>
Sent: Monday, November 13, 2000 11:43 PM
Subject: Re: [ROOT] 2.25.03: anomalous TKey cycles in TFile for large TTrees


> Hi Matt,
> The behaviour you describe is normal. See URL:
>     http://root.cern.ch/root/html225/TTree.html#TTree:AutoSave
>
> The default value for AutoSave is to save a copy of the Tree header
> as soon as you have written 100 Mbytes (uncompressed) to the file.
> When AutoSave is called, a new key is created with the current header.
> If the creation of the key is successful, the previous key is deleted.
> At the end of the job, when you do tree.Write, a new key is
> always created, keeping the previous cycle of the same key, if any.
> You can use the option kOverwrite when calling the final tree.Write
> if you want to keep one single key.
>
> The Autosave facility has been introduced against possible job crashes.
> You can always recover data up to the last AutoSave.
>
> In case you loop yourself on the list of TKeys in a directory,
> you have to take into account teh fact that one key may have more than
> one cycle.
>
> Rene Brun
>
> Matthew D. Langston wrote:
> >
> > The problem concerns ROOT 2.25.03 under RedHat Linux 6.1 Intel, and is
> > reproducible.
> >
> > I have come across an anomaly with TFile's containing large TTrees
(where
> > "large" means each TTree is anywhere from a few hundred MB to 1 GB
> > uncompressed, and the total size of the TFile is about 1 GB compressed).
> > The problem is that the ROOT I/O seems to be creating erroneous TKey
cycles
> > in the TFile.  Are my TFiles being created corrupt?  Or, are these extra
> > TKey cycles just an artifact of the TTree creation and filling process?
If
> > so, do I need to "purge" (ala VMS) these erroneous cycles?  If so, how?
> >
> > I have attached to this e-mail (in the file called TTree_Print.txt) the
> > output of calling TTree::Print just after I write each of six TTrees to
a
> > TFile, where each TTree is written to a separate TDirectory within the
> > TFile.  As you can see by looking at this attached file, there are six
> > sections of output from TTree::Print for each of the six TTrees that I
> > created and filled.
> >
> > However, when I later look at this TFile with TBrowser, or by simply
> > iterating through each TKey in the TFile and printing out the name and
cycle
> > number of the TTrees that I find, I see several cycles for each TTree.
I
> > have attached this output in the second file included with this email
(this
> > file is called TFile_contents.txt).
> >
> > For example, the first TTree that I create has 72019 entries according
to
> > TTree::Print just after filling and writing the TTree.  However, when I
> > later reopen the TFile and iterate over it, I find two TTrees with cycle
> > numbers 1 and 2, each with 67509 entries and 72019 entries,
respectively.
> >
> > Finally, I have attached a table (in the file TKey_cycle_anomaly.txt) of
the
> > output for the TFile that shows the cycle numbers, the number of entries
in
> > each TTree, and the difference in the number of entries between the
"good"
> > cycle and the "erroneous" cycle.  Note that one of the TTrees, the one
in
> > the TDirectory named "year_1995", didn't have one of the erroneous TKey
> > cycles created (i.e. this particular TTree is fine).  It would appear
that
> > this "erroneous TKey cycle effect" is triggered after a certain number
of
> > bytes is written to the TFile.
> >
> > So, based on this data, I would claim that ROOT created these additional
> > TTree cycles when it shouldn't have:
> >
> > year_1993/SelectedWABEvents;1
> >
> > year_1994/SelectedWABEvents;1
> >
> > year_1996/SelectedWABEvents;2
> >
> > year_1997/SelectedWABEvents;5
> >
> > year_1998/SelectedWABEvents;10
> >
> > Would anyone be able to comment on this?  This is such a dramatic effect
> > that I would assume that other collaborations which create large TTrees
and
> > TFiles would have seen this too.  Is there a workaround for this?
> >
> > Regards, Matt
> >
> > --
> > Matthew D. Langston
> > SLD, Stanford Linear Accelerator Center
> > langston@SLAC.Stanford.EDU
> >
>
  --------------------------------------------------------------------------
------
> >
> >                          Name: TTree_Print.txt
> >    TTree_Print.txt       Type: Plain Text (text/plain)
> >                      Encoding: 7BIT
> >
> >                             Name: TFile_contents.txt
> >    TFile_contents.txt       Type: Plain Text (text/plain)
> >                         Encoding: 7BIT
> >
> >                                 Name: TKey_cycle_anomaly.txt
> >    TKey_cycle_anomaly.txt       Type: Plain Text (text/plain)
> >                             Encoding: 7BIT
>



This archive was generated by hypermail 2b29 : Tue Jan 02 2001 - 11:50:37 MET