Re: [ROOT] reduction

From: K. Hauschild (karlhaus@hep.saclay.cea.fr)
Date: Thu Jan 17 2002 - 16:25:10 MET


Hi Rene, Christian,

Here are the results of Rene's fix.........

OLD METHOD
==========

******************************************************************************
*Tree    :45MeV     : 8pi 232Th(7Li,xxx) : Run101 (UpStrm)                   *
*Entries :   772929 : Total =        57402209 bytes  File  Size =   15700940 *
*        :          : Tree compression factor =   3.66                       *
******************************************************************************
*Branch  :event                                                              *
*Entries :   772929 : BranchElement (see below)                              *
*............................................................................*
*Br    2 :fGes      : fGes_                                                  *
*Entries :   772929 : Total  Size=    6145920 bytes  File Size  =    1688064 *
*Baskets :       72 : Basket Size=     128000 bytes  Compression=   3.64     *
*............................................................................*
*Br    8 :fSaphir.fChan : fChan[fSaphir_]                                    *
*Entries :   772929 : Total  Size=    3074444 bytes  File Size  =       9376 *
*Baskets :       48 : Basket Size=     128000 bytes  Compression= 327.91     *
*............................................................................*
*Br   10 :fCsIs     : fCsIs_                                                 *
*Entries :   772929 : Total  Size=    6145992 bytes  File Size  =    1791389 *
*Baskets :       72 : Basket Size=     128000 bytes  Compression=   3.43     *
*............................................................................*

compare with NEW METHOD
             ==========
             
******************************************************************************
*Tree    :45MeV     : 8pi 232Th(7Li,xxx) : Run101 (UpStrm)                   *
*Entries :   772929 : Total =        48179969 bytes  File  Size =   10812971 *
*        :          : Tree compression factor =   4.46                       *
******************************************************************************
*Branch  :event                                                              *
*Entries :   772929 : BranchElement (see below)                              *
*............................................................................*
*Br    2 :fGes      : fGes_                                                  *
*Entries :   772929 : Total  Size=    3071904 bytes  File Size  =      56798 *
*Baskets :       24 : Basket Size=     128000 bytes  Compression=  54.08     *
*............................................................................*
*Br    7 :fSaphir   : fSaphir_                                               *
*Entries :   772929 : Total  Size=    3071976 bytes  File Size  =       6036 *
*Baskets :       24 : Basket Size=     128000 bytes  Compression= 508.94     *
*............................................................................*
*Br   10 :fCsIs     : fCsIs_                                                 *
*Entries :   772929 : Total  Size=    3071928 bytes  File Size  =     136837 *
*Baskets :       24 : Basket Size=     128000 bytes  Compression=  22.45     *
*............................................................................*

So with the fix the total file size in this example 2/3 the size
of the file using the old method.

I processed 2000 blocks of data, so the raw data would be 32M. So, the
real compression factors are ~2 and ~3.

I will now check that I can read the file Tree correctly because the
savings in size are better than I expected !


Thanks Rene,

Karl



>
> Hi Christian,
> 
> What Karl is doing is perfectly correct and a very efficient solution.
> As indicated in a previous mail, I have fixed a problem in the
> TBranchElement constructor where an unnecessary buffer was created
> for the counter branch of the TClonesArray. With this fix, the
> average size of the counter should be below one byte in average.
> I am just waiting the confirmation from Karl.
> 
> Rene Brun
> 
> On Wed, 16 Jan 2002, Christian Holm Christensen wrote:
> 
> > Hi Karl, 
> > 
> > On Wed, 16 Jan 2002 18:07:57 +0100 (MET)
> > "K. Hauschild" <karlhaus@hep.saclay.cea.fr> wrote
> > concerning "[ROOT] reduction":
> > > Hi All,
> > > 
> > 
> > > I want to make my root Data files smaller : here is what I do, but I
> > > would like to know if the is a more space efficient way of storing
> > > the data.  I have the following class to store/access data for a
> > > particular detector type.
> > >
> > > //Class for CsI Detectors
> > > class DetectorCsI : public TObject {
> > >   
> > >  private:
> > >   UShort_t    fId;       //Detector Id
> > >   UShort_t    fPId;      //Detector Particle Id
> > >   UShort_t    fChan;     //Detector chan 
> > >   UShort_t    fTac;      //Detector TAC
> > >   
> > >  public:
> > >   DetectorCsI()  {;}
> > >   DetectorCsI(UShort_t id, UShort_t pid, UShort_t chan, UShort_t tac);
> > >   virtual ~DetectorCsI() {;}
> > >   
> > >   //setters and getters
> > >   inline UShort_t GetId()    {return fId;  }
> > >   inline UShort_t GetPId()   {return fPId; }
> > >   inline UShort_t GetChan()  {return fChan;}
> > >   inline UShort_t GetTac()   {return fTac; }
> > >   
> > >   ClassDef (DetectorCsI,1)  //CsI Detector class
> > > };
> > > 
> > 
> > > So, in this case "fCsIs_" is a large overhead. I presume this is
> > > where the number of CsI elements hit is stored per event entry. From
> > > the numbers above it would seem this is 8 bytes per event entry. In
> > > my particular case this need only be 8 bits since I have less than
> > > 255 CsI detectors.  Why is the overhead needed by ROOT so large
> > > ?. Is there any way to define the size of "fCsIs_" ?
> > >
> > > Or, am I barking up the wrong tree and there is a more space
> > > efficient way of handling this ?
> > > 
> > 
> > The branch fCsIs_ is from the TClonesArray, and I guess it can't be
> > changed, and probably shouldn't. 
> > 
> > However, I you only have 255 detector elements, I guess that
> > DetectorCsI::fId needs only take values form 0 - 254, you can store
> > the make DetectorCsI::fId of type Byte_t (1 byte = 8 bits, giving a
> > range of 0 - 255):
> > 
> >   class DetectorCsI : public TObject {
> >   private:
> >     Byte_t      fId;       //Detector Id
> >     UShort_t    fPId;      //Detector Particle Id
> >     UShort_t    fChan;     //Detector chan 
> >     UShort_t    fTac;      //Detector TAC
> >    public:
> >     DetectorCsI()  {}
> >     DetectorCsI(UShort_t id, UShort_t pid, UShort_t chan, UShort_t tac) {
> >       fId = Byte_t(id > 255 ? 255 : id); fPid=pid; fChan=1; fTac=tac; }
> >     virtual ~DetectorCsI() {}
> >     inline UShort_t GetId()    { return (UShort_t)fId;  }
> >     inline UShort_t GetPId()   { return fPId; }
> >     inline UShort_t GetChan()  { return fChan;}
> >     inline UShort_t GetTac()   { return fTac; }    
> >     ClassDef (DetectorCsI,1)  //CsI Detector class
> >   };
> > 
> > That saves you 1 byte per entry in the TClonesArray, which comes to
> > 255 bytes if all detectors fire, much more then the possible 7 bytes
> > from making fCsIs_ a Byte_t instead of a Double_t.  And if possible,
> > you can apply this to fPid, fChan, and fTac.  
> > 
> > If you only store one DetectorCsI object per tree fill, don't put it
> > in a TClonesArray (doesn't make sense really) but put it on a seperate
> > branch.  
> > 
> > Also, if you don't use TRef or TRefArray, you can use
> > TObject::fUniqueID to hold fPid, fTac, or fChan, giving you another 2
> > bytes.  
> > 
> > Finally, ROOT does compress integer types per default, so you're
> > probably already as low as you can get.  Check out the manual for more
> > on how to set different compression schemes. 
> > 
> > Hope that helps. 
> > 
> > Yours, 
> > 
> > Christian Holm Christensen -------------------------------------------
> > Address: Sankt Hansgade 23, 1. th.           Phone:  (+45) 35 35 96 91 
> >          DK-2200 Copenhagen N                Cell:   (+45) 28 82 16 23
> >          Denmark                             Office: (+45) 353  25 305 
> > Email:   cholm@nbi.dk                        Web:    www.nbi.dk/~cholm
> > 
> 

==========================================================================

CEA Saclay, DAPNIA/SPhN                Phone  : (33) 01 69 08 7553
Bat 703 - l'Orme des Merisiers         Fax    : (33) 01 69 08 7584
F-91191 Gif-sur-Yvette                 E-mail :  khauschild@cea.fr
France                                           karl_hauschild@yahoo.co.uk
                                       WWW: http://www-dapnia.cea.fr/Sphn



This archive was generated by hypermail 2b29 : Sat Jan 04 2003 - 23:50:38 MET