overhead of ROOT I/O?

From: Pasha Murat (murat@cdfsga.fnal.gov)
Date: Sun Oct 26 1997 - 06:20:18 MET


	Hello,

after exercising a little bit with ROOT I/O I ended up with a problem which
might be interesting for everybody on this list and with an example,
full source code of which (including makefile) is enclosed below.

What this example is doing:
--------------------------
I'm writing out 10000 events , each event (EVENT_AB) is a very simple object.
It contains 2 pointers to dynamically allocated objects (A and B), 
each of A and B is 2 integers+TObject (see header files below)
Defining pointers to A and B in the event serves for writing events 
out in split mode.

I intentionally use a small buffer(2000 bytes), which however is much larger
than the event size, so buffer size shouldn't be a factor in this exercise.

Next, I'm trying to estimate the expected size of the output file. 
Size of each A and B is 12 bytes (2*4+sizeof(TObject)=12), 
size of the event header (EVENT_AB) is also 12 bytes: 
4+4+sizeof(TObject) = 12 bytes (the last number may be even smaller). 

With 10000 events written in it the output file should have its size of
about 10000*(12+12+12) = 360000 bytes. ROOT, however, creates a file, 
which is 712959 bytes large, i.e. almost 2 times larger! 

Here is what the output says:

/cdf/upgrade/tracking/murat/test_root>ab.exe
TFile Writing Name=ab.root Title=ab test
************************************************************************************
*Tree    :AB        : AB tree                                                      *
*Entries :    10000 : Total  Size =    712644 bytes  File  Size =     712644 bytes *
*        :          : Tree compression factor =   1.00                             *
************************************************************************************
*Branch  :Event     : Event                                                        *
*Entries :    10000 : BranchObject (see below)                                     *
*..................................................................................*
*Branch  :fA        : fA                                                           *
*Entries :    10000 : Total  Size =    290002 bytes  File Size  =     290002 bytes *
*Baskets :      166 : Basket Size =      2000 bytes  EvOffsetLen=       1000       *
*        :          : Branch compression factor =   1.00                           *
*..................................................................................*
*Branch  :fB        : fB                                                           *
*Entries :    10000 : Total  Size =    290002 bytes  File Size  =     290002 bytes *
*Baskets :      166 : Basket Size =      2000 bytes  EvOffsetLen=       1000       *
*        :          : Branch compression factor =   1.00                           *
*..................................................................................*
*Branch  :fUniqueID : fUniqueID                                                    *
*Entries :    10000 : Total  Size =     39960 bytes  File Size  =      39960 bytes *
*Baskets :       20 : Basket Size =      2000 bytes  EvOffsetLen=          0       *
*        :          : Branch compression factor =   1.00                           *
*..................................................................................*
*Branch  :fBits     : fBits                                                        *
*Entries :    10000 : Total  Size =     39960 bytes  File Size  =      39960 bytes *
*Baskets :       20 : Basket Size =      2000 bytes  EvOffsetLen=          0       *
*        :          : Branch compression factor =   1.00                           *
*..................................................................................*
/cdf/upgrade/tracking/murat/test_root>dir -l *.root
-rw-r--r--    1 murat    cdfupg    456479 Aug 29 11:19 Event.root
-rw-r--r--    1 murat    cdfupg    712959 Oct 25 23:36 ab.root

*********************

-	my 1st observation is that the size of each of fA and fB branches 
	(290002) is about 2.5 times larger than one could expect from 
	multiplying 12*10000 = 120000

-	the size of fBits and fUniqueID branches (constituents of TObject) is 
	about right: 4*10000 = 40000 and the branch size is 39960

-       then there is some missing component which accounts for 
	712644-2*(290002+39960) =  52720 bytes ( about 5 bytes per event)


The resulting question is: where the overhead (factor of 1.98) is coming from? 
- It seems to be unacceptably large...

	I'd appreciate any comments and hope that I did something wrong,

					thanks, Pasha.




****************************************************** 
source codes and makefile of the example which produced output above
*****************************************************



------------------------------ A.hh
#ifndef __A_HH__
#define __A_HH__
#include "TObject.h"

class A: public TObject {
public:
  Int_t Word[2];

  A();
  virtual ~A();
  ClassDef(A,1)
};

#endif
----------------------------- B.hh
#ifndef __B_HH__
#define __B_HH__
#include "TObject.h"

class B: public TObject {
public:
  Int_t Word[2];

  B();
  virtual ~B();
  ClassDef(B,1)
};

#endif
----------------------------- EVENT_AB.hh
#ifndef __EVENT_AB_HH__
#define __EVENT_AB_HH__
#include "A.hh"
#include "B.hh"


class EVENT_AB : public TObject {
  A*     fA;
  B*     fB;
public:
  EVENT_AB();
  virtual ~EVENT_AB();
  void    init();
  ClassDef(EVENT_AB,1)
};
#endif
------------------------------- A.cc
#include "A.hh"

ClassImp(A)

A::A() {
  Word[0] = 101;
  Word[1] = 202;
}

A::~A() {
}
-------------------------------- B.cc
#include "B.hh"

ClassImp(B)


B::B() {
  Word[0] = 101;
  Word[1] = 202;
}


B::~B() {
}
---------------------------------- EVENT_AB.cc
#include "EVENT_AB.hh"

ClassImp(EVENT_AB)

EVENT_AB::EVENT_AB() {
  fA = 0;
  fB = 0;
}


void EVENT_AB::init() {
  if (fA) delete fA;
  fA = new A();
  if (fB) delete fB;
  fB = new B();
}

EVENT_AB::~EVENT_AB() {
  if (fA) delete fA;
  if (fB) delete fB;
}
---------------------------------- test_write1.cc
// -*- Mode: C++ -*-
//------------------------------------------------------------------------------
//  Oct 03 1997 P.Murat
//
//  revision history :
//  ------------------ 
// *0000 Oct 14 1997 P.Murat: creation date
//------------------------------------------------------------------------------
#ifdef __GNUG__
#pragma implementation
#endif


#include <stdlib.h>

#include "TROOT.h"
#include "TFile.h"
#include "TRandom.h"
#include "TTree.h"
#include "TBranch.h"
#include "TClonesArray.h"
#include "TStopwatch.h"

#include "EVENT_AB.hh"

EVENT_AB*      Event;
TROOT*         Root;
TFile*         File;
TTree*         Tree;
TBranch*       BranchAB;


int main() {

  int split, bufsize;
  int comp = 0;

  Root = ::new TROOT("root","root");

  File = new TFile("ab.root","RECREATE","ab test");

  File->SetCompressionLevel(comp);
  Tree = new TTree("AB","AB tree");

					// autosave when 1 Mbyte written
  Tree->SetAutoSave(1000000);
  bufsize = 2000;
  split   = 1;
  Event     = new EVENT_AB();
  BranchAB  = Tree->Branch("Event","EVENT_AB",&Event,bufsize,split);

  for (int i=0; i<10000; i++) {
    Event->init();
    Tree->Fill();
  }

  File->Write();
  Tree->Print();
					// tree should be deleted BEFORE
					// the file is closed
  delete Tree;
  delete Event;
  File->Close();
  delete File;
  delete Root;
}
------------------------------------------- Makefile
ROOTLIBS      = -L$(ROOTSYS)/lib -lBase -lRint -lCint -lClib -lCont -lFunc -lGraf \
                -lGraf3d -lHist -lHtml -lMeta -lMinuit -lNet -lPostscript \
                -lProof -lTree -lUnix -lZip
ROOTGLIBS     = -lGpad -lGX11 -lMotif -lWidgets -lX3d

# SGI with GCC
CXX           = gcc
CXXFLAGS      = -fsigned-char -fPIC -w -g -I$(ROOTSYS)/include
LD            = gcc
LDFLAGS       = -Wl,-u,__builtin_new -Wl,-u,__builtin_delete -Wl,-u,__nw__FUiPv
SOFLAGS       = -Wl,-soname,libEvent.so -shared
LIBS          = $(ROOTLIBS) -lg++ -lm -ldl
#GLIBS         = $(ROOTLIBS) $(ROOTGLIBS) -lXm -lXt -lX11 -lg++ -lm -lPW -ldl

#  don't need -ldl on IRIX 6.2

GLIBS         = $(ROOTLIBS) $(ROOTGLIBS) -lXm -lXt -lX11 -lg++ -lm -lPW 

w1:

	rootcint -f ab_cint.cc -c A.hh B.hh EVENT_AB.hh
	gcc -o ab.exe $(CPPFLAGS) $(CXXFLAGS) A.cc B.cc EVENT_AB.cc \
		ab_cint.cc test_write1.cc  $(LDFLAGS) $(GLIBS)
********************************************************************************
********************************************************************************
/cdf/upgrade/tracking/murat/test_root>make w1
rootcint -f ab_cint.cc -c A.hh B.hh EVENT_AB.hh
Note: operator new() masked c
gcc -o ab.exe  -fsigned-char -fPIC -w -g -I/cdf/upgrade/root/v1_03/include A.cc B.cc EVENT_AB.cc \
        ab_cint.cc test_write1.cc  -Wl,-u,__builtin_new -Wl,-u,__builtin_delete -Wl,-u,__nw__FUiPv -L/cdf/upgrade/root/v1_03/lib -lBase -lRint lCint -lClib -lCont -lFunc -lGraf -lGraf3d -lHist -lHtml -lMeta -lMinuit -lNet -lPostscript -lProof -lTree -lUnix -lZip -lGpad -lGX11 -lMotif -Widgets -lX3d -lXm -lXt -lX11 -lg++ -lm -lPW 
10.251u 1.871s 0:18.02 67.2% 0+0k 344+36io 96pf+0w



This archive was generated by hypermail 2b29 : Tue Jan 04 2000 - 00:26:21 MET