Re: [ROOT] Optimizing performance in Windows

From: Valeri Fine (fine@bnl.gov)
Date: Mon Feb 25 2002 - 19:48:27 MET


> You should obtain a performance twice as good as on my small laptop.
> If you run in optimized mode, it would be interesting to find out
> if the bad performance is due to Windows.

  He used the non-optimized version. The optimized version  gives the 
  expected factor 2 (on my laptop)

              Cheers, Valeri

> 
> Rene Brun
> 
> 
> Ed Oltman wrote:
> > 
> > Hi Rene,
> > 
> > The root file I am processing is 133,440 KB.  Here's what the ntuple looks
> > like:
> > 
> > root [1] nt.Print()
> > ****************************************************************************
> > **
> > *Tree    :nt        : Raw
> > *
> > *Entries : 10009775 : Total =       361019458 bytes  File  Size =  137536093
> > *
> > *        :          : Tree compression factor =   2.63
> > *
> > ****************************************************************************
> > **
> > *Br    0 :x         :
> > *
> > *Entries : 10009775 : Total  Size=   40093494 bytes  File Size  =   20655625
> > *
> > *Baskets :     1253 : Basket Size=      32000 bytes  Compression=   1.94
> > *
> > *...........................................................................
> > .*
> > *Br    1 :y         :
> > *
> > *Entries : 10009775 : Total  Size=   40093494 bytes  File Size  =   20856422
> > *
> > *Baskets :     1253 : Basket Size=      32000 bytes  Compression=   1.92
> > *
> > *...........................................................................
> > .*
> > *Br    2 :tof       :
> > *
> > *Entries : 10009775 : Total  Size=   40090988 bytes  File Size  =   24044984
> > *
> > *Baskets :     1253 : Basket Size=      32000 bytes  Compression=   1.67
> > *
> > *...........................................................................
> > .*
> > *Br    3 :z         :
> > *
> > *Entries : 10009775 : Total  Size=   40093494 bytes  File Size  =   12689049
> > *
> > *Baskets :     1253 : Basket Size=      32000 bytes  Compression=   3.16
> > *
> > *...........................................................................
> > .*
> > *Br    4 :v         :
> > *
> > *Entries : 10009775 : Total  Size=   40093494 bytes  File Size  =     157586
> > *
> > *Baskets :     1253 : Basket Size=      32000 bytes  Compression= 254.42
> > *
> > *...........................................................................
> > .*
> > *Br    5 :delta     :
> > *
> > *Entries : 10009775 : Total  Size=   40093494 bytes  File Size  =    9622126
> > *
> > *Baskets :     1253 : Basket Size=      32000 bytes  Compression=   4.17
> > *
> > *...........................................................................
> > .*
> > *Br    6 :pulse     :
> > *
> > *Entries : 10009775 : Total  Size=   40093494 bytes  File Size  =   14472837
> > *
> > *Baskets :     1253 : Basket Size=      32000 bytes  Compression=   2.77
> > *
> > *...........................................................................
> > .*
> > *Br    7 :vap       :
> > *
> > *Entries : 10009775 : Total  Size=   40090988 bytes  File Size  =     151613
> > *
> > *Baskets :     1253 : Basket Size=      32000 bytes  Compression= 264.43
> > *
> > *...........................................................................
> > .*
> > *Br    8 :erate     :
> > *
> > *Entries : 10009775 : Total  Size=   40093494 bytes  File Size  =   34702827
> > *
> > *Baskets :     1253 : Basket Size=      32000 bytes  Compression=   1.16
> > *
> > *...........................................................................
> > .*
> > root [2]
> > 
> > To spin through the ntuple once it takes about 1 minute of real time (54 sec
> > CPU)
> > 
> > When I create an output ntuple, it takes about 3 minutes real time (2:30
> > CPU).  Root
> > page faults about 400-450 times/sec and the memory increases by about 1.6
> > MB/sec. Once the
> > output ntuple was made the process size was 320 MB (originally 40 MB)  The
> > output ntuple
> > looks like:
> > 
> > root [9] ntCorr->Print()
> > ****************************************************************************
> > **
> > *Tree    :ntCorr    : 1008CC16
> > *
> > *Entries :  8595694 : Total =       275456000 bytes  File  Size =          0
> > *
> > *        :          : Tree compression factor =   1.00
> > *
> > ****************************************************************************
> > **
> > *Br    0 :x         :
> > *
> > *Entries :  8595694 : Total  Size=   34432000 bytes  File Size  =          0
> > *
> > *Baskets :     1076 : Basket Size=      32000 bytes  Compression=   1.00
> > *
> > *...........................................................................
> > .*
> > *Br    1 :y         :
> > *
> > *Entries :  8595694 : Total  Size=   34432000 bytes  File Size  =          0
> > *
> > *Baskets :     1076 : Basket Size=      32000 bytes  Compression=   1.00
> > *
> > *...........................................................................
> > .*
> > *Br    2 :z         :
> > *
> > *Entries :  8595694 : Total  Size=   34432000 bytes  File Size  =          0
> > *
> > *Baskets :     1076 : Basket Size=      32000 bytes  Compression=   1.00
> > *
> > *...........................................................................
> > .*
> > *Br    3 :tofc      :
> > *
> > *Entries :  8595694 : Total  Size=   34432000 bytes  File Size  =          0
> > *
> > *Baskets :     1076 : Basket Size=      32000 bytes  Compression=   1.00
> > *
> > *...........................................................................
> > .*
> > *Br    4 :tofr      :
> > *
> > *Entries :  8595694 : Total  Size=   34432000 bytes  File Size  =          0
> > *
> > *Baskets :     1076 : Basket Size=      32000 bytes  Compression=   1.00
> > *
> > *...........................................................................
> > .*
> > *Br    5 :v         :
> > *
> > *Entries :  8595694 : Total  Size=   34432000 bytes  File Size  =          0
> > *
> > *Baskets :     1076 : Basket Size=      32000 bytes  Compression=   1.00
> > *
> > *...........................................................................
> > .*
> > *Br    6 :erate     :
> > *
> > *Entries :  8595694 : Total  Size=   34432000 bytes  File Size  =          0
> > *
> > *Baskets :     1076 : Basket Size=      32000 bytes  Compression=   1.00
> > *
> > *...........................................................................
> > .*
> > *Br    7 :bv        :
> > *
> > *Entries :  8595694 : Total  Size=   34432000 bytes  File Size  =          0
> > *
> > *Baskets :     1076 : Basket Size=      32000 bytes  Compression=   1.00
> > *
> > *...........................................................................
> > .*
> > 
> > When I would spin through this (memory resident) ntuple (15% fewer entries
> > than input)
> > it only took 11 seconds (CPU and real)..
> > 
> > Here is the output from ROOTSYS\test\event.exe
> > [I have a Dell Precision 330: P4 1.5 GHz w/1GB RAM]
> > 
> > ==> Step 1
> > C:\ROOT\test>Event 2000 0 0 1
> > event:0, rtime=0.032000 s
> > event:100, rtime=0.609001 s
> > event:200, rtime=0.608999 s
> > event:300, rtime=0.610001 s
> > event:400, rtime=0.608999 s
> > event:500, rtime=0.610001 s
> > event:600, rtime=0.608999 s
> > event:700, rtime=0.608999 s
> > event:800, rtime=0.610001 s
> > event:900, rtime=0.593000 s
> > event:1000, rtime=0.610001 s
> > event:1100, rtime=0.608999 s
> > event:1200, rtime=0.610001 s
> > event:1300, rtime=0.608999 s
> > event:1400, rtime=0.594000 s
> > event:1500, rtime=0.609001 s
> > event:1600, rtime=0.608999 s
> > event:1700, rtime=0.610001 s
> > event:1800, rtime=0.608999 s
> > event:1900, rtime=0.610001 s
> > ****************************************************************************
> > **
> > *Tree    :T         : An example of a ROOT tree
> > *
> > *Entries :     2000 : Total =       113596283 bytes  File  Size =  113596283
> > *
> > *        :          : Tree compression factor =   1.00
> > *
> > ****************************************************************************
> > **
> > *Br    0 :event     :
> > *
> > *Entries :     2000 : Total  Size=  113560552 bytes  File Size  =  113560552
> >  *
> > *Baskets :     2000 : Basket Size=      64000 bytes  Compression=   1.00
> > *
> > *...........................................................................
> > .*
> > 
> > 2000 events and 113414552 bytes processed.
> > RealTime=12.282000 seconds, CpuTime=12.000000 seconds
> > compression level=0, split=0, arg4=1
> > You write 9.234209 Mbytes/Realtime seconds
> > You write 9.451213 Mbytes/Cputime seconds
> > 
> > ==>Step 2
> > C:\ROOT\test>Event 2000 0 0 20
> > event:0, rtime=0.063000 s
> > event:100, rtime=0.202999 s
> > event:200, rtime=0.203001 s
> > event:300, rtime=0.202999 s
> > event:400, rtime=0.187000 s
> > event:500, rtime=0.204000 s
> > event:600, rtime=0.202999 s
> > event:700, rtime=0.187000 s
> > event:800, rtime=0.202999 s
> > event:900, rtime=0.188002 s
> > event:1000, rtime=0.202999 s
> > event:1100, rtime=0.203001 s
> > event:1200, rtime=0.188000 s
> > event:1300, rtime=0.202999 s
> > event:1400, rtime=0.203001 s
> > event:1500, rtime=0.186998 s
> > event:1600, rtime=0.203001 s
> > event:1700, rtime=0.204000 s
> > event:1800, rtime=0.187000 s
> > event:1900, rtime=0.202999 s
> > 
> > 2000 events and 110950706 bytes processed.
> > RealTime=4.030998 seconds, CpuTime=4.000000 seconds
> > You read 27.524375 Mbytes/Realtime seconds
> > You read 27.737677 Mbytes/Cputime seconds
> > 
> > ==>Step 3
> > C:\ROOT\test>Event 2000 0 0 20
> > event:0, rtime=0.063002 s
> > event:100, rtime=0.202999 s
> > event:200, rtime=0.203001 s
> > event:300, rtime=0.202999 s
> > event:400, rtime=0.188000 s
> > event:500, rtime=0.203001 s
> > event:600, rtime=0.186998 s
> > event:700, rtime=0.203001 s
> > event:800, rtime=0.188000 s
> > event:900, rtime=0.203001 s
> > event:1000, rtime=0.202999 s
> > event:1100, rtime=0.188000 s
> > event:1200, rtime=0.203001 s
> > event:1300, rtime=0.202999 s
> > event:1400, rtime=0.187000 s
> > event:1500, rtime=0.204000 s
> > event:1600, rtime=0.187000 s
> > event:1700, rtime=0.202999 s
> > event:1800, rtime=0.188000 s
> > event:1900, rtime=0.203001 s
> > 
> > 2000 events and 110950706 bytes processed.
> > RealTime=4.016001 seconds, CpuTime=4.000000 seconds
> > You read 27.627163 Mbytes/Realtime seconds
> > You read 27.737677 Mbytes/Cputime seconds
> > 
> > I hope this information is helpful..
> > 
> > Ed
> > 
> > > -----Original Message-----
> > > From: Rene Brun [mailto:brun@pcbrun.cern.ch]
> > > Sent: Saturday, February 23, 2002 2:48 AM
> > > To: Ed Oltman
> > > Cc: Roottalk@Pcroot. Cern. Ch
> > > Subject: Re: [ROOT] Optimizing performance in Windows
> > >
> > >
> > > Hi Ed,
> > >
> > > You do not mention the size of your file nor the time it takes
> > > to process the file (Real and Cpu time). To help me to understand
> > > your problem, could you run the standard ROOT test program
> > > in $ROOTSYS/test?
> > > cd to this directory. Build Event (make or nmake)
> > > then run the following sequence
> > >   Event 2000 0 0 1
> > >   Event 2000 0 0 20
> > >   Event 2000 0 0 20
> > >
> > > Send me the output of the 3 steps.
> > > The first step should produce a 113 Mbytes file
> > > The following steps read the file.
> > > To give you an indication, here is the output on my small laptop
> > > (DELL P III 600 Mhz with 256 Mbytes RAM) running Linux.
> > >
> > > ==>step 1
> > > (pcnotebrun) #[182] Event 2000 0 0 1
> > > event:0, rtime=0.640000 s
> > > event:100, rtime=0.720000 s
> > > event:200, rtime=0.710000 s
> > > event:300, rtime=0.730000 s
> > > event:400, rtime=0.710000 s
> > > event:500, rtime=0.720000 s
> > > event:600, rtime=0.730000 s
> > > event:700, rtime=0.730000 s
> > > event:800, rtime=0.730000 s
> > > event:900, rtime=0.740000 s
> > > event:1000, rtime=0.720000 s
> > > event:1100, rtime=0.740000 s
> > > event:1200, rtime=0.740000 s
> > > event:1300, rtime=0.740000 s
> > > event:1400, rtime=0.740000 s
> > > event:1500, rtime=0.750000 s
> > > event:1600, rtime=0.740000 s
> > > event:1700, rtime=0.740000 s
> > > event:1800, rtime=0.720000 s
> > > event:1900, rtime=0.740000 s
> > > ******************************************************************
> > > ************
> > > *Tree    :T         : An example of a ROOT tree
> > > *
> > > *Entries :     2000 : Total =       113630291 bytes  File  Size =
> > > 113630291 *
> > > *        :          : Tree compression factor =   1.00
> > > *
> > > ******************************************************************
> > > ************
> > > *Br    0 :event     :
> > > *
> > > *Entries :     2000 : Total  Size=  113594552 bytes  File Size  =
> > > 113594552 *
> > > *Baskets :     2000 : Basket Size=      64000 bytes  Compression=   1.00
> > > *
> > > *.................................................................
> > > ...........*
> > >
> > > 2000 events and 113448552 bytes processed.
> > > RealTime=18.320000 seconds, CpuTime=14.470000 seconds
> > > compression level=0, split=0, arg4=1
> > > You write 6.192607 Mbytes/Realtime seconds
> > > You write 7.840260 Mbytes/Cputime seconds
> > >
> > > ==step 2
> > > (pcnotebrun) #[183] Event 2000 0 0 20
> > > event:0, rtime=0.070000 s
> > > event:100, rtime=0.250000 s
> > > event:200, rtime=0.250000 s
> > > event:300, rtime=0.220000 s
> > > event:400, rtime=0.230000 s
> > > event:500, rtime=0.230000 s
> > > event:600, rtime=0.500000 s
> > > event:700, rtime=1.150000 s
> > > event:800, rtime=0.240000 s
> > > event:900, rtime=0.250000 s
> > > event:1000, rtime=0.440000 s
> > > event:1100, rtime=0.460000 s
> > > event:1200, rtime=0.310000 s
> > > event:1300, rtime=0.260000 s
> > > event:1400, rtime=0.450000 s
> > > event:1500, rtime=0.440000 s
> > > event:1600, rtime=0.240000 s
> > > event:1700, rtime=0.520000 s
> > > event:1800, rtime=0.460000 s
> > > event:1900, rtime=0.510000 s
> > >
> > > 2000 events and 113448552 bytes processed.
> > > RealTime=8.510000 seconds, CpuTime=4.390000 seconds
> > > You read 13.331205 Mbytes/Realtime seconds
> > > You read 25.842495 Mbytes/Cputime seconds
> > >
> > >
> > > ==>step 3
> > > (pcnotebrun) #[185] Event 2000 0 0 20
> > > event:0, rtime=0.070000 s
> > > event:100, rtime=0.210000 s
> > > event:200, rtime=0.190000 s
> > > event:300, rtime=0.180000 s
> > > event:400, rtime=0.190000 s
> > > event:500, rtime=0.190000 s
> > > event:600, rtime=0.190000 s
> > > event:700, rtime=0.190000 s
> > > event:800, rtime=0.180000 s
> > > event:900, rtime=0.190000 s
> > > event:1000, rtime=0.190000 s
> > > event:1100, rtime=0.190000 s
> > > event:1200, rtime=0.180000 s
> > > event:1300, rtime=0.190000 s
> > > event:1400, rtime=0.190000 s
> > > event:1500, rtime=0.180000 s
> > > event:1600, rtime=0.200000 s
> > > event:1700, rtime=0.180000 s
> > > event:1800, rtime=0.190000 s
> > > event:1900, rtime=0.190000 s
> > >
> > > 2000 events and 113448552 bytes processed.
> > > RealTime=3.850000 seconds, CpuTime=3.750000 seconds
> > > You read 29.467157 Mbytes/Realtime seconds
> > > You read 30.252948 Mbytes/Cputime seconds
> > >
> > > Could you also send me the result of TTree::Print on your ntuple?
> > >
> > > Rene Brun
> > >
> > > On Fri, 22 Feb 2002, Ed Oltman wrote:
> > >
> > > > Hello,
> > > >
> > > >   I would like to know if anybody has any advice on performance
> > > optimizing
> > > > on Win2K - I am using qtRoot "version 3.02/06 25 January 2002".  My
> > > > application analyzed a root file which consists of a TNtuple.
> > > The analysis
> > > > requires spinning though the TNtuple multiple times.  The
> > > TNtuple has 8-10
> > > > branches and can contain up to 10's of millions of entries.
> > > Note: I have a
> > > > gigabyte of physical memory.  I have 2 questions:
> > > >
> > > > 1. When I first spin through the ntuple, it takes a long time -
> > > I can see
> > > > that my root process is doing lots of I/O - fine.  when I do a
> > > .ls I see the
> > > > input ntuple "in memory" that is, it shows up on a line
> > > starting w/ OBJ:.
> > > > However my process memory is about the same size as it was
> > > before I read in
> > > > the TNtuple.  When I spin through the TNtuple subsequent times, it takes
> > > > just as long - I my process once again logs lots of I/O.  How
> > > can I read in
> > > > the ntuple in such a way that it lives in memory?
> > > >
> > > > 2.  At some point, I generate an output TNtuple.  When I do
> > > this, my root
> > > > process starts page faulting like crazy but eventually the
> > > process memory
> > > > size creeps up to the required size.  [Note: when I spin through the
> > > > TNtuple, its quite a bit faster!]  How can I configure root to
> > > use start out
> > > > with a larger allocated memory so that I don't need to page fault.
> > > >
> > > > Thanks..
> > > >
> > > > Ed Oltman
> > > >
> > > >
> > >
> > >
> > >
> > >
> 



This archive was generated by hypermail 2b29 : Sat Jan 04 2003 - 23:50:42 MET