Hi,
Please find attached a new Makefile.linuxpgcc for Linux i386 with glibc
and PGCC ( PGCC i586/i686 RPMs can be found at http://www.foyer.se/PGCC.html ).
I'm not sure for some OPT flags - mainly the -funroll-all-loops - I think
some time ago someone reported that it gave him 2 or 3 times better
performance, but I cannot find this mail in roottalk archives ( maybe it
was about another utility :-) ). I'd like to ask all who have
experience with different "optimization" flags to share it with me.
I took some tests and benchmarks which seem to run fine ( please find
attached benchmarks.C.out which was run on a double PII/450 machine with
256MB RAM, RH5.2, 2.0.36 kernel, PGCC 1.1.1 compiler ). Please compare
these results with your own ( I mean mainly the speed ), if you have a
similar machine, and tell me if it is better or worse ( the authors of
the PGCC claim it should produce code up to 30% better in speed ).
I could not find any rules to compile/link g2root and h2root.
Could someone please provide them.
Also the make did NOT (re)create makedepend and rmkdepend which are in the
$ROOTSYS/bin subdirectory ( I think sources for these two utilities are
also not included ).
( The case of rootd is known to me - separate tar.gz source. )
Thanks in advance,
Best regards,
Jacek.
# Makefile of ROOT for Linux ( RH5.2 i386 ) with glibc and PGCC
# ( PGCC i586/i686 RPMs can be found at http://www.foyer.se/PGCC.html )
PGCC = /opt/pgcc
PLATFORM = linux
CXX = g++
CC = pgcc
CXXFLAGS = -Wall -fPIC -DR__GLIBC -fno-rtti -fno-exceptions
CFLAGS = -Wall -fPIC -DR__GLIBC
CINTCXXFLAGS = -Wall -fPIC -fno-rtti -fno-exceptions -DG__REGEXP \
-DG__UNIX -DG__SHAREDLIB -DG__OSFDLL -DG__ROOT -DG__REDIRECTIO
CINTCFLAGS = -Wall -fPIC -DG__REGEXP -DG__UNIX -DG__SHAREDLIB \
-DG__OSFDLL -DG__ROOT -DG__REDIRECTIO
#OPT = -g
OPT = -O6 -mpentiumpro -mstack-align-double -funroll-all-loops
NOOPT =
LD = g++
LDFLAGS = $(OPT) -Wl,-rpath,$(PGCC)/lib:$(ROOTSYS)/lib
SOFLAGS = -shared -Wl,-soname,
SOEXT = so
SYSLIBS = -lm -ldl -rdynamic
SYSXLIBS = -L/usr/X11R6/lib -lX11 -lm -ldl -rdynamic
XLIBS = $(ROOTSYS)/lib/libXpm.a -L/usr/X11R6/lib -lX11
CILIBS = -lm -ltermcap -lbsd -ldl -rdynamic
##### MACROS and TARGETS #####
include Make-macros
##### DEPENDENCIES #####
include Make-depend
[holeczek@p2a02 tutorials]$ root
*******************************************
* *
* W E L C O M E to R O O T *
* *
* Version 2.21/07 8 March 1999 *
* *
* You are welcome to visit our Web site *
* http://root.cern.ch *
* *
*******************************************
CINT/ROOT C/C++ Interpreter version 5.13.90, Feb 18 1999
Type ? for help. Commands must be C++ statements.
Enclose multiple statements between { }.
Welcome to the ROOT tutorials
Type ".x demos.C" to get a toolbar from which to execute the demos
Type ".x demoshelp.C" to see the help window
root [0] .x benchmarks.C
hsimple : Real Time = 2.35 seconds Cpu Time = 1.73 seconds
hsum : Real Time = 1.34 seconds Cpu Time = 1.06 seconds
Object statistics
class cnt on heap size total size heap size
============================================================================
TObject 39 39 12 468 468
TAxis 3 0 100 300 0
THashList 2 2 44 88 88
THashTable 2 2 36 72 72
TUnixSystem 1 1 252 252 252
TOrdCollection 4 4 40 160 160
TEnv 1 1 24 24 24
TROOT 1 0 236 236 0
TClassTable 1 1 12 12 12
TObjectTable 1 1 24 24 24
----------------------------------------------------------------------------
Total: 55 51 780 1636 1100
============================================================================
fillrandom: Real Time = 0.25 seconds Cpu Time = 0.07 seconds
TFile** fillrandom.root
TFile* fillrandom.root
KEY: TFormula form1;1 abs(sin(x)/x)
KEY: TF1 sqroot;1 x*gaus(0) + [3]*form1
KEY: TH1F h1f;1 Test random numbers
sqroot : x*gaus(0) + [3]*form1 Ndim= 1, Npar= 4, Noper= 11
fExpr[0] = x fOper = 110000
fExpr[1] = G fOper = 2001
fExpr[2] = * fOper = 3
fExpr[3] = [3] fOper = 104
fExpr[4] = x fOper = 110000
fExpr[5] = sin fOper = 11
fExpr[6] = x fOper = 110000
fExpr[7] = / fOper = 4
fExpr[8] = abs fOper = 41
fExpr[9] = * fOper = 3
fExpr[10] = + fOper = 1
Par 0 p0 = 10
Par 1 p1 = 4
Par 2 p2 = 1
Par 3 p3 = 20
TH1.Print Name= Func, Total sum= 1514.36
FCN=191.938 FROM MIGRAD STATUS=CONVERGED 141 CALLS 142 TOTAL
EDM=0.000261941 STRATEGY= 1 ERROR MATRIX ACCURATE
EXT PARAMETER STEP FIRST
NO. NAME VALUE ERROR SIZE DERIVATIVE
1 p0 3.29817e+01 6.12182e-01 2.92273e-03 5.48776e-03
2 p1 3.98912e+00 2.73706e-02 9.44425e-05 9.98335e-01
3 p2 1.00017e+00 2.18738e-02 6.01098e-05 3.01152e-01
4 p3 6.29884e+01 1.40686e+00 8.60789e-03 -1.65665e-03
fit1 : Real Time = 0.27 seconds Cpu Time = 0.16 seconds
TFile** hsimple.root
TFile* hsimple.root
KEY: TH1F hpx;1 This is the px distribution
KEY: TH2F hpxpy;1 py vs px
KEY: TProfile hprof;1 Profile of pz versus px
KEY: TNtuple ntuple;1 Demo ntuple
i 0 0.000000 1.986693
i 1 0.100000 2.955202
i 2 0.200000 3.894183
i 3 0.300000 4.794255
i 4 0.400000 5.646425
i 5 0.500000 6.442177
i 6 0.600000 7.173561
i 7 0.700000 7.833269
i 8 0.800000 8.414710
i 9 0.900000 8.912073
i 10 1.000000 9.320391
i 11 1.100000 9.635582
i 12 1.200000 9.854497
i 13 1.300000 9.974950
i 14 1.400000 9.995736
i 15 1.500000 9.916648
i 16 1.600000 9.738476
i 17 1.700000 9.463000
i 18 1.800000 9.092975
i 19 1.900000 8.632093
tornado : Real Time = 0.20 seconds Cpu Time = 0.14 seconds
TNode count=1000 name=TF1N1
TNode count=2000 name=ST2B122
TNode count=3000 name=PR3_3
TNode count=4000 name=TF3N6
TNode count=5000 name=TF3T1
TNode count=6000 name=TF3T1
TNode count=7000 name=TF3T1
TNode count=8000 name=ST4B135
TNode count=9000 name=TF4P4
TNode count=10000 name=TF4P4
TNode count=11000 name=TF4P4
TNode count=12000 name=WF281
TNode count=13000 name=SZ3811
na49 : Real Time = 7.74 seconds Cpu Time = 7.62 seconds
geometry : Real Time = 1.05 seconds Cpu Time = 1.01 seconds
na49view : Real Time = 0.21 seconds Cpu Time = 0.10 seconds
na49view : Real Time = 0.83 seconds Cpu Time = 0.30 seconds
FCN=13.6566 FROM MIGRAD STATUS=CONVERGED 64 CALLS 65 TOTAL
EDM=4.34281e-15 STRATEGY= 1 ERROR MATRIX ACCURATE
EXT PARAMETER STEP FIRST
NO. NAME VALUE ERROR SIZE DERIVATIVE
1 p0 1.29313e+00 1.76826e-01 2.61580e-05 -4.80065e-06
2 p1 1.60130e-03 1.77265e-02 7.08820e-06 -1.24991e-05
3 p2 9.06693e-01 1.52986e-02 1.91372e-06 -6.30101e-05
ntuple1 : Real Time = 1.15 seconds Cpu Time = 0.96 seconds
---------------ROOT 2.21/07 benchmarks summary--------------------
hsimple : Real Time = 2.35 seconds Cpu Time = 1.73 seconds
hsum : Real Time = 1.34 seconds Cpu Time = 1.06 seconds
fillrandom: Real Time = 0.25 seconds Cpu Time = 0.07 seconds
fit1 : Real Time = 0.27 seconds Cpu Time = 0.16 seconds
tornado : Real Time = 0.20 seconds Cpu Time = 0.14 seconds
na49 : Real Time = 7.74 seconds Cpu Time = 7.62 seconds
geometry : Real Time = 1.05 seconds Cpu Time = 1.01 seconds
na49view : Real Time = 0.83 seconds Cpu Time = 0.30 seconds
ntuple1 : Real Time = 1.15 seconds Cpu Time = 0.96 seconds
TOTAL : Real Time = 15.18 seconds Cpu Time = 13.05 seconds
---------------ROOT 2.21/07 benchmarks summary (in ROOTMARKS)-----
For comparison, an HP735/99 is benchmarked at 27 ROOTMARKS
hsimple = 122.02 RealMARKS, = 127.82 CpuMARKS
hsum = 122.71 RealMARKS, = 107.24 CpuMARKS
fillrandom = 99.36 RealMARKS, = 111.86 CpuMARKS
fit1 = 142.00 RealMARKS, = 128.25 CpuMARKS
tornado = 140.40 RealMARKS, = 169.71 CpuMARKS
na49 = 108.42 RealMARKS, = 108.57 CpuMARKS
na49view = 91.73 RealMARKS, = 133.20 CpuMARKS
ntuple1 = 194.17 RealMARKS, = 203.62 CpuMARKS
geometry = 183.86 RealMARKS, = 164.14 CpuMARKS
MEAN = 133.85 RealMARKS, = 139.38 CpuMARKS
****************************************************
* Your machine is estimated at 136.62 ROOTMARKS *
****************************************************
root [1] .q
This is the end of ROOT -- Goodbye
[holeczek@p2a02 tutorials]$
This archive was generated by hypermail 2b29 : Tue Jan 04 2000 - 00:43:30 MET