Hi, Please find attached a new Makefile.linuxpgcc for Linux i386 with glibc and PGCC ( PGCC i586/i686 RPMs can be found at http://www.foyer.se/PGCC.html ). I'm not sure for some OPT flags - mainly the -funroll-all-loops - I think some time ago someone reported that it gave him 2 or 3 times better performance, but I cannot find this mail in roottalk archives ( maybe it was about another utility :-) ). I'd like to ask all who have experience with different "optimization" flags to share it with me. I took some tests and benchmarks which seem to run fine ( please find attached benchmarks.C.out which was run on a double PII/450 machine with 256MB RAM, RH5.2, 2.0.36 kernel, PGCC 1.1.1 compiler ). Please compare these results with your own ( I mean mainly the speed ), if you have a similar machine, and tell me if it is better or worse ( the authors of the PGCC claim it should produce code up to 30% better in speed ). I could not find any rules to compile/link g2root and h2root. Could someone please provide them. Also the make did NOT (re)create makedepend and rmkdepend which are in the $ROOTSYS/bin subdirectory ( I think sources for these two utilities are also not included ). ( The case of rootd is known to me - separate tar.gz source. ) Thanks in advance, Best regards, Jacek. # Makefile of ROOT for Linux ( RH5.2 i386 ) with glibc and PGCC # ( PGCC i586/i686 RPMs can be found at http://www.foyer.se/PGCC.html ) PGCC = /opt/pgcc PLATFORM = linux CXX = g++ CC = pgcc CXXFLAGS = -Wall -fPIC -DR__GLIBC -fno-rtti -fno-exceptions CFLAGS = -Wall -fPIC -DR__GLIBC CINTCXXFLAGS = -Wall -fPIC -fno-rtti -fno-exceptions -DG__REGEXP \ -DG__UNIX -DG__SHAREDLIB -DG__OSFDLL -DG__ROOT -DG__REDIRECTIO CINTCFLAGS = -Wall -fPIC -DG__REGEXP -DG__UNIX -DG__SHAREDLIB \ -DG__OSFDLL -DG__ROOT -DG__REDIRECTIO #OPT = -g OPT = -O6 -mpentiumpro -mstack-align-double -funroll-all-loops NOOPT = LD = g++ LDFLAGS = $(OPT) -Wl,-rpath,$(PGCC)/lib:$(ROOTSYS)/lib SOFLAGS = -shared -Wl,-soname, SOEXT = so SYSLIBS = -lm -ldl -rdynamic SYSXLIBS = -L/usr/X11R6/lib -lX11 -lm -ldl -rdynamic XLIBS = $(ROOTSYS)/lib/libXpm.a -L/usr/X11R6/lib -lX11 CILIBS = -lm -ltermcap -lbsd -ldl -rdynamic ##### MACROS and TARGETS ##### include Make-macros ##### DEPENDENCIES ##### include Make-depend [holeczek@p2a02 tutorials]$ root ******************************************* * * * W E L C O M E to R O O T * * * * Version 2.21/07 8 March 1999 * * * * You are welcome to visit our Web site * * http://root.cern.ch * * * ******************************************* CINT/ROOT C/C++ Interpreter version 5.13.90, Feb 18 1999 Type ? for help. Commands must be C++ statements. Enclose multiple statements between { }. Welcome to the ROOT tutorials Type ".x demos.C" to get a toolbar from which to execute the demos Type ".x demoshelp.C" to see the help window root [0] .x benchmarks.C hsimple : Real Time = 2.35 seconds Cpu Time = 1.73 seconds hsum : Real Time = 1.34 seconds Cpu Time = 1.06 seconds Object statistics class cnt on heap size total size heap size ============================================================================ TObject 39 39 12 468 468 TAxis 3 0 100 300 0 THashList 2 2 44 88 88 THashTable 2 2 36 72 72 TUnixSystem 1 1 252 252 252 TOrdCollection 4 4 40 160 160 TEnv 1 1 24 24 24 TROOT 1 0 236 236 0 TClassTable 1 1 12 12 12 TObjectTable 1 1 24 24 24 ---------------------------------------------------------------------------- Total: 55 51 780 1636 1100 ============================================================================ fillrandom: Real Time = 0.25 seconds Cpu Time = 0.07 seconds TFile** fillrandom.root TFile* fillrandom.root KEY: TFormula form1;1 abs(sin(x)/x) KEY: TF1 sqroot;1 x*gaus(0) + [3]*form1 KEY: TH1F h1f;1 Test random numbers sqroot : x*gaus(0) + [3]*form1 Ndim= 1, Npar= 4, Noper= 11 fExpr[0] = x fOper = 110000 fExpr[1] = G fOper = 2001 fExpr[2] = * fOper = 3 fExpr[3] = [3] fOper = 104 fExpr[4] = x fOper = 110000 fExpr[5] = sin fOper = 11 fExpr[6] = x fOper = 110000 fExpr[7] = / fOper = 4 fExpr[8] = abs fOper = 41 fExpr[9] = * fOper = 3 fExpr[10] = + fOper = 1 Par 0 p0 = 10 Par 1 p1 = 4 Par 2 p2 = 1 Par 3 p3 = 20 TH1.Print Name= Func, Total sum= 1514.36 FCN=191.938 FROM MIGRAD STATUS=CONVERGED 141 CALLS 142 TOTAL EDM=0.000261941 STRATEGY= 1 ERROR MATRIX ACCURATE EXT PARAMETER STEP FIRST NO. NAME VALUE ERROR SIZE DERIVATIVE 1 p0 3.29817e+01 6.12182e-01 2.92273e-03 5.48776e-03 2 p1 3.98912e+00 2.73706e-02 9.44425e-05 9.98335e-01 3 p2 1.00017e+00 2.18738e-02 6.01098e-05 3.01152e-01 4 p3 6.29884e+01 1.40686e+00 8.60789e-03 -1.65665e-03 fit1 : Real Time = 0.27 seconds Cpu Time = 0.16 seconds TFile** hsimple.root TFile* hsimple.root KEY: TH1F hpx;1 This is the px distribution KEY: TH2F hpxpy;1 py vs px KEY: TProfile hprof;1 Profile of pz versus px KEY: TNtuple ntuple;1 Demo ntuple i 0 0.000000 1.986693 i 1 0.100000 2.955202 i 2 0.200000 3.894183 i 3 0.300000 4.794255 i 4 0.400000 5.646425 i 5 0.500000 6.442177 i 6 0.600000 7.173561 i 7 0.700000 7.833269 i 8 0.800000 8.414710 i 9 0.900000 8.912073 i 10 1.000000 9.320391 i 11 1.100000 9.635582 i 12 1.200000 9.854497 i 13 1.300000 9.974950 i 14 1.400000 9.995736 i 15 1.500000 9.916648 i 16 1.600000 9.738476 i 17 1.700000 9.463000 i 18 1.800000 9.092975 i 19 1.900000 8.632093 tornado : Real Time = 0.20 seconds Cpu Time = 0.14 seconds TNode count=1000 name=TF1N1 TNode count=2000 name=ST2B122 TNode count=3000 name=PR3_3 TNode count=4000 name=TF3N6 TNode count=5000 name=TF3T1 TNode count=6000 name=TF3T1 TNode count=7000 name=TF3T1 TNode count=8000 name=ST4B135 TNode count=9000 name=TF4P4 TNode count=10000 name=TF4P4 TNode count=11000 name=TF4P4 TNode count=12000 name=WF281 TNode count=13000 name=SZ3811 na49 : Real Time = 7.74 seconds Cpu Time = 7.62 seconds geometry : Real Time = 1.05 seconds Cpu Time = 1.01 seconds na49view : Real Time = 0.21 seconds Cpu Time = 0.10 seconds na49view : Real Time = 0.83 seconds Cpu Time = 0.30 seconds FCN=13.6566 FROM MIGRAD STATUS=CONVERGED 64 CALLS 65 TOTAL EDM=4.34281e-15 STRATEGY= 1 ERROR MATRIX ACCURATE EXT PARAMETER STEP FIRST NO. NAME VALUE ERROR SIZE DERIVATIVE 1 p0 1.29313e+00 1.76826e-01 2.61580e-05 -4.80065e-06 2 p1 1.60130e-03 1.77265e-02 7.08820e-06 -1.24991e-05 3 p2 9.06693e-01 1.52986e-02 1.91372e-06 -6.30101e-05 ntuple1 : Real Time = 1.15 seconds Cpu Time = 0.96 seconds ---------------ROOT 2.21/07 benchmarks summary-------------------- hsimple : Real Time = 2.35 seconds Cpu Time = 1.73 seconds hsum : Real Time = 1.34 seconds Cpu Time = 1.06 seconds fillrandom: Real Time = 0.25 seconds Cpu Time = 0.07 seconds fit1 : Real Time = 0.27 seconds Cpu Time = 0.16 seconds tornado : Real Time = 0.20 seconds Cpu Time = 0.14 seconds na49 : Real Time = 7.74 seconds Cpu Time = 7.62 seconds geometry : Real Time = 1.05 seconds Cpu Time = 1.01 seconds na49view : Real Time = 0.83 seconds Cpu Time = 0.30 seconds ntuple1 : Real Time = 1.15 seconds Cpu Time = 0.96 seconds TOTAL : Real Time = 15.18 seconds Cpu Time = 13.05 seconds ---------------ROOT 2.21/07 benchmarks summary (in ROOTMARKS)----- For comparison, an HP735/99 is benchmarked at 27 ROOTMARKS hsimple = 122.02 RealMARKS, = 127.82 CpuMARKS hsum = 122.71 RealMARKS, = 107.24 CpuMARKS fillrandom = 99.36 RealMARKS, = 111.86 CpuMARKS fit1 = 142.00 RealMARKS, = 128.25 CpuMARKS tornado = 140.40 RealMARKS, = 169.71 CpuMARKS na49 = 108.42 RealMARKS, = 108.57 CpuMARKS na49view = 91.73 RealMARKS, = 133.20 CpuMARKS ntuple1 = 194.17 RealMARKS, = 203.62 CpuMARKS geometry = 183.86 RealMARKS, = 164.14 CpuMARKS MEAN = 133.85 RealMARKS, = 139.38 CpuMARKS **************************************************** * Your machine is estimated at 136.62 ROOTMARKS * **************************************************** root [1] .q This is the end of ROOT -- Goodbye [holeczek@p2a02 tutorials]$
This archive was generated by hypermail 2b29 : Tue Jan 04 2000 - 00:43:30 MET