In past EMME/2 News issues, several benchmarks have been published each time EMME/2 was ported to a new product. These benchmarks reported the CPU time for auto and transit assignment with the standard Winnipeg demonstration data base. In an attempt to give a general picture of the relative machine performance and evolution over time, we have gathered here all the benchmark results accumulated over the years. We will also compare some of these results with those obtained by running well known CPU and FPU benchmarks.
The following table gives all the EMME/2 benchmark results sorted according to auto assignment performance. The lines marked with "*" give results for the 80386 using native 32-bit protected mode.
Computer Model | Processor | Speed | Date | Auto Assignment | Trans. | |
CPU/FPU | MHz | 1 it. | Total | Ass. | ||
IBM 3090 | NEC NAS AS/EX90 | n/a | 09/89 | 8.4 | 92.7 | 19.1 |
HP 9000-835 | HP RISC | 30 | 09/89 | 10.3 | 113.6 | 23.4 |
SUN SPARCserver 330 | SPARC | 25 | 08/89 | 11.1 | n/a | n/a |
SUN SPARCstation1 | SPARC | 20 | 09/89 | 14.9 | 163.2 | 28.9 |
HP 9000-825 | HP RISC | 25 | 09/89 | 18.6 | 202.5 | 56.5 |
VAX 8600 | DEC | n/a | 03/87 | 23.1 | n/a | n/a |
Interpro 340 | Clipper | 25 | 03/89 | 29.8 | 325.9 | 92.2 |
Definicon PM-030 | 68030/882 | 33 | 09/89 | 34.1 | 382.1 | 80.5 |
Interpro 120 | Clipper | 20 | 03/89 | 36.3 | 412.8 | 116.8 |
COMPAQ 20e (*) | 80386/387 | 20 | 09/89 | 43.7 | 491.3 | 101.7 |
Definicon DSI-780 | 68020/881 | 20 | 09/89 | 51.6 | 580.6 | 123.5 |
COMPAQ 20 (*) | 80386/387 | 20 | 09/89 | 58.3 | 653.4 | 135.0 |
SUN 3/60 | 68020/881 | 20 | 09/89 | 63.4 | 702.9 | 132.3 |
IBM PS2/70 (*) | 80386/387 | 20 | 09/89 | 64.8 | 721.8 | 147.6 |
COMPAQ 20e | 80386/387 | 20 | 09/89 | 66.2 | 737.1 | 214.4 |
Definicon DSI-780 | 68020/881 | 17 | 03/87 | 67.5 | n/a | 152.5 |
Masscomp 5520 | 68020/lightning | n/a | 09/89 | 73.2 | 806.0 | 172.3 |
COMPAQ 20 (*) | 80386/387 | 16 | 07/88 | 75.2 | n/a | 183.6 |
Masscomp 5400 | 68020/881 | n/a | 11/86 | 86.0 | 937.9 | 190.2 |
COMPAQ 20 | 80386/387 | 20 | 09/89 | 86.1 | 957.0 | 267.7 |
IBM PS2/70 | 80386/387 | 20 | 09/89 | 91.8 | 1018.2 | 297.5 |
VAX 11/780 | DEC | n/a | 03/87 | 119.2 | n/a | 300.5 |
microVAX II | DEC | n/a | 03/87 | 151.2 | n/a | 450.3 |
VAX station 2000 | DEC | n/a | 09/89 | 167.9 | 1842.4 | 392.2 |
CLUB 286 | 80286/287 | 10 | 09/89 | 177.2 | 1947.9 | 477.5 |
HP 9000-500 | HP | n/a | 08/85 | 186.5 | 2028.8 | 292.0 |
AT clone (Taiwan) | 80286/287 | 12 | 09/89 | 199.8 | 2200.7 | 548.6 |
SUN 3/50 | 68020/881 | 15 | 03/87 | 202.0 | n/a | 247.7 |
Definicon DSI-32 | 32032/081 | 10 | 11/85 | 223.4 | 2455.9 | 534.8 |
IBM AT | 80286/287 | 8 | 02/88 | 262.5 | n/a | 1193.4 |
Masscomp 500 | 68010 | n/a | 06/85 | 291.3 | 3166.0 | 499.6 |
Symmetric 375 | 32016/081 | 10 | 12/86 | 417.3 | 4526.9 | 862.0 |
AT&T Unix-PC 7300 | 68010 | 10 | 01/84 | 463.5 | 5147.6 | n/a |
Burroughs XT550 | 68010 | 10 | 10/85 | 522.4 | 5695.7 | 1503.5 |
Pixel 100/AP | 68000 | 10 | 06/85 | 549.5 | 5959.0 | 715.8 |
SUN 2/50 | 68010 | 10 | 06/85 | 584.6 | 6391.1 | 730.5 |
Not surprisingly, the above table is also nearly in reverse chronological order. In the five year span covered by the benchmarks, a tremendous progress has been made at the hardware level. Consider the 68000 microprocessors family for example: a factor of 15 has been gained from the 10 MHz 68000 with an iteration time of 550 sec to the 33 MHz 68030 with an iteration time of 34.1 sec! Similarly, the 20 MHz 80386 is 6 times faster than its 8 MHz 80286 predecessor. In this last case, the difference would even be greater if we could have included results from 25 and 33 MHz 80386 or the new generation of 80486 machines which are now available. Then, at the top of the table, we have RISC microprocessors nearing the IBM 3090 mainframe results. At these speeds, we can almost speak of interactive assignments!
For all the 80386 machines tested, the improvement obtained when using the native protected mode (lines marked with "*" in the above table) is about 30% for the auto assignment and 50% for the transit assignment.
Different computer models are usually compared using small standard benchmarks that try to isolate one system component (i.e CPU, FPU, disk I/O, etc). An interesting question is: if machine X is 10 times faster than machine Y according to benchmark Z, will EMME/2 also run 10 times faster? In other words, can we predict the EMME/2 performance on a machine based solely on a few standard benchmark results? In a search for an answer to this last question, we have conducted a small study with three benchmarks testing the performance of the CPU, the FPU and a mix of both. To be even more complete, the study could also have included disk I/O and memory access benchmarks, but we decided to concentrate on CPU and FPU benchmarks, since EMME/2 is known to be processor bound. The following benchmarks were used:
real*4 a,b a=0. b=1. 10 a=a+1./b-1./(b+1.) b=b+2. if(b.lt.1000000.)goto 10 print *,a end
Two versions were used: a single precision sfloat and a double precision dfloat.
The following table gives the benchmark results along with some EMME/2 results taken from the previous table. All the tests where done with the basic hardware configuration, without software accelerators such as high level memory cacheing (memcache) or virtual disk (vdisk). For multi-user operating systems, the programs where run at night to avoid as much as possible interference from other users. (Although it would be interesting in itself to observe the degradation under normal and heavy workloads.) Except for swhet and dwhet, all results are expressed in seconds.
Computer | swhet | dwhet | sfloat | dfloat | ssieve | lsieve | Auto | Transit | Buildwpg |
SPARCstation1 | 6303 | 4249 | 3.0 | 4.8 | 1.75 | 1.27 | 163.2 | 28.8 | 382 |
HP 9000-825 | 3491 | 2631 | 5.3 | 6.0 | 1.83 | 1.49 | 202.5 | 56.5 | 450 |
PM-030 | 2345 | 2129 | 8.1 | 9.3 | 3.32 | 2.89 | 382.1 | 80.5 | 870 |
SUN 3/60 | 1186 | 1171 | 17.7 | 18.3 | 4.68 | 4.93 | 702.9 | 132.3 | 1147 |
DSI-780 | 1220 | 1120 | 15.3 | 17.9 | 5.24 | 4.60 | 580.6 | 123.5 | 1146 |
Masscomp 5520 | 2500 | 1697 | 6.1 | 10.4 | 5.45 | 5.88 | 806.0 | 172.3 | 1124 |
COMPAQ 20e (*) | 1388 | 1303 | 15.0 | 16.8 | 5.33 | 3.73 | 491.3 | 101.6 | 1044 |
COMPAQ 20e | 1036 | 924 | 14.2 | 14.9 | 5.89 | 10.22 | 737.1 | 214.4 | 1359 |
COMPAQ 20 (*) | 1282 | 1175 | 15.0 | 17.4 | 6.53 | 4.15 | 653.4 | 135.0 | 1294 |
COMPAQ 20 | 950 | 842 | 14.4 | 15.4 | 8.03 | 13.84 | 957.0 | 267.7 | 1804 |
IBM PS2/70 (*) | 1158 | 1064 | 18.1 | 20.8 | 6.70 | 4.16 | 721.8 | 147.6 | 1453 |
IBM PS2/70 | 883 | 782 | 16.3 | 17.7 | 8.08 | 13.74 | 1018.2 | 297.5 | 1818 |
CLUB 286 | 281 | 254 | 68.9 | 74.0 | 15.16 | 25.35 | 1947.9 | 477.5 | 2981 |
AT clone | 285 | 257 | 64.8 | 70.0 | 17.78 | 28.74 | 2200.7 | 548.6 | 3399 |
DSI-32 | 312 | 277 | 34.4 | 42.3 | 14.60 | 12.60 | 2415.0 | 513.0 | 3826 |
Symmetric 375 | 196 | 196 | 61.7 | 54.6 | 36.39 | 26.38 | 4175.2 | 817.0 | 6253 |
VAX st. 2000 | n/a | n/a | 13.3 | 19.3 | 19.30 | 13.30 | 1842.4 | 392.2 | 2462 |
The following table gives the sample correlation matrix between the benchmark times (note that for this the whetstone/sec results of swhet and dwhet were first inverted to obtain the corresponding time values):
swhet | dwhet | sfloat | dfloat | ssieve | lsieve | Auto | Transit | |
dwhet | .99 | |||||||
sfloat | .94 | .95 | ||||||
dfloat | .91 | .93 | .99 | |||||
ssieve | .95 | .92 | .76 | .71 | ||||
lsieve | .90 | .91 | .91 | .89 | .82 | |||
Auto | .97 | .96 | .80 | .77 | .98 | .84 | ||
Transit | .98 | .97 | .85 | .82 | .95 | .91 | .98 | |
Buildwpg | .97 | .95 | .82 | .78 | .96 | .84 | .99 | .98 |
Four points are worth noting in this last table:
Keeping in mind the small sample size (17 observations), this small study would suggest that even if depending on the FPU throughput (high correlation with sfloat) EMME/2 seems more CPU bound than FPU bound even for the auto assignment.
In a next article, we will compare the speed of the various graphic displays and also run benchmarks on the GPL and GPR utilities, using various plotters and printers.