This benchmark was performed on a 128-CPU, dual-core Itanium 9000 (Montecito)
based system. For this single-CPU benchmark obviously only one CPU (core even)
was used. So, the theoretical peak performance was 6.4 Gflop/s. The frontside
bus had a capacity of 8.53 GB/s.

All programs ran smoothly, only results for program mod3a, the large
out-of-core matrix-vector multiplication was lacking for unknown reasons.
