This is the twelfth edition of a report in which we attempt to give an overview of parallel and vector systems that are commercially available or are expected to become available within a short time frame (typically a few months to half a year). We choose the expression "attempt" deliberately because the market of parallel- and vector machines is highly evasive: the rate with which systems are introduced --- and disappear again --- is very high and therefore the information will probably be only approximately valid. Nevertheless, we think that such an overview is useful for those who want to obtain a general idea about the various means by which these systems strive at high performance, especially when it is updated on a regular basis.
We will try to be as up-to-date and compact as possible and on these grounds we think there is a place for this report. At this moment systems appearing on and disappearing from the market are approximately in balance. One of the reasons for this seems to be the ASCI program, [2] in the USA that has given a big impulse to the HPC industry, at least in the USA. Furthermore, there is the more or less natural wave motion of older systems that are withdrawn and are replaced by newer models. Generally, one could say that the trend of the past few years in which more systems disappeared than new ones were introduced does not seem to continue. Only time can tell whether this stabilisation is permanent.
A trend that seems to emerge is that most new systems look as minor variations on the same theme: clusters of RISC-based Symmetric Multi-Processing (SMP) nodes which in turn are connected by a fast network Culler et.al., [5] consider this as a natural architectural evolution. However, it may also be argued that the requirements formulated in the ASCI program has steered these systems in this direction.
The supercomputer market is a very dynamic one and this is especially
true for the Beowulf clusters that have emerged at a tremendous rate in
the last few years. The number of vendors that sell pre-configured
clusters has boomed accordingly and, at least for this issue, we have
decided not to include such configurations in this report: the
speed with which cluster companies and systems appear and disappear
makes this almost impossible. We will briefly comment on cluster
characteristics and their position relative to other supercomputers in
section Clusters though.
For the tightly-coupled or "integrated" parallel systems, however,
we can by updating this report at least follow the main trends in
popular and emerging architectures. The details of the systems be
reported do not allow the report to be shorter than in former years:
between 40--50 pages.
As of the 11th issue we decided to introduce a section that describes the dominant processors in some detail. This seems fit as the processors are the heart of the systems. We do that in section Processors .
The rule for including systems is as follows: they should be either available commercially at the time of appearance of this report, or within 6 months thereafter. This excludes interesting research systems like the ASCI systems, at the Sandia, Los Alamos, and Lawrence Livermore National Laboratories in the USA (all with a measured performance of more than 1.5 Tflop/s) because they are not marketed and only available at the institutes mentioned and, therefore, of not much benefit to the supercomputer community at large.
The rule that systems should be available within a time-span of 6
months is to avoid confusion by describing systems that are announced
much too early, just for marketing reasons and that will not be
available to general users within a reasonable time. We also have to
refrain from including all generations of a system that are still in
use. Therefore, for instance, we do not include the IIBM SP1, the Cray
T90 series anymore although these systems are still in use. Generally
speaking, we include machines that are presently marketed or will be
marketed within 6 months. To add to the information given in this
report, we quote the Web addresses of the vendors because the
information found there may be more recent than what can be provided
here. On the other hand, such pages should be read with care because it
will not always be clear what the status is of the products described
there.
Some vendors offer systems that are identical in all respects except in
the clock cycle of the nodes (examples are the SGI Origin3000 series
and the Fujitsu AP3000). In these cases we always only mention the
models with the fastest clock as it will be always possible to get the
slower systems and we presume that the reader is primarily interested
in the highest possible speeds that can be reached with these systems.
Until the eighth issue of this report we ordered the systems by their architectural classes as explained in section architecture. However, this distinction became more and more artificial as is explained in the same section. Therefore all systems described are simply listed alphabetically. In the header of each system description the machine type is provided. There is referred to the architectural class for as far this is relevant. We omit price information which in most cases is next to useless. If available, we will give some information about performances of systems based on user experiences instead of only giving theoretical peak performances. Here we have adhered to the following policy: We try to quote best measured performances, if available, thus providing a more realistic upper bound than the theoretical peak performance. We hardly have to say that the speed range of supercomputers is enormous, so the best measured performance will not always reflect the performance of the reader's favourite application. When we give performance information, it is not always possible to quote all sources and in any case if this information seems (or is) biased, this is entirely the responsibility of the author of this report. He is quite willing to be corrected or to receive additional information from anyone who is in the position to do so.
Although for the average user the appearance of new systems rapidly
becomes more and more alike, it is still useful to dwell a little on
the architectural classes that underlie this appearance. It gives some
insight in the various ways that high performance is achieved and a
feeling why machines perform as they do. This is done in the section
on architecture which will be referred
to repeatedly in sections that describe the various systems.
Up till the tenth issue we included a section
Systems disappeared from the list some systems
are listed that disappeared from the market. We reduced that section
in the printed and PostScript versions from now on because it tends to
take an unreasonable part of the total text. Still, because this
information is of interest to a fair amount of readers and it gives
insight in the field of the historical development of supercomputing
over the last 12 years, this information will still be available in
full at
http://www.phys.uu.nl/~steen/gone.html.
In section Systems under development we
present some systems that are under development and have a fair chance
to appear on the market. Because of the addition of the section on
processors that introduces many technical terms, also a
glossary is included.
The overview given in this report concentrates on the computational capabilities of the systems discussed. To do full justice to all assets of present days high-performance computers one should list their I/O performance and their connectivity possibilities as well. However, the possible permutations of configurations even for one model of a certain system often are so large that they would multiply the volume of this report, which we tried to limit for greater clarity. So, not all features of the systems discussed will be present. Still we think (and certainly hope) that the impressions obtained from the entries of the individual machines may be useful to many. We also omitted some systems that may be characterised as "high-performance" in the fields of database management, real-time computing, or visualisation. Therefore, as we try to give an overview for the area of general scientific and technical computing, systems that are primarily meant for database retrieval like the AT&T GIS systems or concentrate exclusively on the real-time user community, like Concurrent Computing Systems, are not discussed in this report. Furthermore, we have set a threshold of about 10 Gflop/s for systems to appear in this report as, at least with regard to theoretical peak performance, single CPUs often exceed 1 Gflop/s although their actual performance may be an entirely other matter.
Although most terms will be familiar to many readers, we still think it is worthwhile to give some of the definitions in the archictecture section because some authors tend to give them a meaning that may slightly differ from the idea the reader already has acquired.
Lastly, we should point out that the WWW version is available at
various places. The URLs are:
USA:
www.netlib.org/utk/papers/advanced-computers/
Europe:
www.nwo.nl/ncf/overview-src.
Europe:
www.phys.uu.nl/~steen/overview/overview02.html.
Europe:
http://www.euroben.nl/reports/overview02.html.