Although we mainly want to discuss real, marketable systems and no experimental, special purpose, or even speculative machines, we want to include a section on systems that are in a far stage of development and have a fair chance of reaching the market. For inclusion in section 3 we set the rule that the system described there should be on the market within a period of 6 months from announcement. The systems described in this section will in all probability appear within one year from the publication of this report. However, there are vendors who do not want to disclose any specific data on their new machines until they are actually beginning to ship them. We recognise the wishes of such vendors (it is generally wise not to stretch the expectation of potential customers too long) and they will not disclose such information.
Below we discuss systems that may lead to commercial systems to be introduced on the market between somewhat more than half a year to a year from now. The commercial systems that result from it will sometimes deviate significantly from the original research models depending on the way the development is done (the approaches in Japan and the USA differ considerably in this respect) and the user group which is targeted.
A development that may be of significance in the near future is the introduction of Intel's IA-64 Itanium processor family. The first chip in the family has recently been succeeded by the second generation, the Itanium 2initially with a clock frequencies of 1 GHz. It is highly probable that a majority of vendors will incorporate IA-64 chips in favour of their proprietary RISC processors in a time span of 1--2 years. Understandable as this may be from an economical point of view, it is also slightly disturbing as the processor landscape may become rather barren in this way.
Compaq has now become a part of Hewlett-Packard and it is hard to say whether the now successful AlphaServer SC line will be continued in some way. It seems fairly sure that an EV7-based system (see the section on the EV7) will be marketed in the near future. At the same time itis almost as sure that no system with the projected dual core Alpha EV8 will be built. Already before Compaq joined with HP it committed itself to using IA-64 processors in future systems and this will undoubtly be pursued as a part of HP as this company was one of the main developers of the latter processor line. As of the macro structure of the future systems nothing can be sure, although the present interconnect technology is very successful and for that reason may be maintained at least for another generation.
In the beginning of 2003 the next generation vector processor from Cray Inc., the SV-2, should be ready to ship. It builds on the technology found in the present Cray SV-1s, but the speed per processor should be appreciably higher: 12.8 Gflop/s. Up to 4 processors will be fitted in a node that also harbours a maximum of 32 GB of memory per CPU. As many as 16 nodes can be put into one frame. Up to 64 frames can connected having a Single System Image. The inter-node communication speed is projected to be 100 GB/s. If these design targets can be achieved, the SV-2 would be a formidable system and also a testimony that vector processing is not a dead end in computer technology as except Cray and NEC all other vendors seem to have abandoned the concept (in the Hitachi SR8000 series pseudo-vector processing is implemented. However, the bandwidth to/from memory is markedly lower than what normally is expected in vectorprocessors).
Because of the merger with Compaq it is not clear what the future course of HP at the high end will be. The present SuperDome does not belong to the very high-end systems and HP had no clear plans for making such systems in the near future. A logical decision would be to build upon the AlphaServer SC for this line of systems but as no strategy on this part is known presently, very little can be assumed except that the generation after the next will be based on the Intel/HP IA-64 processor.
Since end 2001/begin 2002 IBM has systems available based on the POWER4 processor (see the eServer p690 description). Presently, the most extensive system built with 32 processor Turbo nodes would be able to attain a Theoretical Peak Performance of slightly in excess of 8 Tflop/s. There is still a way to go to the goal of building a 100 Tflop/s system as IBM ventures to make in a follow-on contract in the ASCI program.
One can expect that the integration level of the nodes will further increase and also the number of nodes may be increased beyond the 512 that now can be offered on special delivery. Assuming a doubling both of the clock frequency and the number of CPUs/node and extending the number of nodes that can be coupled by a factor of 4, the 100 Tflop/s boundary could be passed in about 2 years. With the increase of processor speed a matching increase of interconnect speed is in order. The High-Performance switch that presently is used for node interconnection has a bandwidth of 500 MB/s. This will be upgraded to speed of about 1 GB/s within the next two years and should be increased even more to be useful in a 100 Tflop/s system for general applications.
In the SGI Origin3000 systems already a provision has been made in the C-bricks to put in Intel's IA-64 processors instead of the MIPS R14000 (see \ref{origin}. Though SGI projects that at least still one MIPS chip generation, the R16000, will (can) be used in Origin systems. At the moment SGI seems directed to making systems with a high "compute density", i.e., to integrate as many processors as possible in a smallest possible volume. In this respect the MIPS processors have a very good track record. The R14000A, the processor employed at this moment dissipate a factor 4 to 5 less energy than the Alpha EV68 or the IBM POWER4. This should allow for building quite dense systems without running into cooling problems. Whether this is a sufficient argument in view of the relatively low clock frequency of the MIPS processor, remains to be seen. The very modular structure of the Origin systems is an architectural asset that makes it probable that the system structure will not change significantly in the near future.
Still, SGI has publicly committed itself to making systems using IA-64 processors. It will depend on the availability/price of the Itanium 2 or its successor whether such systems will be of interest, also because of the much larger energy requirements. If IA-64 based machines will be marketed, they will use Linux for an operating system as SGI already a few years ago discontinued porting its native IRIX OS to the IA-64 platform because of cost considerations.
The SRC company represents a trend that is taking up remarkable speed at the moment. It consists of complementing general purpose processors with (a collection of) FPGAs, Field Programmable Gate Arrays, see also the glossary. This makes it possible to configure such a machine for special user-defined tasks that would them make, at least in principle, significantly faster than general purpose processors for the same tasks. SRC proposes a system, the SRC-6, with 256 dual processor boards containing standard IA-32 processor each of which is connected to a unit, MAP, for Multi-Adaptive Processor, that consists of an FPGA, private memory, a MAP controller, and logic to reconfigure the unit when needed. MAPs are interconnected by a ring network and the standard processor boards that also have their local memory, are through the MAPs connected to a global memory by a read and a write crossbar.