The IBM eServer p575

Introduction
HPC Architecture
  1. Shared-memory SIMD machines
  2. Distributed-memory SIMD machines
  3. Shared-memory MIMD machines
  4. Distributed-memory MIMD machines
  5. ccNUMA machines
  6. Clusters
  7. Processors
    1. AMD Opteron
    2. IBM POWER6
    3. IBM PowerPC 970
    4. IBM BlueGene processors
    5. Intel Itanium 2
    6. Intel Xeon
    7. The MIPS processor
    8. The SPARC processors
  8. Accelerators
    1. GPU accelerators
    2. General accelerators
    3. FPGA accelerators
  9. Networks
    1. Infiniband
    2. InfiniPath
    3. Myrinet
    4. QsNet
Available systems
  1. The Bull NovaScale
  2. The C-DAC PARAM Padma
  3. The Cray XT3
  4. The Cray XT4
  5. The Cray XT5h
  6. The Cray XMT
  7. The Fujitsu/Siemens M9000
  8. The Fujitsu/Siemens PRIMEQUEST
  9. The Hitachi BladeSymphony
  10. The Hitachi SR11000
  11. The HP Integrity Superdome
  12. The IBM BlueGene/L&P
  13. The IBM eServer p575
  14. The IBM System Cluster 1350
  15. The Liquid Computing LiquidIQ
  16. The NEC Express5800/1000
  17. The NEC SX-9
  18. The SGI Altix 4000
  19. The SiCortex SC series
  20. The Sun M9000
Systems disappeared from the list
Systems under development
Glossary
Acknowledgments
References

Machine type RISC-based distributed-memory multi-processor.
Models IBM eServer p575
Operating system AIX (IBMs Unix variant), Linux (SuSE SLES 10).
Connection structure Variable (see remarks)
Compilers XL Fortran (Fortran 90), (HPF), XL C, C++
Vendors information Web page http://www-01.ibm.com/common/ssi/cgi-bin/ssialias?infotype=AN&subtype=CA&htmlfid=897/ENUS107-675&appname=USN
Year of introduction 2008 (32-core POWER6 SMP)

System parameters:

Model eServer p575
Clock cycle 4.7 GHz
Theor. peak performance  
Per Proc. (2 cores) 37.6 Gflop/s
Per node (32 cores) 601.6 Gflop/s
Per 14-node frame 8.42 Tflop/s
Maximal (512-node system) ≥ 100 Tflop/s
Main memory  
Memory/node ≤ 256 GB
Memory/maximal
Communication bandwidth  
Node-to-node (see remarks)

Remarks:

There is a multitude of high end servers in the eServer p-series. However, IBM singles out the POWER6 based p575 model specifically for HPC. The eServer p575 is the successor of the earlier POWER5+ based systems. It retains much of the macro structure of this system: multi-CPU nodes are connected within a frame either by a dedicated switch or by other means, like switched Ethernet. The structure of the nodes, however, has changed considerably, see POWER6. Four dual-core POWER6 processors are housed in a Multi-Chip Module (MCM) while four of these constitute a p575 node. So, 32 cores make up a node. The 4 MCMs are all directly connected to each other at a bandwidth of 80 GB/s. The inter-MCM links are used to reach the memory modules that are not local to a core but within a node. Therefore, all memory in a node is shared by the processor cores, although the memory access is no longer uniform as in earlier p575 models. As yet no NUMA factor is published but, given the node structure, it should be moderate. Obviously, within a node shared-memory parallel programming as with OpenMP can be employed.

In contrast to its earlier p575 clusters, IBM does not provide its proprietary Federation switch anymore for inter-node communication. Instead, one can choose to configure a network from any vendor. In practice this will turn out to be InfiniBand in most cases, but also switched Gigabit Ethernet, Myrinet or a Quadrics network is enterily possible. For this reason it is not possible to give inter-node bandwidth values as this is to be chosen by the user.

At this moment nowhere is to be found what the maximum configuration of a POWER6-based p575 configuration would be. The online information at present is not definitive and consistently speaks of planned characteristics of the POWER6-based systems, although several machines already have been installed and are in operation. At ECMWF, UK a 156 Tflop/s system is installed, at NCAR, USA a 71 Tflop/s, and a 60 Tflop/s at SARA in The Netherlands. Because of this lack of information, we cannot give details about maximum performance, memory, etc.
The p575 is accessed through a front-end control workstation that also monitors system failures. Failing nodes can be taken off line and exchanged without interrupting service. Because of the very dense packaging of the units that house the POWER6 processors are water cooled.

Applications can be run using PVM or MPI. IBM used to support High Performance Fortran, both a proprietary version and a compiler from the Portland Group. It is not clear whether this is still the case. IBM uses its own PVM version from which the data format converter XDR has been stripped. This results in a lower overhead at the cost of generality. Also the MPI implementation, MPI-F, is optimised for the p575-based systems. As the nodes are in effect shared-memory SMP systems, within the nodes OpenMP can be employed for shared-memory parallelism and it can be freely mixed with MPI if needed. In addition to its own AIX OS IBM also supports some Linux distributions: the professional version of SuSE Linux is available for the p575 series.

Measured Performances:
In [54] a performance of 80.3 Tflop/s for a 8,320 core system at ECMWF, Reading, UK, is reported for solving a dense linear system of unspecified order with an efficiency of 51%.