The Fujitsu/Siemens M9000

Introduction
HPC Architecture
  1. Shared-memory SIMD machines
  2. Distributed-memory SIMD machines
  3. Shared-memory MIMD machines
  4. Distributed-memory MIMD machines
  5. ccNUMA machines
  6. Clusters
  7. Processors
    1. AMD Opteron
    2. IBM POWER5+
    3. IBM BlueGene processors
    4. Intel Itanium 2
    5. Intel Xeon
    6. The MIPS processor
    7. The SPARC processors
  8. Networks
    1. Infiniband
    2. InfiniPath
    3. Myrinet
    4. QsNet
    5. SCI
Available systems
  1. The Bull NovaScale
  2. The C-DAC PARAM Padma
  3. The Cray X1E
  4. The Cray XT3
  5. The Cray XT4
  6. The Cray XMT
  7. The Fujitsu/Siemens M9000
  8. The Fujitsu/Siemens PRIMEQUEST
  9. The Hitachi BladeSymphony
  10. The Hitachi SR11000
  11. The HP Integrity Superdome
  12. The IBM eServer p575
  13. The IBM BlueGene/L&P
  14. The Liquid Computing LiquidIQ
  15. The NEC Express5800/1000
  16. The NEC SX-8
  17. The SGI Altix 4000
  18. The SiCortex SC series
  19. The Sun M9000
Systems disappeared from the list
Systems under development
Glossary
Acknowledgments
References

Machine type RISC-based shared-memory multi-processor
Models M9000-32, M9000-64
Operating system Solaris (Sun's Unix variant)
Connection structure Crossbar
Compilers Parallel Fortran 90, OpenMP, C, C++
Vendors information web page: http://www.fujitsu-siemens.com/products/unix_servers/sparc_enterprise/sparcent_enterprise.html
Year of introduction 2007

System parameters:

Model M9000-32 M9000-64
Clock cycle 2.4 GHz 2.4 GHz
Theor. peak performance    
Per core (64-bits) 9.6 Gflop/s 9.6 Gflop/s
Maximal 614 Gflop/s 1229 Gflop/s
Main memory    
Memory/node ≤ 128 GB ≤ 128 GB
Memory/maximal ≤ 1 TB ≤ 2 TB
No. of processor cores 8—64 8—128
Communication bandwidth    
Point-to-point --- ---
Aggregate 367.5 GB/s 737 GB/s

Remarks

We only discuss here the M9000-32 and M9000-64 as the smaller models like the M8000s have the same structure but less processors. We also mention here that the same models are available with a somewhat slower processor at 2.28 GHz. The M9000 systems now represent the high-end servers of Fujitsu/Siemens and Sun and as such replace both the Fujitsu/Siemens PRIMEPOWER series as well as Sun's E25K server (see Systems disappeared from the list).

The dual-core SPARC64 VI processors (see the section on the SPARC processor) have a theoretical peak speed of 9.6 Gflop/s/core and are packaged in four-processor Multi Chip Units (MCUs). Apart from the four processors an MCU also houses a part of the total memory, up to 128 GB/MCU. The MCUs in turn are connected by a crossbar, connecting 8 or 16 of them in the M9000-32 and M9000-64, respectively.

Unfortunately, the technical information available beyond the data sheets that are provided via Fujitsu-Siemens' web site is very scarce. These data sheets omit all information about the bandwidth of the interconnect be it point-to-point, bi-sectional, or aggregate. From other sources it can be gathered that with respect to the earlier PRIMEPOWER series the crossbar is doubled and when one of them fails communication proceeds at half of the total bandwidth. The aggregate bandwidth is impressive: 737 GB/s for the M9000-64. No data is available about the point-to-point bandwidth, nor about the latency.

Fujitsu/Siemens positions the Mx000 servers for the commercial market and seems not interested to market it for HPC-related work although the specifications look quite good. On the other hand, the systems are fitted with extreme RAS features that will be much appreciated in commercial environments but which makes the systems relatively costly.

Measured Performances:
No performance results in the technical/scientific area are known to us to date. This is not only due to the newness of the system but also to the lack of interest in the scientific HPC realm.