The Fujitsu FX10

Introduction
HPC Architecture
  1. Shared-memory SIMD machines
  2. Distributed-memory SIMD machines
  3. Shared-memory MIMD machines
  4. Distributed-memory MIMD machines
  5. ccNUMA machines
  6. Clusters
  7. Processors
    1. AMD Opteron
    2. IBM POWER7
    3. IBM BlueGene/Q processor
    4. Intel Xeon
    5. The SPARC processors
  8. Accelerators
    1. GPU accelerators
      1. ATI/AMD
      2. nVIDIA
    2. General computational accelerators
      1. Intel Xeon Phi
    3. FPGA accelerators
      1. Convey
      2. Kuberre
      3. SRC
  9. Interconnects
    1. Infiniband
Available systems
  • The Bull bullx system
  • The Cray XC30
  • The Cray XE6
  • The Cray XK7
  • The Eurotech Aurora
  • The Fujitsu FX10
  • The Hitachi SR16000
  • The IBM BlueGene/Q
  • The IBM eServer p775
  • The NEC SX-9
  • The SGI Altix UV series
  • Systems disappeared from the list
    Systems under development
    Glossary
    Acknowledgments
    References

    Machine type RISC-based distributeded-memory multi-processor
    Models PRIMEHPC FX10
    Operating system Solaris (Sun's Unix variant)
    Connection structure 6-D torus.
    Compilers Parallel Fortran 90, OpenMP, C, C++
    Vendors information web page: http://www.fujitsu.com/global/services/solutions/tc/hpc/
    Year of introduction 2011

    System parameters:

    Model FX10
    Clock cycle 1.848 GHz
    Theor. peak performance  
    Per core (64-bits) 14.8 Gflop/s
    Per processor (64-bits) 236.5 Gflop/s
    Maximal 23.2 Pflop/s
    Main memory  
    Memory/node ≤ 64 GB
    Memory/maximal 6 PB
    Communication bandwidth  
    Point-to-point 5 GB/s/direction
    Aggregate

    Remarks

    A few months after the introduction of the K-computer a commercial version was presented on the market. The structure of the system was the same as that of the K-computer but the processor was slightly improved (see the SPARC processor). The 16-core processor runs at 1.848 GHz which amounts to a peak performance of 236.5 Gflop/s because every core delivers 8 floating-point results/cycle.

    The FX10 has the same interconnect as the K-computer: a 6-D torus. Apart from a very high bandwidth of 5 GB/s in each direction, the extra dimensions (the user experiences only 3 of them) allow for a high resiliency for failures and an easy way of rerouting in case of contention.

    The system can be made very large: a maximum of 98,304 nodes can be configured for a peak performance of 23.2 Pflop/s. Like in the Cray and IBM BlueGene systems the compute nodes are not bothered by system tasks. These tasks are diverted to dedicated I/O nodes. This greatly reduces the OS-jitter in the system, ultimately improving application scalability for large amount of nodes.

    Where most HPC vendors (except IBM) offer Lustre as their HPC file system, Fujitsu has its own brand: FEFS. Fujitsu also has its own suite of compilers, a scientific library, MPI, OpenMP, and a proprietary Fortran extension XPFortran that, like OpenMP is directive-based. In addition, Fujitsu has its own tuner/debugger to help in the development of large-scale applications.

    Measured Performances
    When one would regard the K-computer as an instantiation of an FX10 system in [39] the K-computer enters with a speed of 10.51 Pflop/s in solving a linear system of unknown size with the very high efficiency of 93.1%