HPC Architecture
  1. Shared-memory SIMD machines
  2. Distributed-memory SIMD machines
  3. Shared-memory MIMD machines
  4. Distributed-memory MIMD machines
  5. ccNUMA machines
  6. Clusters
  7. Processors
    1. AMD Opteron
    2. IBM POWER7
    3. IBM BlueGene/Q processor
    4. Intel Xeon
    5. The SPARC processors
  8. Accelerators
    1. GPU accelerators
      1. ATI/AMD
      2. nVIDIA
    2. General computational accelerators
      1. Intel Xeon Phi
    3. FPGA accelerators
      1. Convey
      2. Kuberre
      3. SRC
  9. Interconnects
    1. Infiniband
Available systems
  • The Bull bullx system
  • The Cray XC30
  • The Cray XE6
  • The Cray XK7
  • The Eurotech Aurora
  • The Fujitsu FX10
  • The Hitachi SR16000
  • The IBM BlueGene/Q
  • The IBM eServer p775
  • The NEC SX-9
  • The SGI Altix UV series
  • Systems disappeared from the list
    Systems under development

    1.   R. Alverson, D. Roweth, L. Kaplan, The Gemini Interconnect,
      18th IEEE Symposium on High Performance Interconnects, August 2010, 83–87.

    2.   Amza C., A.L. Cox, S. Dwarkadas, P. Keleher, R. Rajamony H. Lu, W. Yu, and W.Zwaenepoel. ThreadMarks: Shared memory computing on networks of workstations, to appear in IEEE Computer,
      (draft copy):

    3.   The ASCI program:

    4.   The home page for Co-array Fortran can be found at:

    5.   R. Chandra, L. Dagum, D.Kohr, D. Maydan, J. McDonald, R.Menon, Parallel Programming in OpenMP, Morgan Kaufmann Publishers Inc., January 2001.

    6.   B. Chapman, G. Jost, R. van der Pas, Using OpenMP, MIT Press, Boston, 2007.

    7.   D.E. Culler, J.P. Singh, A. Gupta, Parallel Computer Architecture: A Hardware/Software Approach, Morgan Kaufmann Publishers Inc., August 1998.

    8.   D.W. Doerfler, An analysis of the PathScale Inc. Infiniband Host Channel Adapter, InfiniPath, Sandia Report SAND2005-5199, August 2005.

    9.   Bibliography page of Distributed Shared Memory systems:

    10.   Directory with EuroBen results:

    11.   M.J. Flynn, Some computer organizations and their effectiveness, IEEE Transactions on Computing, C-21, (1972) 948-960.

    12.   A. Geist, A. Beguelin, J. Dongarra, R. Manchek, W. Jaing, and V. Sunderam, PVM: A Users' Guide and Tutorial for Networked Parallel Computing, MIT Press, Boston, 1994.

    13.   D.A. Bader, J. Berry, S. Kahan, R. Murphy, E.J. Riedy, J. Wilcock, The Graph 500 list, edition June 2011. The Graph 500 site has URL:

    14.   D.B. Gustavson, Q. Li, Local-Area MultiProcessor: the Scalable Coherent Interface., SCIzzL Report, Santa Clara University, Dept. of Computer Engineering, 1995. Available through:

    15.   R.W. Hockney, C. Jesshope, Parallel Computers II: Architecture, Programming and Algorithms, Adam Hilger, Ltd., Bristol, United Kingdom, 1988.

    16.   Most material on Exascale computing can be found at

    17.   T. Horie, H. Ishihata, T. Shimizu, S. Kato, S. Inano, and M. Ikesaka, AP1000 architecture and performance of LU decomposition, In: Proceedings of the 1991 International Conference on Parallel Processing, Vol. I, Architecture, CRC Press, Boca Raton, FL, August 1991, I-634–I-635.

    18.   HPC Challenge Benchmark,

    19.   High Performance Fortran Forum, High Performance Fortran language specification, Scientific Programming, 2, 13, (1993) 1--170.

    20.   D.V. James, A.T. Laundrie, S. Gjessing, and G.S. Sohi, Scalable Coherent Interface, IEEE Computer, 23, 6, (1990), 74--77.

    21.  Julie Langou, Julien Langou, P. Luszczek, J. Kurzuk, J.J. Dongarra, Exploiting the Performance of 32-Bit Floating Point Arithmetic in Obtaining 64-Bit Accuracy, Proceedings of SC06, Tampa, Nov. 2006.

    22.  J. Kim, W.J. Dally, S. Scott, D. Abts, Technology-Driven, Highly-Scalable Dragonfly Topology, IEEE Intl. Symposium on Computer Architecture, (2008), 77–88.

    23.  T. Maruyama, T. Yoshida, R. Kan, I. Yamazaki, S. Yamamura, N. Takahashi, M. Hondou, H. Okano, SPARC64 VIIIfx: A New-generation Octocore Processor for Petascale Computing, IEEE Micro, 30, 2, (2010), 30–40.

    24.   M. Snir, S. Otto, S. Huss-Lederman, D. Walker, J. Dongarra, MPI: The Complete Reference Vol. 1, The MPI Core, MIT Press, Boston, 1998.

    25.   W. Gropp, S. Huss-Ledermann, A. Lumsdaine, E. Lusk, B. Nitzberg, W. Saphir, M. Snir, MPI: The Complete Reference, Vol. 2, The MPI Extensions, MIT Press, Boston, 1998.


    27.   N.J. Boden, D. Cohen, R.E. Felderman, A.E. Kulawik, C.L. Seitz, J.N. Seizovic, Wen-King Su, Myrinet -- A Gigabit-per-second Local Area Network, IEEE Micro 15, No. 1, Jan. 1995, 29--36.

    28.   Web page for the NAS Parallel benchmarks NPB2:

    29.   OpenMP Forum, OpenMP Application Interface, version 2.5, Web page:, May 2005.

    30.   C. Schow, F. Doany, J. Kash, Get on the Optical Bus, IEEE Spectrum, September 2010, 31--35.

    31.   T. Shanley, Infiniband Network Architecture, Addison-Wesley, November 2002.

    32.   D.H.M. Spector, Building Unix Clusters, O'Reilly, Sebastopol, CA, USA, July 2000.

    33.   A.J. van der Steen Exploring VLIW: Benchmark tests on a Multiflow TRACE 14/300, Academic Computing Centre Utrecht, Technical Report TR-31, April 1990.

    34.   A.J. van der Steen, Aspects of Computational Science, NCF, The Hague, 1995.

    35.   A.J. van der Steen, An evaluation of some Beowulf clusters, Technical Report WFI-00-07, Utrecht University, Dept. of Computational Physics, December 2000. (Also available through, directory reports/.)

    36.   A.J. van der Steen, Overview of recent supercomputers , June 2005,, directory reports/.

    37.   T.L. Sterling, J. Salmon, D.J. Becker, D.F. Savaresse, How to Build a Beowulf, The MIT Press, Boston, 1999.

    38. The STREAM benchmark

    39.   H.W. Meuer, E. Strohmaier, J.J. Dongarra, H.D. Simon, Top500 Supercomputer Sites, 33th Edition, June 2010,
      The report can be downloaded from:

    40.   Task Force on Cluster Computing home page:

    41.   The home page of UPC can be found at: