References

Introduction

HPC Architecture

Shared-memory SIMD machines

Distributed-memory SIMD machines

Shared-memory MIMD machines

Distributed-memory MIMD machines

ccNUMA machines

Clusters

Processors

AMD Opteron

IBM POWER7

IBM BlueGene/Q processor

Intel Xeon

The SPARC processors

Accelerators

GPU accelerators

ATI/AMD

nVIDIA

General computational accelerators

Intel Xeon Phi

FPGA accelerators

Convey

Kuberre

SRC

Interconnects

Infiniband

Available systems
The Bull bullx system

The Cray XC30

The Cray XE6

The Cray XK7

The Eurotech Aurora

The Fujitsu FX10

The Hitachi SR16000

The IBM BlueGene/Q

The IBM eServer p775

The NEC SX-9

The SGI Altix UV series

Systems disappeared from the list

Systems under development

Glossary

Acknowledgments

References

R. Alverson, D. Roweth, L. Kaplan, The Gemini Interconnect,
18th IEEE Symposium on High Performance Interconnects, August 2010, 83–87.

Amza C., A.L. Cox, S. Dwarkadas, P. Keleher, R. Rajamony H. Lu, W. Yu, and W.Zwaenepoel. ThreadMarks: Shared memory computing on networks of workstations, to appear in IEEE Computer,
(draft copy): www.cs.rice.edu/willy/TreadMarks/papers.html.

The ASCI program: http://www.sandia.gov/ASC/.

The home page for Co-array Fortran can be found at: http://www.co-array.org/

R. Chandra, L. Dagum, D.Kohr, D. Maydan, J. McDonald, R.Menon, Parallel Programming in OpenMP, Morgan Kaufmann Publishers Inc., January 2001.

B. Chapman, G. Jost, R. van der Pas, Using OpenMP, MIT Press, Boston, 2007.

D.E. Culler, J.P. Singh, A. Gupta, Parallel Computer Architecture: A Hardware/Software Approach, Morgan Kaufmann Publishers Inc., August 1998.

D.W. Doerfler, An analysis of the PathScale Inc. Infiniband Host Channel Adapter, InfiniPath, Sandia Report SAND2005-5199, August 2005.

Bibliography page of Distributed Shared Memory systems: http://www.cs.umd.edu/~keleher/bib/dsmbiblio/dsmbiblio.html.

Directory with EuroBen results: http://www.hpcresearch.nl/euroben/results.

M.J. Flynn, Some computer organizations and their effectiveness, IEEE Transactions on Computing, C-21, (1972) 948-960.

A. Geist, A. Beguelin, J. Dongarra, R. Manchek, W. Jaing, and V. Sunderam, PVM: A Users' Guide and Tutorial for Networked Parallel Computing, MIT Press, Boston, 1994.

D.A. Bader, J. Berry, S. Kahan, R. Murphy, E.J. Riedy, J. Wilcock, The Graph 500 list, edition June 2011. The Graph 500 site has URL: www.graph500.org.

D.B. Gustavson, Q. Li, Local-Area MultiProcessor: the Scalable Coherent Interface., SCIzzL Report, Santa Clara University, Dept. of Computer Engineering, 1995. Available through: www.scizzle.com.

R.W. Hockney, C. Jesshope, Parallel Computers II: Architecture, Programming and Algorithms, Adam Hilger, Ltd., Bristol, United Kingdom, 1988.

Most material on Exascale computing can be found at www.exascale.org.

T. Horie, H. Ishihata, T. Shimizu, S. Kato, S. Inano, and M. Ikesaka, AP1000 architecture and performance of LU decomposition, In: Proceedings of the 1991 International Conference on Parallel Processing, Vol. I, Architecture, CRC Press, Boca Raton, FL, August 1991, I-634–I-635.

HPC Challenge Benchmark, http://icl.cs.utk.edu/hpcc/.

High Performance Fortran Forum, High Performance Fortran language specification, Scientific Programming, 2, 13, (1993) 1--170.

D.V. James, A.T. Laundrie, S. Gjessing, and G.S. Sohi, Scalable Coherent Interface, IEEE Computer, 23, 6, (1990), 74--77.

Julie Langou, Julien Langou, P. Luszczek, J. Kurzuk, J.J. Dongarra, Exploiting the Performance of 32-Bit Floating Point Arithmetic in Obtaining 64-Bit Accuracy, Proceedings of SC06, Tampa, Nov. 2006.

J. Kim, W.J. Dally, S. Scott, D. Abts, Technology-Driven, Highly-Scalable Dragonfly Topology, IEEE Intl. Symposium on Computer Architecture, (2008), 77–88.

T. Maruyama, T. Yoshida, R. Kan, I. Yamazaki, S. Yamamura, N. Takahashi, M. Hondou, H. Okano, SPARC64 VIIIfx: A New-generation Octocore Processor for Petascale Computing, IEEE Micro, 30, 2, (2010), 30–40.

M. Snir, S. Otto, S. Huss-Lederman, D. Walker, J. Dongarra, MPI: The Complete Reference Vol. 1, The MPI Core, MIT Press, Boston, 1998.

W. Gropp, S. Huss-Ledermann, A. Lumsdaine, E. Lusk, B. Nitzberg, W. Saphir, M. Snir, MPI: The Complete Reference, Vol. 2, The MPI Extensions, MIT Press, Boston, 1998.

http://www.myricom.com

N.J. Boden, D. Cohen, R.E. Felderman, A.E. Kulawik, C.L. Seitz, J.N. Seizovic, Wen-King Su, Myrinet -- A Gigabit-per-second Local Area Network, IEEE Micro 15, No. 1, Jan. 1995, 29--36.

Web page for the NAS Parallel benchmarks NPB2: http://www.nas.nasa.gov/Software/NPB/

OpenMP Forum, OpenMP Application Interface, version 2.5, Web page: www.openmp.org/, May 2005.

C. Schow, F. Doany, J. Kash, Get on the Optical Bus, IEEE Spectrum, September 2010, 31--35.

T. Shanley, Infiniband Network Architecture, Addison-Wesley, November 2002.

D.H.M. Spector, Building Unix Clusters, O'Reilly, Sebastopol, CA, USA, July 2000.

A.J. van der Steen Exploring VLIW: Benchmark tests on a Multiflow TRACE 14/300, Academic Computing Centre Utrecht, Technical Report TR-31, April 1990.

A.J. van der Steen, Aspects of Computational Science, NCF, The Hague, 1995.

A.J. van der Steen, An evaluation of some Beowulf clusters, Technical Report WFI-00-07, Utrecht University, Dept. of Computational Physics, December 2000. (Also available through www.hpcresearch.nl/euroben, directory reports/.)

A.J. van der Steen, Overview of recent supercomputers , June 2005, www.hpcresearch.nl/euroben, directory reports/.

T.L. Sterling, J. Salmon, D.J. Becker, D.F. Savaresse, How to Build a Beowulf, The MIT Press, Boston, 1999.

The STREAM benchmark www.cs.virginia.edu/stream/

H.W. Meuer, E. Strohmaier, J.J. Dongarra, H.D. Simon, Top500 Supercomputer Sites, 33th Edition, June 2010,
The report can be downloaded from: http://www.top500.org/.

Task Force on Cluster Computing home page: http://www.cloudbus.org/~raj/tfcc/.

The home page of UPC can be found at: http://upc.gwu.edu.