Kuberre

Introduction

HPC Architecture

Shared-memory SIMD machines

Distributed-memory SIMD machines

Shared-memory MIMD machines

Distributed-memory MIMD machines

ccNUMA machines

Clusters

Processors

AMD Opteron

IBM POWER7

IBM BlueGene/Q processor

Intel Xeon

The SPARC processors

Accelerators

GPU accelerators

ATI/AMD

nVIDIA

General computational accelerators

Intel Xeon Phi

FPGA accelerators

Convey

Kuberre

SRC

Interconnects

Infiniband

Available systems
The Bull bullx system

The Cray XC30

The Cray XE6

The Cray XK7

The Eurotech Aurora

The Fujitsu FX10

The Hitachi SR16000

The IBM BlueGene/Q

The IBM eServer p775

The NEC SX-9

The SGI Altix UV series

Systems disappeared from the list

Systems under development

Glossary

Acknowledgments

References

Since May 2009 Kuberre markets its FPGA-based HANSA system. The information provided is extremely scant. The company has traditionally been involved in financial computing and with the rising need for HPC in this sector Kuberre has built a system that houses 1--16 boards, each with 4 Altera Stratix II FPGAs and 16 GB of memory in addition to one dual core x86-based board that acts as a front-end. The host board runs the Linux or Windows OS and the compilers.
For programming a C/C++ or Java API is available. Although Kuberre is almost exclusively oriented to the financial analytic market, the little material that is accessible shows that libraries like, ScaLAPACK, Monte-Carlo algorithms, FFTs and Wavelet transforms are available. For the Life Sciences standard applications like BLAST, and Smith-Watermann are present. The standard GNU C libraries can also be linked seamlessly.
The processors are organised in a grid fashion and use a 256 GB distributed shared cache to combat data access latency. The system comes configured as having 768 RISC CPUs for what are called "generic C/C++ programs" or as 1536 double precision cores for heavy numerical work. It is possible to split the system to run up to 16 different "contexts" (reminiscent to Convey's personalities, see The Convey HC-2). A part of the machine may be dedicated to a Life Science application where other parts work on encryption and numerical applications.
Like for the Convey HC-2 it is hardly possible to give performance figures but a fully configured machine with 16 boards should be able to obtain 250 Gflop/s on the Linpack benchmark which cannot really be regarded as ''High Performance'' these days. However, it may do very well on specialised workloads.
The material publicly available does not allow to show a reliable block diagram but this may come about later when the system might be installed at sites that want to evaluate it.