ScaDS Logo

CENTER FOR
SCALABLE DATA ANALYTICS
AND ARTIFICIAL INTELLIGENCE

Big Data Cluster in “Shared Nothing” Architecture in Leipzig - Galaxy Hardware

Beitragsseiten

Galaxy Hardware

Common Hardware Recommendations

Some older public sources on hardware recommendations are available in public, for example:

Most guides go more into detail, e.g. make a distinction between compute and storage/IO intensive work load, but here is just a very brief summary:

  • master nodes need more reliability, slave/worker nodes less
  • 1-2 sockets witch 4-8 cores each and medium clock speed
  • 32-512 gigabyte ECC RAM
  • 4-24 hard disks with 2-4 terabyte each and SATA or SAS interface.
  • 1-10 gigabit/s Ethernet

The Actual Hardware of Galaxy

For our big data cluster Galaxy we decided, as mentioned before, to go for the shared nothing architecture. Special focus was put on a big number of nodes and high flexibility for the high diversity of researchers needs. A Europe wide tender procedure got us good cluster hardware:

90 Nodes with the following homogeneous hardware specification:

  • 2 sockets equipped with 6 Core CPUs (Intel Xeon E5-2620 v3, 2.4 GHz, supports Hyperthreading)
  • 128GB RAM (DDR4, ECC)
  • 6 SATA Hard disks with 4 terabyte each
  • attached via a controller capable of JBOD (recommended for Apache HDFS) or fault tolerant RAID 6 configuration
  • 10 gigabit/s Ethernet interface

In addition to that we have a dedicated virtualization infrastructure. There we can organize management nodes and master nodes in as many virtual machines as needed. They benefit from better protection against hardware failures but can leverage only limited resources in comparison with a dedicated server.

Network Infrastructure

The big data cluster spans both ScaDS partner locations, Dresden and Leipzig. 30 of the 90 Nodes are located at the TU Dresden, 60 Nodes and the management infrastructure is located at the Leipzig University. The nodes of both locations are organized in a common private network, connected transparently via a VPN tunnel. To achieve optimal performance on this tunnel, the VPN tunnel endpoints on both sides are specialized routers with hardware support for the VPN packaging. The Ethernet-bandwidth within each location is 10 gigabit/s via non-blocking switches. For security reasons the cluster’s private network is accessible only from the network of the Leipzig University. Scientists from other institutions with a project on the big data infrastructure are currently provided with a VPN-Login when needed.

The Galaxy cluster consists of 90 worker nodes plus virtualization infrastructure plus network infrastructure to interconnect the two locations transparently.
The Galaxy cluster consists of 90 worker nodes plus virtualization infrastructure plus network infrastructure to interconnect the two locations transparently.