ScaDS Logo

CENTER FOR
SCALABLE DATA ANALYTICS
AND ARTIFICIAL INTELLIGENCE

Big Data Cluster in “Shared Nothing” Architecture in Leipzig - Usage and Outlook

Beitragsseiten

Usage and Outlook

The cluster Galaxy at Leipzig University is optimized for big data research on a shared nothing architecture. It supports a broad diversity of usage through its flexible partitioning approach, including efficient Hadoop job scheduling and dedicated temporary partitions configured according to the user’s need.

Access to Galaxy

The Galaxy cluster is available for scientists at universities in Saxony and the near vicinity. Cluster usage is organized as follows:

  • contact us and we will discuss your requirements
  • create an account at register.sc.uni-leipzig.de
  • get your reservation for cluster ressources
  • access the reserved resources during the reservation interval via SSH to our gateway
  • do your research e.g. with Apache Hadoop jobs or on an exclusive cluster partition
  • clean resources after usage

You can contact us at Diese E-Mail-Adresse ist vor Spambots geschützt! Zur Anzeige muss JavaScript eingeschaltet sein! or directly:

  • Dr. Stefan Kühne (URZ Leipzig, Research/Development)
  • Vadim Bulst (URZ Leipzig, Infrastructure)
  • Lars-Peter Meyer (ScaDS Dresden Leipzig)

More Options

Although shared nothing is the cluster architecture most mentioned in the field of big data research and galaxy is quite an impressive one, we at ScaDS are happy to have more choices for deploying and testing:

Single Server / Virtual Machine:
Sometimes a dedicated server or virtual machine is the best option.

Temporary Clusters:
Especially temporary student projects build temporary shared nothing clusters out of small numbers of office pcs available.

IfI Big Data Research Cluster:
This shared nothing cluster with 18 nodes is located at the computer science department (IfI) of the Leipzig University. This cluster is dedicated for flexible research and offers solid state discs beside normal hard discs as option for comparisons.

Galaxy Cluster:
As described in this article the Galaxy cluster is equipped with about 2 petabyte storage capacity, about 2000 computing cores and in total about 11.5 terabyte RAM, organized in shared nothing architecture. Although tending to the production oriented side it offers quite a lot of flexibility with its partitioning concept.

Venus System:
Venus is a shared memory computer located at the ZIH at the TU Dresden. It has 8 terabyte of RAM for 512 cpu cores and is organized with the linux batch job scheduler Slurm.

Sirius Cluster:
The new Sirius Cluster with 2 shared memory computers is optimized for in memory databases. Each node is equipped with 6 terabyte RAM and 256 cores. The cluster is located at the Leipzig University Computing Center.

Taurus Cluster:
The HPC cluster Taurus is located at the ZIH at the TU Dresden and is listed currently as number 107 in the Top500 List of supercomputers in the world from November 2016.
More than 10 000 cores are organized with the linux batch job scheduler Slurm. The deployment of Apache Hadoop on Slurm is possible.

We have a broad range of systems at our disposal at ScaDS, ranging from temporary test options on single machines or temporary clusters over shared nothing systems over big shared memory systems to the HPC cluster Taurus listed in the top500 list of supercomputers.
We have a broad range of systems at our disposal at ScaDS, ranging from temporary test options on single machines or temporary clusters over shared nothing systems over big shared memory systems to the HPC cluster Taurus listed in the top500 list of supercomputers.