Usage and Outlook
The cluster Galaxy at Leipzig University is optimized for big data research on a shared nothing architecture. It supports a broad diversity of usage through its flexible partitioning approach, including efficient Hadoop job scheduling and dedicated temporary partitions configured according to the user’s need.
Access to Galaxy
The Galaxy cluster is available for scientists at universities in Saxony and the near vicinity. Cluster usage is organized as follows:
- contact us and we will discuss your requirements
- create an account at register.sc.uni-leipzig.de
- get your reservation for cluster ressources
- access the reserved resources during the reservation interval via SSH to our gateway
- do your research e.g. with Apache Hadoop jobs or on an exclusive cluster partition
- clean resources after usage
- Dr. Stefan Kühne (URZ Leipzig, Research/Development)
- Vadim Bulst (URZ Leipzig, Infrastructure)
- Lars-Peter Meyer (ScaDS Dresden Leipzig)
Although shared nothing is the cluster architecture most mentioned in the field of big data research and galaxy is quite an impressive one, we at ScaDS are happy to have more choices for deploying and testing:
Single Server / Virtual Machine:
Sometimes a dedicated server or virtual machine is the best option.
Especially temporary student projects build temporary shared nothing clusters out of small numbers of office pcs available.
IfI Big Data Research Cluster:
This shared nothing cluster with 18 nodes is located at the computer science department (IfI) of the Leipzig University. This cluster is dedicated for flexible research and offers solid state discs beside normal hard discs as option for comparisons.
As described in this article the Galaxy cluster is equipped with about 2 petabyte storage capacity, about 2000 computing cores and in total about 11.5 terabyte RAM, organized in shared nothing architecture. Although tending to the production oriented side it offers quite a lot of flexibility with its partitioning concept.
Venus is a shared memory computer located at the ZIH at the TU Dresden. It has 8 terabyte of RAM for 512 cpu cores and is organized with the linux batch job scheduler Slurm.
The new Sirius Cluster with 2 shared memory computers is optimized for in memory databases. Each node is equipped with 6 terabyte RAM and 256 cores. The cluster is located at the Leipzig University Computing Center.
The HPC cluster Taurus is located at the ZIH at the TU Dresden and is listed currently as number 107 in the Top500 List of supercomputers in the world from November 2016. More than 10 000 cores are organized with the linux batch job scheduler Slurm. The deployment of Apache Hadoop on Slurm is possible.