ScaDS Logo


TEC-3: Hadoop usage


Service Owner



Jan Frenzel


Developers, high-level programmers (Java) and Data Scientist which need distributed computing, but do not want to implement parallelization aspects, such as communication or data distribution.




It is possible to evaluate and use our Hadoop cluster for your data analysis. Not only high-performance computers can be used to distribute data storage and processing and thus, reduce runtime. It is also possible to use commodity hardware via a so called shared-nothing cluster. The architecture and concepts are different than the usual ones used in HPC. Additionally, other programming paradigms, such as MapReduce, can be used. For testing this architecture and analyzing data, a Hadoop cluster can be used. It is possible to use it as a stand-alone cluster or together with other users. If you are planning to employ other software based on Hadoop, you can ask us for support.


  •  login to our Hadoop cluster

  •  a dedicated Hadoop cluster environment with a configuration specialized to the user's needs

  •  hosting of Hadoop applications

  •  quickstart guides for writing Hadoop jobs


  •  Collect information about your use case. Prepare for the following questions:

    •  Do you only want to evaluate Apache Hadoop?

    •  Do you already have a (serial/parallel) program?

    •  What challenge are you addressing with your program? This can include:

      •  Analyze data streams (in parallel)

      •  Distributed batch processing

    •  How long does it take to complete your (serial/parallel) program?

    •  How many resources do you need? (type of computing resources and amount of required computing time)

    •  Who is responsible for your project?

  •  Contact us (via e-mail or phone)

  •  We send you an application form. With this form, we want to have a look at your use case and see the specific requirements. This helps us to provide any additional software you might need. Additionally, we need this form to request computing resources.

  •  Fill out the form and send it back to us.

  •  We contact you, when your login is granted and you can access our cluster. This might require some time.

  •  We send you information about how to use our cluster. This includes material on how to log on or submit jobs to the cluster, write programs or avoid potential bottlenecks.