ScaDS Logo


Successful ScaDS Big Data School in Leipzig - a Report - Wednesday & Thursday



After this sportive trip we started the next day with presentations on “Distributed Data Processing”. First of all, Prof. Dr. Kai-Uwe Sattler of the TU Ilmenau spoke about Big Data Stream-processing. He gave a survey of the recent processing engines and discussed the different architectures, execution models and programming interfaces. The next speaker Tilmann Rabl, a research director at the Database Systems and Information Management (DIMA) group and technical coordinator of the Berlin Big Data Center (BBDC), introduced the open source system Apache Flink in his talk “Distributed Data Processing and Streaming in Flink”, which allows a faster and more efficient data analysis on both batch and streaming data. A research assistant of the Center for Information Services and High Performance Computing at the TU Dresden continued and focused on another method in his talk “Introduction to Big Data Analytics on HPC clusters”. 

On this day we continued our practical courses with the system Apache Flink and finished the day with a dinner at the “Bayerischer Bahnhof” that offered a wide range of international specialties and the locally brewed beer “Gose”.   


On Thursday we welcomed Prof. Dennis Shasha of the New York University who introduced today’s topic “Graph Analytics” with his talk “Fast Methods for Finding Colored Motifs in Graphs”. He focused on the problem of finding subgraphs of a network. Next, Vasia Kalavri of the KTH, Stockholm, introduced the Gelly framework in the talk “Graph processing on Apache Flink with the Gelly framework” and showed how graph analysis task can be expressed using Flink operators and different graph processing models. Another method of graph analytics was presented by Martin Junghanns, a researcher of the University of Leipzig. In his talk “Graph Analytics with Gradoop” he explained the functionalities of Gradoop. The last speaker of the day, Prof. Sören Auer of the University of Bonn, spoke about “(Big) Knowledge Graphs”. He introduced the concept of knowledge graphs based on the RDF and Linked Data paradigm and thematised recent and future Big Knowledge Graph applications and strategies of the combination of Linked Data paradigms and Big Data. 

For the last practical sessions of this summer school we introduced the previously mentioned Flink Gelly system.