Tilmann Rabl: Big Data Stream Processing
|
Topic Data analysis has become evermore scalable in recent years and many systems have been introduced that can process huge amounts of data efficiently. However, large portions of these data are most valuable if they are processed quickly after their production in an incremental fashion. To this end, scalable stream processing systems have been developed. In contrast to traditional stream processing systems, these can scale to tens or hundreds of nodes like their batch processing counter parts. In this talk, we will give an overview on big data stream processing. We will give details on the use of big data streaming systems using Apache Flink as an example and explain differences between current open source systems. Apache Flink is an open source system for expressive, declarative, fast, and efficient data analysis on both batch and streaming data. Flink combines the scalability and programming flexibility of distributed MapReduce-like platforms with the efficiency, out-of-core execution, and query optimization capabilities found in parallel databases. Short Bio: Tilmann Rabl is a Visiting Professor at the Database Systems and Information Management (DIMA) group at the Technische Universität Berlin. At DIMA he is research director and technical coordinator of the Berlin Big Data Center (BBDC). |