ScaDS Logo

COMPETENCE CENTER
FOR SCALABLE DATA SERVICES
AND SOLUTIONS

Using Apache Flink CEP for real-time logistics monitoring

Beitragsseiten

With the increasing distribution of smart devices and sensor systems it is now possible to get data and context information of any element of the real-world. The Internet of Things (IoT) is synonym of this trend, which also becomes a social meaning because it affects all areas of everyday life. But of course this trend provides massive possibilities to improve our life, as well as our companies. So for example real-time insights into business processes are getting more important in recent times. With the right information and only short delays between certain incidents decision makers on operational level are able to quickly react and adapt changes to the processes. Thus, companies with in-depth knowledge of their processes have options to optimize their business as well as to offer new service levels for customers and increase earnings.

[1]

One domain where this necessity was already seen a few years ago, is the logistics sector. Since some years, most logistics providers monitor the location status of the goods. Logistics customers want to know where the goods are, but also what happened with them. For example, fragile or sensitive goods can be very expansive for manufacturers, if they are damaged and production stands still because they aren’t in stock anymore. For this reasons logistics providers have to collect more information about the goods on a more detailed level. To collect the GPS coordinates of trucks, airplanes or ships isn’t sufficient for this. Its needed to gather data on a packet level, so information about environmental conditions or movements for each shipment good independently.

This mass of information first must be handled. One option is to use data warehouse and business intelligence systems for saving and processing data. But this approach lacks in speed of action. Depending on the used technologies it can take hours until incidents and variances in processes are detected. Another technology for analysing streams of real-time data is Complex Event Processing (CEP). Having its roots on ECA (event-condition-action) rules of database systems, firstly they were developed to detect incidents in stock exchange trading. Due to emerging new data sources they were also used to handle also RFID or other sensor data. But traditional CEP engines like Esper, Drools or Tibco are not efficiently scalable to handle millions of events per second and simultaneously detect complex event patterns in this data. Hence with the rise of scalable stream-processing frameworks like Apache Storm, Apache Spark Streaming and Apache Flink the foundations for scalable complex event processing was laid. While Spark Streaming-based CEP engines retain the same performance limits due to the micro-batch processing of Apache Spark (e.g. Stratio Decision [2]), Storm-based frameworks need to involve a whole system stack (e.g. WSO2 CEP [3]). Now the Apache Flink framework provides a CEP library because “[…] its true streaming nature and its capabilities for low latency as well as high throughput stream processing is a natural fit for CEP workloads.”[4]

Leipzig University and some business partners explored this research topic in the LogiLEIT [5] project funded by the german federal ministry of education and research. In cooperation with the ScaDS competence center a scalable approach based on big data technologies was developed to enhance the current capabilities.