ScaDS Logo

CENTER FOR
SCALABLE DATA ANALYTICS
AND ARTIFICIAL INTELLIGENCE

News

On Wednesday 29th June 2016 we hosted the presentation by Prof. Dr. Peter Christen from the Australian National University, Canberra.

 

Recent developments and research challenges in data linkage 

Abstract: 

Techniques for linking and integrating data from different sources are becoming increasingly important in many application areas, including health, census, taxation, immigration, social welfare, in crime and fraud detection, in the assembly of national security intelligence, for businesses, in bibliometrics, as well as in the social sciences. 

In today's Big Data era, data linkage (also known as entity resolution, duplicate detection, and data matching) not only faces computational challenges due to the increasing size of data collections and their complexity, but also operational challenges as many applications move from static environments into real-time processing and analysis of potentially very large and dynamically changing data streams, where real-time linking of records is required. Additionally, with the growing concerns by the public of the use of their sensitive data, privacy and confidentiality often need to be considered when personal information is being linked and shared between organisations. 

 

Short-bio:

Peter Christen is a Professor at the Research School of Computer Science at the Australian National University in Canberra. He received his Diploma in Computer Science Engineering from ETH Zurich in 1995 and his PhD in Computer Science from the University of Basel in 1999. His research interests are in data mining and data matching (record linkage). He has published over 140 articles in these areas, including in 2012 the book `Data Matching' published by Springer. In 2015 he was co-editor of the book `Population Reconstruction' also published by Springer. He is the principle developer of the Febrl (Freely Extensible Biomedical Record Linkage) open source data cleaning, deduplication and record linkage system.