Bachelor or Master Thesis(Leipzig): Efficient storage and retrieval of temporal graph data
Graphs are ubiquitous - many of them are large and strongly change over time. Typical examples for large time-evolving graphs are information networks in the web and social networks with billions of vertices and edges. The analysis of such graphs representing interrelated and evolving information is needed for numerous applications, e.g., in social networks to analyse user communities, in bioinformatics to analyse protein-protein interactions, in e-commerce to analyse the website usage and purchases of customers, or in criminology to analyse the behaviour of suspects with all their relevant actions.
In this work we would like to investigate how to extend an existing property graph model by temproal aspects and how to efficiently store and query such temporal graph information in a distributed graph store. If possible, the work shall be based on the graph analytics framework GRADOOP and should investigate possible extensions of the underlying graph storage mechanism which curently is implemented on HBASE.
The thesis consists of the following subtasks
- Getting an overview to the related work of (distributed) temporal graph storage
- Getting an overview to technology stack of Apache and their support for temporal data (looking into HBASE, Accumulo etc.)
- Analysis of temporal analytics usecase of a ScaDS-Usecasepartner
- Conceptional storage schema for temporal graph data
- Prototypical implementation evaluation of newly developed temporal graph store, prototypical integration into gradoop