Bachelor Thesis (Leipzig): Dockerizing Big Data Software
Big Data and Container are both important topics in the IT world, but the combination of both needs more research. We are looking for the development and testing of a concept for big data cluster software like Apache Hadoop or Apache Kafka in UNIX containers for researchers. Possible container technologies could be Docker, rckt or something else, networking and orchestration could be done e.g. with Mesos, Kubernetes, Docker Swarm or scripting.
The work includes the following subtasks:
- requirements collection of big data cluster software
- overview on container technologies
- showcase development of big data software in container cluster
- evaluation of the concept on a big data cluster