ScaDS Logo

COMPETENCE CENTER
FOR SCALABLE DATA SERVICES
AND SOLUTIONS

 

Management and Processing of Mass Spectrometry Data

 

The research project “Management and Processing of Mass Spectrometry Data” is a collaborative project between the big data center ScaDS Dresden/Leipzig and the Department of Analytical Chemistry at Helmholtz Centre for Environmental Research (UFZ).

Background

The Helmholtz Centre for Environmental Research (UFZ) investigates the interaction between environment and humans. One focus is the fate of chemicals that are released to the environment with unknown effects for humans and the ecosystem. However, chemical substances outside the laboratory never occur isolated but mix with a background of naturally occurring molecules. This natural organic matter is one of the most complex mixtures of chemical substances and found almost anywhere on the planet.

Currently at the UFZ, state-of-the art analytical tools (e.g. ultra-high resolution mass spectrometry) are used to characterize naturally occurring molecules with high accuracy and precision. Tens of thousands molecules can be detected in each sample and chemical formulas be calculated from precise mass measurements and known masses of atoms. Therefore, data sets produced by these instruments are large, structurally complex, and intrinsically connected by chemical rules and measurement parameters.

Objectives

In this joint project of ScaDS and the UFZ Department of Analytical Chemistry we aim at developing a big data integration and analysis pipeline to facilitate the handling of these expansive data sets. For instance, manual steps in such complex workflows are prone to error, impair reproducibility and limit scalability. We therefore build a flexible end-to-end analytics platform to process, manage and analyze mass spectrometry data of complex mixtures of molecules based on the data analytics, reporting and integration platform KNIME, an Oracle database and the statistical programming language R. We use these powerful tools to implement workflows covering efficient data evaluation algorithms as well as novel visualizations of mass spectrometry data.

 

 

Project Members

ScaDS Dresden/Leipzig

  • Dr. Anika Groß
  • Dr. Eric Peukert
  • Kevin Jakob (Student)

Helmholtz-Zentrum für Umweltforschung UFZ

  • Dr. Oliver Lechtenfeld

 

Preliminary Results

  • Initial Automation of manual steps with KNIME
  • Data management for mass-spectrometry data
  • Parallelization of Sum-Formula computation
  • automatic Extraction of metadata

 

 

Outlook

 

  • automated fast validation of measurements
  • Create a Service-Offering at UFZ for many Mass-Spectrometry-Users