Management and Processing of Mass Spectrometry Data
The research project “Management and Processing of Mass Spectrometry Data” is a collaborative project between the big data center ScaDS Dresden/Leipzig and the Department of Analytical Chemistry at Helmholtz Centre for Environmental Research (UFZ).
The Helmholtz Centre for Environmental Research (UFZ) investigates the interaction between environment and humans. One focus is the fate of chemicals that are released to the environment with unknown effects for humans and the ecosystem. However, chemical substances outside the laboratory never occur isolated but mix with a background of naturally occurring molecules. This natural organic matter is one of the most complex mixtures of chemical substances and found almost anywhere on the planet.
Currently at the UFZ, state-of-the art analytical tools (e.g. ultra-high resolution mass spectrometry) are used to characterize naturally occurring molecules with high accuracy and precision. Tens of thousands molecules can be detected in each sample and chemical formulas be calculated from precise mass measurements and known masses of atoms. Therefore, data sets produced by these instruments are large, structurally complex, and intrinsically connected by chemical rules and measurement parameters.
In this joint project of ScaDS and the UFZ Department of Analytical Chemistry we aim at developing a big data integration and analysis pipeline to facilitate the handling of these expansive data sets. For instance, manual steps in such complex workflows are prone to error, impair reproducibility and limit scalability. We therefore build a flexible end-to-end analytics platform to process, manage and analyze mass spectrometry data of complex mixtures of molecules based on the data analytics, reporting and integration platform KNIME, an Oracle database and the statistical programming language R. We use these powerful tools to implement workflows covering efficient data evaluation algorithms as well as novel visualizations of mass spectrometry data.
- Dr. Anika Groß
- Dr. Eric Peukert
- Kevin Jakob (Student)
Helmholtz-Zentrum für Umweltforschung UFZ
- Dr. Oliver Lechtenfeld
- Initial Automation of manual steps with KNIME
- Data management for mass-spectrometry data
- Parallelization of Sum-Formula computation
- automatic Extraction of metadata
- automated fast validation of measurements
- Create a Service-Offering at UFZ for many Mass-Spectrometry-Users