ScaDS Logo

COMPETENCE CENTER
FOR SCALABLE DATA SERVICES
AND SOLUTIONS

Sierra Platinum - Sierra Platinum

Beitragsseiten

Sierra Platinum

Therefore, we developed a new approach that allows to handle replicated measurements leading to several pairs of background and experiment called replicate. In addition, we also wanted to go one step further. Most current peak-caller are batch programs: given the data sets and some parameters as input they produce a list of peaks as output. Important information about the data is thus lost. Our approach combines the computation of the final result—the list of peaks—with additional statistics about the data. One example is the determination of the moments of the underlying distribution. The distribution is a mathematical model of the data. A moment is for example the average value of the data. If the found distributions and their parameters match the theoretical assumptions, then it is valid to apply the proposed peak-calling approach. Otherwise, it should be considered not to use the data set.

Computing these statistical measurements is only one part. The second part is to visualize this statistical information using adapted visualizations. The visualizations allow the analyst to literally see whether or not one of the data sets has a problem. Further, they allow to see the correlation among the replicates and the contribution of each data set to the final result—the list of peaks. If a data set has low quality it can either be completely excluded from the determination of the peaks, or its influence on the result can be reduced.