Sierra Platinum - Multi-Replicate Peak-Calling


Multi-Replicate Peak-Calling

Quite some peak-caller exist that can find peaks for single experiment-background pairs. Lately, however, the measurement protocol got that cheap that replicates are affordable. The same cell sample is divided into several parts, each part is sent to sequencing to perform the measurements for background and experiment. This leads to several replicates of the resulting background and experiment data sets.

Now, peak-calling becomes more involved. Most available peak-callers could not handle these situations. Thus, two methods were created to handle replicates. The first combines all background and all experiment data sets, and uses the combined data sets as input for single data set peak-caller. The other method creates peaks for each data set and then combines the results. Both methods are problematic. For the first method, it is assumed, that the number of elements determining the signal is the same for both background and experiment. For the second method, peaks that are not reliable might be included in the final result. In both methods, it can not easily be decided how good a data set matches the final results. Moreover, if the quality of a data set is bad, it ought to be excluded or contribute less to the overall result.

Only one peak-caller called PePr can handle replicates. However, the underlying mathematical model does not match the replication assumptions and the results from tests where not convincing.