75.3k views
2 votes
You've just looked at the k-mer frequency distribution in a set of Illumina sequence data from a bacteria. You notice a peak in this distribution around 25-fold k-mer coverage when you use a k=29. What do you predict will happen to this peak if you use k=31. Explain your answer.

User Pwuertz
by
7.7k points

1 Answer

3 votes

Answer: If you shift the k-mer size from k=29 to k=31 in Illumina sequence data from a bacterium, you would likely observe a decrease in the peak at 25-fold k-mer coverage. This decrease is because, with a larger k-mer size (k=31), the probability of observing the same k-mer multiple times by chance decreases. In other words, k-mers become more unique and less likely to be duplicated in the sequencing data, causing a reduction in the peak at 25-fold k-mer coverage. The new k-mer size would emphasize higher coverage regions, revealing different aspects of the sequence data's characteristics.

Step-by-step explanation:

User Damienfrancois
by
7.7k points