188k views
4 votes
A single strand of a DNA molecule is a sequence of nucleotides. There are four possible nucleotides in each position (step), one of which is cytosine (C). In a particular long strand, it has been observed that C appears in 34.1% of the positions. Also, in 36.8% of the cases where C appears in one position along the strand, it also appears in the next position.1. What is the probability that a randomly chosen pair of adjacent nucleotides is CC (that has cytosine in both locations).2. If a position along the strand is not C, then what is the probability that the next position is C?3. If a position n along the strand is C, what is the probability that position n + 2 is also C? How about position n + 4?4. Answer parts (a)- (c) if C appeared independently in any one position with probability 0.341.

1 Answer

6 votes

Answer:

The probability of C appearing in two adjacent positions is the product of the probability of C appearing in the first position and the probability of C appearing in the second position given that C appeared in the first position:

P(CC) = P(C in first position) * P(C in second position | C in first position)

Using the given information, we know that P(C in first position) = 0.341 and P(C in second position | C in first position) = 0.368. Therefore:

P(CC) = 0.341 * 0.368 = 0.125488

So the probability of a randomly chosen pair of adjacent nucleotides being CC is approximately 0.125.

The probability that the next position is C given that the current position is not C is the conditional probability:

P(C in next position | not C in current position) = P(C in next position and not C in current position) / P(not C in current position)

The numerator is the probability of C in the next position and not C in the current position, which is equal to the probability of not C in the current position times the probability of C in the next position given that the current position is not C:

P(C in next position and not C in current position) = (1 - 0.341) * P(C in next position | not C in current position)

The denominator is the probability of not C in the current position, which is equal to 1 minus the probability of C in the current position:

P(not C in current position) = 1 - 0.341

Using the given information, we know that P(C in next position | not C in current position) = 0.159. Therefore:

P(C in next position | not C in current position) = [(1 - 0.341) * 0.159] / (1 - 0.341) = 0.159

So the probability that the next position is C given that the current position is not C is approximately 0.159.

If a position n along the strand is C, then the probability that position n + 2 is also C is:

P(C in n + 2 position | C in n position) = P(C in n + 2 position and C in n position) / P(C in n position)

The numerator is the probability of C in both position n and position n + 2, which is equal to the probability of C in position n times the probability of C in position n + 2 given that C appeared in position n:

P(C in n + 2 position and C in n position) = 0.341 * 0.368

The denominator is the probability of C in position n, which is equal to 0.341:

P(C in n position) = 0.341

Therefore:

P(C in n + 2 position | C in n position) = (0.341 * 0.368) / 0.341 = 0.368

So the probability that position n + 2 is also C given that position n is C is approximately 0.368.

Similarly, the probability that position n + 4 is also C given that position n is C is:

P(C in n + 4 position | C in n position) = P(C in n + 4 position and C in n position) / P(C in n position)

The numerator is the probability of C in both position n and position n + 4, which is equal to the probability of C in position n times the probability of not C in position n + 1 times the probability of not C in position n + 2 times the probability of C in position n + 3 times the probability of C in position n + 4 given that C appeared in position n + 3:

P(C in n + 4 position and C in n position) = 0.341 * 0.659 * 0.632 * 0.368

The denominator is still the probability of C in position n, which is equal to 0.341:

P(C in n position) = 0.341

Therefore:

P(C in n + 4 position | C in n position) = (0.341 * 0.659 * 0.632 * 0.368) / 0.341 = 0.158

So the probability that position n + 4 is also C given that position n is C is approximately 0.158.

If C appeared independently in any one position with probability 0.341, then the probabilities calculated in parts (a)-(c) would be different. In particular:

(a) The probability of C in two adjacent positions would be the product of the probability of C in each position, since the positions are independent:

P(CC) = 0.341 * 0.341 = 0.116281

So the probability of a randomly chosen pair of adjacent nucleotides being CC is approximately 0.116.

(b) The probability that the next position is C given that the current position is not C would be the same as the overall probability of C, since the positions are independent:

P(C in next position | not C in current position) = P(C) = 0.341

(c) The probability that position n + 2 is also C given that position n is C would be the same as the overall probability of C, since the positions are independent:

P(C in n + 2 position | C in n position) = P(C) = 0.341

Similarly, the probability that position n + 4 is also C given that position n is C would also be the same as the overall probability of C:

P(C in n + 4 position | C in n position) = P(C) = 0.341

User Truncheon
by
8.3k points