210k views
5 votes
This might be naive since I am very new to the field, but I wonder about the difference between sequence identity and coverage in multiple sequence alignment of proteins. I imagine the calculation would be simple, but I couldn't find strict definitions of these two terms and how they are calculated. Can somebody possibly provide an example showing how these two terms are calculated (preferably sequences with gaps and the terms not being 100%)?

User Kartika
by
7.9k points

1 Answer

5 votes

Final answer:

Sequence identity and coverage are two important measures used in multiple sequence alignment of proteins. Sequence identity refers to the percentage of identical amino acids between two aligned sequences, while coverage refers to the percentage of the query sequence that is aligned with the subject sequence.

Step-by-step explanation:

Sequence identity and coverage are two important measures used in multiple sequence alignment of proteins. Sequence identity refers to the percentage of identical amino acids between two aligned sequences, while coverage refers to the percentage of the query sequence that is aligned with the subject sequence.

Sequence identity is calculated by dividing the number of matching amino acids by the total number of aligned amino acids, and then multiplying by 100. For example, if there are 20 matching amino acids out of 30 aligned amino acids, the sequence identity would be 20/30 * 100 = 66.7%.

On the other hand, coverage is calculated by dividing the length of the aligned region by the length of the query sequence, and then multiplying by 100. For example, if the aligned region is 50 amino acids long and the query sequence is 100 amino acids long, the coverage would be 50/100 * 100 = 50%.

User Crypth
by
8.6k points