This question focuses on the scoring matrices used in protein sequence alignment. These matrices are essential because they assign a score to every possible substitution of one amino acid for another, reflecting the likelihood of that mutation occurring through evolution. PAM and BLOSUM are the two most widely used families of substitution matrices, but they are derived using different methodologies.
Understanding the Question
The question asks for the fundamental distinction in the way PAM and BLOSUM matrices are constructed.
Key Concepts and Approach
The core concepts are the derivation methods for PAM and BLOSUM matrices. The approach is to compare the source of alignment data used to calculate the substitution frequencies for each matrix type.
Detailed Solution
PAM (Point Accepted Mutation) Matrices: These matrices are based on an explicit evolutionary model. They were constructed by observing amino acid substitutions in global alignments of very closely related proteins (over 85% identical). The probabilities for more distant relationships (e.g., PAM250) are extrapolated from these initial observations.
BLOSUM (BLOcks SUbstitution Matrix) Matrices: These matrices are derived empirically from observing substitutions in ungapped, conserved regions (local alignments or "blocks") of more distantly related proteins. For example, BLOSUM62 is derived from proteins that share no more than 62% identity, making it suitable for finding similarities between more divergent sequences.
The Core Difference: The foundational difference is the alignment data used. PAM relies on an evolutionary model built from global alignments of highly similar sequences, while BLOSUM is based on direct observation of substitutions within conserved local alignment blocks from a broader range of sequences.