Next: About this document ...
Up: Appendices
Previous: Appendix B:
Contents
Appendix B: Structural Similarity per Residue
Here we define another metric, called Q, that is derived from Q which is used to measure the structural conservation of the environment of each
residue in the alignment. Q is a measure of the similarity of the
C-C distances between a particular residue and all other aligned residues, excluding nearest neighbors, in a set of aligned proteins.
The result is a value between 0 and 1 that describes the similarity of the
structural environment of a residue in a
particular protein to the environment of that same residue in all other proteins in the set. Lower scores represent low similarity and higher scores high
similarity.
If the set of proteins represents an evolutionarily balanced set, then structural similarity corresponds to structural conservation. Formally,
Q is defined as follows:
|
(1) |
where
is the structural similarity of the residue in the protein, is the -
distance between residues and in protein and
is the - distance between the residues in
protein that correspond to residues and in protein . The variance is related to the sequence separation between residues and ,
|
(2) |
and the normalization is given by
|
(3) |
where is the number of proteins in the set, is the number of residues in protein , and is 2 when residue is the N- or
C-terminus otherwise 3.
In order to know which residues correspond to each other across the set of proteins, Q requires a multiple sequence alignment (MSA) of the
proteins' sequences. Typically the MSA is generated using a structural alignment program.
Next: About this document ...
Up: Appendices
Previous: Appendix B:
Contents
multiseq@scs.uiuc.edu