In real data, frequencies usually vary among different kind of amino acids. In this case, the correction based on the equal input model gives a better estimate of the number of amino acid substitutions than the Poisson correction distance. Note that this assumes an equality of substitution rates among sites and the homogeneity of substitution patterns between lineages.
MEGA provides facilities to compute the following quantities:
Quantity |
Description |
d: distance |
Number of amino acid substitutions per site. |
L: No of valid common sites |
Number of sites compared. |
The formulas used are:
Distance
where p is the proportion of different amino acid sites, gi is the frequency of amino acid i, and
Variance