In real data, nucleotide frequencies often deviate substantially from 0.25. In this case the Tajima-Nei distance (Tajima and Nei 1984 ) gives a better estimate of the number of nucleotide substitutions than the Jukes-Cantor distance. Note that this assumes an equality of substitution rates among sites and between transitional and transversional substitutions. When the nucleotide frequencies are different between the sequences, the modified formula (Tamura and Kumar 2002) relaxes the assumption of substitution pattern homogeneity.
The Felsenstein-Tajima-Nei model
MEGA provides facilities for computing the following quantities for this method:
d: Transitions + Transversions: Number of nucleotide substitutions per site.
L: No of valid common sites: Number of sites compared.
Formulas for computing these quantities are as follows:
Distance
where p is the proportion of sites with different nucleotides and
where xij is the relative frequency of the nucleotide pair i and j, gi’s are the nucleotide frequencies.
Variance can be estimated by the bootstrap method.