Jukes-Cantor Gamma distance

 

In the Jukes and Cantor (1969)Jukes_and_Cantor_1969 model, the rate of nucleotide substitution is the same for all pairs of the four nucleotides A, T, C, and G.  The multiple hit correction equation for this model, which is given below, produces a maximum likelihood estimate of the number of nucleotide substitutions between two sequences, while relaxing the assumption that all sites are evolving at the same rate.  However, it assumes equal nucleotide frequencies and does not correct for higher rate of transitionRH_Transitional substitutions as compared to transversionRH_Transversional substitutions.  If the rate variation among sites is modeled using the Gamma distribution, you will need to provide a gamma parameterRH_Gamma_parameter (a) for computing this distance.

The Jukes-Cantor model

 

MEGA provides facilities for computing the following p-distances and related quantities:

 

d: Transitions + Transversions  : Number of nucleotide substitutions per site.

L: No of valid common sites: Number of sites compared.

 

The formulas for computing these quantities are as follows:

Distance

where p is the proportion of sites with different nucleotides and a is the gamma parameter.

Variance

See also Nei and Kumar (2000)Nei_and_Kumar_2000, page 36 and estimating gamma parameterHC_Computing_the_Gamma_Parameter.