A new method for characterizing replacement rate variation in molecular sequences. Application of the Fourier and wavelet models to Drosophila and mammalian proteins.
Animal, Confidence-Intervals, Drosophila, Fourier-Analysis, Genes-Immunoglobulin, Genetic-Vectors, Human, Likelihood-Functions, Mammals, Models-Genetic, SUPPORT-U-S-GOVT-NON-P-H-S, SUPPORT-U-S-GOVT-P-H-S, Variation-(Genetics)
Genetics 2000 Jan; 154(1):381-95.
We propose models for describing replacement rate variation in genes and proteins, in which the profile of relative replacement rates along the length of a given sequence is defined as a function of the site number. We consider here two types of functions, one derived from the cosine Fourier series, and the other from discrete wavelet transforms. The number of parameters used for characterizing the substitution rates along the sequences can be flexibly changed and in their most parameter-rich versions, both Fourier and wavelet models become equivalent to the unrestricted-rates model, in which each site of a sequence alignment evolves at a unique rate. When applied to a few real data sets, the new models appeared to fit data better than the discrete gamma model when compared with the Akaike information criterion and the likelihood-ratio test, although the parametric bootstrap version of the Cox test performed for one of the data sets indicated that the difference in likelihoods between the two models is not significant. The new models are applicable to testing biological hypotheses such as the statistical identity of rate variation profiles among homologous protein families. These models are also useful for determining regions in genes and proteins that evolve significantly faster or slower than the sequence average. We illustrate the application of the new method by analyzing human immunoglobulin and Drosophilid alcohol dehydrogenase sequences.
Morozov, P; Sitnikova, T; Churchill, G; Ayala, F J.; and Rzhetsky, A, " A new method for characterizing replacement rate variation in molecular sequences. Application of the Fourier and wavelet models to Drosophila and mammalian proteins." (2000). Faculty Research 2000 - 2009. 21.