Lab Home | Phone | Search | ||||||||
|
||||||||
Coalescent processes represent probabilistic frameworks for quantifying and identifying the impact of evolutionary forces using genetic variation in samples of DNA sequences. In my talk I will present two coalescent based inference methods for population genetic data at a neutral non-recombining locus. The first method is a joint work with Prof. John Wakeley. In this project we describe a forward-time haploid reproduction model with a constant population size that includes life history characteristics common to many marine organisms. We develop coalescent approximations for sample gene genealogies under this model and use these to predict patterns of genetic variation. Depending on the behavior of the underlying parameters of the model, the approximations are coalescent processes with simultaneous multiple mergers or Kingman’s coalescent. Using simulations, we apply our model to data from the Pacific oyster and show that our model predicts the observed data very well. We also show that a fact which holds for Kingman’s coalescent and also for general coalescent trees–that the most-frequent allele at a biallelic locus is likely to be the ancestral allele–is not true for our model. Our work suggests that the power to detect a “sweepstakes effect” in a sample of DNA sequences from marine organisms depends on the sample size. In the second project I developed computationally tractable inference method for full polymorphisms in samples of DNA sequences at a neutral non-recombining locus when the underlying probabilistic framework is based on the the general coalescent tree framework with infinite-site model. The general coalescent tree framework is a family of models for determining ancestries among random samples of DNA sequences at a non-recombining locus. The ancestral models included in this framework can be derived under various evolutionary scenarios. First, an exact sampling scheme is developed to determine the topologies of conditional ancestral trees. However, this scheme has some computational limitations and to overcome these limitations a second scheme based on importance sampling is provided. Next, these schemes are combined with Monte Carlo integrations to estimate the likelihood of full polymorphism data, the ages of mutations in the sample, and the time of the most recent common ancestor. I applied this method for estimating the likelihood of neutral polymorphisms in a sample of DNA sequences completely linked to a mutant allele, which is either neutral or under selection. Host: Thomas Leitner |