The role of sexual recombination and off-season survival in temporal maintenance of Puccinia

Get Complete Project Material File(s) Now! »

MATERIALS AND METHODS

Selection of isolates. A set of 409 isolates was selected to represent 11 geographical regions on six continents (Africa, Asia, Australia, Europe, North America and South America) from a collection of more than 4,000 isolates available at Institut National de la Recherche Agronomique (INRA), France and Aarhus University, Denmark. The selection was made to maximize the representation of each population (partially assessed previously by AFLP, microsatellites and virulence profiles (Ali et al., 2010; Bahri et al., 2009; Bahri et al., 2011; de Vallavieille-Pope et al., 2012; Enjalbert et al., 2005; Hovmøller et al., 2008; Mboup et al., 2009)) such that isolates from different genotypic groups were present in any given geographical region. Isolates representative of aggressive strains were selected from the two recently emerged aggressive strains, PstS1 (associated with the post-2000 epidemics in the USA and Australia), and the European strain, PstS2, as well as a set of aggressive isolates frequently reported in Europe, PstS3, which were lesser aggressive than PstS1 and PstS2 (Milus et al., 2009). Details regarding the number of isolates are shown in Table. 1. Molecular genotyping. For most isolates, DNA was already available, having been previously extracted through modified CTAB protocols (Enjalbert et al., 2002; Justesen et al., 2002). For isolates received from Pakistan and Nepal in 2008 and China in 2005, DNA was extracted from 5 mg of spores following Ali et al.(2011; ANNEX II of thesis). All of the isolates were multiplied from single pustule lesions to avoid a mixture of genotypes. Molecular genotyping was carried out using a set of 20 microsatellite loci in three multiplex reactions, with subsequent separation of the PCR products using a Beckman Coulter CEQ-8000 DNA Analyzer. Electrophorograms were processed using the CEQ-8000 Genetic Analysis System Software (Beckman Coulter) (ANNEX II of thesis; Ali et al., 2011). Analyses of population subdivision. The level of population subdivision among different geographical regions was assessed using both model-based Bayesian and non-parametric, multivariate clustering approaches. We used the model-based Bayesian method implemented in STRUCTURE 2.2 (Pritchard et al., 2000). The rationale of this method is to assign multilocus genotypes to different clusters while minimizing the Hardy-Weinberg disequilibrium and the gametic phase disequilibrium between loci within clusters (where the number of clusters may be unknown). The Monte Carlo Markov Chain (MCMC) sampling scheme was run for 200,000 iterations with a 100,000 burn-in period, with K ranging from 1 to 10 and 20 independent replications for each K. The STRUCTURE outputs were processed with CLUMPP (Jakobsson and Rosenberg, 2007); a G’-statistic greater than 80% was used to assign groups of runs to a common clustering pattern. Because STRUCTURE can overestimate the number of clusters when there is relatedness among some genotypes (e.g., due to asexual reproduction; Gao et al., 2007), we also analyzed the level of population subdivision using a non-parametric approach that does not rely on a particular population model. We used discriminate analyses of principal components (DAPC), implemented in the ADEGENET package in the R environment (Jombart et al., 2010). The number of clusters was identified based on the Bayesian Information Criterion (BIC), as suggested by Jombart et al. (2010). The relatedness among populations was plotted using a neighbor-joining population tree based on the genetic distance DA (Nei et al., 1983), as implemented in the POPULATION program (Langella, 2008). Significance was assessed using 1000 bootstraps. The level of population differentiation was assessed using pairwise FST statistics among pairs of populations (GENETIX 4.05.2 (Belkhir et al., 2004)).
Analyses for genetic variability and recombination. The quality of the set of markers for inferring population structure was tested by assessing the ability of the set of microsatellite loci to detect multilocus genotypes (MLGs) under panmixia, using GENCLONE (Arnaud-Haond and Belkhir, 2007). The redundancy of the set of loci was tested by estimating the linkage disequilibrium among different loci and generating 1000 random permutations with GENETIX 4.05.2 (Belkhir et al., 2004). Within-population variability was assessed using allele richness and gene diversity, calculated with FSTAT 2.9.3 (Goudet, 2001). Private allelic richness was estimated using a rarefaction approach, implemented in ADZE (Szpiech et al., 2008). Observed (Ho) and unbiased expected heterozygosity (He) were computed using GENETIX 4.05.2 (Belkhir et al., 2004). The null hypothesis of Hardy-Weinberg equilibrium within each population was tested using the exact test implemented in GENEPOP 4.0 (Raymond and Rousset, 1995). Calculations were performed both on the whole dataset and on the clone-corrected data (i.e., a dataset in which only one representative of each repeated MLG was kept). Only the clone-corrected data are reported in cases where the two datasets yielded different results because the sampling during epidemics would result in over-representation of certain clones due to the recent/epidemic clonality resulting from epidemic clonal structure (Maynard-Smith et al., 1993). Ancestral relationship and migration patterns among populations: Different competing scenarios were tested to infer the ancestral relationship among populations through Approximate Bayesian Computations (ABC) analyses implemented in DIYABC (Cornuet et al., 2010; Cornuet et al., 2008). The method has been reported to be appropriate for complex population genetic models (Cornille et al., 2012; Dilmaghani et al., 2012), as instead of exact likelihood estimation, the method estimates the posterior probabilities of given scenarios based on the posterior distributions of demographic parameters from observed and simulated datasets.
We used a “hierarchal” strategy for comparing different scenarios, based on our understanding of the population structure in different regions. In a first step we made the comparison between three populations at a time, termed as “triplets”. We started with the three recombinant populations in the centre of diversity (Pakistan, Nepal and China), and then compared the rest of the populations among each other using the same triplet strategy and with these recombinant populations (Supporting information_on_ABC_Analyses). In the second step we explored the relationship of Middle-Eastern, Central Asian and Mediterranean populations with those of the centre of diversity. In the third step we included the NW European population to explore the ancestral relationship among the overall world populations. For each dataset, parameters were estimated for the most appropriate scenarios. The results shown in the thesis chapter will only be based on “triplet” results. A total of 106 simulated data was generated for each scenario under the generalized stepwise mutation model, with two parameters i.e., the mean mutation rate (I) and the mean parameter (P) of the geometric distribution used to model the length of mutation events (in number of repeats). Due to the lack of empirical estimates of mutation rate for microsatellites in PST, the mean mutation rate was drawn from a uniform distribution of 10-4 to 10-3, while the mutation rate at each locus was drawn from a gamma distribution (mean = μ, shape =2). The parameter P was kept in the range of 0.1 to 0.3. A range of 40 contiguous allelic states was kept for each locus, characterized by the individual value of mutation rate (lL) and the parameter of the geometric distribution (PL), which were obtained from a Gamma distribution (with mean =1, range 5 x 10-5 to 5 x 10-2 for IL; and mean = P, shape = 2, shape 0.01-0.09 for P L). Mean number of alleles per locus, mean genetic diversity (Nei, 1978), mean variance in allele size, genetic differentiation between pairwise groups, FST (Weir and Cockerham, 1984), and genetic distance δμ (δμ; Goldstein et al., 1995) were used as summary statistics. A polychotomous logistic regression procedure (Fagundes et al., 2007) was used to estimate the relative posterior probabilities of different scenarios using the 1% of simulated datasets closest to the observed data. The limiting distribution of the maximum likelihood estimators was used to compute the confidence interval of the posterior probabilities. The posterior distribution of parameters were estimated for the most likely scenario using the local linear regression (Beaumont et al., 2002; Cornuet et al., 2008) on 1% simulated datasets closest to the observed data. Confidence in model choice was assessed using a leave-one-out method (Csilléry et al., 2011). For each model we drew 500 of the 1,000,000 simulated datasets used for model selection and treated them as observed datasets (i.e., pseudo-observed datasets). Posterior probabilities of competing models were evaluated for each pseudo-observed dataset, using all remaining simulated datasets and the same methodology as described for the observed dataset. Confidence in model choice was then estimated using the number of pseudo-observed dataset that gave higher posterior probability to the model they had been simulated with. In tests of goodness-of-fit (i.e., model checking), we simulated datasets of similar numbers of markers as observed datasets and calculated for each dataset the average across loci of several test quantities. The set of test quantities included the summary statistics used in analyses of the observed dataset. Because using the same statistics in parameter inference and model checking can overestimate the quality of the fit (Cornuet et al., 2010), we selected additional summary statistics that had not been used in parameter inferences: mean allele size variance across loci, mean index of classification and mean gene diversity across loci. Test statistics computed from observed data were then ranked against the distributions obtained from simulated datasets (Cornuet et al., 2010). (The results of model checking and confidence in scenario choice are not shown in the thesis chapter and were only discussed as future perspectives during the oral PhD defense. This will be the addressed in the version of this chapter submitted to an international journal.)

READ Redefining the relationship between safety and security

RESULTS

Summary of genetic variation

We performed multilocus genotyping of 409 PST isolates, representatives of a worldwide collection, using a set of 20 microsatellite markers. Plotting the multilocus genotypes detected against the number of loci re-sampled showed that the full set of SSRs was sufficient for discriminating clonal lineages (supplementary files; Fig. S1). No significant linkage disequilibrium was found among SSR loci (data not shown), suggesting a lack of redundancy among markers. Some of the loci were monomorphic in certain geographical areas, except that China had no fixed loci and Pakistan had only one monomorphic locus (RJN-12; supplementary files; Table S1).

Population subdivision

Genotypes clearly clustered according to their geographical origin in the analyses with the model-based clustering method implemented in STRUCTURE, with an optimal number of clusters (K) equal to 6, based on the rate of change in the log probability of data across successive K values (Evanno et al., 2005). At K = 2, Middle Eastern, Mediterranean and Central Asian populations were assigned to one group; the Chinese population was assigned to the other group; and Nepalese, Pakistani and NW European populations had a mixed assignment of the two groups (Fig. S2). Increasing K to 3 individualized a Pakistan-specific group, while increasing K to 4 split the cluster of Middle East, Central Asia and Mediterranean region into two groups, one specific to the Middle East and East Africa and the other specific to the Central-Asia and Mediterranean region, with substantial admixture from the Middle East. The Middle Eastern and East African populations had no differentiation from each other and are termed as Middle East-Red Sea Area, onward. At K = 5, the NW European populations were separated from the Chinese population, and at K = 6, the Nepalese group individualized (Fig. S2). Increasing K above 6 did not reveal any further subdivisions. We confirmed that the presence of some of the clonal populations would not result in strong deviation from the STRUCTURE results, as the existence of six genetic groups was further supported by the non-parametric DAPC analysis (Fig. 1 and Fig.2). The BIC curve in the DAPC analyses also supported K=6 with a clear discrimination of genotypes from China, Pakistan, Nepal, Middle East-Red Sea Area, NW Europe and Central Asia-Mediterranean region (Fig. 2). Population differentiation among the different groups was estimated by means of pairwise FST. Populations showed a high differentiation, with strong and significant FST values for all pairs except for PST from the Middle Eastern, Central Asian and Mediterranean regions (Table 2), confirming a relatively recent shared ancestry or significant gene flow among these populations. Chinese, Pakistani and Nepalese populations were differentiated from one another and from the Middle Eastern and Mediterranean populations. These two latter populations were not highly differentiated from one another (Fig. S2; Table 2). The NW European population showed a strong differentiation from Mediterranean and Middle Eastern populations but was closer to the Chinese population (Fig. S2 and Fig. S3).

Table of contents :

GENERAL INTRODUCTION
CHAPTER I. Origin, migration routes and genetic structure of worldwide populations of the wheat yellow rust pathogen, Puccinia striiformis f.sp. tritici
Introduction
Materials and Methods
Results
Disucussion
References
Supporting information
CHAPTER II. Reduction in the sex ability of worldwide clonal populations of Puccinia striiformis f.sp. tritici
Introduction
Materials and Methods
Results
Disucussion
References
Supporting information
CHAPTER III. The role of sexual recombination and off-season survival in temporal maintenance of Puccinia striiformis f.sp. tritici at its centre of diversity; the Himalayan region of Pakistan
Introduction
Materials and Methods
Results
Disucussion
References
Supporting information
CHAPTER IV. Recapturing clones to estimate the sexuality and population size of pathogens
Introduction
Theoretical Model
Performance in simulations
Application to a fungal pathogen
Disucussion
Materials and Methods
References
Appendices
Supporting information
GENERAL DISCUSSION