RESEARCHFree Access icon

Variation in Genetic Diversity of Phytophthora infestans Populations in Mexico from the Center of Origin Outwards

    Authors and Affiliations
    • Shankar K. Shakya , Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331
    • Meredith M. Larsen , Horticultural Crops Research Unit, USDA-ARS, Corvallis, OR 97330
    • Mercedes María Cuenca-Condoy
    • Héctor Lozoya-Saldaña , Departamento de Fitotecnia, Universidad Autónoma de Chapingo, Méx. 56230, México
    • Niklaus J. Grünwald , Horticultural Crops Research Unit, USDA-ARS, Corvallis, OR 97330

      Published Online:


      The Toluca valley, located in central Mexico, is thought to be the center of origin of the potato late blight pathogen Phytophthora infestans. We characterized over 500 individuals of P. infestans sampled from populations with a geographical distance of more than 400 km in six regions adjacent to the Toluca valley in three states including Michoacán, Mexico, and Tlaxcala. Our sampling occurred on a predominant east to west gradient and showed significant genetic differentiation. The most western sampling location found in Michoacán was most differentiated from the other populations. Populations from San Gerónimo, Juchitepec, and Tlaxcala clustered together and appeared to be in linkage equilibrium. This work provides a finer understanding of gradients of genetic diversity in populations of P. infestans at the center of origin.

      The Toluca valley, located in central Mexico, is considered to be the center of origin of the potato late blight pathogen Phytophthora infestans (Goodwin et al. 1992; Grünwald et al. 2001). This hypothesis is based on several lines of evidence. Populations of P. infestans in the Toluca are genetically diverse (Flier et al. 2003; Goss et al. 2014; Grünwald et al. 2001), the A1 and A2 mating types exist in a 1:1 ratio (Grünwald et al. 2001), oospores can readily be observed in various host plant tissues and are ubiquitous (Fernández-Pavía et al. 2004; Flier et al. 2001; Gallegly and Galindo 1958), and populations reproduce sexually based on the observed linkage equilibrium among genetic markers (Goss et al. 2014; Grünwald et al. 2001). This evidence is further supported by the fact that the closest relatives of P. infestans, namely P. ipomoeae and P. mirabilis, are only observed in Mexico (Flier et al. 2002; Galindo and Hohl 1985). The only other known close relative, P. andina, found in South America, is a hybrid species with two ancestral parents (Goss et al. 2011; Oliva et al. 2010). Finally, Mexico has been described as a center of diversity for tuber-bearing Solanum species and P. infestans has been shown to infect these native species and is thought to have coevolved with the native R genes described in various hosts such as S. demissum and S. edinense (Flier et al. 2003; Grünwald et al. 2001; Hijmans and Spooner 2001; Rivera-Peña 1990; Rivera-Peña and Molina-Galan 1989). Knowledge and understanding of a pathogen’s center of origin provides insights into its evolutionary potential, genetic diversity, mode of reproduction, and gene flow. This information in turn is critical to implementing adaptive disease management strategies and discovery of novel R genes.

      Until the 1970s, the clonal lineages US-1 and HERB-1 (both of mating type A1) of P. infestans were dominant globally (Fry 2008; Yoshida et al. 2013), whereas the P. infestans population in central Mexico was shown to have both mating types (Goodwin et al. 1992; Grünwald and Flier 2005). Even though both the A1 and A2 mating types are present in the United States, evidence of sexual recombination is still lacking (Danies et al. 2013; Fry et al. 2013) and populations are described as clonal lineages. Recombinant isolates are occasionally detected in the U.S. but exist for only a short period of time before going extinct (Danies et al. 2014). In the U.S., we currently distinguish up to 24 clonal lineages (Fry et al. 2013). A recent phylogenomic analysis showed that none of the U.S. clonal lineages are descendants of preexisting lineages (Knaus et al. 2016; Wang et al. 2017). Hence, populations in Mexico are thought to be the main source of novel migrant lineages that establish themselves in the U.S.

      Several studies have investigated diversity of populations in different regions of Mexico. Goodwin et al. (1992) compared the diversity of P. infestans between northern and central Mexico based on allozyme loci and DNA fingerprinting and suggested high diversity in Chapingo. A recent study by Wang et al. (2017) provides novel insights into population structure of P. infestans in Mexico. Their results show high genetic diversity and sexual reproduction in the Michoacán, Toluca, and Tlaxcala regions based on 12 SSR loci. Here, we expand the range of populations sampled around the Toluca valley to determine the range of sexual populations and whether there is isolation by distance as we sample away from the Toluca valley.

      In this study, we analyzed a sample of 517 P. infestans isolates sampled from six locations in three states on a predominantly east to west gradient (Fig. 1). We newly sampled 355 samples and combined these with populations described previously (Wang et al. 2017). The goal of this study was to provide a better understanding of the population structure and diversity of P. infestans in central Mexico and adjacent states. We specifically tested several hypotheses including: 1) Is there genetic differentiation among populations of P. infestans in central Mexico? 2) Are populations isolated by geographic distance? 3) Are populations that are geographically distant from central Mexico, such as Tlaxcala and Michoacán, increasingly clonal? Our work provides novel evidence for spatial correlation across this gradient and further refines our understanding of the biology of this pathogen at the center of origin.

      Fig. 1.

      Fig. 1. Map of Mexico (top) and enlarged map of the Mexican states where population of P. infestans were sampled. The colored dots represent the areas where samples were collected.

      Download as PowerPoint

      Materials and Methods

      Sampling and isolation of P. infestans.

      In this study, we characterized populations of P. infestans sampled at six locations between 1997 and 2016 (Table 1; Fig. 1). We newly sampled 355 isolates in 2015 and 2016. Briefly, isolates were obtained utilizing the protocol described previously (Grünwald et al. 2001) with the slight modification of adding oxytetracycline (20 mg/liter) to the selective media (see Wang et al. 2017). We also included populations sampled previously by Wang et al. (2017) and deposited in Dryad ( (Table 1). The Wang et al. (2017) strains were genotyped using the same protocol as described below. Isolates from Toluca in 2015 and Chapingo in 2015 and 2016 were sampled from experimental plots subjected to varying fungicide treatments and are not considered natural populations, and should thus be interpreted differently. Most of the isolates from Tlaxcala state were sampled near Villareal and are referred to as the Tlaxcala population hereafter. Isolates originating from the same geographic area, regardless of sampling years, were combined as they were not significantly different based on FST analysis (Supplementary Table S1).

      Table 1. Samples of P. infestans collected and genotyped from Mexico using 11 polymorphic microsatellite loci. Our analysis combined newly sampled population with those analyzed recently by Wang et al. (2017).

      DNA extraction and SSR genotyping.

      Isolates were grown in pea broth agar for a week at 18°C and actively growing agar plugs were transferred to 10 ml pea broth for 7 to 14 days for DNA extraction. DNA was extracted following the manufacturer’s protocol with the FastDNA SPIN KIT (MP Biomedicals, United States) and stored at –20°C until used. DNA concentration was measured using a NanoDrop 2000C (Thermo Scientific, United States) and adjusted to 10 ng/µl. Multiplex PCR was performed on 12 polymorphic microsatellite loci as described by Li et al. (2013). A panel of reference strains (5303, 5304, 5306, 5307, and 5308) with known SSR alleles was included in all PCR and fragment analysis runs as described at (Grünwald et al. 2011). Allele sizing was done using capillary electrophoresis (ABI 3730, Applied Biosystems, Foster City, CA) at the Center for Genome Research and Biocomputing at Oregon State University and fragments were sized using the GENEMAPPER software (Applied Biosystems) (Wang et al. 2017).

      Data preparation.

      Isolates originating from the same geographical region were pooled into a population regardless of sampling year and study. We also included samples from Toluca and Tlaxcala from Wang et al. (2017) to our recent collection sampled in 2015–16. Populations were checked for varying ploidy levels and missing values. Only strictly diploid genotypes were considered for downstream analysis. The locus D13 had missing values ranging from 5 to 70% and therefore was removed from the analyses, resulting in a final data set with 11 SSR loci (Supplementary Fig. S1).

      Genetic diversity and mode of reproduction.

      Analysis was conducted on 431 diploid individuals based on 11 SSR loci (e.g., some of the 517 strains sampled were not included due to varying ploidy levels; Table 1). Multilocus genotypes (MLG), i.e., the unique combination of alleles at all loci per individual, and expected multilocus genotype (eMLG) based on rarefaction were calculated using the R package poppr V.2.3.0 (Grünwald et al. 2017; Kamvar et al. 2014; R Core Team 2016). The Stoddart and Taylor’s diversity index (G) was calculated (Stoddart and Taylor 1988). Nei’s unbiased gene diversity (Nei 1978) also referred to as expected heterozygosity (Hexp) was calculated for all the diploid individuals per population.

      P. infestans is known to reproduce clonally and sexually. The literature suggests that sexual recombination is predominant in the Toluca valley (Goodwin et al. 1992; Goss et al. 2014; Grünwald and Flier 2005; Grünwald et al. 2003). We assessed the mode of reproduction in P. infestans across all populations. The standardized index of association (rbarD), a multilocus estimate of the index of association/linkage disequilibrium, was calculated to investigate the mode of reproduction (Agapow and Burt 2001; Kamvar et al. 2014). rbarD was measured on clone corrected data to remove the bias of resampled MLGs. The expectation of rbarD for a randomly mating population is zero. Any significant deviation from the expectation of zero would suggest clonal reproduction. The significance was tested based on 999 permutations and conducted in the R package poppr (Kamvar et al. 2014).

      One of the assumptions of Hardy-Weinberg equilibrium (HWE) is random mating. Deviation from HWE would suggest nonrandom mating or population subdivision. We tested the hypotheses of nonrandom mating and population subdivision by calculating HWE per locus per population. HWE was performed using the R package pegas V.0.9 and significance was tested using 1,000 permutation events (Paradis 2010).

      Population differentiation and structure.

      Wright’s FST is a measure of population subdivision for diploid individuals. It is based on differences in allele frequencies between sampled populations (Wright 1949). We calculated pairwise FST values with the clone corrected P. infestans data to test the hypothesis of population subdivision using the R package strataG V.1.0.5 (Archer et al. 2017). Statistical significance was calculated with 10,000 permutation events. Analysis of molecular variance (AMOVA) was performed to test for genetic structure implemented in the R package ade4 V.1.7-5 (Dray and Dufour 2007; Excoffier et al. 1992). AMOVA was calculated on clone corrected data using Bruvo’s distance (Bruvo et al. 2004) to estimate the variance explained by populations or individuals within populations. Statistical significance was tested using 999 permutation events. The neighbor-joining algorithm based on Bruvo’s distance was run to visualize the clustering of P. infestans isolates using poppr with 1,000 bootstrap replicates and the resulting tree was edited using the R package ggtree V.1.4.20 (Yu et al. 2017).

      The Bayesian model based clustering algorithm STRUCTURE V.2.3 (Pritchard et al. 2000) was used to infer population structure. The software clusters individuals into K populations using a Markov Chain Monte Carlo (MCMC) approach. Fifteen independent runs were performed with 100,000 iterations/MCMC run after a burn-in period of 20,000 for each value of K ranging from 1 to 6. An admixture model was selected as it assumes mixed ancestry of an individual. The optimum value of K was determined using the Evanno et al. (2005) method. CLUMPP V.1.2 was run to aggregate multiple STRUCTURE runs (Jakobsson and Rosenberg 2007) based on a greedy algorithm and a pairwise similarity statistic parameter with 10,000 repeats. The R package strataG (Archer et al. 2017), providing a wrapper script to run and visualize STRUCTURE data and implementing the Evanno method and CLUMPP analyses, were used.

      Discriminant analysis of principal component (DAPC) is a model-free approach to infer clusters of populations, relative population membership, and assign individuals to populations (Jombart et al. 2010). Proportional assignment of an individual to populations is provided by the retained discriminant functions. DAPC was conducted using the R package adegenet V. 2.0.1 (Jombart 2008).

      Long-distance dispersal of P. infestans sporangia is limited due to desiccation and susceptibility to radiation (Mizubuti et al. 2000). Thus, isolation by distance might be observed as individuals cannot migrate freely over long distances (e.g., >1 to 10 km). We tested for correlation in P. infestans populations in Mexico using pairwise FST and pairwise geographic distance. We also performed Mantel’s test implemented in the R package ade4 with 1,000 permutations (Dray and Dufour 2007; Mantel 1967).

      All code and data used in this study are deposited on github ( and OSF (Shakya and Grünwald 2017;


      Variation in genetic diversity and reproduction in P. infestans populations.

      A total of 431 diploid P. infestans isolates (after removing polyploid strains from a total of 517) sampled in six regions of Mexico were analyzed using 11 highly polymorphic microsatellite loci (Table 1). The majority of isolates from Chapingo that were sampled in experimental plots, rather than grower fields, were identified as clones. Expected heterozygosity ranged from 0.44 (Chapingo) to 0.57 (Tlaxcala) (Table 2). Stoddart and Taylor’s diversity, which is a measure of evenness and richness of MLGs, was lowest for Chapingo (5.5) and highest for Tlaxcala (65.2).

      Table 2. Population statistics for diploid genotypes for P. infestans populations by region in Mexico. Total number of samples (N), observed multilocus genotype (MLG), expected multilocus genotype (eMLG), Stoddart and Taylor’s diversity (G), and expected heterozygosity (Hexp).

      To infer the mode of reproduction, the standardized index of association (rbarD) was calculated for clone corrected data. Populations from Chapingo and Tlaxcala showed significant deviations in rbarD values from the null expectation supporting clonal reproduction (Table 3). rbarD values for all other regions were not significantly different from zero, indicating linkage equilibrium or presence of sexual reproduction. In each population, between five and 11 loci were considered to be in Hardy-Weinberg equilibrium (Table 3).

      Table 3. Linkage disequilibrium among loci based on the standardized index of association (rbarD) and number of loci in Hardy-Weinberg equilibrium after clone correction for P. infestans in different regions of Mexico. Total number of samples (N), observed multilocus genotype (MLG), the standardized index of association (rbarD), the P value for rbarD, and the number of loci in Hardy Weinberg equilibrium.

      Genetic differentiation among populations of P. infestans in central Mexico.

      Pairwise FST values were calculated to estimate differentiation among populations. These values ranged from 0.01 to 0.11 indicating that populations show low to modest levels of differentiation (Table 4). The Michoacán population was most differentiated from the other populations in Mexico. The highest differentiation was between populations from Michoacán and San Gerónimo (central Mexico). Populations that were geographically close had small but significant FST values. San Gerónimo and Tlaxcala had the smallest pairwise FST value of 0.01. The results of population differentiation using FST are well supported by our AMOVA analysis. AMOVA on clone corrected P. infestans populations based on Bruvo’s distance revealed 11% of variation between populations (Table 5, P = 0.001). This is similar to our highest pairwise FST value of ∼12%. Approximately 90% of the genetic variation was explained due to variation within populations rather than between populations. Our data support the hypothesis of a gradient of increasing differentiation as we move west toward Michoacán from Toluca. However, at the eastern front of our sample areas, we do not yet see this effect and further sampling east of Tlaxcala is likely needed.

      Table 4. Pairwise FST values for clone corrected P. infestans populations from regions in Mexico. Values significant (alpha = 0.05) based on 10,000 permutations are marked with ‘*’.

      Table 5. Analysis of molecular variance (AMOVA) for clone corrected, diploid P. infestans populations based on Bruvo’s genetic distance

      Population structure and isolation by distance.

      We performed Bayesian clustering using STRUCTURE to determine population relationships. The STRUCTURE analysis identified two likely solutions for K based on Evanno’s method. The value of ΔK had two peaks at K = 2 and K = 4 (Fig. 2). For K = 2, Michoacán was identified as a first group and the rest of the populations as a second group. STRUCTURE results for K = 4 identified Michoacán, Chapingo, and Juchitepec as distinct clusters and Toluca, Tlaxcala, and San Gerónimo as one cluster (Fig. 3). We also conducted neighbor-joining analysis based on Bruvo’s distance (Supplementary Fig. S2). In this tree, isolates from Michoacán formed a cluster, whereas isolates from other regions were more or less randomly distributed, confirming the STRUCTURE analysis.

      Fig. 2.

      Fig. 2. Delta K plot for populations of P. infestans using Evanno’s method (Evanno et al. 2005).

      Download as PowerPoint
      Fig. 3.

      Fig. 3. Bayesian clustering of diploid P. infestans genotypes based on the admixture ancestry model using STRUCTURE. The optimum value of K inferred with the Evanno’s method was 2 (Evanno et al. 2005). Shown are STRUCTURE plots for K = 2 to 5 from top to bottom. MICHO = Michoacán; CHA = Chapingo; TOLU = Toluca; SG = San Gerónimo; JFH = Juchitepec; and TLAX = Tlaxcala.

      Download as PowerPoint

      To resolve the discrepancy in the number of clusters based on STRUCTURE, we performed a DAPC, which is a model-free approach to identify clusters. Clustering of isolates using DAPC was consistent with Evanno’s result of K = 4 (Fig. 4). Michoacán, Juchitepec, and Chapingo each formed an independent cluster in the DAPC analysis whereas the Toluca, Tlaxcala, and San Gerónimo populations formed a fourth cluster.

      Fig. 4.

      Fig. 4. Discriminant analysis of principal components (DAPC) of P. infestans populations from six regions in Mexico. A, All populations; B, excluding the Michoacán population.

      Download as PowerPoint

      Geographic and genetic distance were significantly and positively correlated (Fig. 5A) (P < 0.05). However, this correlation between geographic and genetic distances may be confounded with population structure. Mantel’s test indicated no significant isolation by distance (Fig. 5B). Two clusters were observed for Mantel’s test (Fig. 5B) that resulted in K = 2 clusters identified above: Michoacán versus all other populations. When removing the Michoacán populations, no spatial correlation was observed between populations of central Mexico and Tlaxcala. Our analysis does not demonstrate significant isolation by distance, but suggests that the Michoacán population is distinct compared with the other populations in line with a gradient of gene flow at the western border of our sampling range.

      Fig. 5.

      Fig. 5. A, Plot of pairwise genetic distance (FST values) versus geographic distance between P. infestans populations. Based on the adjusted R2, 34% of the variance in FST is explained by geographical distance (P = 0.0123). B, Test for isolation by distance (Mantel’s test) with 1,000 permutations (P = 0.11).

      Download as PowerPoint


      In this study, we investigated the population structure of P. infestans beyond the well-characterized Toluca valley region, considered to be the center of origin. Our work built on the previous work by Wang et al. (2017) by adding samples from regions in central Mexico other than the Toluca valley. Our work suggests that populations in central Mexico show high genotypic diversity beyond the center of origin previously described to be located in the Toluca valley. Similar results were recently reported by Wang et al. (2017). Our analyses suggest presence of both sexual and asexual reproduction in central Mexico. We found that, in the range sampled, populations on the western edge in Michoacán are the most differentiated, yet still genetically diverse and sexual in nature. Removing the Michoacán population resulted in three main clusters: Chapingo, Juchitepec, and a third cluster containing San Gerónimo, Tlaxcala, and Toluca.

      The populations we sampled in Chapingo (2015 and 2016) showed low genotypic diversity and clonality. STRUCTURE and DAPC analyses suggest that the Chapingo population is a group by itself. We do not currently know why Chapingo populations show a different pattern; these plots were not inoculated and did receive various fungicide treatments.

      Tlaxcala is a state east of central Mexico. The P. infestans population in this state has been reported to reproduce sexually based on 12 microsatellite loci (Wang et al. 2017). However, our analyses of the Tlaxcala populations based on 11 microsatellite loci suggest clonal reproduction. We did not include locus D13 in our analyses because of the high percentage of missing data. We checked for a putative effect of missing data in Wang et al. (2017) and found that inclusion of locus D13 (high percentage of missing data) actually suggests sexual reproduction (Supplementary Fig. S3).

      We detected population differentiation based on pairwise FST values. Small FST values can be observed due to recent common ancestry or because of ongoing migration. Different genetic markers will result in different estimates of FST. A high FST value between Michoacán and the other populations suggests limited or no gene flow between these regions. One-way migration has been reported from Michoacán to Toluca valley in Wang et al. (2017).

      We expected to see isolation by distance for the populations sampled on a predominant east to west gradient. However, with the exception of the Michoacán population, no isolation by distance was apparent. FST values did not show a correlation with distance except when Michoacán was included. This indicates that more sampling is necessary between Michoacán and other regions to test for spatial correlation.

      Our understanding of variation in P. infestans ploidy is limited. However, a recent study by Li et al. (2017) suggested that sexually reproducing P. infestans might be diploid while clonal lineages might be triploid. In this study, we only scored P. infestans isolates as being diploid for SSR markers. However, in genotyping of these isolates, we did notice some apparently trisomic or even tetrasomic loci for strains that were excluded from our analysis. Eighty percent of isolates genotyped in 2015–16 from Mexico were scored as strictly diploid. A similar pattern of diploidy and trisomy was also reported by Wang et al. (2017). Because the pattern of inheritance of SSR markers is complex, this poses a distinct challenge in performing genetic analyses for populations that are of varying ploidy levels. We chose to eliminate nondiploid strains from our analysis, but other paths can be considered (Grünwald et al. 2017).

      We studied the population structure of P. infestans across a gradient from east to west within central Mexico. The population structure observed for P. infestans in central Mexico suggests gene flow and sexual reproduction beyond the Toluca valley. We observed high differentiation at the western edge of the populations sampled, while on the eastern edge, further sampling is indicated. This work provides a finer understanding of gradients of genetic diversity in populations of P. infestans at the center of origin.


      We thank Zhian Kamvar for helping with R, Karan Fairchild for maintaining the cultures, and Val Fieland and Ciera Gray for helping with DNA extraction. We thank the Center for Genome Research and Biocomputing (CGRB) at Oregon State University for fragment analysis.

      Literature Cited

      Funding: USDA National Institute of Food and Agriculture (2011-68004-30154), USDA Agricultural Research Service (2072-2072-22000-041-00-D).

      Mention of trade names or commercial products in this manuscript are solely for the purpose of providing specific information and do not imply recommendation or endorsement.