Population BiologyFree Access icon

Transcriptome-Derived Amplicon Sequencing Markers Elucidate the U.S. Podosphaera macularis Population Structure Across Feral and Commercial Plantings of Humulus lupulus

    Affiliations
    Authors and Affiliations
    • William A. Weldon1
    • Brian J. Knaus2
    • Niklaus J. Grünwald3
    • Joshua S. Havill4
    • Mary H. Block2
    • David H. Gent5
    • Lance E. Cadle-Davidson1 6
    • David M. Gadoury1
    1. 1Section of Plant Pathology and Plant-Microbe Biology, Cornell AgriTech, Cornell University, Geneva, NY 14456
    2. 2Department of Botany and Plant Pathology, Corvallis, OR 97331
    3. 3U.S. Department of Agriculture-Agricultural Research Service Horticultural Crops Research Unit, Corvallis, OR 97330
    4. 4Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, MN 55108
    5. 5U.S. Department of Agriculture-Agricultural Research Service Forage Seed and Cereal Research Unit, Corvallis, OR 97331
    6. 6U.S. Department of Agriculture-Agricultural Research Service Grape Genetics Research Unit, Geneva, NY 14456

    Abstract

    Obligately biotrophic plant pathogens pose challenges in population genetic studies due to their genomic complexities and elaborate culturing requirements with limited biomass. Hop powdery mildew (Podosphaera macularis) is an obligately biotrophic ascomycete that threatens sustainable hop production. P. macularis populations of the Pacific Northwest (PNW) United States differ from those of the Midwest and Northeastern United States, lacking one of two mating types needed for sexual recombination and harboring two strains that are differentially aggressive on the cultivar Cascade and able to overcome the Humulus lupulus R-gene R6 (V6), respectively. To develop a high-throughput marker platform for tracking the flow of genotypes across the United States and internationally, we used an existing transcriptome of diverse P. macularis isolates to design a multiplex of 54 amplicon sequencing markers, validated across a panel of 391 U.S. samples and 123 international samples. The results suggest that P. macularis from U.S. commercial hop yards form one population closely related to P. macularis of the United Kingdom, while P. macularis from U.S. feral hop locations grouped with P. macularis of Eastern Europe. Included in this multiplex was a marker that successfully tracked V6-virulence in 65 of 66 samples with a confirmed V6-phenotype. A new qPCR assay for high-throughput genotyping of P. macularis mating type generated the highest resolution distribution map of P. macularis mating type to date. Together, these genotyping strategies enable the high-throughput and inexpensive tracking of pathogen spread among geographical regions from single-colony samples and provide a roadmap to develop markers for other obligate biotrophs.

    Pathogen population structure can inform decisions on disease management, such as the timing and efficacy of fungicides (Mengistu et al. 2020; Villani et al. 2016;) or the deployment of resistant host varieties (Jones et al. 2014; Moreira et al. 2019; Parada-Rojas and Quesada-Ocampo 2019). With understanding and monitoring, actions are possible to mitigate the expansion of a pathogen group (Ali et al. 2014), prolong the longevity of fungicides and host resistance (Gent et al. 2017; Lichtner et al. 2020; Wolfenbarger et al. 2016), or select targets for future breeding programs (Teh et al. 2017). Surveys dependent on molecular tools have quantified the dynamics of Phytophthora infestans clonal lineages worldwide, which has enabled the local, high-resolution tracking of isolate populations in each growing season (Dey et al. 2018; Forbes et al. 1998; Fry et al. 2015; Hansen et al. 2016a, b; Knaus et al. 2016). The advent of molecular tools characterizing fine-scale population structure also facilitated the quarantine efforts enacted toward Pyricularia oryzae in Bangladesh and surrounding Middle Eastern countries in response to the pathogens expansion of geographic range in 2016 (Ceresini et al. 2019; Islam et al. 2016).

    Powdery mildew fungi are among the most widespread and problematic plant pathogens in modern agriculture (Dean et al. 2012). Comprised of hundreds of unique, often but not always host-specific species (Gadoury and Pearson 1991; Weldon et al. 2020), they are collectively capable of infecting thousands of monocot and dicot plants (Yarwood 1957). Powdery mildew fungi cost growers billions of dollars in yield losses each year (Braun 2002; Sambucci et al. 2014). These polycyclic fungi reproduce rapidly, are wind-dispersed, and can germinate in the absence of water, and are thus not limited by sparse rainfall. Indeed, they reach their greatest potential within controlled environments (Janisiewicz et al. 2016; Rossi et al. 2020). They have the potential to both rapidly spread across geography and a propensity to adapt to fungicides and host resistance genes in many crops (Gadoury et al. 2012; Gent et al. 2019a; Heyden and Lefebvre 2014; Peetz et al. 2009).

    Podosphaera macularis, the causal agent of powdery mildew of hop (Fig. 1), is no different. The pathogen has been endemic to England since at least the 1700s (Neve 1991; Royle 1978) and in eastern North America since the early 1800s (Blodgett 1915). The Pacific Northwest (PNW) region of North America, where over 98% of U.S. hop production occurs, remained free of the pathogen until 1996 (Ocamb et al. 1999). Within 2 years of the first reports of the pathogen’s arrival, the disease had spread throughout all hop producing areas of the PNW region (Gent 2008). The pathogen population in the PNW has been differentiated into three distinct virulence groups: (i) one that was prevalent prior to 2012, (ii) one differentially aggressive on cultivar Cascade (Cascade-adapted), and (iii) a group capable of overcoming host resistance conditioned by the R-gene R6 (V6), as found in cultivar Nugget (Gent et al. 2017, 2020; Wolfenbarger et al. 2016).

    Fig. 1.

    Fig. 1. Humulus lupulus ‘Symphony’ stem parasitized by Podosphaera macularis, which has produced an extensive number of vegetative hyphae that run parallel to the host cuticle, and erected hundreds of conidiophores that bear chains of infectious conidia.

    Download as PowerPoint

    Throughout the Midwest United States and Northeastern United States, hop production has reemerged to compliment a rapidly expanding craft brewing industry, and a consequent demand for locally sourced hops (Hop Growers of America 2018). Recent findings indicate that P. macularis populations found within Midwest and Northeastern U.S. commercial hop yards may be genetically distinct from feral hop in the same regions (Gent et al. 2020). Isolates possessing V6-virulence have not been confirmed in the United States outside of the PNW (Wolfenbarger et al. 2016). Only the MAT1-1 mating-type idiomorph has been reported from within any U.S. commercial hop plantings in the western United States, and the ascigerous state has not been observed west of the Rocky Mountains in North America. Both mating-type idiomorphs have been detected in an approximate 1:1 ratio on feral hop sampled from a small number of locations in the Midwest and Northeastern United States (Wolfenbarger et al. 2015). The distribution of virulence groups and partitioning and isolation of sexual reproduction are both relevant to sustainable U.S. hop production and also represent a perhaps unique spatial diversity of pathogen and host population characteristics (e.g., asexual versus sexual, commercial versus feral, and virulence versus avirulence) to which molecular methods could be deployed in the study and management of population diversity.

    The ability to monitor a pathogen’s population structure relies upon two related, but distinct processes: (i) the identification of genetic variation within a population that correlates with phenotypes of interest; and (ii) the conversion of this observed genetic variance into a molecular marker system that can genotype future collections of the pathogen population in a timely, cost-effective, and high-throughput manner. Historically, developing comprehensive molecular marker libraries for obligately biotrophic pathogen groups such as the powdery mildew fungi has presented certain limitations. The highly expanded, repetitive genomes and obligately biotrophic growth behavior of powdery mildews makes the assembly of even a single, high-confidence genome complicated. As such, genomes are currently unavailable for most powdery mildew species (Jones et al. 2014; Wicker et al. 2013), particularly given the amount of time, effort, and plant material needed to culture and extract sufficiently large DNA quantity from these obligate biotrophs. Because of this bottleneck transcriptomic approaches have been used as an alternative sequencing methodology that reduces genome complexity, identifies sources of genetic variation exclusively within expressed gene sequences, and reduces the amount of biological material required for nucleic acid extraction (Gent et al. 2020; Rahman et al. 2019; Tollenaere et al. 2012; Vela-Corcía et al. 2016). It is important to note that due to the obligately biotrophic nature of their growth, all sequencing approaches for obligate biotrophs come with the inherit disadvantage that in addition to the target pathogen, sequence data will be returned for the host and any other epiphytic microorganisms present in the culture, which must be carefully filtered out prior to any alignment. While a transcriptomic approach reasonably resolves the hurdle of identifying genetic variation within obligate biotrophs, the need to convert this variation into a robust molecular marker genotyping platform remains.

    Amplicon sequencing (AmpSeq) is a recently developed genotyping methodology that satisfies the cost, limited biomass, and high-throughput-capacity needs of an effective molecular marker platform (Yang et al. 2015, 2016; Zou et al. 2020). The genotyping pipeline was originally optimized to function within a plant-breeding framework, but more recently has been demonstrated to show promise in genotyping of obligately biotrophic plant pathogenic fungal populations with low quantities of DNA (Kisselstein et al. 2017). In short, AmpSeq is a low-cost marker system for genotyping up to 2,000 loci per sample in an initial multiplexed PCR-1 (Yang et al. 2016). This reaction is then followed by a second PCR on the amplicons of PCR-1 in order to add linker sequences that uniquely barcode samples by their well location within dozens of 96-well plates. The barcoding enables all targeted loci of thousands of samples to be pooled and sequenced, often within a single Illumina sequencing lane, depending on read depth needs, typically reducing the sequencing costs to near $1 per sample. When longer amplicons are sequenced (typically up to 280 bp for Illumina 2 × 150 bp sequencing, variance data are returned not only at the specific nucleotide position for which the given marker was originally designed, but also for all other nucleotides within the amplicon, resulting in a marker with higher information content than a single SNP (Zou et al. 2020). Thus, the AmpSeq platform may identify new, potentially phenotypically relevant variants when genotyping a population.

    Application of an AmpSeq-based marker system would be extremely useful for obligately biotrophic organisms such as powdery mildew fungi because of the high costs of maintaining live cultures and phenotyping hundreds of isolates each year. P. macularis is subject to these restraints, as well as the additional current limitation that an assembled, whole genome sequence is not available. A recent study by Gent et al. (2020) characterized variation within the transcriptomic assemblies of 104 diverse P. macularis isolates in order to discern a likely European origin for the clonal P. macularis population of the PNW United States, which first arrived to the region in 1996 (Ocamb et al. 1999). We saw the existence of this published dataset as an opportunity to repurpose transcriptomic sequence data for the creation of AmpSeq markers capable of monitoring changes in P. macularis population structure over time. Due to a very likely expanded and highly repetitive genome structure, P. macularis serves as a highly complex model that, if successful, opens the door for AmpSeq methodology across many organisms, which would be especially useful in obligately biotrophic pathosystems. As such, our stated objective was to employ a pre-existing P. macularis transcriptome dataset to develop a library of AmpSeq markers that is capable of genotyping the U.S. P. macularis population across a range of relevant phenotypic parameters including geographic origin, mating type, and virulence toward the widely deployed H. lupulus R-gene R6.

    MATERIALS AND METHODS

    Generating AmpSeq local haplotype markers that characterize P. macularis population structure.

    RNA extraction, sequencing, and reference assembly. For this study, a previously assembled transcriptome and resequencing data library (Gent et al. 2020) were used to design SNP markers. Briefly, 104 P. macularis isolates were collected from the Pacific Northwest (PNW) (pre-2012, V6-virulent, and Cascade-adapted strains), U.S. regions east of the Rocky Mountains, the United Kingdom, and continental Europe. Since the arrival of P. macularis into the PNW U.S. region in 1996, two uniquely virulent strains of the pathogen have emerged within this population since 2012; one strain that has a measurably enhanced ability to infect and grow on the hop cultivar Cascade (Gent et al. 2017), and another that overcomes the R6-based host resistance in widely planted hop cultivars including ‘Nugget’, ‘Mt. Hood’, and ‘TriplePearl’ (Wolfenbarger et al. 2016). In addition to diversity in geographic origin and virulence profile of samples, this sampling scheme captured the diversity in P. macularis samples originating from both commercially cultivated and feral hop plants. Paired-end 2 × 150 bp sequencing was performed on an Illumina HiSeq 3000 instrument at the Oregon State University Center for Genome Research and Biocomputing. Sequence reads for 103 P. macularis isolates were independently quality filtered and mapped to a de novo reference transcriptome based on the pre-2012 PNW P. macularis isolate HPM-663 (Gent et al. 2020).

    Population level SNP variant calling and generation of AmpSeq primers.

    As detailed in Gent et al. (2020), reads of the 103 nonreference P. macularis isolates were mapped to the HPM-663 reference transcriptome assembly using bwa 0.7.10 (Li and Durbin 2009) and duplicate reads were marked and filtered out using the Picard toolkit 2.5.0 (Broad Institute and GitHub Repository 2019). The ensuring AmpSeq marker design pipeline is summarized in Supplementary Figure S1. Using the GATK 3.5 HaplotypeCaller (Poplin et al. 2018), variants including both single nucleotide polymorphisms (SNPs) and insertion-deletion mutations (INDELS) were called from resulting bam files that had been converted to a genomic variant call format. The resulting library was filtered manually for variants that fell between the 1st and 8th deciles in read depth. Of these remaining variants, a second filtering step returned only those that were SNPs for which the alternate allele was present in at least two of the 104 P. macularis assemblies. These remaining variants were then converted to a FASTA sequence file with requirements that the sequence encompassing each variant must be at least 240 base pairs (bp) in length, ideally 260 bp, and no more than 500 bp; and the SNP variant must be at least 30 bp from the end of the sequence. Sequences housing multiple SNPs were permitted when these SNPs adhered to the aforementioned specifications. This FASTA file of P. macularis SNP variant sequences was then input into BatchPrimer3 (Untergasser et al. 2012) to obtain desirable polymerase chain reaction (PCR) primers for each sequence. Within BatchPrimer3, we again required that forward and reverse primers be located at least 30 bp away from the SNP variant position and added requirements that amplicons be between 220 to 260 bp long and only return primers between 18 to 25 bp long with 57 to 63°C annealing temperatures. All other parameters were kept as default. In addition to the primers designed around these polymorphisms, one additional locus was added as a genotyping target that included the SNP correlating with presence/absence of the V6-virulence phenotype (Block et al. 2020). Sequences for which forward and reverse primers were successfully designed by BatchPrimer3 were then modified for use in an AmpSeq platform, as described by Yang et al. (2016). An AmpSeq universal linker sequence was added to the 5′ end of each BatchPrimer3-generated, locus-specific primer sequence: 5′-TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG-3′ for each forward primer, or 5′-GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG-3′ for each reverse primer.

    Collection of a diverse P. macularis isolate DNA library.

    In order to assess the AmpSeq markers’ ability to differentiate P. macularis samples based on geographic origin, hop cultivation type, or V6-virulence phenotype, we assembled an extensive, novel collection of 514 P. macularis DNA samples for genotyping (Supplementary Table S1). These samples were similarly diverse in their geographic origin, hop cultivation type, and V6-virulence phenotype to the original P. macularis used for AmpSeq marker design. A subset (n = 61) of the collection had also been manually phenotyped for V6-virulence on differential ‘Nugget’. A hierarchical sampling approach was taken, in which a minimum of five distinct P. macularis samples were collected from each hop sampling location and a minimum of three hop plantings were sampled from each geographic location. All samples were collected using 1-cm Tough-Spot labeling stickers (Diversified BioTech, Dedham, MA) as an adhesive to peel mycelium and conidia of P. macularis colonies from the hop leaf. To reduce the likelihood of there being multiple genotypes within a given sample, a single colony with distinct margins was treated as the experimental sampling unit, which is a common approach across many pathosystems for which pathogen spore production breaches the host cuticle (Dey et al. 2018; Gobbin et al. 2006; Justesen et al. 2002; Kisselstein et al. 2018; Lees et al. 2006; Wallace et al. 2020; Wolfenbarger et al. 2015). The sticker samples were then placed into 2-ml microcentrifuge tubes for storage and subsequent DNA extraction. Collaborators shipped the collected P. macularis samples in padded envelopes via express mail at ambient temperature. Upon arrival, samples were stored at −20°C until processing for DNA extraction. Samples tubes were placed on a metal platform over liquid nitrogen to maintain freezing conditions and supplemented with two stainless steel beads (SPEX sample prep item number 2150, Metuchen, NJ) and approximately 30 0.4-mm glass beads (Scientific Industries, Inc., Bohemia, NY), which were then flash frozen in liquid nitrogen and ground using a GenoGrinder (SPEX SamplePrep 2000 Geno/Grinder, Mutechen, NJ) for three cycles of 30 s at 300 strikes/min. DNA was extracted via a modified CTAB protocol, using 24:1 chloroform/isoamyl alcohol as the solution to separate organic soluble molecules from the DNA-containing liquid phase (Healey et al. 2014).

    AmpSeq and haplotype calling.

    No-template controls, technical replicates, and biological replicates, for a total of 575 samples were submitted for AmpSeq at the Cornell University Biotechnology Resource Center as previously described by Yang et al. (2016), but with the following modifications. During locus-specific amplification with universal adapters in PCR-1, a touchdown PCR program was adopted which consisted of a 10-min denaturation step at 95°C, followed by 10 cycles of denaturation at 94°C for 30 s, primer annealing starting at 62°C for 30 s for the first cycle and recurrently decreasing by 1°C for the ensuing cycles, and extension at 72°C for 1 min. The 10 touchdown PCR cycles were then followed by 24 cycles of 94°C for 30 s, 56°C for 30 s, and 72°C for 1 min. The reaction was completed with a postextension incubation at 72°C for 7 min. This touchdown PCR modification was adopted because an initial run using the default AmpSeq PCR-1 annealing temperature of 62°C for every cycle yielded a poor amplification across the marker library. After PCR-2, the indexed PCR products were pooled, cleaned with Agencourt AMPure beads, quantified, and sequenced on an IlluminaNextSeq500 (2 × 150 bp) sequencer (Illumina, San Diego, CA). Raw sequence reads were then processed through a previously described, custom AmpSeq haplotyping pipeline (Fresnedo-Ramírez et al. 2017). Parameters were set to minimize sequencing errors by requiring a minimum of five samples per haplotype, a maximum of 10 unique haplotypes per sample in the first pass, and a maximum read count ratio of five between the two reported alleles for a given sample. All other input parameters were set to default. The raw sequencing reads have been deposited in the National Center for Biotechnology Information Sequence Read Archive (SRA) and are accessible through BioProject ID PRJNA638926.

    Quality control and filtering of sequencing haplotype outputs.

    The haplotype file output from the AmpSeq haplotype analysis pipeline was subjected to four steps of additional quality control and filtering prior to any downstream analyses. Step 1 filtered P. macularis samples for those with at least 60% of loci having returned sequence data. Step 2 filtered markers with missing data, to remove loci that returned sequence data in fewer than 80% of P. macularis DNA samples. Step 3 removed inconsistent markers by requiring at least five out of the eight biological and technical replicate P. macularis DNA samples to have returned matching haplotype data. Only the technical/biological replicates for which both P. macularis DNA samples survived filtering steps 1 and 2 were considered for use in filtering step 3. Step 4 removed heterozygous markers that were likely either the result of admixed genotypes in the P. macularis colony sampled or were paralogous sequences within the P. macularis transcriptome by requiring markers to have returned heterozygous haplotypes in 10% or fewer of the remaining P. macularis DNA samples.

    Global structure of hop powdery mildew populations.

    In order to assess the performance of the SNP marker library in accurately genotyping the global population structure of P. macularis, a classical multidimensional scaling (MDS) approach, also referred to as a principal coordinate analysis (PCoA), was utilized. The VCF datafile was read into TASSEL5 (Bradbury et al. 2007) and converted into a square distance matrix based on the cumulative haplotype profile of all samples across all loci, calculated as 1 – identity by state (IBS) similarity, with IBS defined as the probability that alleles drawn at random from two individuals at the same locus are the same. This distance matrix file was then read into R, where vegdist() (Oksanen et al. 2019) was used to convert the dataset into a Euclidean dissimilarity index format, cmdscale() (R Core Team 2017) to run a classical multidimensional scaling of the Euclidean dissimilarity index, and ggplot2() (Wickham 2016) to visualize scatter plots of the first and second principal coordinate values. The PCoA calculations were paired with a permutational multivariate analysis of variance (PERMANOVA) using distance matrices (n = 1,000 permutations) to discern significant differences between the grouping of sample phenotype metadata groups, including geographic origin, hop planting type, and sample V6-virulence. With the TASELL5-derived distance matrix as the input, the R package adonis2() was used to run a PERMANOVA calculation and the package pairwise.adonis() to calculate pairwise comparisons and assign levels of statistical significance (Oksanen et al. 2019).

    Continental distribution of P. macularis mating-type idiomorphs.

    Design and validation of real-time quantitative PCR (qPCR) HPM mating assay. AmpSeq markers targeting the MAT1-1 and MAT1-2 loci were included in preliminary runs of the genotyping pipeline but failed to return acceptable haplotype and sequence data (data not shown). As such, the decision was made to exclude the mating-type locus markers from the AmpSeq library and instead design qPCR primers for genotyping of mating type. The biological objective was to provide a thorough description of the distribution of P. macularis mating-type idiomorphs across the United States, as the MAT1-2 idiomorph has yet to be reported in the PNW and would have major implications on disease management should it arrive. The technical objective was to redesign agarose gel PCR markers around the mating-type loci reported in Wolfenbarger et al. (2015) as a presence/absence qPCR assay in which both mating-type idiomorphs could be determined in multiplex, be robust across P. macularis samples of varied geographic origin, and be more efficient when scaled up to processing hundreds of P. macularis samples. The qPCR primer sets (PCR primers and fluorescent probe primer sequences) were designed and ordered from Integrated DNA Technologies (Coralville, IA) using their PrimerQuest primer design tool for a standard qPCR assay. Mating-type locus sequences for primer design were downloaded from P. macularis isolate HPM-175 sequence (GenBank KJ922755.1) for the MAT1-1-1 locus and from P. macularis isolate HPM-200 sequence (GenBank KJ741396.1) for the MAT1-2-1 locus. The qPCR primer set sequences are summarized in Supplementary Table S2. Reactions were conducted in a Bio-Rad CFX96 Touch Real-time PCR Detection System (Bio-Rad, Hercules, CA), using a 25 μl reaction volume with PrimeTime Gene Expression Master Mix (2×) (Integrated DNA Technologies) and 2 μl of sample template DNA. Cycling conditions were set at the default recommendation of a single cycle of polymerase activation for 3 min at 95°C, followed by 40 amplification cycles containing a denaturation step at 95°C for 15 s and an annealing/extension step at 60°C for 1 min. A sample was considered having returned a positive result for a given mating-type locus if the quantitation cycle (Cq) value (for which the fluorescence threshold was surpassed) was within seven cycles of the positive control sample Cq value during that specific reaction run, excluding samples that failed to surpass the CFX Maestro analysis software (Bio-Rad) minimum fluorescence threshold. Specificity of the assay was determined using a validation subset of 16 independent P. macularis samples of diverse geographic origin (Bustin et al. 2009). The mating type of these samples was first determined using the published P. macularis mating-type primers from Wolfenbarger et al. (2015). These mating-type designations were then compared with those returned in the qPCR assay as means for marker validation.

    Screening the continental populations of P. macularis for mating-type idiomorphs.

    The same library of diverse P. macularis samples genotyped with AmpSeq was surveyed for mating type using the novel qPCR mating-type idiomorph assay described above. However, the assay was run on only the 320 P. macularis samples that survived all postsequencing filtering steps, which indicated DNA of sufficient quantity and quality. These 320 samples were supplemented with 177 additional P. macularis samples that were either collected in years prior to 2018 or were collected at a location where fewer than five distinct P. macularis samples were submitted. This disqualified them from being included in the AmpSeq marker project, but these samples were still valuable for the purpose of describing mating-type distribution across hop producing regions. A one proportion Z test with continuity correction using the R function prop.test() (R Core Team 2017) was conducted on the observed P. macularis mating-type ratios to test for adherence to a 1:1 distribution of the MAT1-1 and MAT1-2 mating-type idiomorphs within a geographic region.

    RESULTS

    Generating AmpSeq haplotype markers that characterize P. macularis population structure.

    Population level SNP variant calling and generation of AmpSeq primers.

    When transcriptome sequence reads for all 103 P. macularis samples were aligned to the de novo reference transcriptome of P. macularis isolate HPM-663, 142,623 variants were returned, which included both single nucleotide polymorphisms (SNPs) and insertion/deletion (INDEL) mutations (Supplementary Fig. S1). When these reads were filtered for only those that fell between the second and eighth read depth deciles, 21,407 variants remained. After filtering to keep only SNP variants with the alternate allele present in at least two of the 104 P. macularis isolate transcriptomes, 330 variants across 196 loci remained. Of the 196 loci, 143 contained a single SNP variant, 24 housed two SNPs within the locus, and the final 29 loci contained three or more SNPs. We targeted SNPs present in at least two samples to eliminate sequencing errors and increase likely relevance to different collections of P. macularis. BatchPrimer3 returned PCR primer sets for 168 of the 196 loci. Two of the 168 PCR primer sets were removed during a percent identity matrix analysis in an attempt to limit the likelihood of any primer-primer interactions, resulting in a final 166 PCR primer sets for addition of the AmpSeq linker sequences.

    Quality control and filtering of sequencing haplotype outputs.

    The post-Illumina AmpSeq filtering steps are summarized in Supplementary Figure S2. The original sequencing submission contained 575 P. macularis DNA samples (including biological replicates, technical replicates, and water controls). The amplicon.py AmpSeq data processing and haplotype calling pipeline returned haplotype data for all 575 P. macularis samples, with 151 of the 166 SNP markers having successfully returned some data. Quality filtering resulted in a final data set of 320 P. macularis samples (56% of the original, Table 1) and 54 AmpSeq local haplotype markers. Filter step 4, which removed marker loci that exceeded a 10% heterozygosity level across the 320 P. macularis samples, removed the greatest number of markers of all filtering steps (Supplementary Fig. S3). In this case, locus heterozygosity was defined as the percentage of the 320 P. macularis samples that returned a heterozygous haplotype for the given marker. The output haplotype matrix file, sample and primer key file, the forward and reverse primer sequences, and a file containing the major allele sequence for each of the 54 AmpSeq loci are uploaded to the GitHub repository project “Podosphaera macularis AmpSeq marker project 2020”.

    TABLE 1. Summary of phenotypic metadata for the 320 Podosphaera macularis samples passing all quality filtering parameters

    Global structure of hop powdery mildew populations.

    The first two principal coordinate eigenvalues accounted for 95.2% of the variation explained within the AmpSeq haplotype dataset (Fig. 2). The first phenotypic metadata category that we used to survey for clustering patterns within the principal coordinate analyses was geographic sampling origin (Fig. 2A). Geographic origin alone was not capable of explaining the clustering patterns observed within the U.S.-derived samples, specifically those from the Northeastern United States and Midwestern United States. The confidence ellipses encompassing both the Midwest U.S.- and Northeastern U.S.-derived samples clearly span across two unique clusters of samples within the larger two ellipses. When grouped by the type of hop planting from which the sample was collected, two distinct clusters emerged (Fig. 2B). One ellipse encompassed samples primarily originating from cultivated hop yards, while the other encompassed samples derived largely from feral plantings of hop. When these two metadata phenotypes were merged into a single cumulative passport profile of each sample (Fig. 2C), the discrepancy observed in Figure 2A of a single confidence ellipse spanning two clear groupings for both the Midwest U.S. and Northeastern U.S. samples became clearer. In this case, the ellipses belonging to Eastern U.S. and Midwestern U.S. samples collected from feral plantings of hop occupied an overlapping but distinct space in the upper left of the PCoA plot, which overlapped with P. macularis isolates from Eastern Europe, notably Slovenia and the Czech Republic. In contrast, the ellipses of Midwest U.S. and Eastern U.S. P. macularis samples collected exclusively from cultivated hop plants occupied a second overlapping, but distinct space in the upper right portion of the PCoA plot. These two “cultivated” U.S. groupings also overlapped in their ordination with the cultivated-hop derived samples of the Pacific Northwest United States, as well as samples derived from the United Kingdom (Fig. 2C).

    Fig. 2.

    Fig. 2. Principal coordinate analyses (PCoA) of Podosphaera macularis samples based upon Euclidean distances of the returned amplicon sequencing haplotype profiles. Across all three plots, confidence ellipses are only included for samples originating from the United States. A, P. macularis samples are grouped by geographic origin alone. Both Midwest and Eastern U.S. sample groups have two distinct clusters, indicating geography alone does not describe the observed ordination pattern. B, P. macularis samples grouped by the type of hop planting from which the samples were collected. Samples collected from commercial plantings and from feral hop plantings roughly group into two distinct clusters. C, P. macularis samples grouped by both geographic origin and hop planting type. Samples from Midwest and Eastern U.S. feral hop plants are distinct from samples originating from commercial plants, independent of region of origin.

    Download as PowerPoint

    Pairwise comparisons of the P. macularis cumulative phenotype groupings, based on the PERMANOVA output, provided an additional statistical test for significant differences between sampling groups that followed the same pattern observed in the PCoA plots (Supplementary Tables S3, S4, and S5). In this case the most interesting results are the cumulative phenotype pairings that are not statistically significant from one another, which are shaded in light gray in Table 2. These results further support the distinct clustering patterns observed in the PCoA plots, where the P. macularis samples derived from U.S. cultivated hop yards grouped together and those derived from feral hop plantings throughout the United States clustered in a distinctly separate space (Supplementary Table S5).

    TABLE 2. Pairwise comparisons (adjusted P value) between Podosphaera macularis samples grouped by geographic origin and the type of hop planting from which the sample was derived, based on permutational multivariate analysis of variancea

    Detecting the V6-virulence locus.

    The AmpSeq marker targeting the V6-virulence locus (Pm_2407) passed all sequence quality filtering steps described above. The major alternate allele returned in the haplotype dataset matched the expected SNP mutation associated with a V6-virulence phenotype, which was confirmed via a Clustal Omega alignment of the sequence in comparison with the wildtype allele (Supplementary Fig. S4). The V6-virulence primer data agreed with 24 out of 25 P. macularis samples that had been manually phenotyped and confirmed as possessing the V6-virulence phenotype (Table 3). Additionally, the V6 SNP locus returned no false positive genotypes, correctly assigning 41 out of 41 samples that had been manually phenotyped as non-V6-virulent.

    TABLE 3. Distribution of Podosphaera macularis V6-virulence genotypes returned for marker Pm2407a

    Continental distribution of P. macularis mating-type idiomorphs.

    Design and validation of real-time qPCR HPM mating-type assay.

    Real-time quantitative PCR primer sets were successfully designed for both the MAT1-1 and MAT1-2 P. macularis mating-type idiomorphs (Supplementary Table S2). These primer sets were confirmed to function properly together in a multiplexed qPCR assay as the qPCR assigned mating types matched those returned in the traditional PCR-based mating-type assay (Supplementary Fig. S5). The qPCR assay was also able to genotype one additional P. macularis sample within the validation sample set that failed to amplify via traditional PCR, suggesting a possible increase in assay sensitivity.

    Screening the continental populations of P. macularis for mating-type idiomorphs.

    The mating-type idiomorphs of 497 P. macularis samples were determined using the HPM qPCR mating-type assay. Although we requested that collaborators aim to only collect ToughSpot sticker peels of a single P. macularis colony with clear, distinct margins around the entire colony, there was no guarantee all isolates were comprised of a single pure genotype. As such, we designated sample mating-type idiomorph profiles as either MAT1-1, MAT1-2, or mixed based on the observed qPCR amplification curves. The observed P. macularis population mating-type distribution is summarized in Table 4 and visualized across geography in Figure 3. A one-sample test of equal proportions indicated that the ratio of P. macularis mating-type idiomorphs observed in Northeastern and Midwestern U.S. feral hop plantings did not significantly differ from a 1:1 ratio (P = 0.397), while the observed P. macularis mating-type ratio in commercial hop plantings was significantly different from 1:1 in the Northeastern/Midwestern United States and the PNW United States, but not in Europe (P = 2.2 × 10−16, P = 2.2 × 10−16, and P = 0.515, respectively).

    TABLE 4. Mating-type idiomorph profiles of Podosphaera macularis samples collected throughout the Eastern United States, Midwest United States, Pacific Northwest United States, and Europe, as determined by a multiplexed qPCR targeting the MAT1-1-1 and MAT1-2-1 locia

    Fig. 3.

    Fig. 3. Distribution of Podosphaera macularis mating-type idiomorphs sampled within A, the continental United States and B, the United Kingdom and continental Europe. For a given location, the mating type of each individual sample was determined (range of samples genotyped within a given location, n = 2 to 20) and then a cumulative mating-type profile for the location was assigned.

    Download as PowerPoint

    DISCUSSION

    Using P. macularis as a model, we have outlined a novel methodology for repurposing transcriptomic sequence datasets to build a highly-multiplexed AmpSeq variant library, which is amenable for SNP marker genotyping without the need for pathogen culturing. P. macularis belongs to a class of obligately biotrophic fungi that are difficult to culture en masse and often possess large genomes with high transposable element activity (Jones et al. 2014; Wicker et al. 2013). As such, this approach serves as a template for the generation of low-input, high-throughput AmpSeq genotyping libraries for other difficult-to-culture or obligately biotrophic fungi and oomycetes, including other powdery mildews, downy mildews, and rusts. AmpSeq reactions, which may be comprised of up to 2,000 different markers (or likely more) as a PCR multiplex (Zou et al. 2020), provide greater population-wide resolution than other common low-input population genotyping strategies such as the sequencing of a small collection of microsatellites (SSRs), AFLP markers, SNPs, or differentiation via more recent technologies such as high-resolution melt curve analysis (Forcelini et al. 2018; Forbes et al. 1998; Hansen et al. 2016b; Lees et al. 2006; Markell and Milus 2008; Ordóñez et al. 2019; Rafiei et al. 2018). In fact, a wide range of existing markers including SSRs, SNPs, and INDELs reported within any number of studies could be pooled into a single AmpSeq library and genotyped together as long as primers are compatible and amplicon sizes are similar. As stated previously, a transcriptomic sequence dataset was used as the source of genetic variation from which the AmpSeq SNP markers were designed. However, any existing dataset containing sequence variance information across a representative sample population could likely be repurposed in a similar manner. It has been shown to work equally well in the repurposing of existing GBS (Yang et al. 2016) and whole genome sequencing (WGS) (Zou et al. 2020) data to create AmpSeq molecular marker sets.

    The molecular tools presented here greatly enhance our ability to monitor P. macularis individuals for genetic diversity, putative geographic origin, mating type, and virulence toward a widely deployed R-gene. The AmpSeq markers differentiated P. macularis population structure across the same phenotypic parameters described in (Gent et al. 2020). This is perhaps unsurprising, but nonetheless the desired outcome. Interestingly, the Northeastern U.S. and Midwestern U.S. P. macularis samples derived from feral hop locations occupied a unique space within the PCoA plot (encapsulated by overlapping 95% confidence ellipses), overlapping slightly with Eastern European samples from the Czech Republic and Slovenia. Conversely, the commercially-derived Northeastern U.S. and Midwestern U.S. P. macularis clustered in a space that overlapped with the commercial PNW U.S. P. macularis samples, with U.K.-derived samples grouping nearby. This same phenomenon was reported by (Gent et al. 2020), albeit with a much smaller sample size, suggesting the 1996 P. macularis introduction event within the PNW United States was likely of a U.K. origin, while the P. macularis that has been endemic within the Northeastern U.S. region since the early 1800s, and currently subsiding on noncultivated wild and feral hop, more likely derived from Eastern Europe. The subsequent PERMANOVA and pair-wise comparisons of cumulative passport phenotype designations returned these same clustering patterns, providing strong evidence for the AmpSeq markers generating local haplotypes with biological relevance (Table 2).

    The AmpSeq SNP marker set was also able to differentiate P. macularis samples based on V6-virulence phenotype. The subset of P. macularis samples that had a previously confirmed V6-phenotype based on a compatible infection result on the differential cultivar ‘Nugget’ clustered very tightly with P. macularis samples from the PNW United States. This is to be expected since V6-virulent isolates have not been reported outside of the region (Wolfenbarger et al. 2016). Multiple transcriptome variants are associated with V6-virulence in the PNW population of P. macularis (Gent et al. 2020). However, not all of these variants reliably differentiate isolates that lack or possess V6-virulence (Block et al. 2020). We used a single AmpSeq SNP locus (Pm2407) that is well correlated with V6-virulence. The AmpSeq haplotype data for this specific locus (Table 3) demonstrates the good performance of this marker in properly differentiating wild-type and V6-virulent isolates of P. macularis. As other markers associated with virulence are identified, these can easily be incorporated into the AmpSeq framework, providing a more encompassing profile of P. macularis population structure and phenotypic information important for management. Specifically, it would be of great benefit to identify a locus or set of loci that associate with Cascade-adapted isolates of P. macularis isolates (Gent et al. 2017) to allow for complete genotyping of all known uniquely virulent P. macularis races in the United States in a single, high-throughput assay.

    One interesting aspect of the AmpSeq haplotype data output from this P. macularis population was the high number of loci for which samples returned a heterozygous haplotype. Like all other powdery mildew fungi, P. macularis is taxonomically categorized as an ascomycete fungus, a phylum of which the vast majority are thought to be haploid organisms (Braun 2002; Braun and Cook 2012). Four ways that a marker locus could be returned as heterozygous in a sample from a haploid fungus (after the dataset has been filtered for acceptable sequence read quality) are as follows: (i) that the P. macularis colony sampled was actually an admixture of multiple genotypes overlapping in their growth; (ii) that one or multiple gene duplication events have happened within the P. macularis genome, resulting in returned sequence data that actually corresponds to reads from multiple paralogous genes that have diverged slightly (Jones et al. 2014); (iii) that the SNP marker primers were inadvertently targeting multiple, distinct gene loci; or (iv) error within a variant caller that was originally designed to call diploids. In other powdery mildew fungi, namely Erysiphe necator and Blumeria graminis, aligned genome sequences indicate that upwards of 90% of the genome may be comprised of gene duplications and transposable element activity (Jones et al. 2014; Wicker et al. 2013). However, because a sequenced P. macularis genome is not available, it was not possible to confidently map our sequence reads to a known genomic location, and therefore not possible to differentiate whether a locus that returned an appreciable level of heterozygosity (>10%) was due to genetic admixture, gene duplication, or nonspecific amplification. As such, in the interest of being as conservative as possible and presenting a finalized AmpSeq SNP marker library that is the most likely to return informative SNP data into the future, we decided to filter out all SNP loci that exceeded a 10% heterozygosity level across the 320 P. macularis samples that survived quality filtering steps. It is worth noting that the clustering patterns output from the PCoA plots and the PERMANOVA were the same when this filtering step was not applied (Supplementary Fig. S6), indicating that the most relevant variants are housed within the final core set of 54 SNP loci. This step would not be necessary in cases where a fully sequenced, aligned genome is available. But in the current case of P. macularis, and for other fungi where a sequenced genome is unavailable, we propose this approach as a reasonable, conservative framework in selecting SNP loci for which to adhere.

    In order to provide an updated and high resolution classification of the P. macularis mating-type idiomorph distribution across the United States and Europe, we surveyed the AmpSeq SNP marker validation sample set for mating type, as well as some additional pre-existing P. macularis DNA samples in our collection. In the process, we updated the existing P. macularis mating-type markers for use in a multiplexed real-time quantitative PCR format. Wolfenbarger et al. (2015) assayed 56 samples collected from geographies with sexually reproducing P. macularis populations. Here we expanded that sampling depth to 358 P. macularis samples from geographies with sexually reproduction populations, as well as an additional 139 P. macularis samples from the PNW U.S. region to provide the highest resolution understanding of P. macularis mating-type distribution to date. Given that there were 139 P. macularis samples collected from within the PNW and no positive MAT1-2 detections, a one-sided 95% confidence interval yields a possible MAT1-2 idiomorph frequency of between 0 to 2.6% of the population. As such, the PNW U.S. hop growing region still likely harbors an exclusively MAT1-1 P. macularis population, indicating that quarantine in place to prevent the import of foreign hop plant material into the region has largely been successful. In further support of the idea that there are distinct P. macularis populations within commercial versus feral hop plantings in the Northeastern United States and Midwestern United States, all but one of the 166 P. macularis samples collected from commercial hop yards returned a MAT1-1 mating-type profile, while both mating types were identified in approximately a 1:1 ratio within the populations derived from feral hops (Table 4).

    This data, in combination with the ordination and PERMANOVA data from the AmpSeq SNP marker analysis, cumulatively support a relatively recent introduction of P. macularis from the PNW United States into commercial hop yards of the Northeastern and Midwestern United States. While these population structure analyses cannot discern which of the commercially derived P. macularis clusters (Northeastern United States, Midwestern United States, and PNW United States) is the founder, the historical timing of the pathogen’s emergence across geographies suggests only one likely scenario. The Northeastern and Midwestern U.S. hop industries have only reemerged within the past decade, while P. macularis arrived in the PNW United States in 1996. As such, and as suggested previously (Gent et al. 2020), the most plausible scenario is a dissemination of the fungus on infected hop planting material that was originally sourced out of the PNW region, distributed to a handful of hop propagation facilities east of the Rockies, and ultimately distributed to a large portion of the new hop yards of the Northeastern United States and Midwest United States as initial planting material. By the time the 21st Amendment to the U.S. Constitution was passed in 1933, therefore ending Prohibition, hop production throughout the Northeastern and Midwestern United States had become largely nonexistent. However, it is feasible that this original, sexually reproducing, Northeastern U.S. P. macularis population of the early 1900s has survived on wild and feral hop throughout the region, existing in parallel with the PNW U.S.-derived P. macularis lineage causing disease within commercial hop yards. It is also entirely possible, and to some extent, expected, that as the density of hop yards established throughout the Northeastern and Midwestern U.S. increases, effectively shrinking the “natural bridge” that the pathogen must travel from one hop planting to another, P. macularis introduction events will occur via pathogen spread from feral hop plantings into commercial yards, or between yards via long-distance dispersal (Gent et al. 2019a). If this were to occur, we hypothesize that future analyses utilizing this AmpSeq marker set would reflect such a transition in population structure. We also expect a drastic shift in mating-type ratios would occur within commercial hop yards, resulting in a similar 1:1 ratio to that currently observed in feral hop plantings. Presently, our data reemphasizes the pressing need for hop growers to thoroughly inspect hop planting material, as well as the first shoots that emerge in spring, in order to limit introduction of the pathogen into new plantings in the near and long-term.

    Moving forward, the AmpSeq SNP marker library and analysis pipeline, as well as the updated qPCR mating-type markers are tools available to track some of the most pressing threats to sustainable hop production due to P. macularis. This study has validated the baseline structure of the U.S. P. macularis population to which future AmpSeq runs can compare with in order to discern shifts in structure. The arrival of MAT1-2 P. macularis individuals into the PNW United States would likely mean that the population would no longer overwinter solely through asexual means (Gent et al. 2018, 2019b). The escape of V6-virulent P. macularis strains out of the PNW U.S. and into other U.S. hop production systems would threaten the viability of some select hop cultivars, especially ‘Nugget’, ‘TriplePearl’, and others that have powdery mildew resistance based on R6. In future years, any new clustering patterns returned by the AmpSeq SNP markers that were previously unique to a specific sampling geography or cultivation type could suggest a new mode of pathogen spread or a local introduction event that could be targeted for control. The more we transition to forms of proactive monitoring of P. macularis population structure, the better we will be able to manage the pathogen as a whole.

    AmpSeq has been demonstrated to be functional in genotyping other types of sequence variation (INDELS, SSRs, etc.) in breeding applications (Fresnedo-Ramírez et al. 2017; Yang et al. 2016). Here, AmpSeq successfully genotyped SNP variants of the obligately biotrophic plant pathogen, P. macularis. In synthesizing these two findings, we expect additional variant types would perform equally well in genotyping plant pathogenic fungi via AmpSeq. For example, one could likely create a single AmpSeq marker library that combines a set of SSR markers differentiating a pathogen population by clonal lineages reported from study 1 with a set of SNP markers that correspond to pathogen QoI and SDHI fungicide resistance reported in study 2 into a single genotyping assay. We also expect other sequencing technologies where trait-marker data exists (GBS and WGS) to function as the source dataset for calling genetic variants to create AmpSeq marker libraries. The pathosystems that may benefit most from an approach such as this are those that lack extensive genomic resources, where culturing and nucleic acid extraction is labor-intensive and difficult to scale. This approach may therefore be especially useful in genotyping populations of obligately biotrophic pathogens.

    ACKNOWLEDGMENTS

    We thank Mary Jean Welser for her technical support and Elisabeth Seigner, Sebastjan Radišek, Peter Glendinning, and Josef Patzak for their collaboration in the collection of Podosphaera macularis samples.

    The author(s) declare no conflict of interest.

    LITERATURE CITED

    The author(s) declare no conflict of interest.

    Funding: Support was provided by U.S. Department of Agriculture National Institute of Food and Agriculture Pre-Doctoral Fellowship Project number 2019-67011-29734 and U.S. Department of Agriculture SCRI Project number 2014-51181-22381.