MPMI PhytoFrontiers Phytobiomes all journals
RESEARCHFree Access icon

Illumina Sequencing of 18S/16S rRNA Reveals Microbial Community Composition, Diversity, and Potential Pathogens in 17 Turfgrass Seeds

    Authors and Affiliations
    • Li-Ping Ban1
    • Jin-Dong Li1
    • Min Yan2 3
    • Yu-Hao Gao4
    • Jin-Jin Zhang2
    • Timothy W. Moural5
    • Fang Zhu5
    • Xue-Min Wang2
    1. 1College of Grassland Science and Technology, China Agricultural University, Beijing 100193, China
    2. 2Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, China
    3. 3National Animal Husbandry Station, Ministry of Agriculture, Beijing 100125, China
    4. 4The Affiliated High School of Peking University, Beijing 100190, China
    5. 5Department of Entomology, Pennsylvania State University, University Park, PA 16802, U.S.A.


    The increasing need for turfgrass seeds is coupled with the high risk of dangerous microbial pathogens being transmitted through the domestic and international trade of seeds. Concerns continue to be raised about seed safety and quality. Here, we show that next-generation sequencing (NGS) of DNA represents an effective and reliable tactic to monitor the microbial communities within turfgrass seeds. A comparison of DNA sequence data with reference databases revealed the presence of 26 different fungal orders. Among them, serious plant disease pathogens such as Bipolaris sorokiniana, Boeremia exigua, Claviceps purpurea, and Rhizoctonia zeae were detected. Seedborne bacteria, including Erwinia persicina and Acidovorax avenae, were identified from different bacterial orders. Our study indicated that the traditional culturing method and the NGS approach for pathogen identification complement each other. The reliability of culturing and NGS methods was further validated by PCR with specific primers. The combination of these different techniques ensures maximum sensitivity and specificity for turfgrass seed pathogen testing assay.

    Turfgrass plays an important role in the horticultural settings of modern urban and suburban environments by preventing soil erosion, increasing air quality through carbon dioxide removal and oxygen release, and reducing ambient city noise, glare, and visual pollution (Beard and Green 1994; Qian et al. 2010). Large-scale usage of turfgrass seeds occurs in numerous locations such as parks, lawns, gardens, school grounds, football grounds, and golf courses (Chang and Lee 2016). As a result, the maintenance and management of turfgrasses is a multimillion-dollar industry worldwide (Chang and Lee 2016). A critical aspect of turfgrass maintenance is the identification of turfgrass pathogens that can cause significant declines in lawn quality.

    Various fungi, bacteria, and viruses are seedborne pathogens, and they can infect seeds in fields or during storage and can be passively transmitted over long distances (Elmer 2001; Pochon et al. 2012). Seedborne pathogens are responsible for the dissemination of plant diseases locally and abroad, which presents a serious threat for global agriculture and economics, especially when the pathogens are quarantine microorganisms (Gitaitis and Walcott 2007; Pellegrino et al. 2010).

    In China, turfgrass seeds are predominantly imported from other countries with developed systems for turfgrass breeding, which increases the risk of seedborne pathogen introduction and spread. For example, Fusarium spp., Aspergillus spp., Phoma spp., and Alternaria spp. were identified in seven turfgrass seed samples imported from five countries by using potato dextrose agar (PDA) cultures (Gong and Li 2004). To prevent the invasive spread of fungal turfgrass pathogens and phytopathogens that detrimentally affect seed germination and seedling growth, it is necessary to efficiently identify the presence of these causative agents of major turfgrass diseases (Lei et al. 2016).

    Traditionally, pathogens that infect seeds have been identified based on isolation/culture methods and morphological characteristics (Ellias et al. 2012; Milosevic et al. 2007), which have been standardized by organizations such as the International Seed Testing Association (2009), the International Seed Health Initiative (2006), and the U.S. National Seed Health System (2020). These phenotypic approaches have been widely used as a result of their relatively low cost and because the isolated pathogens can be used for further characterization. However, they are time-consuming and not sensitive enough to detect pathogens with low titer levels in the host seed, and many fungi cannot be cultured (Bindslev et al. 2002; Tavanti et al. 2005). Moreover, the accuracy and reliability of results largely depends on the experience, knowledge, and skills of the technician or operator (Mancini et al. 2016; McCartney et al. 2003).

    To date, more effective surveillance of seedborne pathogens has been achieved by molecular methods, including PCR (Lievens and Thomma 2005; Pethybridge et al. 2006), reverse transcription PCR (Narayanasamy 2008), or quantitative PCR techniques (Gitaitis and Walcott 2007) and DNA macroarray using oligo nucleotides (Lievens and Thomma 2005; Tambong et al. 2006). For instance, Acidovorax avenae subsp. citrulli was detected in watermelon seeds by combining immunomagnetic separation with PCR (Walcott and Gitaitis 2000). Another example was the integration of sorbitol neutral red agar with bio-PCR to detect A. avenae subsp. avenae in corn seeds, a pathogen causing leaf streak in corn (Anan et al. 2013). Such techniques have proven to be reliable and efficient methods for the identification of plant pathogens, which is especially important in the case of quarantine phytopathogens (Li et al. 2008; McCartney et al. 2003; van den Boogert et al. 2005; Xiong et al. 2013). However, most of these methods only screen for target pathogens within seed samples, but do not determine the diversity of microbial communities within a seed sample simultaneously.

    Next-generation sequencing (NGS) has been applied to study microbial communities and their diversity, generating data sets of unparalleled size that have further revealed additional information on various microorganisms at a relatively low monetary cost (Akinsanya et al. 2015). This technique not only has the advantage of screening a large spectrum of organisms in a given sample, but it also allows the detection of low-abundance species that may remain undetected by other more conventional detection methods (Fierer et al. 2007; Miller et al. 2016; Prigigallo et al. 2016). NGS has been successfully used to determine the microbial communities in plant samples, and in turn has proved especially useful for the detection of pathogenic fungal varieties (Malacrinò et al. 2017; Prigigallo et al. 2016; Tremblay et al. 2018), making NGS a valuable technique for rapid detection of pathogens that are present in seed samples (Mancini et al. 2016). However, the NGS method is not always reliable. Some possible false-positive and false-negative interpretations may lead to lack of specificity (Peker et al. 2017). Sensitivity and specificity are the two most important factors in seed pathogen identification when choosing a proper test method for assessing seed health (Munkvold 2009). To achieve the maximum sensitivity and specificity, a combination of multiple approaches including traditional methods and modern technologies is recommended for use in seed health testing (Gitaitis and Walcott 2007).

    Thus, this study aimed to characterize, in a comprehensive manner, the profile of endophytic pathogens that are present in turfgrass seed samples by using the Illumina MiSeq NGS platform. To ensure the maximum sensitivity and specificity of the turfgrass seed testing assay, both conventional fungal pathogen cultures and PCR methods were used to validate NGS results.

    Materials and Methods

    Sample collection and seed sterilization.

    Seventeen turfgrass seed samples were obtained from Clover LLC (Beijing, China). They were originally imported from Oregon, except for the Cynodon dactylon sample, which was imported from California. Detailed sample information is listed in Supplementary Table S1. Fungi from seed samples were isolated from surface-sterilized seeds as follows. The turfgrass seeds were surface sterilized by successive immersion in a 70% ethanol solution for 1 min, followed by a 5% sodium hypochlorite solution for 30 s, and finally rinsed in sterile distilled water for 1 min. Seeds were then dried on sterilized paper towels and used for the following experiments.

    Extraction of 16S and 18S genomic DNA.

    The surface-sterilized turfgrass seed samples were ground with a disrupter tube and mixer mill (MM 200; Retsch, Clifton, NJ), and genomic DNA (gDNA) was extracted using the E.Z.N.A. Soil DNA Kit (Omega Bio-tek, Norcross, GA) according to the manufacturer’s protocol.

    PCR amplification of 16S and 18S ribosomal RNA genes.

    Hypervariable regions V3–V4 of the bacterial 16S ribosomal RNA (rRNA) were amplified by PCR using the ABI GeneAmp 9700 system (Thermo Fisher Scientific Inc., Waltham, MA) with forward and reverse primers 338F and 806R (Dennis et al. 2013), and the fungal 18S rRNA was amplified by PCR with forward and reverse primers 817F and 1196R (Borneman and Hartin 2000), which contained an eight-base barcode sequence that is unique to each sample (Table 1). PCR of the bacterial 16S rRNA gene was performed using the following thermal cycles: 95°C for 3 min, followed by 27 cycles of 95°C for 30 s, 55°C for 30 s, and 72°C for 45 s, followed by a final extension at 72°C for 10 min. The protocol for the fungal 18S rRNA PCR amplification was as follows: 95°C for 3 min, followed by 35 cycles at 95°C for 30 s, 58°C for 30 s, and 72°C for 45 s, followed by a final extension at 72°C for 10 min. PCR reactions were performed in a final 20-μl reaction volume, containing 4 μl of 5× FastPfu Buffer, 2 μl of 2.5 mM dNTPs, 0.8 μl of each primer (5 μM), 0.4 μl of FastPfu Polymerase (Transgen, Beijing, China), and 10 ng of template gDNA. Experiments were performed with three technical replicates for each sample.

    Table 1. Primer information for amplicon sequencing and PCR validation used in this study

    Sequencing of partial 16S and 18S rRNA genes.

    The PCR products were electrophoresed on 2% agarose gel and visualized under ultraviolet light. Gel pieces containing the amplicons (300 to 500 bp) were all cut out with a knife. The cut gel pieces weighed approximately 300 mg and were purified using the AxyPrep DNA Gel Extraction Kit (Axygen Biosciences, Union City, CA) according to the manufacturer’s instructions in order to remove nonspecific amplification and impurities. Then PCR products were quantified using the QuantiFluo-ST system (Promega, Madison, WI). Purified amplicons were pooled at the equimolar concentration of 13 pM for each sample and paired-end sequenced (2 × 300 bp) on an Illumina MiSeq platform (Illumina, San Diego, CA) according to the standard protocols of Caporaso et al. (2012). The raw reads were deposited into the NCBI sequence read archive ( under accession number SRP149691.

    Sequence processing and statistical analysis.

    Raw data containing adapters or low-quality reads were filtered by using FASTP ( according to the following rules: (i) reads containing adapters were removed; (ii) reads containing >10% of unknown nucleotides were removed; and (iii) reads containing >40% of low quality (Q-value ≤20) bases were removed. Then clean sequences were merged as paired reads by using FLASH (version 1.2.11) with a minimum overlap of 10 bp and mismatch error rates of 2% (parameters: flash-m10-M 300-x 0.2-p 33) (Magoč and Salzberg 2011). With the UCHIME algorithm (, clean reads were searched against the reference database ( to perform reference-based chimera checking (Caporaso et al. 2010).

    The clean reads were then clustered into operational taxonomic units (OTUs) with a 97% similarity cutoff by using UPARSE-OTU (version 7.1; with a greedy algorithm (Edgar 2013). The taxonomy of each 16S and 18S rRNA gene sequence was analyzed by the RDP Classifier algorithm ( against the Silva database (release 123; and Unite (Release 7.0; Small Subunit rRNA databases, respectively, by using a confidence threshold of 70% (Amato et al. 2013; Leng et al. 2016; Shao et al. 2016). After removing all chimeric reads, the remaining effective reads were used for further analysis. The rarefaction curves generated by Mothur version 1.21 with functions summary.single and rarefaction.single ( (Schloss et al. 2009) for all 17 samples were normalized to the minimum number of sequences and to reveal the alpha diversity by Shannon diversity indices. The beta diversity analysis was performed by using the Bray-Curtis algorithm to compare the results of the nonmetric multidimensional scaling (NMDS) using Vegan community ecology package version 2.0 in R (Oksanen et al. 2013). Heat-map figures were performed using the R language pheatmap package (

    Incubation and isolation of fungi from turfgrass seeds.

    The surface-sterilized seeds were cut into small pieces and blotted dry on sterile paper towels and plated on quarter-strength PDA containing ampicillin (50 μg/ml) and streptomycin (50 μg/ml) (Cai et al. 2009; Su and Cai 2012). The cultures were incubated at room temperature (24 ± 3°C) in the dark for 7 days. Pure cultures were obtained by single spore isolation as described by Choi et al. (1999) and single spores were inoculated onto new PDA dishes every 2 days. The colony morphology was examined 14 days after incubation. The experiments were performed with three biological replicates using 20 seeds per replicate.

    Fungal mycelial mat gDNA extraction, PCR amplification, and sequencing.

    Mycelia were collected from PDA cultures 7 days after incubation and homogenized using the MP FastPrep-24 sample preparation system (MP Biomedicals, Santa Ana, CA). gDNA extraction was performed following the protocol described by Cubero et al. (1999). In brief, samples were frozen in liquid nitrogen and then were ground into powder with glass beads. After adding cetyl-trimethyl ammonium bromide (CTAB) buffer (containing 1% wt/vol CTAB, 1 M of NaC1, 100 mM of Tris, 20 mM of EDTA, and 1% wt/vol polyvinyl polypyrrolidone), the mixture was incubated at 70°C for 30 min before adding one volume of chloroform/isoamyl alcohol (24:1 vol/vol). The mixture was then centrifuged for 5 min at 10,000 × g at room temperature. The upper aqueous phase was collected, and precipitation buffer (1% wt/vol CTAB, 50 mM of Tris-HC1, 10 mM of EDTA, and 40 mM of NaC1) was added to the supernatant. The pellet was resuspended, and the upper phase was precipitated with isopropanol. The pellet was collected by centrifugation and washed with 70% ethanol prior to resuspension in ddH2O. Internal transcribed spacer (ITS) sequences were amplified using the primer pair ITS1/4 (White et al. 1990) listed in Table 1. The amplification reactions were performed in a 25-μl reaction volume containing 2.5 μl of 10× Easy Taq Buffer (TransGen Biotech, Beijing, China), 50 μM of dNTPs, 0.1 μM of each primer, 0.75 U of Taq DNA polymerase (TianGen Biotech Co., Beijing, China), and 1 to 10 ng of gDNA. The PCR cycle parameters were initial denaturation at 95°C for 5 min, followed by 35 cycles of 10 min at 95°C for polymerase activation, and 40 cycles of 95°C for 30 s, 48°C for 30 s, and 72°C for 80 s, followed by a final extension step at 72°C for 10 min. Sanger sequencing was performed by Omega Genetics Company (Beijing, China) with the BigDye Terminator kit using the ABI 3730 DNA Analyzer (Thermo Fisher Scientific Inc.).

    PCR validation on turfgrass seed samples.

    Fifteen microorganisms from the NGS and fungal culture analyses were randomly chosen for PCR analysis by using species- or genus-specific primers for validation. The microorganisms selected were Acidovorax avenae, Alternaria spp., Bipolaris sorokiniana, Boeremia exigua, Claviceps purpurea, Erwinia persicina, Pantoea stewartii, Pseudomonas spp., Rathayibacter toxicus, Rathayibacter tritici, Rhizoctonia zeae, Setosphaeria rostrata, Stemphylium vesicarium, Taphrina spp., and Xanthomonas spp. The primers designed for target genes (size range of 150 to 1,500 bp) are listed in Table 1. Three biological replicates were performed. The gDNAs of the 17 seed samples were used as the template. Amplifications were performed in a 50-μl reaction volume containing 25 μl of 2× Taq PCR Master Mix (TianGen Biotech Co.), 2 μl of forward primer (10 μmol/liter), 2 μl of reverse primer (10 μmol/liter), 250 ng (∼1 μl) of gDNA, and 20 μl ddH2O. The PCR conditions were as follows: initial denaturation at 94°C for 1 min, followed by 30 cycles of denaturation at 94°C for 30 s, annealing at 48 to 65°C (according to different primer sets) for 30 s and extension at 72°C for 60 s, and a final extension step at 72°C for 5 min. Amplified DNA was detected by electrophoresis in a 2% agarose gel in 0.5× Tris/Borate/EDTA buffer, and the results were visualized with the GelDoc XR Gel Documentation and Analysis System (Bio-Rad, Hercules, CA). Distilled deionized water was used as a negative control (data not shown). The bands on the gel were purified using the AxyPrep DNA Gel Extraction Kit (Axygen Biosciences) and submitted for sequencing validation (Omega Genetics Company).


    Sequencing results and quality control.

    A total of 1,335,945 paired-end reads for 16S rRNA and 1,352,158 paired-end reads for 18S rRNA were obtained from 17 turfgrass seed samples. After raw fastq files were demultiplexed and quality filtered, we obtained 1,118,365 clean reads from the 16S rRNA gene amplicon sequencing with an average of 62,346 ± 6,625 reads per sample. We also gained 908,061 clean reads from 18S rRNA gene amplicon sequencing, with an average of 53,415 ± 2,875 reads per sample. The rarefaction curves of the 17 samples tended to approach the saturation plateau, suggesting that the sequencing depth was reached, and the data volume of sequenced reads was sufficient (Supplementary Fig. S1) (Ban et al. 2017). These rarefaction curves indicated that a large variation in the total number of OTUs existed among different samples. The samples from Agrostis matsumurae (Am_2 to Am_3) had the highest OTU count in the fungal community (Supplementary Fig. S1A), whereas the highest OTU density was from the Trifolium repens sample (Tr_1) in the bacterial community (Supplementary Fig. S1B).


    The coverage, number of OTUs, and statistical estimates of species richness for each group at a genetic distance of 3% (Shao et al. 2016) are presented in Supplementary Table S2. 18S rRNA of the Poa pratensis Pp_1 sample had the highest Shannon diversity index (2.13), whereas the Trifolium repens Tr_1 sample had the lowest Shannon diversity index (0.59). For bacterial 16S rRNA, the Poa pratensis Pp_3 sample had the highest Shannon diversity index (2.67), whereas the Cynodon dactylon Cd_1 sample had the lowest index (0.59). The Shannon diversity levels of Poa pratensis Pp_1 to Pp_6 samples varied from 2.14 to 2.67 (with an average of 2.41), which were higher than all other species tested.

    Taxonomic classification of turfgrass seeds.

    Fungal community composition.

    Among the fungal communities of the 17 turfgrass seed samples, we obtained a total of 908,061 clean reads that could be clustered into 72 OTUs (Supplementary Table S3). Ascomycota and Basidiomycota were the top two most abundant groups, which contained 99.92% of the total clean reads (covering 69.38 and 30.54%, respectively) (Fig. 1).

    Fig. 1.

    Fig. 1. Relative abundance of Eukaryota communities classified at the phylum level from the turfgrass seeds. Fa = Festuca arundinacea, Pp = Poa pratensis, Am = Agrostis matsumurae, Cd = Cynodon dactylon, and Tr = Trifolium repens.

    Download as PowerPoint

    Most of the pathogenic fungi belonged to the phyla of Ascomycota and Basidiomycota, which were sorted into 14 classes, including eight classes from Ascomycota and the other six classes from Basidiomycota (Fig. 2; Supplementary Table S3).

    Fig. 2.

    Fig. 2. Taxonomic distribution of fungi from 17 turfgrass seed samples. The relative proportion of the Ascomycota population at A, the class level and B, the order level. The relative proportion of the Basidiomycota population (without class of Tremellomycetes, because Tremellomycetes scantly contained plant pathogens) at C, the class level and D, the order level. Incertae sedis indicates the taxonomic group where its broader relationships are unknown or undefined, and unclassified indicates the reads identified from the Silva database but below the confidence threshold of 70%. Fa = Festuca arundinacea, Pp = Poa pratensis, Am = Agrostis matsumurae, Cd = Cynodon dactylon, and Tr = Trifolium repens.

    Download as PowerPoint

    In the phylum of Ascomycota, the Dothideomycetes, Sordariomycetes, and Eurotiomycetes were the three major classes identified from tested turfgrass seed samples (Fig. 2A). Some fungal genera within these classes include some potentially important phytopathogenic species. For instance, three important plant pathogen species, Bipolaris sorokiniana, Boeremia exigua, and Claviceps purpurea, were found to be extensively distributed among tested samples (Table 2). Bipolaris sorokiniana and Boeremia exigua belong to the order of Pleosporales (class Dothideomycetes), and Claviceps purpurea is a member in the order Hypocreales (class Sordariomycetes) (Fig. 2B). Both Bipolaris sorokiniana (Pleosporacae) and Boeremia exigua (Didymellaceae) are important plant pathogens causing leaf and stem spots and root rot on potatoes (Joshi et al. 2004; Li et al. 2012). Claviceps purpurea is also an important plant pathogen that can cause ergot, a severe disease of cereals and grasses (Irzykowska et al. 2015).

    Table 2. Twelve microbe plant pathogens identified by next-generation sequencing that cause economically important plant diseases

    In the phylum of Basidiomycota, six classes (Tremellomycetes, Microbotryomycetes, Agaricomycetes, Agaricostilbomycetes, Pucciniomycetes, and Exobasidiomycetes) were identified from 17 samples (Supplementary Fig. S2). Among these classes, Tremellomycetes was excluded from further analysis since plant pathogens belonging to this class were scantly documented (Supplementary Fig. S2; Supplementary Table S3). Many of the pathogenic fungi detected belonged to the classes Agaricomycetes and Pucciniomycetes (Fig. 2C). In the turfgrass seed samples tested, seven major orders were identified from the phylum of Basidiomycota (Fig. 2D). In the class of Agaricomycetes, the potentially plant pathogenic species, Rhizoctonia zeae (order of Cantharellales) was identified in five of our samples (Table 2). R. zeae, one of the most notorious pathogenic fungi infecting the Gramineae plants, was identified in seed samples Am_4 and Pp_2 to Pp_4 (Table 2). Additionally, rust-causing fungi from class Pucciniomycetes and the order Pucciniales were found in all five Festuca arundinacea samples and in low abundance in two Poa pratensis samples (Pp_2 and Pp_5) (Supplementary Fig. S2).

    Bacterial community composition.

    From the bacterial community analyses of turfgrass seed samples, a total of 1,059,890 classifiable sequences were obtained and then clustered into 405 OTUs (Supplementary Table S4). Bacterial sequences were found that belonged to 14 different phyla, with most sequences grouping into the five phyla: Actinobacteria, Bacteroidetes, Cyanobacteria, Firmicutes, and Proteobacteria, which covered 99.84% of total OTUs (Fig. 3A).

    Fig. 3.

    Fig. 3. Taxonomic distribution of bacteria from 17 turfgrass seed samples. A, Relative abundance of bacterial reads from the turfgrass seeds classified at the phylum level. B, Taxonomic distribution of five classes in the phylum of Proteobacteria, which comprised nearly 35.5 to 67.5% of the total clean reads in all samples. C, Relative abundance of the bacteria from the phylum of Proteobacteria that was subdivided into 32 families. Incertae sedis indicates the taxonomic group where its broader relationships are unknown or undefined, unclassified indicates the reads identified from the Silva database but below the confidence threshold of 70%, and uncultured indicates that the microbe could not be raised (or incubated) on the artificial medium in this study. Fa = Festuca arundinacea, Pp = Poa pratensis, Am = Agrostis matsumurae, Cd = Cynodon dactylon, and Tr = Trifolium repens.

    Download as PowerPoint

    Cyanobacteria was the most predominant phylum across the different turfgrass seed samples tested, contributing from 27.1% for the Poa pratensis Pp_6 sample up to 95.5% for the Agrostis matsumurae Am_2 sample of total reads (Fig. 3A). Proteobacteria was the second most predominant phylum and contributed approximately from 0.27% for the Cynodon dactylon Cd_1 sample up to 64.9% for the Poa pratensis Pp_6 sample of the total reads (Fig. 3A). Because the phylum of Proteobacteria contains many pathogenic bacteria, we further analyzed Proteobacteria at the class level (Fig. 3B) and family level (Fig. 3C). Proteobacteria were divided into five classes and subdivided into 32 identified families (Fig. 3B and C). One potential plant pathogenic species, E. persicina (Zhang and Nan 2014), which belongs to the family of Enterobacteriaceae, was identified in the phylum Proteobacteria (Table 2). This potential plant pathogenic bacterial species was extensively distributed in all seed samples tested (Table 2). Other genera detected by NGS that may contain important plant pathogen species include genera Acidovorax, Rathayibacter, Pantoea, Pseudomonas, and Xanthomonas (Table 2).

    Comparisons of microbial community structure among samples.

    Multivariate statistical analyses were used to identify the relationships of microbes among samples. The NMDS of the fungi among samples was analyzed and showed that the different species were separated from each other (Supplementary Fig. S3). These results indicated that seeds from the same species share more similar fungal communities than the fungal communities among different seed species (Supplementary Fig. S3A). Similar community distribution was shown to exist for the bacterial communities (Supplementary Fig. S3B).

    Isolation and identification of fungal colonies from turfgrass seeds.

    Six types of single fungal spores were isolated from PDA cultures. Most isolated fungal cultures were from Poa pratensis (Pp_1 to Pp_6), Cynodon dactylon (Cd_1), and Trifolium repens (Tr_1) seed samples (Table 3). Two isolated fungi, Setosphaeria rostrata and Stemphylium vesicarium, were identified at the species level, and the other four fungi were classified at the genus level, including genera Alternaria, Cladosporium, Drechslera, and Bipolaris (Table 3; Supplementary Fig. S4).

    Table 3. Fungi species isolated and identified from turfgrass seed samples with culturing

    PCR validation study.

    PCR analysis was carried out to validate the presence of plant pathogens identified by either NGS and/or fungal pathogen culturing methods. Most of the amplification products exhibited similar presence or absence patterns in PCR as in the NGS method (Figs. 4 and 5; Table 4). The exception was the genus Taphrina, which was not identified by PCR; however, it was detected by the NGS method in 13 of 17 turfgrass seed samples (Tables 2 and 4).

    Fig. 4.

    Fig. 4. PCR analysis for pathogen species identified by next-generation sequencing and/or cultivation. PCR analysis of A, Alternaria spp., B, Bipolaris sorokiniana, C, Boeremia exigua, D, Claviceps purpurea, E, Rhizoctonia zeae, and F, Erwinia persicina. The GenStar D2000 plus DNA ladder (GenStar Biotech Co., Beijing, China) was used as the marker. Lanes 1 to 17 stand for 17 turfgrass seed samples: Trifolium repens (Tr_1), Cynodon dactylon (Cd_1), Festuca arundinacea (Fa_1, Fa_2, Fa_3, Fa_4, Fa_5), Agrostis matsumurae Hack. ex Honda (Am_1, Am_2, Am_3, Am_4), and P. pratensis (Pp_1, Pp_2, Pp_3, Pp_4, Pp_5, and Pp_6).

    Download as PowerPoint
    Fig. 5.

    Fig. 5. PCR analysis of bacteria species based on the candidate genus identified by next-generation sequencing. A, Acidovorax avenae. B, Pantoea stewartii subsp. stewartii. C, Pseudomonas spp. D, Rathayibacter toxicus. E, Rathayibacter tritici. The GenStar D2000 Plus DNA ladder was used as the marker. Lanes 1 to 17 stand for 17 turfgrass seed samples: Trifolium repens (Tr_1), Cynodon dactylon (Cd_1), Festuca arundinacea (Fa_1, Fa_2, Fa_3, Fa_4, and Fa_5), Agrostis matsumurae Hack. ex Honda (Am_1, Am_2, Am_3, and Am_4), and Poa pratensis (Pp_1, Pp_2, Pp_3, Pp_4, Pp_5, and Pp_6).

    Download as PowerPoint

    Table 4. Sixteen microbe plant pathogens identified with three different approachesa

    The distribution of Alternaria spp. showed a similar presence or absence pattern in PCR and pathogen culturing method (Tables 3 and 4). The distribution of the genus Cladosporium was also checked by cultured hyphae. Fungi belonging to Cladosporium spp. were successfully isolated from Pp_1 and Pp_6 samples (Table 3). The pathogen culture method also identified the fungal pathogens Stemphylium vesicarium from the Trifolium repens (Tr_1) sample and Setosphaeria rostrata from the Cynodon dactylon (Cd_1) sample, but neither was detected by the PCR method (data not shown).

    Using the NGS approach, important plant pathogenic bacteria Acidovorax, Rathayibacter, Pantoea, Pseudomonas, and Xanthomonas were identified in seed samples at the genus level. Therefore, we also performed PCR to validate the presence of these pathogenic bacteria. Four species (Acidovorax avenae, Rathayibacter tritici, Rathayibacter toxicus, and Pantoea stewartii subsp. stewartii) were detected in the majority of 17 samples with the PCR method (Fig. 5). Pseudomonas was detected in all five Festuca arundinacea seed samples, whereas Xanthomonas oryzae could not be identified by PCR (data not shown). Damage caused by microbial pathogens identified in this work is summarized in Supplementary Table S5.


    To our knowledge, this is the first study to apply a high-throughput NGS approach to reveal the composition of pathogen species in imported turfgrass seeds. The results demonstrated that the NGS method is an efficient tool for assessing the pathogen species diversity present in turfgrass seeds. Additionally, using a method such as NGS further enables the identification of a larger number of microbial species. Our study also indicated that the NGS method has limitations, which may lead to possible false-positive and false-negative interpretations. There is no single technology providing a comprehensive solution for seed pathogen identification. The combination of various techniques (e.g., NGS, traditional culturing method, and PCR) ensures accurate identification of microbial diversity in turfgrass seeds, which will be of assistance in keeping seeds safe for use and maintaining high-quality seed products.

    There is a growing risk of turfgrass pathogen transmission as the global seed trade grows and the use of turfgrass for landscaping in urban/suburban settings increases. However, management strategies for the containment of bacterial/fungal turfgrass diseases remain insufficient to stop the impending spread of turfgrass diseases. Currently, chemical control practices for seedborne diseases, seedborne pathogen detection, and elimination of contaminated seeds prior to planting are inadequate (Gitaitis and Walcott 2007). Investigative approaches based on high-throughput sequencing offer a possibility for large-scale turfgrass disease control through “diagnosis.” This study screened the profiles of microbial communities in 17 turfgrass seed samples using the NGS method. In total, 477 OTUs (72 OTUs for eukaryotes and 405 OTUs for bacteria) were generated and classified at the phylum, class, order, and family levels (Supplementary Tables S3 and S4). Twenty-six different fungal orders and 44 different bacteria orders were detected through a comparison of DNA sequence data with the Silva and Unite databases (for bacteria and fungi, respectively).

    The goal of a rarefaction curve is to determine whether sufficient sequencing has been made to reflect the species diversity in a given sample and indirectly reflect the abundance of species in that sample. A sharp rise of the curve indicates that the sequencing amount is insufficient, and the number of sequences needs to be increased; in contrast, a curve that plateaus indicates that the sample sequences are sufficient and the data can be used in further analysis. In this research, with the increases in sequencing number, the rarefaction curve for both fungi and bacteria were asymptotic, surpassing the logarithmic phase (Supplementary Fig. S1). This result suggested that a large number of species were found in the community, whereas the slow rise of the flattened curve indicated that the species in this environment do not increase significantly with an increase in the sequencing number.

    The relationships of microbes among seed samples showed that seeds from the same species share more similar fungal and bacterial communities than seeds from different species. For example, bacterial communities from sample Cd_1 were quite different from the rest of the samples, based on the scale of the NMDS (Supplementary Fig. S3). Sample Cd_1 was the only one from California, whereas the other seed samples tested in this study were from Oregon (Supplementary Table S1). Factors associated with geographic positions such as soil properties and climate could cause the differences in microbial communities. For instance, changes in soil microbial communities across space are often strongly correlated with differences in soil chemistry (Jenkins et al. 2009; Lauber et al. 2008; Nilsson et al. 2007). In particular, it has been shown that soil pH closely defines the relative abundance and diversity of bacterial communities (Rousk et al. 2010).

    Although no dangerous quarantine microbes were detected by sequencing partial 16S/18S rRNA genes, which are not variable enough to make the identification at the species level for some of the pathogens detected, the NGS data did provide evidence of the potential presence of dangerous quarantine microbes and notorious crop pathogens that cause serious plant production losses (Nicolaisen et al. 2014).

    In this study, the fungal communities were clustered into 72 OTUs. Compared with other reports for soil and ocean plant seeds (Ma et al. 2017), the fungal community diversity in turfgrass seed is lower, indicating that the community structure of endophytic fungi in turfgrass seed is simpler. Ascomycota and Basidiomycota together form the large subkingdom of Dikarya (“higher fungi”). About 75% of known fungi are considered Ascomycetes (Liers et al. 2011). The phyla Ascomycota and Basidiomycota occupied 99.92% of the total clean reads in this research, and the ratio of Ascomycota to Basidiomycota was 2.3:1, indicating that Ascomycetes were the dominant fungal community in tested seed samples. The Dothideomycetes is by far the largest class in the largest phylum Ascomycota, and this class member is associated with plants. Almost all wild plants examined have been found to be infected by various species of Dothideomycetes (Goodwin 2014). Several pathogens, known to be major impediments to plant growth and production, were detected and belonged to the Dothideomycetes class. This demonstrated that the class Dothideomycetes is the major pathogenic fungal source for the four species of turfgrass seeds examined in this study.

    Most plant pathogens detected by NGS in this study, including Rathayibacter toxicus, which is responsible for ryegrass toxicity (Agarkova et al. 2006), and on the list of U.S. Department of Agriculture Plant Protection and Quarantine Select Agents and Toxins (Mancini et al. 2016; White et al. 1990) were also identified by PCR (Figs. 4 and 5; Tables 2 and 4). Erwinia persicina has been found in legumes including alfalfa, soybean, common bean, and pea and has also been isolated from healthy tomato, cucumber, banana, apple, and pear. The infectious E. persicina, responsible for necrosis and wilting of turfgrass or legume plants, can be transmitted by seeds, water, and soil (Zhang and Nan 2014).

    Several important fungal plant pathogens, such as Bipolaris sorokiniana, Boeremia exigua, and Claviceps purpurea, that can cause plant production losses were identified by NGS and PCR methods (Aggarwal et al. 2010; Irzykowska et al. 2015; Li et al. 2012; Zhang and Nan 2012). These pathogens were present in many of the tested turfgrass seed samples, raising the possibility that they could be transmitted by seed, air, water, or soil, and thereafter contribute to the dissemination of plant diseases locally, or introduce them to new areas (Barak and Schroeder 2012; Butler et al. 2001; Gai et al. 2016; Joshi et al. 2004; Zhang and Nan 2014). The fungi Bipolaris sorokiniana, Boeremia exigua, and Claviceps purpurea were detected in all turfgrass seed samples tested with the NGS technique along with the PCR method. Therefore, NGS can be used to detect potential pathogens, followed by PCR confirmation for a robust validation. Acidovorax avenae, a seedborne bacterium, causes diseases in a wide range of economically important plants, including corn, rice, wheat, sugarcane, watermelon, and cantaloupe (Anan et al. 2013; Girard et al. 2014; Walcott and Gitaitis 2000) (Supplementary Table S5). In this study, Acidovorax species were detected by both NGS analysis and PCR methods.

    Bipolaris spp. was isolated and cultured via the fungal pathogen culturing approach (Table 4). In this genus, Bipolaris sorokiniana is a seedborne and soilborne cereal fungus of global concern (Kumar et al. 2002) and is a causal agent of many diseases, including common root rot, leaf spot disease, seedling blight, head blight, and black point of wheat and barley and other small cereal grains and grasses (Aggarwal et al. 2010; Joshi et al. 2004; Kumar et al. 2002) (Supplementary Table S5). Pathogenic species within the genus Taphrina cause infections in different plants, resulting in galls, leaf curl, and witch’s broom (Selbmann et al. 2014). NGS testing identified Taphrina spp. in 13 of 17 samples in this study (Table 2); however, genus Taphrina failed to be validated by either PCR or culturing methods (Table 4). Taphrina spp. was detected in both Festuca arundinacea and Poa pratensis seed samples by NGS but with lower OTU abundance (Supplementary Table S2). High-throughput sequencing technology had the capability to detect rare taxa and had added sensitivity compared with traditional cloning and PCR amplification (Prigigallo et al. 2016). This may be the reason that the PCR and culturing methods did not detect all of the NGS results identified for the Taphrina genus.

    The contaminated seeds could be distinguished from healthy ones according to traditional morphological characteristics and/or biochemical means (Gutiérrez et al. 2010; Peng et al. 2014). However, it is still inadequate to identify the whole pathogen profile, since the pathogens could be endogenous or present at low titers in the seeds. The pathogens cultivated directly from seed samples expanded the candidate pathogen profiles discovered by the NGS method, indicating that the traditional technique of isolating pathogen colonies is an essential complement to the NGS approach (Table 4). Among the identified pathogens, Alternaria spp. was isolated from the Cd_1 and Pp_1 to Pp_6 samples (Table 3; Supplementary Fig. S4). Alternaria alternata is a fungus causing leaf spot, rot, and blight on >380 host plant species (Pegg et al. 2014). Setosphaeria rostrata is a common plant pathogen causing leaf spot disease on many grass types and is associated with diseases on economically important plants, including corn, rice, sugarcane, sorghum, tomato, and wheat (Kusai et al. 2016). In this study, Setosphaeria rostrata was only isolated from the Cd_1 sample via the culturing method (Table 3; Supplementary Fig. S4). In other words, the NGS and PCR methods failed to detect this pathogen in any samples (Table 4). Another pathogen, Stemphylium vesicarium, was only isolated from one turfgrass seed sample (Tr_1) with the culturing method (Table 3; Supplementary Fig. S4) but was not detected by the other two methods (Table 4). Stemphylium vesicarium is a very common saprophyte on grasses (Ellis 1971) and has been identified as a causal agent of leaf spot in alfalfa and other plants (Basallote-Ureba et al. 1999; Frayssinet 2002; Rossi and Pattori 2009). High-throughput sequencing methods can identify microorganisms in a shorter amount of time than culturing methods. In addition, microbial communities can be inferred by the analysis of group data. However, these methods have some weaknesses, which substantially affect the results. The methods used to extract DNA, which affect DNA quality, or the different primers used for 16S/18S rRNA amplification will produce discrepant results. Furthermore, the various bioinformatics analysis methods could bring different results (Lagier et al. 2018). The databases used in data analysis also affect the data annotation and final classification. All of the above suggest that PCR and pathogen culturing could serve as complementary methods to avoid possible spurious identifications by NGS.

    We also isolated three genera (Bipolaris, Drechslera, and Cladosporium) from four turfgrass seed samples by using the culturing method (Table 3). However, it is difficult to identify these fungi at the species level according to morphological characteristics, because of the high degree of phenotypic similarity among species. Since the gene-specific primers for Cladosporium spp. and Drechslera spp. could not been obtained, the PCR analysis for these two genera was not performed.


    We thank Prof. Xingzhong Liu and Dr. Qian Chen (Institute of Microbiology, Chinese Academy of Sciences) for colony identification.

    The author(s) declare no conflict of interest.

    Literature Cited

    The author(s) declare no conflict of interest.

    L.-P. Ban and J.-D. Li contributed equally to this work.

    Funding: This work was supported by grants from the Beijing Food Crops Innovation Consortium (BAIC09-2020), National Natural Science Foundation of China (31872410 and 31971759), and Agricultural Science and Technology Innovation Program (ASTIP-I AS10) of China.

    Current address of Y.-H. Gao: Oxford College, Oxford, GA 30054, U.S.A.