MPMI PhytoFrontiers Phytobiomes all journals
RESOURCE ANNOUNCEMENTOpen Access icon OPENOpen Access license

A Hybrid Genome Assembly Resource for Podosphaera xanthii, the Main Causal Agent of Powdery Mildew Disease in Cucurbits

    Authors and Affiliations
    • Álvaro Polonio1 2
    • Luis Díaz-Martínez1 2
    • Dolores Fernández-Ortuño1 2
    • Antonio de Vicente1 2
    • Diego Romero1 2
    • Francisco J. López-Ruiz3
    • Alejandro Pérez-García1 2
    1. 1Departamento de Microbiología, Facultad de Ciencias, Universidad de Málaga, 29071 Málaga, Spain
    2. 2Instituto de Hortofruticultura Subtropical y Mediterránea ‘La Mayora’, Universidad de Málaga, Consejo Superior de Investigaciones Científicas (IHSM-UMA-CSIC), 29071 Málaga, Spain
    3. 3Centre for Crop and Disease Management, School of Molecular and Life Sciences, Curtin University, Perth, WA 6102, Australia


    Podosphaera xanthii is the main causal agent of powdery mildew in cucurbits and, arguably, the most important fungal pathogen of cucurbit crops. Here, we present the first reference genome assembly for P. xanthii. We performed a hybrid genome assembly, using reads from Illumina NextSeq550 and PacBio Sequel S3. The short and long reads were assembled into 1,727 scaffolds with an N50 size of 163,173 bp, resulting in a 142-Mb genome size. The combination of homology-based and ab initio predictions allowed the prediction of 14,911 complete genes. Repetitive sequences comprised 76.2% of the genome. Our P. xanthii genome assembly improves considerably the molecular resources for research on P. xanthii−cucurbit interactions and provides new opportunities for further genomics, transcriptomics, and evolutionary studies in powdery mildew fungi.

    Copyright © 2021 The Author(s). This is an open access article distributed under the CC BY-NC-ND 4.0 International license.

    Genome Announcement

    Powdery mildew fungi (Erysiphales, Ascomycota) are pathogens of increasing concern worldwide. Major crops, including cereals, grapevine, and a number of economically important vegetables and ornamentals, are among their hosts (Panstruga and Schulze-Lefert 2002; Seifi et al. 2014; Takamatsu 2013). Powdery mildew fungi are obligate biotrophic parasites that depend on living host cells to survive (Vogel et al. 2004). Obligate biotrophy in fungi is dependent on the development of specialized structures of parasitism-denominated haustoria, whose main functions are the uptake of nutrients from the host and the release of effectors into plant cells (Martínez-Cruz et al. 2014). Owing to their lifestyle as obligate parasites, they cannot be grown on artificial culture media, a fact that has significantly hampered research compared with other filamentous plant pathogens (Fernández-Ortuño et al. 2007). Despite significant progress made in recent decades, powdery mildew diseases continue to be among the most important plant pathological problems worldwide.

    Cucurbits are among those economically important vegetables affected by powdery mildews. The disease in cucurbits can be caused by either Podosphaera xanthii or Golovinomyces cichoracearum, species that induce identical symptoms but can be easily distinguished by light microscopy (del Pino et al. 2002). Nevertheless, it is widely accepted that P. xanthii is the main causal agent of powdery mildew in cucurbits and one of the most important limiting factors for cucurbit production worldwide (Bellón-Gómez et al. 2015; Fernández-Ortuño et al. 2006). In recent years, important efforts have been invested into deciphering the molecular bases of P. xanthii pathogenesis. This has produced fundamental resources such as the epiphytic and the haustorial transcriptomes (De Miccolis Angelini et al. 2019; Polonio et al. 2019; Vela-Corcía et al. 2016), and specific tools for the functional analysis of P. xanthii genes such as transformation and RNAi silencing protocols (Martínez-Cruz et al. 2017, 2018a and b). Despite this substantial progress, further research on P. xanthii is still needed to uncover novel genes and pathways that will allow us to understand the singularities of this important pathogen of cucurbits. In this study, we present the first draft genome of P. xanthii, which was obtained by a hybrid genome assembly approach.

    The P. xanthii isolate 2086 was cultured onto previously disinfected zucchini cotyledons (Cucurbita pepo L. ‘Negro Belleza’) (Semillas Fitó, Barcelona, Spain) and maintained in vitro in 8-cm Petri dishes with Bertrand medium under a cycle of 16 h of light and 8 h of darkness at 22°C for 10 days (Fernández-Ortuño et al. 2006). The isolate was reidentified as P. xanthii first by light microscopy (del Pino et al. 2002) before the isolation of genomic DNA and, subsequently, by BLAST analysis of the internal transcribed spacer sequences detected in the genome assembly (100.00% identity with MT242593.1 and other sequences). Genomic DNA was isolated from P. xanthii epiphytic mycelia and conidia collected from powdery mildew-infected zucchini cotyledons and ground with liquid nitrogen in a previously chilled mortar, using the MasterPure Yeast DNA Purification Kit (Lucigen, Middleton, WI, U.S.A.) according to the manufacturer’s recommendations. Quality control of DNA samples was carried out using the Nanodrop 2000 spectrophotometer (Thermo Fisher Scientific, Waltham, MA, U.S.A.), 0.6% agarose gels, and Qubit double-stranded DNA High Sensitivity (HS) Assays Kit in a Qubit fluorometer and expert HS DNA Assays in an Agilent 2100 Bioanalyzer electrophoresis system (Agilent, Santa Clara, CA, U.S.A.).

    The extracted DNA was sequenced to obtain both short and long reads using two sequencing platforms, Illumina NextSeq 550 (Illumina, San Diego, CA, U.S.A.) and PacBio Sequel S3 (Pacific Biosciences of California, Inc., Melon Park, CA, U.S.A.). In the case of Illumina, a ready-to-sequence library was constructed using Nextera XT DNA Library Preparation Kit (Illumina) that was sequenced through a midoutput run with 300 cycles (2 × 150 paired ends) to obtain approximately 300× coverage, yielding a total of 137,231,749 paired reads. In the case of PacBio, a 10-kb insert library was constructed using SMRTbell Express Template Prep Kit 2.0 (Pacific Biosciences) that was later sequenced using 1 SMRT cell to obtain a total of 1,787,490 subreads (mean read length of 8,256 bp). This strategy led to approximately 15× coverage of circular consensus sequences (ccs); that is, highly accurate consensus sequences ready for assembly after the process of error correction, corresponding to 185,839 ccs (mean read length of 11,031.56 bp and longest read of 57,662 bp).

    To estimate the genome size, Illumina paired-end reads were used. Before performing the analysis, the Illumina raw reads were trimmed using SeqtrimNext v2.1.3 (Falgueras et al. 2010) to remove sequences of low quality and low complexity, adapters, and contaminating sequences such as bacterial sequences or host sequences, using, in this case, the zucchini reference genome (Montero-Pau et al. 2018). The k-mer histogram method based on a k-mer distribution with a k value of 21 was performed with KmerGenie v1.7051 (Chikhi and Medvedev 2014), using the preprocessed reads. The main peak obtained corresponded with a k value of 121, which was used for the estimation of the genome size, obtaining a predicted assembly of 125.58 Mb.

    The pipeline used to assemble the P. xanthii genome and the subsequent gene prediction is shown in Figure 1. The genome assembly was performed using the combination of paired-end preprocessed Illumina reads and PacBio ccs. For this hybrid assembly, several available tools were used. The best results were obtained with MaSuRCA v3.3.4 (Zimin et al. 2013), probably due to the high performance of this software with low-coverage long reads. Illumina paired-end reads were extended into superreads and were mapped to PacBio reads, resulting in megareads. These megareads were converted into contigs and, finally, paired-end reads were reused to do scaffolding and gap repair using the Flye module included in MASuRCA. After that, a new polish step was performed with one round of Pilon v1.23 (Walker et al. 2014) to fill scaffold gaps using preprocessed reads mapped to the assembly with BWA v0.7.17 (Li and Durbin 2009). The statistics of the genome assembly were carried out using QUAST v.5.0.2 (Gurevich et al. 2013), obtaining 1,727 scaffolds with a final total genome size of 142.11 Mb, which was slightly higher than the KmerGenie size estimation (125.58 Mb). The N50 contig length was 163,173 bp and the size of the largest contig obtained was 947,834 bp (Table 1).

    Fig. 1.

    Fig. 1. Pipelines of the Podosphaera xanthii draft genome assembly and gene prediction.

    Download as PowerPoint

    Table 1. Core metrics of the draft genome sequence of Podosphaera xanthii

    The assembled genome of P. xanthii was analyzed for repetitive elements prior to gene prediction. To do this, a combination of de novo and homology-based approaches was performed using RepeatModeler v1.0.11 (Flynn et al. 2020) and RepeatMasker v4.0.7 (Tarailo-Graovac and Chen 2009). A large number of repeats were identified that represented up to 76.16% of the genome (Table 1). Gene prediction from the repeat-masked genome was performed using the GenSAS v6.0 platform (Humann et al. 2019) combining ab initio and homology-based gene predictions. For homology-based prediction, BLAST+ v2.7.1 (Camacho et al. 2009) and DIAMOND v0.9.22 (Buchfink et al. 2015) were used whereas, for ab initio gene prediction, raw reads and cDNA sequences from the epiphytic and haustorial transcriptomes of P. xanthii (De Miccolis Angelini et al. 2019; Polonio et al. 2019) and a complete set of protein sequences from Blumeria graminis (Spanu et al. 2010) were used to train Augustus v3.3.1 (Stanke et al. 2008) to predict P. xanthii gene models. Augustus identified a total of 22,072 hypothetical genes. Then, SNAP (Korf 2004) and BRAKER v2.1.1 (Hoff et al. 2019) were used to compare and refine the results obtained by Augustus, resulting in a total of 28,014 and 19,590 hypothetical genes, respectively. Finally, EvidenceModeler v1.1.1 (Haas et al. 2008) was used to combine the homology-based and ab initio gene predictions of Augustus, SNAP, and BRAKER into weighted consensus gene structures, yielding 16,030 predicted genes. Among them, 14,911 (93.02%) were complete genes with standard start and stop codons (Table 1).

    The number of predicted genes in the P. xanthii genome assembly represents a higher number of genes than previously described in the genomes of other monocot- and dicot-infecting powdery mildews, ranging from 5,854 to 8,470 (Barsoum et al. 2019). However, recently, the genome of the sweet pepper powdery mildew Leveillula taurica presented a predicted gene set of 19,751 (Kusch et al. 2020), which is very similar to P. xanthii and other obligate biotrophs such as rust fungi, presenting genomes with predicted gene numbers ranging from 13,364 to 28,801 (Barsoum et al. 2019). All of this suggests that, in some dicot powdery mildews, the number of species-specific genes encoded in the genome appears to be higher than in other powdery mildew fungi.

    On the other hand, the completeness of the P. xanthii genome was studied to carry out a first evaluation of the quality of the genome assembly. For this purpose, BUSCO software v3.0.2 (Simão et al. 2015) was used over the Ascomycota lineage (ascomycota_odb9), which contains a total of 1,315 ortholog groups, using by default Aspergillus nidulans as the model species for Augustus gene predictions. The results showed that a high percentage of Ascomycota core genes were present in the P. xanthii genome (94.5% as complete and single copy and 1.2% as fragmented), which was an indicator of the good quality of the genome assembly (Table 1).

    To conclude, we generated the first-draft genome assembly and ab initio and homology-based gene models for P. xanthii. To date, the best-studied powdery mildew genomes are those of the barley and wheat powdery mildew pathogens (Frantzeskakis et al. 2018; Müller et al. 2019; Spanu et al. 2010; Wicker et al. 2013; Wu et al. 2018). To our knowledge, this is the first dicot powdery mildew genome produced using a hybrid assembly approach. This genome considerably improves the resources available for research on P. xanthii to date and will likely provide the necessary framework to undertake further investigations on genomics, transcriptomics, and evolutionary biology of powdery mildew fungi.

    Data availability.

    This genome project is indexed at the GenBank database of NCBI under the Bioproject accession number PRJNA646981. All sequencing data were deposited at the Sequence Read Archive database of NCBI under the accession numbers SRR12260826 (PacBio ccs long reads) and SRR12260825 (Illumina NextSeq550 raw short reads), and the draft genome assembly of P. xanthii under the accession number JACSEY000000000.


    This study would not have been possible without the computer resources and the technical support provided by the Plataforma Andaluza de Bioinformática of the University of Malaga and, especially, R. Bautista. We thank I. Linares (University of Malaga, Spain) for her technical assistance and J. Gómez from Centro de Bioinnovación (University of Malaga) for the excellent technical support provided in the Illumina sequencing process.

    Author-Recommended Internet Resources

    The Genome Sequence Annotation Server:

    SeqtrimNext v2.1.3:

    The author(s) declare no conflict of interest.

    Literature Cited

    A. Polonio and L. Díaz-Martínez contributed equally to this work.

    The author(s) declare no conflict of interest.

    Funding: This study was supported by a grant from the Agencia Estatal de Investigación (AEI) (AGL2016-76216-C2-1-R), co-financed by FEDER funds (European Union).