ONT-Based Draft Genome Assembly and Annotation of Alternaria atra
- Bhawna Bonthala1
- Corinn S. Small1
- Maximilian A. Lutz1
- Alexander Graf2
- Stefan Krebs2
- German Sepúlveda3
- Remco Stam1 †
- 1Chair of Phytopathology, School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany
- 2Gene Centre Munich, Laboratory for Functional Genome Analysis, Ludwig-Maximilians-University, Munich, Germany
- 3Departmento de recursos Ambientales, Facultad de Ciencias Agronómicas, Universidad de Tarapacá, Arica, Chile
Species of Alternaria (phylum Ascomycota, family Pleosporaceae) are known as serious plant pathogens, causing major losses on a wide range of crops. Alternaria atra (previously known as Ulocladium atrum) can grow as a saprophyte on many hosts and causes Ulocladium blight on potato. It has been reported that it can also be used as a biocontrol agent against Botrytis cinerea. Here, we present a scaffold-level reference genome assembly for A. atra. The assembly contains 43 scaffolds with a total length of 39.62 Mbp, with scaffold N50 of 3,893,166 bp, L50 of 4, and the longest 10 scaffolds containing 89.9% of the assembled data. RNA-sequencing-guided gene prediction using BRAKER resulted in 12,173 protein-coding genes with their functional annotation. This first high-quality reference genome assembly and annotation for A. atra can be used as a resource for studying evolution in the highly complicated Alternaria genus and might help in understanding the mechanisms defining its role as pathogen or biocontrol agent.
Copyright © 2021 The Author(s). This is an open access article distributed under the CC BY 4.0 International license.
The fungal genus Alternaria includes endophytic, pathogenic, and saprophytic species that are ubiquitous in nature. They can cause a wide variety of diseases to both fruit and vegetables in the field and at postharvest stage (Scott 2001; Thomma 2003). The phylogeny of Alternaria spp. is particularly complex and hampered by the highly similar morphology of closely related species and inability to resolve monophyletic trees with simple barcodes (Simmons 2007; Woudenberg et al. 2013, 2014, 2015). Alternaria atra (Preuss) Woudenb. & Crous. (previously known as Ulocladium atrum) (Woudenberg et al. 2013) is one of those globally occurring Alternaria spp. It causes Ulocladium blight on potato in large parts of the world (Esfahani 2018) yet it has been reported to have biocontrol potential as a saprophyte on different crops against Botrytis cinerea and Sclerotinia sclerotiorum (Boff et al. 2001; Elead et al. 1994; Li et al. 2003; Ronseaux et al. 2013). Moreover, the species can be found on many wild plant species with various degrees of symptoms. Genomic resources for the Alternaria genus are limited, and no good reference exists for A. atra. Here, we present a scaffold-level reference assembly and gene annotation for A. atra isolate CS162. These data will help future studies on the phylogeny, genetic diversity, and biology of this intriguing fungal genus.
A. atra isolate CS162 was collected from the wild tomato species Solanum chilense in northern Chile near a canyon riverbed, close to the Bolivian border. We confirmed the identity of the A. atra isolate through the analysis of the sequences of multiple conserved genes Alt1a and RPB2 (Woudenberg et al. 2015) using BLASTn. The isolate was purified and grown on potato dextrose agar medium, and whole genomic DNA was isolated from mycelia of 7-day old cultures using phenol/chloroform based extraction. Purified high molecular weight genomic DNA (2 μg) was sent for Oxford Nanopore custom sequencing. The sequencing was then run on an MinION R9 flow cell; the run produced 2,105,684 reads amounting to 4.06 GB and corresponding to a coverage of 104×. The de novo assembly using wtdbg2 (Ruan and Li 2020) generated a total of 87 contigs, with the largest contig containing 6,982,780 bases.
BRAKER v2.1.5 (Brůna et al. 2020), a combination of the GeneMark-ET (Lomsadze et al. 2014) and AUGUSTUS (Stanke et al. 2006, 2008) annotation tools, was used for gene prediction. AUGUSTUS used 51.86 million high-quality paired end RNA-seq reads as extrinsic evidence into the gene prediction and identified 12,173 genes encoding 12,228 proteins, with a BUSCO score of 98.8%. Contigs were scaffolded into 43 scaffolds using SLR (Luo et al. 2019), and polished with 4.2 million trimmed PE Illumina reads obtained from the same genomic DNA, using Pilon (Walker et al. 2014). The final A. atra genome assembly had a size of 39.62 Mb. The quality of the assembled data was assessed using QUAST (Gurevich et al. 2013) (Table 1). The assembled sequence had an N50 of 3.89 Mb and L50 was 4, with an average GC content of 50.87%. The genome comprises of 8.02% transposable elements, as identified using the RepeatModeler pipeline (Flynn et al. 2020). The quality and completeness of the assembled genome was estimated using benchmarking universal single-copy ortholog (BUSCO) software v5.0.0 (Seppey et al. 2019). The completeness of this genome is 98.8% (Table 1). Out of 749 total BUSCO groups searched, the assembly contained 705 complete single-copy, 44 complete duplicated, and 9 missing orthologs, and no fragmented BUSCO was observed. Similar relatively high numbers of duplicated BUSCO groups were also reported in other Alternaria genomes (Bihon et al. 2016; Feng et al. 2021).
Gene ontology terms were associated with 1,872 transcriptional genes and 1,193 transmembrane proteins (Supplementary File S1). Protein domains were searched for in the Pfam database (Mistry et al. 2007), and 241 predicted proteins containing domains for fungal-specific transcription genes and 41 Laminin proteins used for adhesion of fungal conidia to host were also identified. The A. atra proteome contained 1,479 putative secreted proteins, 1,330 signal-peptide-containing proteins, and 1,148 potential effector proteins (Supplementary File S1), as predicted by TargetP-2.0 (Emanuelsson et al. 2007), SignalP-5.0 (Petersen et al. 2011), and EffectorP-2.0 (Sperschneider et al. 2016), respectively. Furthermore, 128 unique CAZy families were identified using conserved unique peptide patterns (CUPP). Barrett and Lange (2019) assigned 635 CUPP groups with 180 unique EC numbers to specify enzyme-catalyzed property. Using a blast (HSP) high-scoring segment pair, 23 gene clusters responsible for the production of the pathogenicity factor Destruxin-B, an important secondary metabolite produced by pathogenic Alternaria spp. (Rajarammohan et al. 2019), were also identified (Supplementary File S1). PFAM domain output showed that 17 TBC-associated domains and 100 fungus-specific HET domains were present. Proteins with these domains are used by fungi to inactivate specific membrane-trafficking processes and, ultimately, lead to the death of the host’s cells (Gabernet-Castello et al. 2013; Paoletti and Clavé 2007). Six toxin gene families—namely, HicA toxin, HigB-like toxin, ParE-like toxin, Toxin YhaV, YafO toxin, and YdaT toxin—were identified by PFAM. The domains of some other toxins—namely, CDtoxinA, Chi-conotoxin, Conotoxin, Endotoxin_N, Fst_toxin, MazE_antitoxin, ParD_antitoxin, VapB_antitoxin, and Zeta_toxin—were also present in the A. atra proteome (Supplementary File S1). Identification of these toxins will facilitate genome comparisons within the species and enhance our understanding of the principle behind molecular mechanisms underlying the pathogenicity and host specificity of this fungal pathogen.
We generated the first scaffold-level genome of the ubiquitous plant-associated fungus A. atra using Oxford Nanopore long-read and Illumina short-read data, with RNA-sequencing-driven gene prediction and functional annotation of factors with high relevance for pathogenicity. This draft genome report will provide useful information for phylogenetic studies and functional genome comparisons among the most important plant pathogens, endophytes, and saprophytes belonging to the Alternaria genus.
The sequencing data sets produced for this study are deposited at the EBI European Nucleotide Archive (ENA) under the project reference PRJEB42493. Fasta files for the genome, coding sequences, and protein sequences and the GTF files, as well as all result files from the functional annotation, are also available at Zenodo.
We thank R. Salvatierra Martínez for support during pathogen sampling in Chile, F. Diaz and other members of the Rodriguez Guittierez lab (Ponteficia Universidad Catolica, Santiago) for helping us with essential sampling preparations in their labs, R. Dittebrandt for help with help for fungal propagation for DNA and RNA extractions, and T. Schmey for help with RNA extraction. Bioinformatics analyses were performed on the BMBF-funded de.NBI Cloud within the German Network for Bioinformatics Infrastructure (de.NBI) (031A537B, 031A533A, 031A538A, 031A533B, 031A535A, 031A537C, 031A534A, and 031A532B).
Author-Recommended Internet Resource
The author(s) declare no conflict of interest.
- 2019. Peptide-based functional annotation of carbohydrate-active enzymes by conserved unique peptide patterns (CUPP). Biotechnol. Biofuels 12:102-123. https://doi.org/10.1186/s13068-019-1436-5 Crossref, Medline, ISI, Google Scholar
- 2016. Draft genome sequence of Alternaria alternata isolated from onion leaves in South Africa. Genome Announce. 4:e01022-16. https://doi.org/10.1128/genomeA.01022-16 Crossref, Medline, Google Scholar
- 2001. Conidial persistence and competitive ability of the antagonist Ulocladium atrum on strawberry leaves. Biocontrol Sci. Technol. 11:623-636. https://doi.org/10.1080/09583150120076175 Crossref, ISI, Google Scholar
- 2020. BRAKER2: Automatic eukaryotic genome annotation with GeneMark-EP+ and AUGUSTUS supported by a protein database. bioRxiv. Google Scholar
- 1994. Control of infection and sporulation of Botrytis cinerea on bean and tomato by saprophytic bacteria and fungi. Eur. J. Plant Pathol. 100:315-336. https://doi.org/10.1007/BF01876443 Crossref, ISI, Google Scholar
- 2007. Locating proteins in the cell using TargetP, SignalP and related tools. Nat. Protoc. 2:953-971. https://doi.org/10.1038/nprot.2007.131 Crossref, Medline, ISI, Google Scholar
- 2018. Identification of Ulocladium atrum causing potato leaf blight in Iran. Phytopathol. Mediterr. 57:112-114. ISI, Google Scholar
- 2021. Draft genome sequence of cumin blight pathogen Alternaria burnsii. Plant Dis. 105:1165-1167. https://doi.org/10.1094/PDIS-02-20-0224-A Link, ISI, Google Scholar
- 2020. RepeatModeler2 for automated genomic discovery of transposable element families. Proc. Natl. Acad. Sci. U.S.A. 117:9451-9457. https://doi.org/10.1073/pnas.1921046117 Crossref, Medline, ISI, Google Scholar
- 2013. Evolution of Tre-2/Bub2/Cdc16 (TBC) Rab GTPase-activating proteins. Mol. Biol. Cell 24:1574-1583. https://doi.org/10.1091/mbc.e12-07-0557 Crossref, Medline, ISI, Google Scholar
- 2013. QUAST: Quality assessment tool for genome assemblies. Bioinformatics 29:1072-1075. https://doi.org/10.1093/bioinformatics/btt086 Crossref, Medline, ISI, Google Scholar
- 2003. Antagonism and biocontrol potential of Ulocladium atrum on Sclerotinia sclerotiorum. Biol. Control 28:11-18. https://doi.org/10.1016/S1049-9644(03)00050-1 Crossref, ISI, Google Scholar
- 2014. Integration of mapped RNA-Seq reads into automatic training of eukaryotic gene finding algorithm. Nucleic Acids Res. 42:e119. https://doi.org/10.1093/nar/gku557 Crossref, Medline, ISI, Google Scholar
- 2019. SLR: A scaffolding algorithm based on long reads and contig classification. BMC Bioinf. 20:539. https://doi.org/10.1186/s12859-019-3114-9 Crossref, Medline, ISI, Google Scholar
- 2007. Predicting active site residue annotations in the Pfam database. BMC Bioinf. 8:298. https://doi.org/10.1186/1471-2105-8-298 Crossref, Medline, ISI, Google Scholar
- 2007. The fungus-specific HET domain mediates programmed cell death in Podospora anserina. Eukaryot. Cell 6:2001-2008. https://doi.org/10.1128/EC.00129-07 Crossref, Medline, Google Scholar
- 2011. SignalP 4.0: Discriminating signal peptides from transmembrane regions. Nat. Methods 8:785-786. https://doi.org/10.1038/nmeth.1701 Crossref, Medline, ISI, Google Scholar
- 2019. Comparative genomics of Alternaria species provides insights into the pathogenic lifestyle of Alternaria brassicae—A pathogen of the Brassicaceae family. BMC Genomics 20:1036. https://doi.org/10.1186/s12864-019-6414-6 Crossref, Medline, ISI, Google Scholar
- 2020. Fast and accurate long-read assembly with wtdbg2. Nat. Methods 17:155-158. https://doi.org/10.1038/s41592-019-0669-3 Crossref, Medline, ISI, Google Scholar
- 2001. Analysis of agricultural commodities and foods for Alternaria mycotoxins. J. AOAC Int. 84:1809-1817. https://doi.org/10.1093/jaoac/84.6.1809 Crossref, Medline, ISI, Google Scholar
- 2013. Interaction of Ulocladium atrum, a potential biological control agent, with Botrytis cinerea and grapevine plantlets. Agronomy (Basel) 3:632-647. https://doi.org/10.3390/agronomy3040632 Crossref, Google Scholar
- 2019. BUSCO: Assessing genome assembly and annotation completeness. Methods Mol. Biol. 1962:227-245. https://doi.org/10.1007/978-1-4939-9173-0_14 Crossref, Medline, Google Scholar
- 2007. Alternaria. An identification Manual. CBS Biodiversity Series 6. CBS Fungal Biodiversity Centre, Utrecht, The Netherlands. Google Scholar
- 2016. EffectorP: Predicting fungal effector proteins from secretomes using machine learning. New Phytol. 210:743-761. https://doi.org/10.1111/nph.13794 Crossref, Medline, ISI, Google Scholar
- 2008. Using native and syntenically mapped cDNA alignments to improve de novo gene finding. Bioinformatics 24:637-644. https://doi.org/10.1093/bioinformatics/btn013 Crossref, Medline, ISI, Google Scholar
- 2006. Gene prediction in eukaryotes with a generalized hidden Markov model that uses hints from external sources. BMC Bioinf. 7:62. https://doi.org/10.1186/1471-2105-7-62 Crossref, Medline, ISI, Google Scholar
- 2003. Alternaria spp.: From general saprophyte to specific parasite. Mol. Plant Pathol. 4:225-236. https://doi.org/10.1046/j.1364-3703.2003.00173.x Crossref, Medline, ISI, Google Scholar
- 2014. Pilon: An integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS One 9:e112963. https://doi.org/10.1371/journal.pone.0112963 Crossref, Medline, ISI, Google Scholar
- 2013. Alternaria redefined. Stud. Mycol. 75:171-212. https://doi.org/10.3114/sim0015 Crossref, Medline, ISI, Google Scholar
- 2015. Alternaria section Alternaria: Species, formae speciales or pathotypes? Stud. Mycol. 82:1-21. https://doi.org/10.1016/j.simyco.2015.07.001 Crossref, Medline, ISI, Google Scholar
- 2014. Large-spored Alternaria pathogens in section Porri disentangled. Stud. Mycol. 79:1-47. https://doi.org/10.1016/j.simyco.2014.07.003 Crossref, Medline, ISI, Google Scholar
The author(s) declare no conflict of interest.
Funding: This project was funded by the German Science Foundation (Deutsche Forschungsgemeinschaft, grant 403835372) and the Technische Universität München Seed Fund.