Resource AnnouncementOpen Access icon OPENOpen Access license

Genome Sequence of the Chestnut Blight Fungus Cryphonectria parasitica EP155: A Fundamental Resource for an Archetypical Invasive Plant Pathogen

    Authors and Affiliations
    • Jo Anne Crouch1
    • Angus Dawe2
    • Andrea Aerts3
    • Kerrie Barry3
    • Alice C. L. Churchill4
    • Jane Grimwood5
    • Bradley I. Hillman6
    • Michael G. Milgroom4
    • Jasmyn Pangilinan3
    • Myron Smith7
    • Asaf Salamov3
    • Jeremy Schmutz3 5
    • Jagjit S. Yadav8
    • Igor V. Grigoriev5 9
    • Donald L. Nuss10 11
    1. 1Mycology and Nematology Genetic Diversity and Biology Laboratory, United States Department of Agriculture–Agricultural Research Service, 10300 Baltimore Avenue, Building 010A, Beltsville, MD, U.S.A.
    2. 2Department of Biological Sciences, Mississippi State University, 295 Lee Boulevard, Mississippi State, MS, U.S.A.
    3. 3United States Department of Energy Joint Genome Institute, Walnut Creek, CA, U.S.A.
    4. 4School of Integrative Plant Science, Plant Pathology and Plant-Microbe Biology Section, Cornell University, Ithaca, NY, U.S.A.
    5. 5HudsonAlpha Institute for Biotechnology, Huntsville, AL, U.S.A.
    6. 6Department of Plant Biology, Rutgers University, 59 Dudley Road, New Brunswick, NJ, U.S.A.
    7. 7Department of Biology, Carleton University, 1125 Colonel by Drive, Ottawa, ON, Canada.
    8. 8Environmental Genetics and Molecular Toxicology Division, Department of Environmental Health, University of Cincinnati College of Medicine, Cincinnati, OH, U.S.A.
    9. 9Department of Plant and Microbial Biology, University of California Berkeley, Berkeley, CA, U.S.A.
    10. 10Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, MD, U.S.A.
    11. 11Division of Plant and Soil Sciences, West Virginia University, Morgantown, WV, U.S.A.

    Published Online:


    Cryphonectria parasitica is the causal agent of chestnut blight, a fungal disease that almost entirely eliminated mature American chestnut from North America over a 50-year period. Here, we formally report the genome of C. parasitica EP155 using a Sanger shotgun sequencing approach. After finishing and integration with simple-sequence repeat markers, the assembly was 43.8 Mb in 26 scaffolds (L50 = 5; N50 = 4.0Mb). Eight chromosomes are predicted: five scaffolds have two telomeres and six scaffolds have one telomere sequence. In total, 11,609 gene models were predicted, of which 85% show similarities to other proteins. This genome resource has already increased the utility of a fundamental plant pathogen experimental system through new understanding of the fungal vegetative incompatibility system, with significant implications for enhancing mycovirus-based biological control.

    Plant disease epidemics have had profound ecological and economic consequences and have significantly influenced human history. One of the most momentous disease epidemics was chestnut blight, which completely changed the ecological landscape of the hardwood forests of the eastern United States during the 20th century (Anagnostakis 1988). The causal agent, the ascomycete fungus Cryphonectria parasitica, found a very susceptible host in the American chestnut tree, Castanea dentata, following its introduction into North America on nursery stock of resistant Asian chestnut tree species. The resulting disease epidemic, first identified in 1903 in the Bronx Zoo, spread rapidly, resulting in the destruction of an estimated four billion mature American chestnut trees in the following 50 years (Anagnostakis 1988). Although the root systems of infected trees often survive and resprout, because the new sprouts remain susceptible to endemic Cryphonectria parasitica, the once dominant tree now survives throughout its former natural range primarily as an understory shrub (Dalgleish et al. 2015).

    The mechanisms underlying the ability of C. parasitica to effectively penetrate defense barriers and rapidly expand in the cambium tissues of susceptible hosts remain ill defined, with no identified role for toxins, specific secondary metabolites (SM), or hydrolytic enzymes. In this article, we report the C. parasitica strain EP155 genome sequence as an important resource for elucidating the genomic basis for the selective pathogenicity, niche-associated evolution, and virus-based biological control of this classic forest pathogen. C. parasitica EP155 (ATCC 38755) is an orange-pigmented, virulent, hypovirus-free strain (vc type EU-5, MAT1-2) originally isolated in 1977 from a canker on Castanea dentata in Bethany, CT, U.S.A. (Anagnostakis and Day 1979). The fungus was grown on potato dextrose agar overlaid with cellophane at room temperature on the laboratory bench for approximately 7 days; mycelium and conidia were scraped from the cellophane using a sterile razor blade, allowed to air dry under a laminar flow hood, then pulverized using liquid nitrogen in a mortar and pestle. DNA was extracted as described (Choi et al. 2012).

    Genome Sequencing and Assembly

    All sequencing reads were collected with standard Sanger sequencing protocols on ABI 3730XL capillary sequencing machines (Thermo Fisher Scientific, Waltham, MA, U.S.A.) at the United States Department of Energy Joint Genome Institute (JGI) in Walnut Creek, CA. The Cryphonectria parasitica EP155 genome was sequenced using a Sanger whole-genome shotgun approach from paired-end sequencing reads of plasmid and fosmid libraries at a coverage of approximately 8.54×. Three different-sized libraries were used as templates for plasmid and fosmid subclone sequencing, with both ends sequenced as follows: 332,747 reads from a 2.3-kb-sized plasmid library, 265,247 reads from a 6.8-kb-sized plasmid library, and 107,327 reads from a 39.3-kb-sized fosmid library. Reads were assembled using a modified version of Arachne v.20071016 (Jaffe et al. 2003) with parameters maxcliq1 = 100, correct1_passes = 0, and BINGE_AND_PURGE = True. After trimming for vector and quality, the genome was assembled into 39 main scaffolds spanning 43.9 Mb (version 1), with the 11 largest scaffolds containing 90% of the genome sequence.

    Genome Finishing and Map Integration

    The initial whole-genome shotgun assembly was broken down into scaffolds and each scaffold piece was reassembled with Phrap (, then manually improved and finished using Consed (Gordon 2003). All low-quality regions and gaps were targeted with computationally selected Sanger sequencing reactions completed with 4:1 BigDye terminator: dGTP chemistry (Thermo Fisher Scientific). These automated rounds included walking on 2.3- and 6.8-kb plasmid subclones using 4,526 custom primers. Following the automated rounds, a trained finisher manually inspected each assembly. Remaining gaps and hairpin structures were resolved by generating small insert shatter libraries of 6.8-kb-spanning clones (Grigoriev et al. 2014). Five fosmid clones were shotgun sequenced using Sanger long-read technology and finished to fill large gaps and resolve larger repeats. Each assembly was validated by an independent quality assessment, which included visual examinations of subclone paired ends, high-quality discrepancies, and any low-quality areas. The assembly was further refined with the aid of the C. parasitica genetic linkage map constructed from a cross of Japanese C. parasitica isolate JA17 and Italian isolate P17-8 (Kubisiak and Milgroom 2006) that was upgraded by the addition of 141 simple-sequence repeat (SSR) markers mined from the EP155 genome sequence using Primer3 (Rozen and Skaletsky 2000). Allele data for 96 ascospore progeny of the JA17 × P17-8 cross were collected for 60 polymorphic EP155-derived SSR markers located at the terminal ends of the EP155 scaffolds. Finished segments (33 scaffolds with 34 contigs) were localized and ordered into pseudomolecules using a 30-marker map with seven joins to form the final assembly. The version 1 assembly was condensed using the recombinational linkage data, yielding the final version 2 assembly contained in 26 scaffolds with 33 contigs (L50 = 5; N50 = 4.0 Mb; 43.9 Mb) with an estimated error rate of less than 1 error in 100,000 bp. The allele data also allowed placement of the C. parasitica mating-type locus (MAT1) on scaffold 2 and vegetative incompatibility loci vic1, vic2, vic4, vic6, and vic7 on scaffolds 5, 7, 4, 3 and 6, respectively. Summary statistics were generated using the JGI Annotation Pipeline, and genome completeness was assessed using BUSCO v3.02 (Simão et al. 2015). A brief overview of the genome assembly is provided in Table 1; complete summary statistics can be accessed at the JGI website (, and supplementary information is available at AgData Commons (Crouch et al. 2020). An overview of the genome assembly and summary of some key groups of genes is detailed in the following paragraphs.

    Table 1. Summary features of the Cryphonectria parasitica EP155 genome assemblya

    Overall, there was close correspondence between EP155 karyotypes (Eusebio-Cope et al. 2009) and the 26 scaffolds of the EP155 genome assembly. Estimated C. parasitica chromosome sizes based on pulsed-field gel electrophoresis ranged from 3.3 to 9.7 Mb with no evidence of mini-chromosomes or accessory chromosomes that are associated with some plant pathogenic fungi (Bertazzoni et al. 2018; Eusebio-Cope et al. 2009). The 16 telomeric sequences—identified in the EP155 assembly through manual inspection of scaffold ends for the telomeric repeat sequence (T/C)TAGGG—indicated a minimum of eight chromosomes, in good agreement with cytological and electrophoretic karyotyping datasets that predicted chromosome counts of either seven or nine (Eusebio-Cope et al. 2009; Milgroom et al. 1992). Five of the EP155 scaffolds were within the estimated chromosome size range and were complete from telomere to telomere (scaffolds 1 to 4 and 8). Six scaffolds had a telomere on one end (scaffolds 5 to 7 and 9 to 11). The remaining scaffolds were smaller and lacked telomeres.

    The 11 largest scaffolds ranged in size from 1.0 to 7.4 Mb and, together, comprised 99.2% of the genome assembly. Roughly half of the genome was contained in four scaffolds of at least 5.1 Mb in length. Scaffold 8 was closest in size to that predicted for chromosome 9 (3.2 versus 3.3 Mb) and contained the β-tubulin gene that was previously identified as residing on chromosome 9 by Southern analysis (Eusebio-Cope, et al. 2009). The close correspondence between karyotyping results and the draft genome sequence analysis provides a promising platform for further refinement of the sequence assembly to the chromosome level.

    Gene Models and Functional Predictions

    EP155 gene models were predicted and annotated using the JGI Annotation Pipeline, which combined homology-based, ab initio, and transcriptome-based gene predictors (Dawe et al. 2003; Grigoriev et al. 2014, Kuo et al. 2014; Shang et al. 2008) using the JGI expressed sequence tag (EST) pipeline. Putative protein domains were identified by querying against a local InterProScan database. After filtering for EST support, completeness, and homology support, 11,609 genes were structurally and functionally annotated from the EP155 version 2 assembly (Table 1). This number is similar to that predicted for related fungi such as Neurospora crassa (10,620), Magnaporthe oryzae (12,841), and Fusarium graminearum (11,640).

    Functional predictions were performed using the JGI Annotation Pipeline ( unless otherwise specified. Over 85% of the predicted proteins showed similarities to other proteins from the NCBI nonredundant protein database. Over 66% of the predicted proteins contained Pfam domains, with the most highly represented domains including major facilitator superfamily MFS-1 (PF07690; n = 244), fungal Zn(20-Cys(6) transcriptional regulatory protein (PF000172; n = 166), short-chain dehydrogenase/reductase SDR (PF000106; n = 132), cytochrome monooxygenase P450 (P450) (PF00067; n = 116), and serine/threonine protein kinases (PF00069; n = 116). As calculated using TRIBE-MCL (Enright et al. 2002), relative to the predicted proteomes of seven other members of the order Diaporthales curated by the JGI Mycoportal (Diaporthales MCL.2920), 85.7% of the predicted C. parasitica proteins were members of multigene family clusters.

    Sm Genes and Gene Clusters

    In fungi, genes involved in complex coordinated functions such as pathogenicity and SM production can occur as coregulated gene clusters, where genes are physically organized together (Keller 2019; Keller et al. 2005). Altogether, 59 predicted SM genes or gene clusters were identified from scaffolds 1 to 11 of the C. parasitica EP155 genome assembly. All but six of these regions contained either polyketide synthase (PKS; annotated as PKS1 to PKS31) or nonribosomal peptide synthase (NPS) genes (annotated as NPS1 to 10, NPS12 and NPS13, PKS/NPS1 to PKS/NPS5, ACS1, LYS1, FASA, and OAS1) or both. The predicted SM clusters ranged in size from just a single gene (44.8%) to clusters containing up to 39 genes, and included genes with functions typically associated with SM clusters (e.g., transporters and transcription factors). The two largest clusters, PKS16 and PKS26 (39 and 32 genes, respectively), each spanned over 100 kb of scaffolds 10 and 1, respectively.

    Cytochrome P450S

    Cytochrome P450s perform a wide variety of reactions such as hydroxylation, epoxidation, dealkylation, sulfoxidation, deamination, desulfuration, dehalogenation, and nitric oxide reduction (Sono et al. 1996). Fungi in general possess extraordinarily large numbers of P450 genes, second only to plants. Putative cytochrome P450s were identified from the C. parasitica EP155 assembly through BLAST searches for the two conserved P450 signature domains (; namely, the oxygen-binding and heme-binding motifs. P450s that showed both domains were considered authentic P450s. Identified P450s were then classified into families and subfamilies based on the existing nomenclature criteria of >40% nucleotide similarity for assigning a family and >55% for a subfamily based on the classification criteria recommended by the International P450 Superfamily Nomenclature Committee. P450s that could not be assigned to any known clan based on the existing classification scheme were assigned to an appropriate clan or clans based on their relative position in the phylogenetic tree.

    The C. parasitica EP155 genome contained 122 P450s (P450ome) classified into 76 CYP families and 101 subfamilies. The majority of the P450s in the C. parasitica P450 genome were orphans, with no known function. Fifteen novel subfamilies were identified (one under each of the CYP families CYP503, CYP526, CYP548, CYP567, CYP584, CYP614, CYP638, CYP639, CYP660, CYP5091, CYP5093, CYP5111, CYP5129, CYP5168, and CYP5227). In comparison with other euascomycetes, the C. parasitica genome had a moderately sized P450ome, comparable in size with the P450omes of Aspergillus terreus, F. verticillioides, and M. oryzae (123 to 126 P450s). Interestingly, unlike the majority of the euascomycetes, which do not contain basidiomycete-like P450s, the C. parasitica genome contained five such basidiomycete P450 homologs; namely CYP5053, CYP5227, CYP5090, CYP5093, and CYP5201. The abundance of basidiomycete P450 homologs in C. parasitica implies that these P450s may play a key role in the oxidation of wood-derived compounds and tree pathogenesis (Suzuki et al. 2012; Syed and Yadav 2012).

    Mitochondrial Genome

    The mitochondria of C. parasitica have been extensively studied for their association with virulence attenuation (i.e., hypovirulence), derived from either viral infection, mtDNA mutations, or mitochondrial plasmids (Baidyaroy et al. 2000; Monteiro-Vitorello et al. 1995; Polashock and Hillman 1994). Some C. parasitica strains harbor mitochondrial plasmids that elicit hypovirulence (Monteiro-Vitorello et al. 2000). Similarly, some strains of C. parasitica are subject to mitochondrial hypovirulence, a cytoplasmically transmissible form of hypovirulence associated with defects in the mitochondria (Baidyaroy et al. 2000; Monteiro-Vitorello et al. 1995). Although distinct from viral-induced hypovirulence, there are remarkable parallels between the nonviral forms of hypovirulence, including virulence attenuation and the shared alteration of transcript accumulation of over 70 genes (Allen and Nuss 2004; Monteiro-Vitorello et al. 1995, 2000). Mitochondria of some C. parasitica strains are also shown to harbor small RNA viruses that can reduce fungal virulence (Polashock and Hillman 1994) and can be transmitted to several other fungal species (Shahi et al. 2019). Given that C. parasitica EP155 is a virus-free, virulent strain of the fungus, it was not surprising that the mtDNA assembly did not share significant similarity with any of the known indicators of mitochondrial hypovirulence; for example, the assembly did not contain sequences with similarity to small subunit ribosomal DNA InC9 (AF218209), Cryphonectria parasitica mitovirus 1-NB631 [NC004046], or pCRY1 [AF031368]. The ratio of UGA (= cytoplasmic terminator) to UGG codons predicted to encode Trp in the C. parasitica mitochondrial genome is high compared with other fungi, approximately 95%, and this correlates positively with the relatively high number of UGA codons predicted to encode Trp in Cryphonectria mitochondrial viruses compared with other related mitochondrial viruses (Nibert 2017).

    The C. parasitica EP155 mitochondrial genome was annotated using MITOS v.2 (Bernt et al. 2013). Genome sequencing recovered the mitochondrial genome within a single 158,902-bp scaffold, consistent with the published physical map (Bell et al. 1996). Overall, the mtDNA genome contained a full complement of protein-coding genes (atp6, atp8, atp9, cob, cox1, cox2, cox3, nad1, nad2, nad3, nad4, nad4L, nad5, and nad6), ribosomal RNA and ribosomal proteins (rrns, rnL, and rps3), and 29 tRNAs. Similar to the mitochondrial genomes of most filamentous ascomycetes, the majority of the tRNAs are clustered together, with 10 tRNA genes located side by side on each side of the rrnL ribosomal gene. Endonuclease open reading frames (ORFs) were abundant, with 36 LAGLI-DADG homing endonucleases and 27 GIY-YIG endonucleases predicted to occupy 29% of the mtDNA assembly. Numerically, C. parasitica EP155 had a total of 63 predicted mitochondrial endonucleases, exhibiting one of the largest overall cohorts of such enzymes identified to date (Sclerotinia borealis = 61, Rhizoctonia solani Rhs1AP = 43, Agaricus bisporus = 46) (Mardanov et al. 2014).

    Vegetative Incompatibility

    Attempts to introduce hypovirulent strains of C. parasitica into North American forest ecosystems to control chestnut blight have been hindered by the difficulty of introducing hypoviruses (family Hypoviridae) into the resident fungal populations (Milgroom and Cortesi 2004). The transfer of hypoviruses from hypovirulent strains to virus-free natural populations of C. parasitica is dependent on hyphal anastomosis and the subsequent transfer of cytoplasmic hypoviruses from a donor strain to the virus-free recipient strain (Dawe and Nuss 2001; Hillman and Suzuki 2004; Nuss 1992, 2005; Smith et al. 2000; Van Alfen et al. 1975). This process is regulated by a diverse self-nonself vegetative incompatibility (vic) system (Milgroom and Cortesi 2004). In C. parasitica, six di-allelic loci controlling vegetative incompatibility in European C. parasitica populations have been identified by classical genetics (Cortesi and Milgroom 1998), five of which function in preventing heterokaryon formation (Choi et al. 2012; Smith et al. 2006; Zhang et al. 2014). At least two more vic loci are thought to function in natural populations of C. parasitica (Liu and Milgroom 2007; Robin et al. 2000). The recent identification, genetic characterization, and systematic disruption of incompatibility genes at six of these vic loci has provided a promising new opportunity for overcoming this major barrier to hypovirus dissemination across C. parasitica populations (Choi et al. 2012; Stauder et al. 2019; Zhang and Nuss 2016; Zhang et al. 2014).

    Although C. parasitica vegetative incompatibility loci have historically been referred to as vic loci (Anagnostakis 1988), they are assumed to share characteristics with heterokaryon incompatibility (het) loci in other fungi (Glass and Dementhon 2006; Paoletti and Saupe 2009). Putative het genes in the C. parasitica EP155 genome were identified through a combination of BLASTp searches and syntenic comparisons with the N. crassa genome (Galagan et al. 2003). The following het proteins were used to query the EP155 assembly: generic HET domains (PF06985.2), het proteins from N. crassa (het-c, het-6, pin-c, tol, and un24) and het proteins from Podospora anserina (het-c, het-D/het-E, and het-s). Several motifs are also associated with some het genes and may function in programmed cell death; therefore, we also searched for NACHT (PF05729.3) and WD40 repeat (PF00400.22) domains, both of which are found associated with HET domains in P. anserina het-D/het-e (Paoletti and Clavé 2007). Because of the large number of these motifs associated with other functions, we only searched for them by comparing motifs in regions adjacent to ORFs with HET domains. We also looked for synteny in C. parasitica to the genomic regions of N. crassa that contain het genes. This was done simply using the VISTA tracks in the JGI genome browser by zooming out and looking for similar genes in the two genomes. We especially looked for the pairs of het genes that function together in N. crassa: het-6/un24 and het-c/pin-c.

    BLASTp searches of the C. parasitica EP155 genome sequence with other ascomycete het genes identified 94 homologous proteins. C. parasitica protein 88866 (annotated hch1, for het-c homolog) is homologous to N. crassa het-c (PF07217.2); in C. parasitica, the region containing hch1 is syntenic to the region of the N. crassa genome containing het-c but we found no homolog to pin-c in C. parasitica, which is the linked interacting partner to het-c in N. crassa. Several C. parasitica-encoded proteins were found with high levels of similarity to het genes from P. anserina. A homolog of the P. anserina het-D/E genes, C. parasitica protein 84049, clearly contains conserved NACHT, WD-repeats, and HET domains. Protein 355540 is highly similar to P. anserina het-C. None of these het gene homologs in C. parasitica map to regions associated with known vegetative incompatibility function. Therefore, the genomic locations and make up of allorecognition factors appear to be distinct among species, as is the distribution of HET domain genes.

    Overall, our analysis showed that there are 124 genes annotated in C. parasitica that contain the HET domain. This number is among the highest found yet in any ascomycete genome. P. anserina was previously described as containing the most recorded HET domains with 120 (Paoletti and Clavé 2007), N. crassa has 55, and A. oryzae has 38 (Fedorova et al. 2005). We found several ORFs that putatively encode HET domains with high similarity to pin-c, tol, or het-6 from N. crassa but none was found in regions syntenic with the N. crassa homologs and, therefore, we did not name these specifically as homologs. Considerably more ORFs that have HET domains occur in filamentous ascomycete genomes than known functional het genes (Fedorova et al. 2005). Similarly, homologs of known het genes are not necessarily functional het genes in other species. The overall lack of synteny among HET domain genes between C. parasitica and other ascomycete species, and the dispersed repetitive distribution both intra- and intergenomically, supports the view that the HET domain represents a component of a mobile genetic element (Micali and Smith 2006; Paoletti and Saupe 2009). We speculate that, within species lineages, independent movement of HET domain elements into distinct genomic locations led to the evolution of unique allorecognition loci in different filamentous fungi.

    Transposable Elements

    Transposable elements (TEs) were identified using REPET v2.5 (Flutre et al. 2011) as described (Rivera et al. 2018). TEs representing both class I transposons (retrotransposons) and class II transposons (transposons with DNA transposition intermediates) were present in the genome of C. parasitica EP155 (Table 1). The TE load, approximately 14% of the total genome sequence, was largely due to the presence of 2,716 class I retroelements, comprising almost 5.0 Mb in total. Class I elements in the family Metaviridae (Gypsy/Ty3 elements) were the most abundant group of retroelements, with 2,040 elements composing over 4 Mb of the genome and making up over 75% of all TEs. No copy of a Metaviridae retrotransposon containing an intact coding sequence was identified. Metaviridae elements were commonly located in TE-rich clusters. Of note was the region surrounding the MAT1 locus on scaffold 5, where there are numerous retrotransposon fragments on either side of the MAT1-2 gene. Overall, the 4-Mb scaffold where MAT1 resides (scaffold 5) was approximately 17% transposon-derived. The presence of TEs surrounding the Mat1 locus was consistent with mapping studies performed by Kubisiak and Milgroom (2006), where significant recombination suppression and high levels of heterogeneity in the region surrounding the MAT1 locus was documented.

    Seventeen full-length TEs with intact coding sequences were identified in the EP155 genome. Of the 17 intact transposons, nine were copies of the hAT-family class II transposon Crypt1, the only C. parasitica element that has been shown experimentally to be active (Linder-Basso et al. 2001).

    RIP is a fungal genome defense mechanism that may mutate repeated sequences such as TEs, most commonly leading to sequences with a reduced GC content and C→T transition mutations. RIP is well defined and extremely efficient in N. crassa (Cambareri et al. 1998) and has been documented at a much less efficient level in several other filamentous fungi (Clutterbuck 2011; Galagan and Selker 2004). Duplicate contiguous sequences of greater than approximately 400 bases within a given genome are detected by an unknown mechanism and then disabled by methylation of cytosine bases in either copy of the duplicated sequence, followed by subsequent deamination of the methylated cytosines to thymine. C. parasitica EP155 TE families that contained ≥10 sequences (minimum of 1 sequence ≥300 bp) were assessed for repeat-induced point mutation (RIP) using RIPCAL v2 (Hane and Oliver 2008) using di-nucleotide frequency and alignment-based algorithms. Evidence of RIP mutation was present if di-nucleotide frequencies met the following criteria: (CpA + TpG)/(A/C + GpT) ≤ 1.03 and (TpA/ApT) ≥ 0.89, and RIPCAL alignments showed peaks for (CA←→TA) + (TG←→TA) mutations. The only gene known to be required for RIP encodes a DNA methyltransferase called RIP-defective (rid) (Freitag et al. 2002), and this gene is present in the C. parasitica EP155 genome. However, RIPCAL analyses and dinucleotide frequencies showed little evidence for RIP mutation across the C. parasitica EP155 genome. Our detections of RIP were limited to DIRS elements (n = 537), helitrons (n = 24), and an unidentified class II element (n = 23). Our analysis did not detect a signature of RIP mutation from Metaviridae elements, although using a de-RIP approach, Clutterbuck (2011) identified 10 Gypsy elements with dinucleotide ratios consistent with RIP mutation.

    Data Deposition

    The genome assembly and annotations are made available via the JGI fungal genome portal MycoCosm ( (Grigoriev et al. 2014). The data are also deposited at DNA Data Bank of Japan/European Nucleotide Archive/GenBank under accession WHUS00000000. The version described in this paper is version WHUS01000000. Supplementary tables and figures are available through the National Agricultural Library AgData Commons at


    Mention of trade names or commercial products in this publication is solely for the purpose of providing specific information and does not imply recommendation or endorsement by the USDA. The USDA is an equal opportunity provider and employer.

    The author(s) declare no conflict of interest.

    Literature Cited

    The author(s) declare no conflict of interest.

    Funding: This work was supported by the U.S. Department of Energy (DOE) Joint Genome Institute (JGI), a DOE Office of Science User Facility, which is supported by the Office of Science of the DOE under contract number DE-AC02-05CH11231 awarded to D. L. Nuss, A. C. L. Churchill, and M. G. Milgroom, Lawrence Livermore National Laboratory under contract number DE-AC52-07NA27344, and Los Alamos National Laboratory under contract number DE-AC02-06NA25396. The project has been supported since its inception through the United States Department of Agriculture (USDA) National Institute of Food and Agriculture (NIFA) Hatch Multistate Research Program, initially through project number NE140, currently project number NE1833. J. A. Crouch is supported by USDA Agricultural Research Service project 8042-22000-298-00-D. A. Dawe and D. L. Nuss were supported by National Science Foundation collaborative awards MCB-1051453 and MCB-1051331, respectively. B. I. Hillman was supported by USDA-NIFA Hatch and McIntire-Stennis Research Programs and by the New Jersey Agricultural Experiment Station. M. Smith is supported by an NSERC Discovery Grant. The P450 work was supported by the University of Cincinnati funds to J. S. Yadav.