Short CommunicationOpen Access icon OPENOpen Access license

The Complete Genome Sequence of Xanthomonas theicola, the Causal Agent of Canker on Tea Plants, Reveals Novel Secretion Systems in Clade-1 Xanthomonads

    Authors and Affiliations
    • Ralf Koebnik1
    • Daiva Burokiene2
    • Claude Bragard3
    • Christine Chang4
    • Marion Fischer-Le Saux5
    • Roland Kölliker6
    • Jillian M. Lang7
    • Jan E. Leach7
    • Emily K. Luna7
    • Perrine Portier5
    • Angeliki Sagia1 8
    • Janet Ziegle4
    • Stephen P. Cohen9
    • Jonathan M. Jacobs9 10
    1. 1IRD, Cirad, Université Montpellier, IPME, Montpellier, France
    2. 2Nature Research Centre, Institute of Botany, Laboratory of Plant Pathology, Akademijos g. 2, Vilnius, Lithuania
    3. 3Earth & Life Institute, Université Catholique Louvain-la-Neuve, Louvain-la-Neuve, Belgium
    4. 4Pacific Biosciences, Menlo Park, CA 94025, U.S.A.
    5. 5IRHS-UMR 1345, Université d’Angers, INRAE, Institut Agro, SFR 4207 QuaSaV, 49071, CIRM-CFBP, Beaucouzé, France
    6. 6Molecular Plant Breeding, Institute of Agricultural Sciences, ETH Zürich, Universitätstrasse 2, 8092 Zürich, Switzerland
    7. 7Department of Agricultural Biology, Colorado State University, Fort Collins, CO 80523, U.S.A.
    8. 8Department of Biology, University of Crete, Heraklion, Greece
    9. 9Department of Plant Pathology, The Ohio State University, Columbus, OH 43210, U.S.A.
    10. 10Infectious Disease Institute, The Ohio State University, Columbus, OH 43210, U.S.A.


    Xanthomonas theicola is the causal agent of bacterial canker on tea plants. There is no complete genome sequence available for X. theicola, a close relative of the species X. translucens and X. hyacinthi, thus limiting basic research for this group of pathogens. Here, we release a high-quality complete genome sequence for the X. theicola type strain, CFBP 4691T. Single-molecule real-time sequencing with a mean coverage of 264× revealed two contigs of 4,744,641 bp (chromosome) and 40,955 bp (plasmid) in size. Genome mining revealed the presence of nonribosomal peptide synthases, two CRISPR systems, the Xps type 2 secretion system, and the Hrp type 3 secretion system. Surprisingly, this strain encodes an additional type 2 secretion system and a novel type 3 secretion system with enigmatic function, hitherto undescribed for xanthomonads. Four type 3 effector genes were found on complete or partial transposons, suggesting a role of transposons in effector gene evolution and spread. This genome sequence fills an important gap to better understand the biology and evolution of the early-branching xanthomonads, also known as clade-1 xanthomonads.

    Members of the genus Xanthomonas, including Xanthomonas theicola, cause diseases on hundreds of host plants, including economically important crops and ornamental plants. The List of Prokaryotic names with Standing in Nomenclature ( includes more than 30 different Xanthomonas spp., which can be classified into two major groups based on sequence comparisons (Ferreira-Tonin et al. 2012; Gonçalves and Rosato 2002; Hauben et al. 1997; Parkinson et al. 2007, 2009). Clade 1, also known as the clade of early-branching species (Parkinson et al. 2007), can be further divided into two subclades (Fig. 1). Subclade 1A contains X. hyacinthi, X. theicola, and X. translucens. Subclade 1B contains X. albilineans, X. sacchari, and the proposed species “X. pseudalbilineans” (Pieretti et al. 2015), which are all pathogenic on sugarcane, and the proposed, rice-associated species “X. sontii” (Bansal et al. 2019a,b; Mirghasempour et al. 2020). At least three more species fall into clade 1 but await formal description (Khenfous-Djebari et al. 2019; Parkinson et al. 2009).

    Fig. 1.

    Fig. 1. Presence of selected gene clusters in clade-1 xanthomonads. Phylogenetic tree constructed from partial gyrB sequences, either extracted from genome sequences or taken from Parkinson et al. (2009), using default settings at (Dereeper et al. 2008), and modified at for better visualization (Letunic and Bork 2019). Circles indicate bootstrap values above 0.5 and their sizes are proportional to the bootstrap values. Presence of gene clusters was elucidated using BLAST, CrisprCasFinder, and SecReT6. Shaded squares indicate presence and white squares indicate absence of a gene cluster. “Slc” stands for “species-level clade” (Parkinson et al. 2009).

    Download as PowerPoint

    Tea is an important commodity consumed by more than 3 billion people worldwide and has tremendous socioeconomic, cultural, and medical importance. The tea plant (Camellia sinensis (L.) O. Kuntze) is native to the Indian Subcontinent and to East and Southeast Asia and is cultivated in tropical and subtropical regions. The genomes of the two most widely used varieties, C. sinensis var. assamica and C. sinensis var. sinensis, were released in 2017 and 2018 (Wei et al. 2018; Xia et al. 2017). As holds true for virtually all domesticated plants, bacterial diseases threaten tea plants and production. Multiple bacterial species have been reported to cause infections of tea, such as Agrobacterium tumefaciens (Bradbury 1986), Mixta theicola (Kato Tanaka et al. 2015; Palmer et al. 2018), Pseudomonas avellanae (syn. P. syringae pv. theae) (Gardan et al. 1999), and X. theicola (Uehara et al. 1980).

    X. theicola Vauterin et al. 1995 (formerly X. campestris pv. theicola Uehara et Arai pv. nov.) (Uehara et al. 1980) was first described by Uehara and colleagues in 1980 as the causal agent of bacterial canker on tea plants (Uehara et al. 1980; Vauterin et al. 1995). Only six isolates, which were isolated in Chinan-cho, Japan, from C. sinensis in June 1974, have been deposited in international strain collections. To expand information on clade-1 xanthomonads, we determined the complete genome sequence of the type strain TC1 (ATCC 700184T, CFBP 4691T, DSM 18797T, ICMP 6774T, and LMG 8684T).


    From lyophilized stock available at the International Centre for Microbial Resources (CIRM) Plant-Associated Bacteria (CFBP) (, a single colony of strain CFBP 4691 was isolated and grown on rich nutrient agar media. Genomic DNA was extracted using the Genomic DNA buffer set and Genomic-tips following the manufacturer’s instructions (Qiagen, Valencia, CA, U.S.A.). DNA was sheared three times with a Covaris g-TUBE at 5,500 rpm for 2 min to 8- to 30-kb fragments and sequenced using long-read single-molecule real-time sequencing on a PacBio Sequel I. The genome was assembled with HGAP v4 (Pacific Biosciences, Menlo Park, CA, U.S.A.). QUAST 5.0.2 was used to assess genome quality (Gurevich et al. 2013). NCBI Prokaryotic Genome Annotation Pipeline version 4.9 was used for functional annotation of genes (Haft et al. 2018).

    The X. theicola strain CFBP 4691 genome project is available at NCBI as BioProject PRJNA606814 and the biological sample is described at NCBI as BioSample SAMN14114378. Mean coverage for the CFBP 4691 genome was 264×. The genome was assembled into two contigs of 4,744,641 and 40,955 bp in size (a chromosome and a plasmid, respectively) with a total G+C content of 66.0%. The N50 and L50 were 4,744,641 and 1, respectively. Of the predicted 4,492 genes, 3,766 encoded proteins, 61 encoded functional RNAs, and 665 were considered pseudogenes.

    Genome-wide comparisons with other clade-1 xanthomonads by calculating pairwise average nucleotide identities ( (Rodriguez-R and Konstantinidis 2016) confirmed the unique standing of strain CFBP 4691 within this clade (Supplementary Table S1). X. theicola is considered a sister species of X. hyacinthi and belongs together with X. translucens to subclade 1A, which is distinct from subclade 1B, which comprises X. albilineans, “X. pseudalbilineans”, X. sacchari, and “X. sontii” (Bansal et al. 2019a; Gonçalves and Rosato 2002; Hauben et al. 1997; Khenfous-Djebari et al. 2019; Mirghasempour et al. 2020; Parkinson et al. 2007, 2009). To position all of the clade-1 strains in the context of Parkinson’s comprehensive gyrB-based species evaluation, we also included three still-undefined species-level clades (Slc-5 to Slc-7) in the phylogenetic comparison for which genomic information is not available yet (Parkinson et al. 2009) (Fig. 1).


    Members of both subclades have been found to be remarkably different in their genetic makeup. For instance, type 3 secretion systems (T3SS) have been described as key virulence factors for most strains of xanthomonads but, in clade 1, they are restricted to subclade 1A. Therefore, we scrutinized the genomes of sequenced strains, preferentially type and pathotype strains, for the presence of this and other gene clusters (Fig. 1).

    Like the other subclade-1A strains, strain CFBP 4691 has a noncanonical T3SS (Merda et al. 2017; Wichmann et al. 2013), including its alternative translocon component HpaT (Pesce et al. 2017). Type 3 effector (T3E) prediction using described T3Es from genus Xanthomonas ( (White et al. 2009), P. syringae sensu lato ( (Lindeberg et al. 2009), and the Ralstonia solanacearum species complex ( (Peeters et al. 2013) identified members of 12 Xanthomonas outer protein (Xop) classes (Table 1), most of which are widely conserved among subclade-1A strains. Similarly, strain CFBP 4691 shares the candidate T3E HgiB with most X. translucens strains (Pesce et al. 2017). Most subclade-1A strains, except X. translucens pv. graminis, encode transcription activator-like effectors (TALEs). Strain CFBP 4691 encodes two TALEs, which were assigned to novel classes by AnnoTALE and designated as TalHR1 and TalHS1 (Grau et al. 2016).

    TABLE 1. Type 3 effectors and avirulence proteins in Xanthomonas theicola and their conservation in type and pathotype strains of subclade 1Aa

    Beyond clade-2 xanthomonads’ T3Es, we identified one candidate T3E with similarity to a hypothetical effector from R. solanacearum (T3E_Hyp7) and another one with weak similarity to HopO, an ADP-ribosyl transferase from P. syringae. Bidirectional BLAST analyses support the similarity between HopO1 and G4Q83_14055, although the X. theicola sequence is much longer (968 amino acids [aa]) than the P. syringae sequences (between 283 and 301 aa). The homology is limited to the C-terminal region of G4Q83_14055 (29% sequence identity), which contains a predicted ADP-ribosyltransferase domain (, as found in HopO1. Among all of the analyzed subclade-1A strains, three candidate T3Es appear to be restricted to the species X. theicola, which include XopD, T3E_Hyp7, and the HopO-related protein. In the absence of biochemical or genetic support that the T3E_Hyp7- and HopO-related proteins are, indeed, effectors we prefer not to assign them a name according to the Xop nomenclature (White et al. 2009).

    Seven candidate T3E genes contain a canonical plant-induced promoter (PIP) box and a properly spaced −10 motif, which are the hallmarks of HrpX-induced genes (Koebnik et al. 2006) and, therefore are likely to be coexpressed with the T3SS: xopB, xopE2, xopG, two paralogs of xopAD, xopAF, and T3E_Hyp7. In addition to these T3E and the T3SS genes, a gene coding for a secreted serine protease that is conserved in clade-2 xanthomonads and in Clavibacter michiganensis was found to possess a PIP box/−10 motif. A putative role of this gene in pathogenicity awaits experimental proof.


    While annotating T3E genes, we realized that strain CFBP 4691 has two identical copies of the xopG gene. We first considered this as an artifact produced during the genome assembly process. However, careful analysis revealed that the xopG gene is part of a 6,781-bp sequence that is duplicated in the chromosome and shows features of a transposon such as 28-bp inverted repeats at the ends, flanked by identical 5-bp motifs resulting from target site duplication, tnpA, tnpS, and tnpT genes with a resolution recombination site between the tnpS and tnpT genes (Fig. 2). A structurally similar transposon of the Tn3 family has been described for strains of X. citri and other clade-2 xanthomonads (Ferreira et al. 2015). Two related sequences, encompassing one copy of the inverted repeat and the full or partial tnpA gene, are present on the plasmid. Interestingly, both of them are linked to a T3E gene (xopAF and xopE2) (Fig. 2). This finding suggests that xopG, xopAF, and xopE3 were once acquired and that xopG may be duplicated further by transposition events.

    Fig. 2.

    Fig. 2. Genetic organization of complete and partial effector-encoding transposons in Xanthomonas theicola strain CFBP 4691. Genes are indicated by colored arrows. Transposition-related genes are shown in orange, effector genes in green, and other predicted genes in light blue. The terminal inverted repeats (IRL and IRR; sequences given in the red boxes) are shown as red triangles, and sequences of target site duplications are shown next to them. HrpX-regulated promoters, which consist of a plant-induced promoter box and a properly spaced −10 element (Koebnik et al. 2006), are indicated as yellow circles. Presumed resolution sites, as indicated by the Ω symbol, are located upstream of the tnpT genes, and their sequences are given at the bottom of the figure, with sequence variations highlighted in red. Resolution sites include two palindromes: IR1 and IR2 are probably part of the core site at which recombination occurs, recognized by the TnpS recombinase, whereas IRa and IRb are potential binding sites for TnpT (Ferreira et al. 2015).

    Download as PowerPoint


    In addition to the Hrp T3SS, this strain does encode a second, hitherto undescribed T3SS for xanthomonads, the function of which remains enigmatic. The corresponding gene cluster contains 23 predicted genes (loci G4Q83_11460 to G4Q83_11570), including all of the conserved genes of such a secretion system (nine sct genes) and genes encoding components of the needle and the translocon complex and an AraC-type regulator. None of the genes contains an early stop or frameshift mutation, indicating that the system is most likely functional. Similar T3SSs are found in soil bacteria of the species Glaciimonas immobilis and Solimicrobium silvestre and in several strains of Herbaspirillum (all from the family Oxalobacteraceae in the order Burkholderiales) (GenBank accession numbers CP022736, AWRY01, JACHHQ01, VFNG01, VFNI01, and VIUO01) (Margesin et al. 2018; Reinhold-Hurek and Hurek 1998; Zhang et al. 2011).

    The universally conserved xps type 2 secretion system (T2SS) was present in all of the strains but none of them harbor the xcs T2SS, which is found in several clade-2 species such as X. campestris, X. citri, and X. euvesicatoria (Lu et al. 2008). Notably, the X. hyacinthi strain CFBP 4691 encodes a third T2SS which, among all xanthomonads sequenced to date, is only present in the X. arboricola pv. populi strain CFBP 3122 (BioProject PRJNA339373). Interestingly, this is an atypical strain lacking a T3SS (Merda et al. 2017). Beyond Xanthomonas, similar T2SSs are found in environmental strains of Rhodanobacter and Pseudoxanthomonas (both from the order Xanthomonadales) (NCBI WGS accession numbers AJXW01, LMHR01, LVJS01, MEGI01, OCND01, PDWP01, and PDWU01) (Lee et al. 2007; Li et al. 2014; Weon et al. 2006).

    Type 6 secretion systems (T6SS) can play an important role in pathogenicity, microbial competition, and cell-to-cell communication (Chassaing and Cascales 2018; Coulthurst 2019). Genes encoding T6SS components were identified and classified using SecReT6 (Li et al. 2015). Interestingly, this secretion system is restricted to some pathovars of X. translucens (pvs. cerealis and translucens) but absent in other clade-1 species and pathovars (Fig. 1).


    Members of the genus Xanthomonas were found to encode two clustered regularly interspaced short palindromic repeats (CRISPR) systems, IC and IF, which are defense systems against alien DNA or RNA (Makarova et al. 2011). We used CrisprCasFinder to search for CRISPR systems in selected clade-1 xanthomonads (Couvin et al. 2018). Most strains of Xanthomonas encode, at most, one system but some strains from the species X. hyacinthi, X. pseudalbilineans, and X. albilineans encode both systems (Cohen et al. 2020; Pieretti et al. 2015; Zhang et al. 2020). Likewise, both CRISPR systems were found in strain CFBP 4691, which supports an evolutionary scenario with an ancestor of Xanthomonas that contained both systems.

    Within the genus Xanthomonas, nonribosomal peptide synthases (NRPS) were first described for X. albilineans, where they synthesize several small peptides, among them the antibiotic albicidin (Royer et al. 2013). A key step in nonribosomal peptide biosynthesis is the transfer of the P-pant moiety from coenzyme A to a serine residue, which is catalyzed by the 4′-phosphopantetheinyl transferase (PPTase). We used the presence of both a PPTase gene and NRPS genes as a marker for the ability of a strain to synthesize peptides nonribosomally (Fig. 1). This analysis revealed that NRPS are widespread among clade-1 strains, with the exception of X. hyacinthi, which did not encode any NRPS.


    The first complete genome sequence for the species X. theicola will stimulate further work on the underexplored clade-1 of xanthomonads. Notably, this work revealed the presence of two protein secretion systems that have not yet been described for the genus Xanthomonas: a rare T2SS and a novel T3SS. It will be interesting to study their contribution to pathogenicity. Scrutiny of the predicted proteomes revealed some not-yet-investigated T3Es in Xanthomonas and the presence of effector-encoding transposons in clade-1 xanthomonads. Another interesting discovery was the presence of two new TALEs. The availability of genome sequences of the host plant will be instrumental to elucidate the TALEs’ contribution, and the contribution of other secreted proteins, to canker formation on tea plants.


    We thank J. Grau, Martin Luther University Halle (Saale), Germany, for assistance with predicting and classifying TALE genes.

    The author(s) declare no conflict of interest.


    The author(s) declare no conflict of interest.

    Funding: S. P. Cohen is supported by a United States Department of Agriculture National Institute of Food and Agriculture Postdoctoral Fellowship (2018-08122). This article is based upon work from COST Action CA16107 EuroXanth, supported by European Cooperation in Science and Technology (COST).