APS Online Publications
BacteriologyFree Access icon

Taxonomic Refinement of Xanthomonas arboricola

    Affiliations
    Authors and Affiliations
    • Sadegh Zarei1 2
    • S. Mohsen Taghavi1
    • Touraj Rahimi3
    • Hamzeh Mafakheri1 2
    • Neha Potnis4
    • Ralf Koebnik5
    • Marion Fischer-Le Saux6
    • Joël F. Pothier7
    • Ana Palacio Bielsa8
    • Jaime Cubero9
    • Perrine Portier6
    • Marie-Agnes Jacques6
    • Ebrahim Osdaghi2
    1. 1Department of Plant Protection, School of Agriculture, Shiraz University, Shiraz, Iran
    2. 2Department of Plant Protection, College of Agriculture, University of Tehran, Karaj, Iran
    3. 3Department of Agronomy and Plant Breeding, Shahr-e-Qods Branch, Islamic Azad University, Tehran, Iran
    4. 4Department of Entomology and Plant Pathology, Auburn University, Auburn, AL, U.S.A.
    5. 5Plant Health Institute of Montpellier, University of Montpellier, CIRAD, INRAE, Institut Agro, IRD, Montpellier, France
    6. 6Institut Agro, Université de Angers, INRAE, IRHS, SFR QUASAV, CIRM-CFBP, Angers, France
    7. 7Environmental Genomics and Systems Biology Research Group, Institute for Natural Resource Sciences, Zurich University of Applied Sciences (ZHAW), Wädenswil, Switzerland
    8. 8Departamento de Protección Vegetal, Centro de Investigación y Tecnología Agroalimentaria de Aragón, Instituto Agroalimentario de Aragón-IA2 (CITA–Universidad de Zaragoza), Zaragoza, Spain
    9. 9Departamento de Protección Vegetal, Centro Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA/CSIC), Madrid, Spain

    Abstract

    Xanthomonas arboricola comprises a number of economically important fruit tree pathogens classified within different pathovars. Dozens of nonpathogenic and taxonomically unvalidated strains are also designated as X. arboricola, leading to a complicated taxonomic status in the species. In this study, we have evaluated the whole-genome resources of all available Xanthomonas spp. strains designated as X. arboricola in the public databases to refine the members of the species based on DNA similarity indexes and core genome-based phylogeny. Our results show that, of the nine validly described pathovars within X. arboricola, pathotype strains of seven pathovars are taxonomically genuine, belonging to the core clade of the species regardless of their pathogenicity on the host of isolation (thus the validity of pathovar status). However, strains of X. arboricola pv. guizotiae and X. arboricola pv. populi do not belong to X. arboricola because of the low DNA similarities between the type strain of the species and the pathotype strains of these two pathovars. Thus, we propose to elevate the two pathovars to the rank of a species as X. guizotiae sp. nov. with the type strain CFBP 7408T and X. populina sp. nov. with the type strain CFBP 3123T. In addition, other mislabeled strains of X. arboricola were scattered within Xanthomonas spp. that belong to previously described species or represent novel species that await formal description.

    Xanthomonas spp. encompasses a set of Gram-negative yellow-pigmented plant-pathogenic or plant-associated bacteria classified within 34 validly described species (Bull et al. 2010; Mafakheri et al. 2022a, b; Vauterin et al. 1995). Phytopathogenic xanthomonads initiate economically important diseases on annual crops, vegetables, and fruit trees (Dia et al. 2022; Khojasteh et al. 2020; Osdaghi et al. 2021). X. arboricola (arbor meaning tree; arboricola meaning “living in trees”) is a species comprising a number of economically important fruit tree pathogens (Garita-Cambronero et al. 2018; Kałużna et al. 2021). Plant-pathogenic members of the species are grouped within nine pathovars, and several nonpathogenic (i.e., with no pathovar status) strains are also designated as X. arboricola in the literature and public databases (Essakhi et al. 2015). According to the European and Mediterranean Plant Protection Organization (EPPO) global database, X. arboricola pv. corylina and X. arboricola pv. pruni, which cause bacterial blight of hazelnut and bacterial canker of stone fruits, respectively, have been included in the A2 list of EPPO since the 1970s, whereas X. arboricola pv. fragariae and X. arboricola pv. juglandis, the causal agents of bacterial leaf blight of strawberry and bacterial blight of walnut, respectively, have been included in the EPPO alert list and Annex IV of the European Union’s Regulated Non-Quarantine Pests list, respectively (EPPO 2022; OJEU 2019).

    Among the biotic constraints of stone fruits and nut trees, bacterial canker and spot on stone fruits, bacterial blight of walnut, and bacterial blight of hazelnut caused by X. arboricola pv. pruni, X. arboricola pv. juglandis, and X. arboricola pv. corylina, respectively, are considered the most important bacterial diseases all over the world (Lamichhane 2014). The diseases occur in many countries, with a particular importance in regions characterized by high precipitation and humid environmental conditions (CABI 2021a, b, c). Currently, the causal agents are widespread across six continents (EPPO 2022; Lamichhane 2014). International trade of plant materials, e.g., rootstocks, budwood, and grafted plants, as well as fresh fruits, transmits the pathogen over long distances into new areas with no history of the diseases. In severe cases of disease occurrence, 25 to 75% of peach fruits have been reported as unmarketable (EPPO 2006). Bacterial blight of walnut can lead to reductions in yield >70% because the pathogen attacks all parts of the plant (Moragrega et al. 2011; Mulrean and Schroth 1982). Yield reduction in bacterial blight of hazelnut is caused by dieback of fruit-bearing twigs and branches and premature fruit drop, which affects the vitality and further development of the trees. In severe cases, infections can cause 10 to 100% tree death, especially in young plantations and nurseries (Miller et al. 1949). Treatment with copper-based compounds, elimination of the infected plant parts, and optimization of fertilization and irrigation practices are among the most important interventions to reduce the risk of diseases caused by X. arboricola (Lamichhane 2014, 2018).

    Disease symptoms of bacterial canker and spot of stone fruits on leaves include small, pale-green to yellow circular or irregular areas with a light-tan center. These spots soon become evident on the upper surface as they enlarge, becoming angular and darkening to deep purple, brown, or black. The symptoms on fruit vary from small pit-like lesions to large, sunken black lesions. Symptoms of bacterial blight of walnut begin as translucent water-soaked spots on leaves that develop into brown to blackish greasy necrotic areas. Lesions, which are often surrounded by a yellow-green halo, are initially circular but often expand into angular spots (Miller and Bollen 1946; Teviotdale et al. 1985; Zarei et al. 2019). Apical necrosis and premature drop of walnut fruit has also been reported in the literature (Moragrega et al. 2011; Zarei et al. 2018). Bacterial blight of hazelnut includes a set of symptoms on aerial parts of the plants, i.e., death of buds and new shoots, cankers on branches and trunks, leaf spots, dark brown spots on nuts, and bacterial exudates on necrotic lesions (Lamichhane and Varvaro 2014).

    As a standalone species, X. arboricola was described for the first time in 1995 (Vauterin et al. 1995), but the history of its members dates back to the early 20th century, when a bacterial disease called walnut blight was observed on English walnut (Juglans regia) in California and the causal agent was designated as Pseudomonas juglandi at that time (Pierce 1901). In 1903, a bacterial disease of Japanese plums (Prunus salicina) was observed in central Michigan and the causal agent was named as Pseudomonas pruni (Smith 1903). After several taxonomic revisions during the subsequent decades, the two pathogens were eventually named as X. juglandis and X. pruni, respectively (Dowson 1939). In 1940, a blight-inducing pathogen on filbert (Corylus avellana and C. maxima) was isolated in the Pacific Northwestern United States and described as Phytomonas corylina (Miller et al. 1940), which then changed to X. corylina (Starr and Burkholder 1942).

    Original taxonomy of the genus Xanthomonas was based on host specificity and the concept of “new host/new species,” leading to creation of a complex taxon with >100 species (Dowson 1939; Dye 1978). During the 1970s, taxonomic status of nearly all species within the genus subsided to pathovar level, and the aforementioned fruit trees’ pathogens were reclassified in the X. campestris species as X. campestris pv. juglandis, X. campestris pv. pruni, and X. campestris pv. corylina, respectively (Dye 1978). In 1979, a xanthomonad pathogen was isolated from superficial bark necrosis on young poplars (Populus spp.) in Didam in The Netherlands, and the causal agent was named as X. campestris pv. populi (De Kam 1984). Subsequently, Vauterin et al. (1995) reclassified all Xanthomonas spp. and proposed transferring the four aforementioned taxa into the new species X. arboricola, namely X. arboricola pv. corylina, X. arboricola pv. juglandis, X. arboricola pv. populi, and X. arboricola pv. pruni. Two additional pathovars, i.e., the banana (Musa spp.) pathogen X. arboricola pv. celebensis and the poinsettia (Euphorbia pulcherrima) pathogen X. arboricola pv. poinsettiicola, were also added to the X. arboricola species (Gäumann 1923; Vauterin et al. 1995). The latter pathovar included type C strains of X. campestris pv. poinsettiicola and changed to X. axonopodis pv. poinsettiicola in the subsequent year (Young et al. 1996). In 2001, a new disease of strawberry (Fragaria × ananassa) called bacterial leaf blight was described, and the causal agent was designated as a new pathovar of X. arboricola as X. arboricola pv. fragariae, possessing no similarity to the previously described species X. fragariae (Janse et al. 2001; Kennedy and King 1962).

    More recently, based on multilocus sequence analysis (MLSA), Fischer-Le Saux et al. (2015) proposed transferring three former pathovars of X. campestris into X. arboricola as X. arboricola pv. arracaciae isolated from arracacha (Arracacia xanthorrhiza), X. arboricola pv. guizotiae isolated from niger (Guizotia abyssinica), and X. arboricola pv. zantedeschiae isolated from arum lily (Zantedeschia aethiopica). Taking all these together, currently, X. arboricola includes nine pathovars each exhibiting characteristic disease symptoms and distinct host specificities (Hajri et al. 2012). Three of these pathovars, X. arboricola pv. corylina, X. arboricola pv. juglandis, and X. arboricola pv. pruni, are economically important pathogens around the globe (Garita-Cambronero et al. 2018; Kałużna et al. 2021), whereas there have been several debates on the pathogenic status of the other members of the species. For instance, Merda et al. (2016) considered X. arboricola pv. fragariae as a nonpathogenic taxon and defined the other members, X. arboricola pv. celebensis, X. arboricola pv. arracaciae, X. arboricola pv. zantedeschiae, X. arboricola pv. populi, and X. arboricola pv. guizotiae, as “unsuccessful” pathovars in terms of their pathogenicity on the host of isolation. On the contrary, during the past decade, several atypical Xanthomonas strains were reported to associate with walnut and stone fruit trees, and, in some cases, they were assigned to novel species. For instance, three strains isolated from symptomatic nectarine trees (Prunus persica var. nectarina) in Murcia, Spain, were named X. prunicola (López et al. 2018), and walnut strains isolated from 2014 to 2016 in Loures, Portugal, comprising pathogenic and nonpathogenic members were described as X. euroxanthea (Martins et al. 2020).

    In addition to these host-specific pathovars and species, many nonpathogenic strains were designated as X. arboricola in the literature (Merda et al. 2016). Essakhi et al. (2015) showed that X. arboricola includes nonpathogenic bacteria that cause no apparent disease symptoms on their hosts. An MLSA was performed on a collection of 100 X. arboricola strains, including 27 nonpathogenic strains isolated from walnut. Nonpathogenic strains grouped outside the clusters that were defined by pathovars and formed separate genetic lineages (Essakhi et al. 2015). Furthermore, MLSA showed that most X. arboricola pv. corylina, X. arboricola pv. juglandis, and X. arboricola pv. pruni strains are phylogenetically related and clustered in three distinct clonal complexes close to one another. In contrast, strains with no or uncertain pathogenicity were represented by numerous unrelated singletons scattered in the phylogenic tree (Essakhi et al. 2015; Fischer-Le Saux et al. 2015). It has also been noticed that X. arboricola includes strains that are not classified as pathovars but revealed to be pathogenic on jujube (Ziziphus jujuba), pepper (Capsicum annum), and grapevine (Vitis vinifera). These strains have been described without pathovar description. Strains isolated from diverse host plants, i.e., onion (Allium cepa), chrysanthemum (Chrysanthemum morifolium), poinsettia, magnolia (Magnolia spp.), and clove (Syzygium aromaticum), without known pathogenicity and strains from walnut and plum (Prunus domestica) that are not pathogenic on their hosts of isolation were also included in the species (Garita-Cambronero et al. 2016a, b; Fischer-Le Saux et al. 2015). Hence, X. arboricola comprises a set of nonhomogeneous strains with a not-yet-clarified phylogenetic relationship to each other. The current taxonomy of X. arboricola in the public databases, e.g., NCBI GenBank, seems to be misleading because of the inclusion of dozens of mislabeled strains within the species (e.g., X. arboricola strain 3307 with GenBank accession number JACHOG000000000.1, which belongs to the clade I of the genus). Taking all this evidence together, a comprehensive taxonomic overview using all available strains designated as X. arboricola is warranted to refine the classification of the members of the species. Thus, the main purpose of the present study was to clarify the taxonomic composition of X. arboricola species based on available whole-genome sequences.

    MATERIALS AND METHODS

    Genome resources.

    All the publicly available genome sequences designated as X. arboricola (up to March 2021) were retrieved from the NCBI GenBank database and subjected to phylogenetic analyses. Furthermore, type strains of all validly described Xanthomonas species (n = 31) were included in the analyses to determine the taxonomic position of X. arboricola strains that were misidentified in the literature. In total, 132 whole-genome sequences labeled as X. arboricola were retrieved from the GenBank database and subjected to further analyses.

    Phylogenetic analyses and comparative genomics.

    All the genome sequences were transferred to the Galaxy Europe platform to reannotate using Prokka v.1.14.6, and the resulted gff3 files were used to construct a pan-genome using Roary v.3.13.0 (Goecks et al. 2010; Page et al. 2015; Seemann 2014; Stein 2013). The core and pan genome were estimated using Roary parameters with 90% minimum BLASTp identity (with the length of 544,759 bp) and Markov Cluster Algorithm (MCL) (Page et al. 2015). Subsequently, to determine the phylogenetic position of the strains, a maximum-likelihood tree was constructed using concatenated core-genome alignment (99% gene coverage among the strains was defined as core genome) via Roary with IQ-TREE v.1.5.5, and ModelFinder was implemented for selection of GTR+R7 as the best DNA substitution model (Kalyaanamoorthy et al. 2017; Nguyen et al. 2015). Branch support was assessed by ultrafast bootstrap analysis and a Shimodaira-Hasegawa approximate likelihood ratio test using 1,000 bootstrap replications, with all other options run as default (Minh et al. 2013). The resulted phylogenetic trees were visualized and manipulated using MEGA 7.0 software (Kumar et al. 2016). All trees were rooted using Stenotrophomonas maltophilia NCTC 10258 as an outgroup.

    The presence and absence scheme of genes in the accessory genome was used to create a binary tree using FastTree (Price et al. 2010), and a heat map depicting gene presence and absence was generated and visualized with Phandango interactive viewer (Hadfield et al. 2018). To infer the taxonomic relationships of the strains designated as X. arboricola, average nucleotide identity based on BLASTn (ANI) and digital DNA-DNA hybridization (dDDH) indexes were calculated considering all pairs of the strains as well as type strains of all Xanthomonas spp. as reference. ANI was estimated using the JSpeciesWS (Richter et al. 2016) and ANI calculator (Rodriguez-R and Konstantinidis 2016) online services, and Genome-to-Genome Distance Calculator (v.2.1) was used to calculate dDDH (Meier-Kolthoff et al. 2013; Richter et al. 2016). A combination of ANI and dDDH indices was used to assign a “standalone species” taxonomic status to a given taxon. When ANI and dDDH values were below the accepted threshold for prokaryotic species description, i.e., ≤95 and ≤70% for ANI and dDDH, respectively, the corresponding strain was considered a potential novel species (Kim et al. 2014).

    RESULTS

    X. arboricola dataset in NCBI GenBank.

    By March 2021, 132 whole-genome sequences designated as X. arboricola were deposited in the NCBI GenBank database (Table 1, Supplementary Table S1). As for the hosts of isolation, 30 strains were isolated from walnut, 22 strains from common bean (Phaseolus vulgaris), 11 strains from Arabidopsis thaliana, and six strains from each of pepper, peach (Prunus persica), strawberry, and tomato (Solanum lycopersicum) plants. Three strains were isolated from filbert plants, and two strains were isolated from each of almond (Prunus amygdalus), banana (Musa acuminata), G. abyssinica, and Prunus salicina. One strain was also isolated from each of arracacha, barley (Hordeum vulgare), Magnolia sp., poplar (Populus × canadensis), St. Lucie cherry (Prunus mahaleb), Japanese arrowroot (Pueraria lobata), Prunus sp., white willow (Salix alba), soil, and Zantedeschia aethiopica, while 24 strains had an undetermined origin (Table 1, Supplementary Table S1).

    TABLE 1. Metadata and genome sequences of the bacterial strains used in this studya

    Phylogenomics analyses.

    Core gene alignment, ANI, and dDDH analyses were used to shed light on the taxonomic status of the 132 strains designated as X. arboricola in the NCBI GenBank database. The core genome of the 132 strains consisted of 1,207 genes (4.00% of total genes), indicating high genetic diversity within this heterogeneous group of strains (Fig. 1). When all type strains of Xanthomonas spp. were added to the 132 X. arboricola data set, the number of genes in the core genome was significantly reduced to only 243 genes (0.38% of total genes), whereas the number of accessory genes (shell + cloud genes) was increased from 28,283 to 63,106, as detailed in Figure 1A and C. Core genome-based phylogeny using the 132 X. arboricola genomes along with the type strains of all validly described Xanthomonas species revealed significant taxonomic diversity among the strains designated as X. arboricola in the public databases. Phylogenetic trees constructed using the core genome alignment revealed 96 of 132 strains as genuine X. arboricola strains, while 36 strains were identified as mislabeled X. arboricola strains (Fig. 2). Based on the core genome phylogeny rooted by S. maltophilia NCTC 10258, a monophyletic clade consisting of 96 strains, including the type strain (indicated by the superscript “T”) of X. arboricola CFBP 2528T, was differentiated from the other Xanthomonas spp. with 100% bootstrap value (Supplementary Fig. S1). These 96 strains were considered taxonomically genuine X. arboricola strains, among which the ANI and dDDH values were consistently >95 and >70%, respectively (Supplementary Fig. S2). On the contrary, based on the core genome phylogeny, as well as ANI and dDDH indexes, 36 of 132 X. arboricola strains were identified as mislabeled strains that should not have been assigned to this species (Supplementary Fig. S1). DNA similarity between the latter 36 strains and the type strain of X. arboricola was consistently less than the accepted threshold (≤95 and ≤70% ANI and dDDH values, respectively) for definition of prokaryotic species (Supplementary Fig. S3). When the latter 36 strains were excluded from the X. arboricola dataset, the remaining 96 genuine X. arboricola strains had a considerably increased core genome size (n = 1,876 genes; Fig. 1A and D), comprising 11.66% of total genes. Core genome calculation using the 36 mislabeled X. arboricola strains along with the type strains of all Xanthomonas spp. showed that only 0.39% of all genes (n = 218) were core genome (Fig. 1A and E). For detection of possible recombination events within the X. arboricola members, core genome alignment was obtained from Roary in Fasta format and used as input for Gubbins (Genealogies Unbiased by Recombinations in Nucleotide Sequences) analysis with default settings (Croucher et al. 2015). Results showed that recombination rate within X. arboricola strains was low (i.e., <1) and bootstrap support was >90 across all nodes; therefore, phylogenetic trees constructed based on the core genome alignment were approved as accurate (data not shown).

    Fig. 1.

    Fig. 1. A, Classification of orthologous gene families into core genome and accessory genome within different sets of Xanthomonas spp. using Roary (Park and Andam 2019). B, Pan-genome analyses for all 132 Xanthomonas arboricola sequences retrieved from NCBI GenBank. C, Pan-genome analyses for the same X. arboricola strains in comparison with the type strains of Xanthomonas spp. D, Pan-genome analyses for 96 genuine X. arboricola strains. E, Pan-genome analyses for the 36 mislabeled X. arboricola strains along with the type strains of all Xanthomonas species. The plots were generated using Roary based on gene presence-absence matrix, showing the distribution of genes present in a genome, and visualized with Phandango. Shaded segments represent gene presence, and white segments represent gene absence. The pan-genome generated beginning from the core genome on the left plot and shifting into the accessory genome (shell and cloud genomes) with increasing gene sequence discrepancy.

    Download as PowerPoint
    Fig. 2.

    Fig. 2. A, Core genome-based phylogeny of 96 genuine Xanthomonas arboricola strains. B, Core genome-based phylogeny of 36 mislabeled X. arboricola strains along with the type strains of all Xanthomonas species rooted with Stenotrophomonas maltophilia NCTC 10258. The trees were generated using IQ-TREE (v.1.5.5) with the general time-reversible model and rates γ-distributed (GTR+R7). Branch support was assessed by ultrafast bootstrap analysis using 1,000 replicates. Circles indicate the type strains of validly described Xanthomonas species, whereas the strains labeled with various symbols indicate different mislabeled X. arboricola strains within the genus. PT, pathotype strain; T, type strain.

    Download as PowerPoint

    Taxonomic analyses.

    In congruence with the MLSA results (Fischer-Le Saux et al. 2015), the core genome-based phylogeny confirmed that the three economically important pathovars, i.e., X. arboricola pv. pruni, X. arboricola pv. corylina, and X. arboricola pv. juglandis, are phylogenetically closely related, being clustered in a monophyletic clade. However, they still remain as three distinct groups based on the host range and symptoms (Fig. 2A). In the present study, eight X. arboricola pv. pruni strains, i.e., 15-088, CITA 9, CITA 99, IVIA 2626.1, MAFF 301420, MAFF 311562, MAFF 301427, and Xap 33, along with the pathotype strain CFBP 3894PT clustered in a monophyletic subclade differentiated by 100% bootstrap value from their closest neighbor subclades that consisted of X. arboricola pv. juglandis and X. arboricola pv. corylina strains (Fig. 2A). Three X. arboricola pv. corylina strains, i.e., CFBP 1159PT, CFBP 2565, and NCCB 100457, clustered in a monophyletic subclade phylogenetically more related to the X. arboricola pv. juglandis group than to X. arboricola pv. pruni (Fig. 2A). A sister group of X. arboricola pv. corylina strains consisted of a taxonomically diverse subclade including the pathotype strain of X. arboricola pv. juglandis CFBP 2528T,PT as well as nine X. arboricola pv. juglandis strains, i.e., 3, CFBP 7179, CFBP 8253, CPBF 1521, CPBF 427, Dw3F3, J303, NCPPB 1447, and Xaj 417. Although the latter subclade was clearly differentiated by 100% bootstrap value from the X. arboricola pv. corylina group, several subclusters were observed for these strains (Fig. 2A). For instance, the four X. arboricola pv. juglandis strains, CFBP 2528T, CFBP 7179, NCPPB 1447, and DW3F3, all pathogenic on walnut, clustered in a unique group, whereas the remaining six strains designated as X. arboricola pv. juglandis, i.e., 3, CFBP 8253, CPBF 1521, CPBF 427, J303, and Xaj 417, along with several X. arboricola strains with no pathovar status, grouped together (Fig. 2A). All these latter strains, i.e., CFSAN033077, CFSAN033078, CFSAN033079, CFSAN033080, CFSAN033081, CFSAN033082, CFSAN033083, CFSAN033084, CFSAN033085, CFSAN033086, CFSAN033087, CFSAN033088, and CFSAN033089, were isolated from walnut and deposited in the NCBI database without pathovar status (Supplementary Table S1) (Higuera et al. 2015), but they are pathogenic on the host of isolation and therefore could belong to X. arboricola pv. juglandis.

    The pathotype strain of the arracacha pathogen, X. arboricola pv. arracaciae CFBP 7407PT, was the genetically closest strain to the three economically important pathogens’ group (pvs. pruni, corylina, and juglandis). The strains designated as X. arboricola pv. fragariae in GenBank were scattered through different clades, whereas the strains CFBP 6773 and LMG 19144 were phylogenetically closer to the pathotype strain LMG 19145PT than the strain LMG 19146 was (Fig. 2A). Furthermore, the latter strain clustered in the same subclade with the pathotype strain of the banana pathogen X. arboricola pv. celebensis NCPPB 1832PT. The other strain of X. arboricola pv. celebensis, NCPPB 1630, clustered in the same subclade with the pathotype strain of X. arboricola pv. zantedeschiae CFBP 7410PT (Fig. 2A). All these X. arboricola members, consisting of 96 strains (Fig. 2A), were further analyzed for their ANI and dDDH indexes. The resulting data (Supplementary Fig. S2) showed that the 96 X. arboricola strains, including the type strain of the species, as well as the pathotype strains of seven pathovars (among nine), had >96% ANI with one another and with the type strain of the species. Hence, these 96 strains were considered genuine strains of X. arboricola.

    On the contrary, strains of X. arboricola pv. guizotiae, i.e., CFBP 7408PT and CFBP 7409; X. arboricola pv. populi, i.e., CFBP 3123PT and CFBP 3122; as well as the strains CFBP 7645 and CFBP 8152 isolated from walnut and common bean, respectively, clustered as neighbor clades of X. arboricola but in separate phylogenetic clusters (Fig. 2B). Interestingly, based on core genome phylogeny and ANI/dDDH indexes, the pathotype strains of neither X. arboricola pv. populi CFBP 3123PT nor X. arboricola pv. guizotiae CFBP 7408PT could be considered as genuine members of X. arboricola (Fig. 2B, Table 2). Comparison of X. arboricola pv. populi CFBP 3123PT with the type strain of X. arboricola CFBP 2528T yielded an ANI value of 93% and a dDDH value of 52%, whereas comparison of X. arboricola pv. guizotiae CFBP 7408PT with X. arboricola CFBP 2528T showed an ANI value of 93% and a dDDH value of 56% (Table 2). Hence, these two pathovars shall not be assigned to the species X. arboricola according to the criteria of species definition in prokaryotes (Kim et al. 2014). Furthermore, comparison of the strains CFBP 7645 and CFBP 8152 with X. arboricola CFBP 2528T showed an ANI value of 93% and a dDDH value of 55 to 57%. However, comparing these two strains with each other showed an ANI value of 97% and a dDDH value of 79% (Table 2). Therefore, these two strains could be considered as a novel species within the genus Xanthomonas.

    TABLE 2. Average nucleotide identity based on average nucleotide identity (ANIb; lower diagonal) and digital DNA–DNA hybridization (dDDH; upper diagonal) values generated from the DNA sequence similarity comparisons among different pathotype strains of Xanthomonas arboricola

    Eight strains, i.e., 2949, 2955, 2957, 2974, 3640, CFBP 7622, CFBP 7653, and F2, all designated as X. arboricola in the NCBI GenBank database, clustered with the type strain of X. euroxanthea CPBF 424T and two nonpathogenic strains of X. euroxanthea, CPBF 367 and CPBF 426 (Fig. 2B) (Fernandes et al. 2021). As detailed in Supplementary Table S2, ANI and dDDH indexes between the former strains and the latter type strain were more than 95 and 70%, respectively. Thus, these strains shall be relabeled as X. euroxanthea in GenBank. Furthermore, strain NL_P126, isolated from A. thaliana in the United States, was phylogenetically closely related to, but still distinct from, the X. euroxanthea clade, showing 93% ANI and 52% dDDH similarity to the type strain of X. euroxanthea CPBF 424T (Fig. 2B, Supplementary Table S2). Strain MEU_M1 clustered with the type strain of X. hortorum CFBP 4925T, the pathotype strain of X. hortorum pv. cynarae CFBP 4188P, and the type strain of the original species X. cynarae CFBP 4188T, which recently became a later heterotypic synonym of X. hortorum (Morinière et al. 2020) (Fig. 2B, Table 3). DNA similarity comparison of MEU_M1 with X. hortorum pv. hederae CFBP 4925T and X. hortorum pv. cynarae CFBP 4188PT yielded ANI values of 95 and 95% and dDDH values of 69 and 66%, respectively. The strain 3058 with an unknown host of isolation and geographic origin, designated as X. arboricola in GenBank, was distinct from all the validly described Xanthomonas species, being separated from its closest species, X. fragariae PD885T and X. populi CFBP 1817T, with consistently <95% ANI and <70% dDDH similarities (Fig. 2B, Table 3). Comparison of strain 3058 with X. fragariae PD885T and X. populi CFBP 1817T showed ANI values of 86 and 87% and dDDH values of 32 and 34%, respectively. A comprehensive ANI calculation, including strain 3058 and all the type strains of Xanthomonas spp., as well as all X. arboricola strains included in this study, showed <90% ANI between strain 3058 and all the other Xanthomonas strains, as shown in Supplementary Figures S3 and S4.

    TABLE 3. Average nucleotide identity based on average nucleotide identity (ANIb; lower diagonal) and digital DNA–DNA hybridization (dDDH; upper diagonal) values generated from the DNA sequence similarity comparisons among different sets of mislabeled Xanthomonas arboricola strains and the type/reference strains of Xanthomonas spp.

    Surprisingly, two additional phylogenetic clades with a considerable number of X. arboricola strains were observed outside the core members of the species. A monophyletic clade consisting of the strains FOR_F20, FOR_F21, FOR_F23, FOR_F26, PLY_3, PLY_4, and PLY_9, all isolated from A. thaliana in France, was phylogenetically closely related to the type strain of X. dyei CFBP 7245T. The seven X. arboricola strains, along with the type strains of X. dyei CFBP 7245T, X. vesicatoria ATCC 35937T, and X. pisi CFBP 4643T, all clustered in a monophyletic clade as shown in Figure 2B. ANI and dDDH calculations showed that sequence similarity among the seven strains was easily >95 and >70%, respectively, indicating that they belong to the same species (Table 4, Supplementary Figs. S2 to S4). However, this atypical clade was too distinct from the three latter type strains (ANI and dDDH were consistently below the species definition threshold) to be considered members of the same species (Table 4). For instance, ANI values between the type strains of X. pisi, X. vesicatoria, and X. dyei and the seven X. arboricola strains were 89 to 93%. Hence, the seven mislabeled X. arboricola strains could be considered as a novel species within the genus Xanthomonas. A set of 11 strains, i.e., CFBP 8590, CFBP 8591, CFBP 8592, CFBP 8594, CFBP 8595, CFBP 8597, CFBP 8598, CFBP 8599, CFBP 8600, F16, and F17, with unknown source of isolation and geographic origin clustered together in a monophyletic clade along with the type strain of the cannabis (Cannabis sativa) pathogen “X. cannabis” NCPPB 2877 (Fig. 2B). Although ANI values between the former strains and the strain used to describe the proposed species X. cannabis were consistently above the accepted threshold for prokaryotic species definition (95 to 97%), in most of the cases, dDDH values between these two groups were on the border of the cutoff for species definition (68 to 69%), leading us to label the 11 X. arboricola strains in this clade as “X. cannabis”-like strains (Supplementary Table S3). The exception was strain CFBP 8594, which was undoubtedly identified as X. cannabis, sharing 97% ANI and 79% dDDH with the type strain of the species.

    TABLE 4. Average nucleotide identity based on average nucleotide identity (ANIb; lower diagonal) and digital DNA–DNA hybridization (dDDH; upper diagonal) values generated from the DNA sequence similarity comparisons among different sets of mislabeled Xanthomonas arboricola strains and the type/reference strains of Xanthomonas spp.

    Members of Xanthomonas spp. are phylogenetically divided into two distinct clades, clade I and clade II (Khojasteh et al. 2019; Shah et al. 2021). All the aforementioned mislabeled X. arboricola strains, while not belonging to this species, still clustered within clade II of Xanthomonas. However, the X. arboricola strain 3307, with an unknown host of isolation and geographic origin, clustered within clade I of the genus and was found to be phylogenetically closely related to the type strains of X. sacchari CFBP 4641T and X. sontii PPL1T (Fig. 2B). ANI and dDDH calculations showed that strain 3307 had less than 93% ANI and <59% dDDH similarities with the type strains of clade I xanthomonads, i.e., X. albilineans, X. hyacinthi, X. sacchari, X. sontii, X. theicola, and X. translucens, as detailed in Table 5. Hence, strain 3307 was too distinct from all the type strains of validly described Xanthomonas spp. to be considered a member of one of these species. Thus, it could represent a novel species within clade I of Xanthomonas.

    TABLE 5. Average nucleotide identity based on average nucleotide identity (ANIb; lower diagonal) and digital DNA–DNA hybridization (dDDH; upper diagonal) values generated from the DNA sequence similarity comparisons among different sets of mislabeled Xanthomonas arboricola strains and the type/reference strains of Xanthomonas spp.

    DISCUSSION

    In this study, we used phylogenomics and comparative genomics on all available whole-genome resources of X. arboricola to refine the taxonomy of the species. Our results showed that, of the nine described pathovars within X. arboricola, pathotype strains of only seven pathovars are taxonomically genuine, belonging to the core clade of the species regardless of their pathogenicity status on the host of isolation (Supplementary Fig. S1). However, strains of X. arboricola pv. guizotiae and X. arboricola pv. populi do not belong to the species X. arboricola because of the low ANI and dDDH similarities between the type strain of the species and the pathotype strains of these two pathovars. To address this taxonomic issue, we propose to elevate the two pathovars X. arboricola pv. guizotiae and X. arboricola pv. populi to the rank of a species as X. guizotiae sp. nov. and X. populina sp. nov., respectively. In addition to these two taxa, mislabeled strains of X. arboricola were scattered within the genus, belonging to several hypothetical novel species that need formal descriptions.

    High genetic diversity within the members of X. arboricola has frequently been noted in terms of virulence features, phylogenetic relationships, and genomic repertories. The original X. arboricola species became more complex when Fischer-Le Saux et al. (2015) added the members of X. campestris pv. arracaciae, X. campestris pv. guizotiae, and X. campestris pv. zantedeschiae to the species X. arboricola and named them as X. arboricola pv. arracaciae, X. arboricola pv. guizotiae, and X. arboricola pv. zantedeschiae, respectively, based on MLSA of seven housekeeping genes. Our genomics-informed analyses, however, suggest that X. arboricola pv. guizotiae and X. arboricola pv. populi should be considered as standalone species. Pathogenicity-associated genomic features such as type III effectors (T3Es) also greatly vary between X. arboricola strains. Hajri et al. (2012) showed that the stone fruit and nut pathogens X. arboricola pv. pruni, X. arboricola pv. corylina, and X. arboricola pv. juglandis possess the largest T3E repertoires, whereas X. arboricola pv. celebensis and X. arboricola pv. fragariae harbored the smallest T3Es. The three stone fruit and nut pathogens are phylogenetically closely related, forming a monophyletic clade based on MLSA (Fischer-Le Saux et al. 2015) and whole-genome analyses (Fig. 2A), thus constituting the core members of the species.

    The versatility of the X. arboricola members is reflected not only in their phylogenetic relationships, but also in their host of isolation, pathogenicity, and host range. Considering the pathogenicity features of the strains, Merda et al. (2016) divided the X. arboricola pathovars into two groups whereby the stone fruit and nut pathogens, i.e., X. arboricola pv. pruni, X. arboricola pv. corylina, and X. arboricola pv. juglandis, were called successful pathovars, which were distinct from unsuccessful pathovars, i.e., X. arboricola pv. zantedeschiae, X. arboricola pv. guizotiae, X. arboricola pv. arracaciae, X. arboricola pv. celebensis, and X. arboricola pv. populi, as well as several nonpathogenic strains isolated from various hosts. Virulence of the members of X. arboricola pv. fragariae has often been ambiguous because, in many cases, they failed to induce symptoms on strawberry upon artificial inoculation (Gétaz et al. 2020; Merda et al. 2016; Vandroemme et al. 2013). However, pathogenicity of the pathotype strain of X. arboricola pv. fragariae LMG 19145PT on strawberry cultivars Candonga, Sabrina, and Murano grown in a chamber with 95% humidity has recently been confirmed (Ferrante and Scortichini 2018). Furthermore, strains isolated from dysoxylum (Dysoxylum spectabile), hardenbergia (Hardenbergia sp.), liquidambar (Liquidambar styraciflua), magnolia, and mahonia (Mahonia lomariifolia) were also identified members of X. arboricola (Young et al. 2010). Our results showed that, besides the well-defined hosts of seven X. arboricola pathovars, i.e., arracacha, arum lily, banana, filbert, strawberry, walnut, and stone fruits, genuine strains of the species were also isolated from A. thaliana (MEDV_A37 and MEDV_P39 strains), pepper, H. vulgare, magnolia, common bean, Pueraria lobata, soil, and tomato. In some cases, strains that were isolated from atypical host plants and identified as X. arboricola have not been evaluated for their pathogenicity and host range. As for the tomato strains isolated in Australia, they have displayed pathogenic reactions when infiltrated but did not display typical lesions when spray-inoculated on tomato plants. Hence, they were described as weakly pathogenic (Roach et al. 2018, 2019).

    In conclusion, our results obtained from the analyses of 132 genome sequences provide novel insight into the genetic diversity of X. arboricola and confirmed a need for taxonomic revision of the species. More specifically, phylogenetic analyses suggest that the pathovars X. arboricola pv. guizotiae and X. arboricola pv. populi should be considered as novel species, although taxonomic heterogeneity of the species expands beyond these latter clades. At least one additional novel species including a considerable number of mislabeled strains of X. arboricola isolated from A. thaliana in France could be described within Xanthomonas (Fig. 2B), whereas several individual strains seem to represent hypothetical novel species not only in clade II but also in clade I of the genus. Taxonomic refinement of X. arboricola led to a significant change in the core genome scheme of the species, increasing from 1,207 genes in the 132 unpolished strains to 1,876 genes in the 96 genuine strains of the species (Fig. 1A, B, and D). These findings raise a question whether current taxonomy of stone fruit- and nut tree-associated Xanthomonas strains is technically applicable and emphasize at the same time the need for more detailed taxonomic investigations among the phylogenetically diverse X. arboricola strains. This would help the plant pathology agencies and industry inspectors to specifically target the enemy and neglect the mislabeled lineages. Only a formal taxonomic study would address this issue with delineation of appropriate epithet and species description for the new taxa beyond the X. arboricola species complex.

    Description of X. guizotiae sp. nov.

    X. guizotiae (gui.zo.ti‘ae N.L. fem. gen. guizotiae of Guizotia, the generic name of the plant from which the strains were isolated). Originally described as X. campestris pv. guizotiae (Yirgou 1964) Dye 1978, the pathogen was transferred to X. arboricola based on partial sequencing of atpD, dnaK, efp, fyuA, glnA, gyrB, and rpoD genes (Fischer-Le Saux et al. 2015). Description of the species is the same as the original pathovar (Yirgou 1964), as well as X. arboricola pv. guizotiae (Fischer-Le Saux et al. 2015). The type strain is CFBP 7408T = NCPPB 1932T = ICPB XG102T = ICMP 5734T = LMG 731T. ANI of the type strain of X. guizotiae with the type strains of all validly described Xanthomonas spp. is <94%. GenBank genome accession number of the type strain CFBP 7408T is MDSK00000000.

    Description of X. populina sp. nov.

    Description of the species is the same as the original description of the pathogen by De Kam (1984). The bacterium was originally described as X. campestris pv. populi to encompass pathogenic strains isolated from poplar (Populus spp.), which was then transferred to X. arboricola and named X. arboricola pv. populi by Vauterin et al. (1995). Colonies are yellow in color on nutrient agar medium, growing at 30°C, which differs from the other well-known poplar pathogen X. populi, which has an optimum growth temperature of 18°C. The aggressiveness of the strains of X. populina is low, and special environmental conditions are necessary for symptom development. The type strain is CFBP 3123T = ICMP 8923T = LMG 12141T. ANI of the type strain of X. populina with the type strains of all validly described Xanthomonas spp. is <93%. GenBank genome accession number of the type strain CFBP 3123T is MDEB00000000.

    ACKNOWLEDGMENTS

    We benefited from interactions promoted by COST Action CA16107 EuroXanth (https://euroxanth.eu/).

    The author(s) declare no conflict of interest.

    LITERATURE CITED

    Funding: This study was supported by the Iran National Science Foundation (project number 99017084) and Shiraz University (Iran). S. Zarei and H. Mafakheri benefited from a sabbatical grant provided by the Iranian Ministry of Science and Technology for a 6-month fellowship at the University of Tehran, Iran. The work of E. Osdaghi was funded by the College of Agriculture Natural Resources, University of Tehran (1400-223).

    The author(s) declare no conflict of interest.