Comparative genomics of Verticillium dahliae isolates reveals the in planta-secreted effector protein recognized in V2 tomato plants

Plant pathogens secrete effector molecules during host invasion to promote host colonization. However, some of these effectors become recognized by host receptors, encoded by resistance genes, to mount defense response and establish immunity. Recently, a novel resistance was identified in tomato, mediated by the single dominant V2 locus, to control strains of the soil-borne vascular wilt fungus Verticillium dahliae that belong to race 2. We performed comparative genomics between race 2 strains and resistance-breaking race 3 strains to identify the avirulence effector that activates V2 resistance, termed Av2. We identified 277 kb of race 2-specific sequence comprising only two genes that encode predicted secreted proteins, both of which are expressed by V. dahliae during tomato colonization. Subsequent functional analysis based on genetic complementation into race 3 isolates confirmed that one of the two candidates encodes the avirulence effector Av2 that is recognized in V2 tomato plants. The identification of Av2 will not only be helpful to select tomato cultivars that are resistant to race 2 strains of V. dahliae, as the corresponding V2 resistance gene has not yet been mapped, but also to monitor adaptations in the V. dahliae population to deployment of V2-containing tomato cultivars in agriculture.


INTRODUCTION
In nature, plants are continuously threatened by potential plant pathogens. 50 However, most plants are resistant to most potential plant pathogens due to an 51 efficient immune system that becomes activated by invasion patterns of diverse 52 nature (Cook, et al., 2015;Dangl & Jones, 2001). Throughout time, different 53 conceptual frameworks have been put forward to describe the molecular basis of 54 plant-pathogen interactions and the mechanistic underpinning of plant immunity. 55 Initially, Harold Flor introduced the gene-for-gene model in which a single 56 dominant host gene, termed a resistance (R) gene, induces resistance in response 57 to a pathogen expressing a single dominant avirulence (Avr) gene (Flor, 1942). 58 Isolates of the pathogen that do not express the allele of the Avr gene that is 59 recognized escape recognition and are assigned to a resistance-breaking race. In 60 parallel to these race-specific Avrs, non-race-specific elicitors were described as 61 conserved microbial molecules that are often recognized by multiple plant species 62 (Darvill & Albersheim, 1984). The recognition by plants of Avrs and of non-race-63 specific elicitors, presently known as microbe-associated molecular patterns 64 (MAMPs), was combined in the 'zig-zag' model (Jones & Dangl, 2006). In this 65 model, MAMPs are perceived by cell surface-localized pattern recognition receptors 66 (PRRs) to trigger MAMP-triggered immunity (MTI), while effectors are recognized 67 by cytoplasmic receptors that are known as resistance (R) proteins to activate 68 effector-triggered immunity (ETI) (Jones & Dangl, 2006). Importantly, the model 69 recognizes that Avrs function to suppress host immune responses in the first place, 70 implying that these molecules, besides being avirulence determinants, act as  designated Ve (Diwan et al., 1999), comprising two genes that encode cell surface 114 receptors of which one, Ve1, acts as a genuine resistance gene (Fradin & Thomma,115 2006). Shortly after its deployment in the 1950s, resistance-breaking strains have 116 appeared that were assigned to race 2 whereas strains that are contained by Ve1 117 belong to race 1 (Alexander, 1962). Thus, Ve1 is characterized as a race-specific 118 R gene, and resistance-breaking strains have become increasingly problematic 119 over time (Alexander, 1962;Dobinson et al., 1996) relativity humidity between 50% and 85%. For V. dahliae inoculation, 10-day-old 150 seedlings were root-dipped for 10 min as previously described (Fradin et al.,151 2009). Disease symptoms were scored at 21 days post inoculation (dpi) by 152 measuring the canopy area to calculate stunting as follows:

223
To determine the genomic localization of XLOC_00170 and Evm_344, the 224 V. dahliae strain JR2 assembly and annotation were used (Faino et al., 2015) 225 together with coverage plots from reads of race 3 and race 2 strains as described 226 in comparative genomics approach IV (Table 2)     Evm344-R for Evm_344 (Table 3).    Fig 1A). First, we aimed to confirm the race 300 assignment of eight V. dahliae strains that were previously tested by Usami et al.

301
(2017) (Table 1). Additionally, three strains that were previously assigned to race   Importantly, most of the strains that were used by Usami and colleagues (2017), 312 and that were previously assigned to race 2, did not cause significant stunting on 313 Aibou, whereas most of the strains that were assigned to race 3 caused clear 314 symptoms of Verticillium wilt disease (Fig. 1a, d). More specifically, this concerned and VT-2A that were previously assigned to race 2 and 3, respectively (Usami et 318 al., 2017), but for which the phenotyping in our assays was ambiguous due to a 319 relatively low degree of virulence on Moneymaker plants (Fig. 1b, d). The as can be observed on Moneymaker plants in our assays (Fig. 1b). This 332 observation, combined with the observation that stunting on Aibou plants by any 333 race 3 strain is generally less than stunting on Moneymaker plants (Fig 1b, d),   (Table 1). In this study, we determined 358 the genome sequences of four additional V. dahliae strains that belong to race 2, 359 namely TO22, UD1-4-1, GF1207 and GFCA2, and four additional race 3 strains, 360 namely GFCB5, GF1192, VT2A and HOMCF, with Oxford Nanopore sequencing 361 Technology (ONT) using a MinION device (Table 1). For each strain, ~2-3 Gb of 362 sequence data was produced, representing 50-100x genome coverage based on 363 the ~35 Mb gapless reference genome of V. dahliae strain JR2 (Faino et al., 2015). 364 Subsequently, we performed self-correction of the reads, read trimming and 365 genome assembly, leading to genome assemblies ranging from 18 contigs for 366 strain UD1-4-1 to 69 for strain GF1207 (Table 1).

367
To perform comparative genomics, for each of the approaches self-368 corrected reads from the selected V. dahliae race 3 strains were mapped against 369 the assembly of V. dahliae strain TO22 (approaches I-III) or the telomere-to-370 telomere assembly of strain JR2 (approach IV) (Li, 2013) and regions that were 371 not covered by race 3 reads were retained (Table 2). Next, self-corrected reads 372 from the selected race 2 strains were mapped against the retained reference 373 genome-specific regions that are absent from the race 3 strains. Sequences that 374 were found in every race 2 strain were retained as candidate regions to encode 375 the Avr molecule. Sequences that are shared by the V. dahliae strain TO22 376 reference assembly and the race 2 strains, and that are absent from the race 3 377 strains, were mapped against the V. dahliae strain JR2 genome assembly, and 378 common genes were extracted. Sequences that did not map to the V. dahliae 379 strain JR2 genome assembly were de novo annotated and signal peptides for 380 secretion at the N-termini of the encoded proteins were predicted to identify 381 potential effector genes.

382
Our strategy identified 670 kb of race 2-specific regions, containing 122 383 genes of which eight encode putative secreted proteins, for approach I (Table 2).

384
For approach II, the addition of three strains reduced the target regions to 660 385 kb, containing 115 genes, of which six encode secreted proteins ( In planta expression of Av2 candidates 400 We anticipate that the genuine Av2 gene should not necessarily be expressed in  (Fig. 2b)  were sequenced in this study, and we were not able to identify any isolate that 427 carries only a single of the two effector genes (Fig. 3).

428
To assess the phylogenetic relationships between strains that carry the two 429 Avr candidates and strains lacking these candidates a phylogenetic tree was 430 generated, showing that the strains can be grouped into three major clades, two 431 of which carry strains that contain the two Avr candidate genes. However, within 432 these clades closely related strains occur that lost the effector genes, suggesting 433 the occurrence of multiple independent losses (Fig. 3). However, overall, no 434 obvious phylogenetic structure is apparent with respect to effector presence within 435 the V. dahliae population.

436
The co-occurrence of both Avr effector candidate genes suggests that they terminal repeat (LTR) retrotransposons (Fig. 4, Fig. 5). Typically, LS regions are 446 characterized by the high abundance of PAV. As expected, the flanking genomic 447 region (100 kb) are highly variable between V. dahliae strains (Fig. 5). Even well as of the Ve1-transgenic Moneymaker plants (Fig. 6a, b) disease symptoms and could not induce stunting on these plants (Fig. 6a, b). As 468 such, these complementation transformants of the race 3 strains GF-CB5 and 469 20 HOMCF behaved essentially as the race 2 strain TO22 (Fig. 6a, b). Thus, these 470 findings suggest that XLOC_00170 encodes Av2. All visual observations of stunting 471 were supported by quantifications of fungal biomass by real-time PCR (Fig. 6c).

472
These measurements revealed that fungal biomass levels were only reduced on  (Fig. 7). Interestingly, strains carrying V73 are clustered in the 488 same branch, suggesting that a single event caused the polymorphism (Fig. 3).