April 21, 2020

SyRI: identification of syntenic and rearranged regions from whole-genome assemblies

We present SyRI, an efficient tool for genome-wide identification of structural rearrangements (SR) from genome graphs, which are built up from pair-wise whole-genome alignments. Instead of searching for differences, SyRI starts by finding all co-linear regions between the genomes. As all remaining regions are SRs by definition, they can be classified as inversions, translocations, or duplications based on their positions in convoluted networks of repetitive alignments. Finally, SyRI reports local variations like SNPs and indels within syntenic and rearranged regions. We show SyRItextquoterights broad applicability to multiple species and genetically validate the presence of ~100 translocations identified in Arabidopsis.

April 21, 2020

Draft Genome Assembly and Annotation of Red Raspberry Rubus Idaeus

The red raspberry, Rubus idaeus, is widely distributed in all temperate regions of Europe, Asia, and North America and is a major commercial fruit valued for its taste, high antioxidant and vitamin content. However, Rubus breeding is a long and slow process hampered by limited genomic and molecular resources. Genomic resources such as a complete genome sequencing and transcriptome will be of exceptional value to improve research and breeding of this high value crop. Using a hybrid sequence assembly approach including data from both long and short sequence reads, we present the first assembly of the Rubus idaeus genome (Joan J. variety). The de novo assembled genome consists of 2,145 scaffolds with a genome completeness of 95.3% and an N50 score of 638 KB. Leveraging a linkage map, we anchored 80.1% of the genome onto seven chromosomes. Using over 1 billion paired-end RNAseq reads, we annotated 35,566 protein coding genes with a transcriptome completeness score of 97.2%. The Rubus idaeus genome provides an important new resource for researchers and breeders.

April 21, 2020

The Ptr1 locus of Solanum lycopersicoides confers resistance to race 1 strains of Pseudomonas syringae pv. tomato and to Ralstonia pseudosolanacearum by recognizing the type III effectors AvrRpt2/RipBN.

Race 1 strains of Pseudomonas syringae pv. tomato, which cause bacterial speck disease of tomato, are becoming increasingly common and no simply-inherited genetic resistance to such strains is known. We discovered that a locus in Solanum lycopersicoides, termed Pseudomonas tomato race 1 (Ptr1), confers resistance to race 1 Pst strains by detecting the activity of type III effector AvrRpt2. In Arabidopsis, AvrRpt2 degrades the RIN4 protein thereby activating RPS2-mediated immunity. Using site-directed mutagenesis of AvrRpt2 we found that, like RPS2, activation of Ptr1 requires AvrRpt2 proteolytic activity. Ptr1 also detected the activity of AvrRpt2 homologs from diverse bacteria including one in Ralstonia pseudosolanacearum. The genome sequence of S. lycopersicoides revealed no RPS2 homolog in the Ptr1 region. Ptr1 could play an important role in controlling bacterial speck disease and its future cloning may shed light on an example of convergent evolution for recognition of a widespread type III effector.

April 21, 2020

Virus-host coexistence in phytoplankton through the genomic lens

Phytoplankton-virus interactions are major determinants of geochemical cycles in the oceans. Viruses are responsible for the redirection of carbon and nutrients away from larger organisms back towards microorganisms via the lysis of microalgae in a process coined the “viral shunt”. Virus-host interactions are generally expected to follow “boom and bust” dynamics, whereby a numerically dominant strain is lysed and replaced by a virus resistant strain. Here, we isolated a microalga and its infective nucleo-cytoplasmic large DNA virus (NCLDV) concomitantly from the environment in the surface NW Mediterranean Sea, Ostreococcus mediterraneus, and show continuous growth in culture of both the microalga and the virus. Evolution experiments through single cell bottlenecks demonstrate that, in the absence of the virus, susceptible cells evolve from one ancestral resistant single cell, and vice-versa; that is that resistant cells evolve from one ancestral susceptible cell. This provides evidence that the observed sustained viral production is the consequence of a minority of virus-susceptible cells. The emergence of these cells is explained by low-level phase switching between virus-resistant and virus-susceptible phenotypes, akin to a bet hedging strategy. Whole genome sequencing and analysis of the ~14 Mb microalga and the ~200 kb virus points towards ancient speciation of the microalga within the Ostreococcus species complex and frequent gene exchanges between prasinoviruses infecting Ostreococcus species. Re-sequencing of one susceptible strain demonstrated that the phase switch involved a large 60 Kb deletion of one chromosome. This chromosome is an outlier chromosome compared to the streamlined, gene dense, GC-rich standard chromosomes, as it contains many repeats and few orthologous genes. While this chromosome has been described in three different genera, its size increments have been previously associated to antiviral immunity and resistance in another species from the same genus. Mathematical modelling of this mechanism predicts microalga-virus population dynamics consistent with the observation of continuous growth of both virus and microalga. Altogether, our results suggest a previously overlooked strategy in phytoplankton-virus interactions.

April 21, 2020

Complete genome sequence of Bacillus velezensis JT3-1, a microbial germicide isolated from yak feces

Bacillus velezensis JT3-1 is a probiotic strain isolated from feces of the domestic yak (Bos grunniens) in the Gansu province of China. It has strong antagonistic activity against Listeria monocytogenes, Staphylococcus aureus, Escherichia coli, Salmonella Typhimurium, Mannheimia haemolytica, Staphylococcus hominis, Clostridium perfringens, and Mycoplasma bovis. These properties have made the JT3-1 strain the focus of commercial interest. In this study, we describe the complete genome sequence of JT3-1, with a genome size of 3,929,799 bp, 3761 encoded genes and an average GC content of 46.50%. Whole genome sequencing of Bacillus velezensis JT3-1 will lay a good foundation for elucidation of the mechanisms of its antimicrobial activity, and for its future application.

April 21, 2020

A novel blaSIM-1-carrying megaplasmid pSIM-1-BJ01 isolated from clinical Klebsiella pneumonia

A rare carbapenem-resistant gene blaSIM-1 was found in a 316-kb megaplasmid designated pSIM-1-BJ01 isolated from a clinical strain Klebsiella pneumonia 13624. The plasmid pSIM-1-BJ01 was fully sequenced and analyzed. Its length is 316,557 bp and it has 342 putative open reading frames with two multidrug-resistant regions and a total of 19 resistant genes. Its backbone was highly homologous to the newly reported plasmid pRJA166a, which was isolated from a clinical third-generation cephalosporin-resistant hypervirulen strain K. pneumonia ST23. The plasmid pSIM-1-BJ01 was verified to be able to transfer to Escherichia coli. The emergency of the transferable blaSIM-1-carrying multidrug-resistant plasmid pSIM-1-BJ01 suggests the spread of blaSIM among Enterobacteriaceae is possible. Therefore, the data presented herein provided insights into the genomic diversity and evolution of blaSIM-carrying plasmids, as well as the dissemination and epidemiology of blaSIM among Enterobacteriaceae in public health system.

April 21, 2020

Centromere-mediated chromosome break drives karyotype evolution in closely related Malassezia species

Intra-chromosomal or inter-chromosomal genomic rearrangements often lead to speciation. Loss or gain of a centromere leads to alterations in chromosome number in closely related species. Thus, centromeres can enable tracing the path of evolution from the ancestral to a derived state. The Malassezia species complex of the phylum Basiodiomycota shows remarkable diversity in chromosome number ranging between six and nine chromosomes. To understand these transitions, we experimentally identified all eight centromeres as binding sites of an evolutionarily conserved outer kinetochore protein Mis12/Mtw1 in M. sympodialis. The 3 to 5 kb centromere regions share an AT-rich, poorly transcribed core region enriched with a 12 bp consensus motif. We also mapped nine such AT-rich centromeres in M. globosa and the related species Malassezia restricta and Malassezia slooffiae. While eight predicted centromeres were found within conserved synteny blocks between these species and M. sympodialis, the remaining centromere in M. globosa (MgCEN2) or its orthologous centromere in M. slooffiae (MslCEN4) and M. restricta (MreCEN8) mapped to a synteny breakpoint compared with M. sympodialis. Taken together, we provide evidence that breakage and loss of a centromere (CEN2) in an ancestral Malassezia species possessing nine chromosomes resulted in fewer chromosomes in M. sympodialis. Strikingly, the predicted centromeres of all closely related Malassezia species map to an AT-rich core on each chromosome that also shows enrichment of the 12 bp sequence motif. We propose that centromeres are fragile AT-rich sites driving karyotype diversity through breakage and inactivation in these and other species.

April 21, 2020

Complete genome screening of clinical MRSA isolates identifies lineage diversity and provides full resolution of transmission and outbreak events

Whole-genome sequencing (WGS) of Staphylococcus aureus is increasingly used as part of infection prevention practices, but most applications are focused on conserved core genomic regions due to limitations of short-read technologies. In this study we established a long-read technology-based WGS screening program of all first-episode MRSA blood infections at a major urban hospital. A survey of 132 MRSA genomes assembled from long reads revealed widespread gain/loss of accessory mobile genetic elements among established hospital- and community-associated lineages impacting >10% of each genome, and frequent megabase-scale inversions between endogenous prophages. We also characterized an outbreak of a CC5/ST105/USA100 clone among 3 adults and 18 infants in a neonatal intensive care unit (NICU) lasting 7 months. The pattern of changes among complete outbreak genomes provided full spatiotemporal resolution of its origins and progression, which was characterized by multiple sub-transmissions and likely precipitated by equipment sharing. Compared to other hospital strains, the outbreak strain carried distinct mutations and accessory genetic elements that impacted genes with roles in metabolism, resistance and persistence. This included a DNA-recognition domain recombination in the hsdS gene of a Type-I restriction-modification system that altered DNA methylation. RNA-Seq profiling showed that the (epi)genetic changes in the outbreak clone attenuated agr gene expression and upregulated genes involved in stress response and biofilm formation. Overall our findings demonstrate that long-read sequencing substantially improves our ability to characterize accessory genomic elements that impact MRSA virulence and persistence, and provides valuable information for infection control efforts.

April 21, 2020

Genome rearrangements induce biofilm formation in Escherichia coli C, an old model organism with a new application in biofilm research

Escherichia coli C forms more robust biofilms than the other laboratory strains. Biofilm formation and cell aggregation under a high shear force depends on temperature and salt concentrations. It is the last of five E. coli strains (C, K12, B, W, Crooks) designated as safe for laboratory purposes whose genome has not been sequenced. Here we present the complete genomic sequence of this strain in which we utilized both long-read PacBio-based sequencing and high resolution optical mapping to confirm a large inversion in comparison to the other laboratory strains. Notably, DNA sequence comparison revealed the absence of several genes thought to be involved in biofilm formation, including antigen 43, waaSBOJYZUL for LPS synthesis, and cpsB for curli synthesis. The first main difference we identified that likely affects biofilm formation is the presence of an IS3-like insertion sequence in front of the carbon storage regulator csrA gene. This insertion is located 86 bp upstream of the csrA start codon inside the -35 region of P4 promoter and blocks the transcription from the sigma32 and sigma70 promoters P1-P3 located further upstream. The second is the presence of an IS5/IS1182 in front of the csgD gene, which may drive its overexpression in biofilm. And finally, E. coli C encodes an additional sigma70 subunit overexpressed in biofilm and driven by the same IS3-like insertion sequence. Promoter analyses using GFP gene fusions and total expression profiles using RNA-seq analyses comparing planktonic and biofilm envirovars provided insights into understanding this regulatory pathway in E. coli.

April 21, 2020

The genomic architecture of introgression among sibling species of bacteria

Gene transfer between bacterial species is an important mechanism for adaptation. For example, sets of genes that confer the ability to form nitrogen-fixing root nodules on host plants have frequently moved between Rhizobium species. It is not clear, though, whether such transfer is exceptional, or if frequent inter-species introgression is typical. To address this, we sequenced the genomes of 196 isolates of the Rhizobium leguminosarum species complex obtained from root nodules of white clover (Trifolium repens). Core gene phylogeny placed the isolates into five distinct genospecies that show high intra-genospecies recombination rates and remarkably different demographic histories. Most gene phylogenies were largely concordant with the genospecies, indicating that recent gene transfer between genospecies was rare. In contrast, very similar symbiosis gene sequences were found in two or more genospecies, suggesting recent horizontal transfer. The replication and conjugative transfer genes of the plasmids carrying the symbiosis genes showed a similar pattern, implying that introgression occurred by conjugative plasmid transfer. The only other regions that showed strong phylogenetic discordance with the genospecies classification were two small chromosomal clusters, one neighbouring a conjugative transfer system. Phage-related sequences were observed in the genomes, but appeared to have very limited impact on introgression. Introgression among these closely-related species has been very limited, confined to the symbiosis plasmids and a few chromosomal islands. Both introgress through conjugative transfer, but have been subject to different types of selective forces.

April 21, 2020

Chromosome-level assembly of the common lizard (Zootoca vivipara) genome

Squamate reptiles exhibit high variation in their traits and geographical distribution and are therefore fascinating taxa for evolutionary and ecological research. However, high-quality genomic recourses are very limited for this group of species, which inhibits some research efforts. To address this gap, we assembled a high-quality genome of the common lizard Zootoca vivipara (Lacertidae) using a combination of high coverage Illumina (shotgun and mate-pair) and PacBio sequence data, with RNAseq data and genetic linkage maps. The 1.46 Gbp genome assembly has scaffold N50 of 11.52 Mbp with N50 contig size of 220.4 Kbp and only 2.96% gaps. A BUSCO analysis indicates that 97.7% of the single-copy Tetrapoda orthologs were recovered in the assembly. In total 19,829 gene models were annotated in the genome using a combination of three ab initio and homology-based methods. To improve the chromosome-level assembly, we generated a high-density linkage map from wild-caught families and developed a novel analytical pipeline to accommodate multiple paternity and unknown father genotypes. We successfully anchored and oriented almost 90% of the genome on 19 linkage groups. This annotated and oriented chromosome-level reference genome represents a valuable resource to facilitate evolutionary studies in squamate reptiles.

April 21, 2020

Investigating the role of exudates in recruiting Streptomyces bacteria to the Arabidopsis thaliana root microbiome

Arabidopsis thaliana has a diverse but consistent root microbiome, recruited in part by the release of fixed carbon in root exudates. Here we focussed on the recruitment of Streptomyces bacteria, which are well established plant-growth-promoting rhizobacteria and which have been proposed to be recruited to A. thaliana roots by the release of salicylic acid. We generated high quality genome sequences for eight Streptomyces endophyte strains and showed that although some strains do enhance plant growth, they are not attracted to, and do not feed on, salicyclic acid. We used 13CO2 DNA-stable isotope probing to determine which bacteria are fed by the plants in the rhizo- and endosphere and found that streptomycetes did not feed on root exudates in vivo, despite the fact that they can use exudate as sole carbon and nitrogen sources in vitro. We confirmed increased root colonisation by streptomycetes in plants that constitutively produce salicylic acid, but these plants exhibited a pleiotropic phenotype of early senescence and weak growth. We propose that streptomycetes are attracted to the rhizosphere by root exudates but can be outcompeted for this food source by more abundant proteobacteria and most likely feed off unlabelled complex organic matter.

April 21, 2020

A chromosome-level genome of black rockfish, Sebastes schlegelii, provides insights into the evolution of live birth.

Black rockfish (Sebastes schlegelii) is a teleost species where eggs are fertilized internally and retained in the maternal reproductive system, where they undergo development until live birth (termed viviparity). In the present study, we report a chromosome-level black rockfish genome assembly. High-throughput transcriptome analysis (RNA-seq and ATAC-seq), coupled with in situ hybridization (ISH) and immunofluorescence, identify several candidate genes for maternal preparation, sperm storage and release, and hatching. We propose that zona pellucida (ZP) proteins retain sperm at the oocyte envelope, while genes in two distinct astacin metalloproteinase subfamilies serve to release sperm from the ZP and free the embryo from chorion at pre-hatching stage. Finally, we present a model of black rockfish reproduction, and propose that the rockfish ovarian wall has a similar function to the uterus of mammals. Taken together, these genomic data reveal unprecedented insights into the evolution of an unusual teleost life history strategy, and provide a sound foundation for studying viviparity in non-mammalian vertebrates and an invaluable resource for rockfish ecological and evolutionary research. This article is protected by copyright. All rights reserved.This article is protected by copyright. All rights reserved.

April 21, 2020

Haplotype-phased genome assembly of virulent Phythophthora ramorum isolate ND886 facilitated by long-read sequencing reveals effector polymorphisms and copy number variation.

Phytophthora ramorum is a destructive pathogen that causes Sudden Oak Death. The genome sequence of P. ramorum isolate Pr102 was previously produced using Sanger reads, and contained 12 Mb of gaps. However, isolate Pr102 had shown reduced aggressiveness and genome abnormalities. In order to produce an improved genome assembly for P. ramorum, we performed long read sequencing of highly aggressive P. ramorum isolate CDFA1418886 (abbreviated as ND886). We generated a 60.5 Mb assembly of the ND886 genome using the Pacific Biosciences sequencing platform. The assembly includes 302 primary contigs (60.2 Mb) and 9 unplaced contigs (265 Kb). Additionally, we found a “Highly repetitive” component from the Pacbio unassembled unmapped reads containing tandem repeats that are not part of the 60.5 Mb genome. The overall repeat content in the primary assembly was much higher than the Pr102 Sanger version (48% vs. 29%) indicating that the long reads have captured repetitive regions effectively. The 302 primary contigs were phased into 345 haplotype blocks and 222,892 phased variants, of which the longest phased block was 1,513,201 bp with 7,265 phased variants. The improved phased assembly facilitated identification of 21 and 25 Crinkler effectors and 393 and 394 RXLR effector genes from two haplotypes. Of these, 24 and 25 RXLR effectors were newly predicted from Haplotype A and Haplotype B, respectively. In addition, 7 new paralogs of effector Avh207 were found in contig 54, not reported earlier. Comparison of the ND886 assembly with Pr102 V1 assembly suggests that several repeat-rich smaller scaffolds within the Pr102 V1 assembly were possibly misassembled; these regions are fully encompassed now in ND886 contigs. Our analysis further reveals that Pr102 is a heterokaryon with multiple nuclear types in the sequences corresponding to contig 10 of ND886 assembly.

April 21, 2020

The radish genome database (RadishGD): an integrated information resource for radish genomics.

Radish (Raphanus sativus L.) is an important root vegetable crop in the family Brassicaceae, which provides diverse nutrients for human health and is closely related to the Brassica crop species. Recently, we sequenced and assembled the radish genome into nine chromosome pseudomolecules. In addition, we developed diverse genomic resources, including genetic maps, molecular markers, transcriptome, genome-wide methylation and variome data. In this study, we describe the radish genome database (RadishGD), including details of data sets that we generated and the web interface that allows access to these data. RadishGD comprises six major units that enable researchers and general users to search, browse and analyze the radish genomic data in an integrated manner. The Search unit provides gene structures and sequences for gene models through keyword or BLAST searches. The Genome browser displays graphic representations of gene models, mRNAs, repetitive sequences, genome-wide methylation and variomes among various genotypes. The Functional annotation unit offers gene ontology, plant ontology, pathway and gene family information for gene models. The Genetic map unit provides information about markers and their genetic locations using two types of genetic maps. The Expression unit presents transcriptional characteristics and methylation levels for each gene in 18 tissues. All sequence data incorporated into RadishGD can be downloaded from the Data resources unit. RadishGD will be continually updated to serve as a community resource for radish genomics and breeding research.

