Menu
September 22, 2019

A protein-truncating HSD17B13 variant and protection from chronic liver disease.

Elucidation of the genetic factors underlying chronic liver disease may reveal new therapeutic targets.We used exome sequence data and electronic health records from 46,544 participants in the DiscovEHR human genetics study to identify genetic variants associated with serum levels of alanine aminotransferase (ALT) and aspartate aminotransferase (AST). Variants that were replicated in three additional cohorts (12,527 persons) were evaluated for association with clinical diagnoses of chronic liver disease in DiscovEHR study participants and two independent cohorts (total of 37,173 persons) and with histopathological severity of liver disease in 2391 human liver samples.A splice variant (rs72613567:TA) in HSD17B13, encoding the hepatic lipid droplet protein hydroxysteroid 17-beta dehydrogenase 13, was associated with reduced levels of ALT (P=4.2×10-12) and AST (P=6.2×10-10). Among DiscovEHR study participants, this variant was associated with a reduced risk of alcoholic liver disease (by 42% [95% confidence interval CI, 20 to 58] among heterozygotes and by 53% [95% CI, 3 to 77] among homozygotes), nonalcoholic liver disease (by 17% [95% CI, 8 to 25] among heterozygotes and by 30% [95% CI, 13 to 43] among homozygotes), alcoholic cirrhosis (by 42% [95% CI, 14 to 61] among heterozygotes and by 73% [95% CI, 15 to 91] among homozygotes), and nonalcoholic cirrhosis (by 26% [95% CI, 7 to 40] among heterozygotes and by 49% [95% CI, 15 to 69] among homozygotes). Associations were confirmed in two independent cohorts. The rs72613567:TA variant was associated with a reduced risk of nonalcoholic steatohepatitis, but not steatosis, in human liver samples. The rs72613567:TA variant mitigated liver injury associated with the risk-increasing PNPLA3 p.I148M allele and resulted in an unstable and truncated protein with reduced enzymatic activity.A loss-of-function variant in HSD17B13 was associated with a reduced risk of chronic liver disease and of progression from steatosis to steatohepatitis. (Funded by Regeneron Pharmaceuticals and others.).


September 22, 2019

Comparative Annotation Toolkit (CAT)-simultaneous clade and personal genome annotation.

The recent introductions of low-cost, long-read, and read-cloud sequencing technologies coupled with intense efforts to develop efficient algorithms have made affordable, high-quality de novo sequence assembly a realistic proposition. The result is an explosion of new, ultracontiguous genome assemblies. To compare these genomes, we need robust methods for genome annotation. We describe the fully open source Comparative Annotation Toolkit (CAT), which provides a flexible way to simultaneously annotate entire clades and identify orthology relationships. We show that CAT can be used to improve annotations on the rat genome, annotate the great apes, annotate a diverse set of mammals, and annotate personal, diploid human genomes. We demonstrate the resulting discovery of novel genes, isoforms, and structural variants-even in genomes as well studied as rat and the great apes-and how these annotations improve cross-species RNA expression experiments.© 2018 Fiddes et al.; Published by Cold Spring Harbor Laboratory Press.


September 22, 2019

Iso-Seq analysis of Nepenthes ampullaria, Nepenthes rafflesiana and Nepenthes × hookeriana for hybridisation study in pitcher plants.

Tropical pitcher plants in the species-rich Nepenthaceae family of carnivorous plants possess unique pitcher organs. Hybridisation, natural or artificial, in this family is extensive resulting in pitchers with diverse features. The pitcher functions as a passive insect trap with digestive fluid for nutrient acquisition in nitrogen-poor habitats. This organ shows specialisation according to the dietary habit of different Nepenthes species. In this study, we performed the first single-molecule real-time isoform sequencing (Iso-Seq) analysis of full-length cDNA from Nepenthes ampullaria which can feed on leaf litter, compared to carnivorous Nepenthes rafflesiana, and their carnivorous hybrid Nepenthes × hookeriana. This allows the comparison of pitcher transcriptomes from the parents and the hybrid to understand how hybridisation could shape the evolution of dietary habit in Nepenthes. Raw reads have been deposited to SRA database with the accession numbers SRX2692198 (N. ampullaria), SRX2692197 (N. rafflesiana), and SRX2692196 (N. × hookeriana).


September 22, 2019

Comprehensive profiling of rhizome-associated alternative splicing and alternative polyadenylation in moso bamboo (Phyllostachys edulis).

Moso bamboo (Phyllostachys edulis) represents one of the fastest-spreading plants in the world, due in part to its well-developed rhizome system. However, the post-transcriptional mechanism for the development of the rhizome system in bamboo has not been comprehensively studied. We therefore used a combination of single-molecule long-read sequencing technology and polyadenylation site sequencing (PAS-seq) to re-annotate the bamboo genome, and identify genome-wide alternative splicing (AS) and alternative polyadenylation (APA) in the rhizome system. In total, 145 522 mapped full-length non-chimeric (FLNC) reads were analyzed, resulting in the correction of 2241 mis-annotated genes and the identification of 8091 previously unannotated loci. Notably, more than 42 280 distinct splicing isoforms were derived from 128 667 intron-containing full-length FLNC reads, including a large number of AS events associated with rhizome systems. In addition, we characterized 25 069 polyadenylation sites from 11 450 genes, 6311 of which have APA sites. Further analysis of intronic polyadenylation revealed that LTR/Gypsy and LTR/Copia were two major transposable elements within the intronic polyadenylation region. Furthermore, this study provided a quantitative atlas of poly(A) usage. Several hundred differential poly(A) sites in the rhizome-root system were identified. Taken together, these results suggest that post-transcriptional regulation may potentially have a vital role in the underground rhizome-root system.© 2017 The Authors The Plant Journal © 2017 John Wiley & Sons Ltd.


September 22, 2019

Full-length transcriptome sequencing and modular organization analysis of naringin/neoeriocitrin related gene expression pattern in Drynaria roosii.

Drynaria roosii (Nakaike) is a traditional Chinese medicinal fern, known as ‘GuSuiBu’. The effective components, naringin and neoeriocitrin, share a highly similar chemical structure and medicinal function. Our HPLC-tandem mass spectrometry (MS/MS) results showed that the accumulation of naringin/neoeriocitrin depended on specific tissues or ages. However, little was known about the expression patterns of naringin/neoeriocitrin-related genes involved in their regulatory pathways. Due to a lack of basic genetic information, we applied a combination of single molecule real-time (SMRT) sequencing and second-generation sequencing (SGS) to generate the complete and full-length transcriptome of D. roosii. According to the SGS data, the differentially expressed gene (DEG)-based heat map analysis revealed that naringin/neoeriocitrin-related gene expression exhibited obvious tissue- and time-specific transcriptomic differences. Using the systems biology method of modular organization analysis, we clustered 16,472 DEGs into 17 gene modules and studied the relationships between modules and tissue/time point samples, as well as modules and naringin/neoeriocitrin contents. We found that naringin/neoeriocitrin-related DEGs distributed in nine distinct modules, and DEGs in these modules showed significantly different patterns of transcript abundance to be linked to specific tissues or ages. Moreover, weighted gene co-expression network analysis (WGCNA) results further identified that PAL, 4CL and C4H, and C3H and HCT acted as the major hub genes involved in naringin and neoeriocitrin synthesis, respectively, and exhibited high co-expression with MYB- and basic helix-leucine-helix (bHLH)-regulated genes. In this work, modular organization and co-expression networks elucidated the tissue and time specificity of the gene expression pattern, as well as hub genes associated with naringin/neoeriocitrin synthesis in D. roosii. Simultaneously, the comprehensive transcriptome data set provided important genetic information for further research on D. roosii.


September 22, 2019

Normalized long read RNA sequencing in chicken reveals transcriptome complexity similar to human.

Despite the significance of chicken as a model organism, our understanding of the chicken transcriptome is limited compared to human. This issue is common to all non-human vertebrate annotations due to the difficulty in transcript identification from short read RNAseq data. While previous studies have used single molecule long read sequencing for transcript discovery, they did not perform RNA normalization and 5′-cap selection which may have resulted in lower transcriptome coverage and truncated transcript sequences.We sequenced normalised chicken brain and embryo RNA libraries with Pacific Bioscience Iso-Seq. 5′ cap selection was performed on the embryo library to provide methodological comparison. From these Iso-Seq sequencing projects, we have identified 60 k transcripts and 29 k genes within the chicken transcriptome. Of these, more than 20 k are novel lncRNA transcripts with ~3 k classified as sense exonic overlapping lncRNA, which is a class that is underrepresented in many vertebrate annotations. The relative proportion of alternative transcription events revealed striking similarities between the chicken and human transcriptomes while also providing explanations for previously observed genomic differences.Our results indicate that the chicken transcriptome is similar in complexity compared to human, and provide insights into other vertebrate biology. Our methodology demonstrates the potential of Iso-Seq sequencing to rapidly expand our knowledge of transcriptomics.


September 22, 2019

Somatic mosaicism of an intragenic FANCB duplication in both fibroblast and peripheral blood cells observed in a Fanconi anemia patient leads to milder phenotype.

Fanconi anemia (FA) is a rare disorder characterized by congenital malformations, progressive bone marrow failure, and predisposition to cancer. Patients harboring X-linked FANCB pathogenic variants usually present with severe congenital malformations resembling VACTERL syndrome with hydrocephalus.We employed the diepoxybutane (DEB) test for FA diagnosis, arrayCGH for detection of duplication, targeted capture and next-gen sequencing for defining the duplication breakpoint, PacBio sequencing of full-length FANCB aberrant transcript, FANCD2 ubiquitination and foci formation assays for the evaluation of FANCB protein function by viral transduction of FANCB-null cells with lentiviral FANCB WT and mutant expression constructs, and droplet digital PCR for quantitation of the duplication in the genomic DNA and cDNA.We describe here an FA-B patient with a mild phenotype. The DEB diagnostic test for FA revealed somatic mosaicism. We identified a 9154 bp intragenic duplication in FANCB, covering the first coding exon 3 and the flanking regions. A four bp homology (GTAG) present at both ends of the breakpoint is consistent with microhomology-mediated duplication mechanism. The duplicated allele gives rise to an aberrant transcript containing exon 3 duplication, predicted to introduce a stop codon in FANCB protein (p.A319*). Duplication levels in the peripheral blood DNA declined from 93% to 7.9% in the span of eleven years. Moreover, the patient fibroblasts have shown 8% of wild-type (WT) allele and his carrier mother showed higher than expected levels of WT allele (79% vs. 50%) in peripheral blood, suggesting that the duplication was highly unstable.Unlike sequence point variants, intragenic duplications are difficult to precisely define, accurately quantify, and may be very unstable, challenging the proper diagnosis. The reversion of genomic duplication to the WT allele results in somatic mosaicism and may explain the relatively milder phenotype displayed by the FA-B patient described here.© 2017 The Authors. Molecular Genetics & Genomic Medicine published by Wiley Periodicals, Inc.


September 22, 2019

Use of a draft genome of coffee (Coffea arabica) to identify SNPs associated with caffeine content.

Arabica coffee (Coffea arabica) has a small gene pool limiting genetic improvement. Selection for caffeine content within this gene pool would be assisted by identification of the genes controlling this important trait. Sequencing of DNA bulks from 18 genotypes with extreme high- or low-caffeine content from a population of 232 genotypes was used to identify linked polymorphisms. To obtain a reference genome, a whole genome assembly of arabica coffee (variety K7) was achieved by sequencing using short read (Illumina) and long-read (PacBio) technology. Assembly was performed using a range of assembly tools resulting in 76 409 scaffolds with a scaffold N50 of 54 544 bp and a total scaffold length of 1448 Mb. Validation of the genome assembly using different tools showed high completeness of the genome. More than 99% of transcriptome sequences mapped to the C. arabica draft genome, and 89% of BUSCOs were present. The assembled genome annotated using AUGUSTUS yielded 99 829 gene models. Using the draft arabica genome as reference in mapping and variant calling allowed the detection of 1444 nonsynonymous single nucleotide polymorphisms (SNPs) associated with caffeine content. Based on Kyoto Encyclopaedia of Genes and Genomes pathway-based analysis, 65 caffeine-associated SNPs were discovered, among which 11 SNPs were associated with genes encoding enzymes involved in the conversion of substrates, which participate in the caffeine biosynthesis pathways. This analysis demonstrated the complex genetic control of this key trait in coffee.© 2018 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.


September 22, 2019

Species groups distributed across elevational gradients reveal convergent and continuous genetic adaptation to high elevations.

Although many cases of genetic adaptations to high elevations have been reported, the processes driving these modifications and the pace of their evolution remain unclear. Many high-elevation adaptations (HEAs) are thought to have arisen in situ as populations rose with growing mountains. In contrast, most high-elevation lineages of the Qinghai-Tibetan Plateau appear to have colonized from low-elevation areas. These lineages provide an opportunity for studying recent HEAs and comparing them with ancestral low-elevation alternatives. Herein, we compare four frogs (three species of Nanorana and a close lowland relative) and four lizards (Phrynocephalus) that inhabit a range of elevations on or along the slopes of the Qinghai-Tibetan Plateau. The sequential cladogenesis of these species across an elevational gradient allows us to examine the gradual accumulation of HEA at increasing elevations. Many adaptations to high elevations appear to arise gradually and evolve continuously with increasing elevational distributions. Numerous related functions, especially DNA repair and energy metabolism pathways, exhibit rapid change and continuous positive selection with increasing elevations. Although the two studied genera are distantly related, they exhibit numerous convergent evolutionary changes, especially at the functional level. This functional convergence appears to be more extensive than convergence at the individual gene level, although we found 32 homologous genes undergoing positive selection for change in both high-elevation groups. We argue that species groups distributed along a broad elevational gradient provide a more powerful system for testing adaptations to high-elevation environments compared with studies that compare only pairs of high-elevation versus low-elevation species.


September 22, 2019

Avian transcriptomics: opportunities and challenges

Recent developments in next-generation sequencing technologies have greatly facilitated the study of whole transcriptomes in model and non-model species. Studying the transcriptome and how it changes across a variety of biological conditions has had major implications for our understanding of how the genome is regulated in different contexts, and how to interpret adaptations and the phenotype of an organism. The aim of this review is to highlight the potential of these new technologies for the study of avian transcriptomics, and to summarise how transcriptomics has been applied in ornithology. A total of 81 peer-reviewed scientific articles that used transcriptomics to answer questions within a broad range of study areas in birds are used as examples throughout the review. We further provide a quick guide to highlight the most important points which need to be take into account when planning a transcriptomic study in birds, and discuss how researchers with little background in molecular biology can avoid potential pitfalls. Suggestions for further reading are supplied throughout. We also discuss possible future developments in the technology platforms used for ribonucleic acid sequencing. By summarising how these novel technologies can be used to answer questions that have long been asked by ornithologists, we hope to bridge the gap between traditional ornithology and genomics, and to stimulate more interdisciplinary research.


September 22, 2019

Multi-platform sequencing approach reveals a novel transcriptome profile in pseudorabies virus.

Third-generation sequencing is an emerging technology that is capable of solving several problems that earlier approaches were not able to, including the identification of transcripts isoforms and overlapping transcripts. In this study, we used long-read sequencing for the analysis of pseudorabies virus (PRV) transcriptome, including Oxford Nanopore Technologies MinION, PacBio RS-II, and Illumina HiScanSQ platforms. We also used data from our previous short-read and long-read sequencing studies for the comparison of the results and in order to confirm the obtained data. Our investigations identified 19 formerly unknown putative protein-coding genes, all of which are 5′ truncated forms of earlier annotated longer PRV genes. Additionally, we detected 19 non-coding RNAs, including 5′ and 3′ truncated transcripts without in-frame ORFs, antisense RNAs, as well as RNA molecules encoded by those parts of the viral genome where no transcription had been detected before. This study has also led to the identification of three complex transcripts and 50 distinct length isoforms, including transcription start and end variants. We also detected 121 novel transcript overlaps, and two transcripts that overlap the replication origins of PRV. Furthermore,in silicoanalysis revealed 145 upstream ORFs, many of which are located on the longer 5′ isoforms of the transcripts.


September 22, 2019

MCF-7 breast cancer cell line PacBio generated transcriptome has ~300 novel transcribed regions, un-annotated in both RefSeq and GENCODE, and absent in the liver, heart and brain transcriptomes

Illuminating the “dark” regions of the human genome remains an ongoing effort, a decade and a half after the human genome was sequenced – RefSeq and GENCODE being two of the major annotation databases. Pacific Biosciences (PacBio) has provided open access to the transcriptome of MCF-7, a breast cancer cell line that has provided significant therapeutic advancement in breast cancer research since the 1970s. PacBio sequencing generates much longer reads compared to second-generation sequencing technologies, with a trade-off of lower throughput, higher error rate and more cost per base. Here, this transcriptome was analyzed using the YeATS pipeline, with additionally introduced kmer based algorithms, reducing computational times to a few hours on a simple workstation. Out of ~300 transcripts that have no match in both RefSeq and GENCODE, ~250 are absent in the transcriptomes of the heart, liver and brain, also provided by PacBio. Also, ~200 transcripts are absent in a recent catalogue of un-annotated long non-coding RNAs from 6,503 samples (~43 Terabases of sequence data) [1], and only two present in common in an experimental workflow RACE-Seq that reported 2,556 novel transcripts [2]. ~100 transcripts have >100 amino acid open reading frames, and have the potential of being protein coding genes. ORF based annotation also identified few bacterial transcripts in the PacBio database mapped to the human genome, and one human transcript that has been annotated as bacterial in the NCBI database. The current work reiterates the under-utilization of transcriptomes for annotating genomes. It also provides new leads for investigating breast cancer by virtue of exclusively expressed transcripts not expressed in other tissues, which have the prospects of breast cancer biomarkers based on further investigations.


September 22, 2019

Current progress in EBV-associated B-cell lymphomas.

Epstein-Barr virus (EBV) was the first human tumor virus discovered more than 50 years ago. EBV-associated lymphomagenesis is still a significant viral-associated disease as it involves a diverse range of pathologies, especially B-cell lymphomas. Recent development of high-throughput next-generation sequencing technologies and in vivo mouse models have significantly promoted our understanding of the fundamental molecular mechanisms which drive these cancers and allowed for the development of therapeutic intervention strategies. This review will highlight the current advances in EBV-associated B-cell lymphomas, focusing on transcriptional regulation, chromosome aberrations, in vivo studies of EBV-mediated lymphomagenesis, as well as the treatment strategies to target viral-associated lymphomas.


September 22, 2019

Full-length transcriptome sequences of ephemeral plant Arabidopsis pumila provides insight into gene expression dynamics during continuous salt stress.

Arabidopsis pumila is native to the desert region of northwest China and it is extraordinarily well adapted to the local semi-desert saline soil, thus providing a candidate plant system for environmental adaptation and salt-tolerance gene mining. However, understanding of the salt-adaptation mechanism of this species is limited because of genomic sequences scarcity. In the present study, the transcriptome profiles of A. pumila leaf tissues treated with 250 mM NaCl for 0, 0.5, 3, 6, 12, 24 and 48 h were analyzed using a combination of second-generation sequencing (SGS) and third-generation single-molecule real-time (SMRT) sequencing.Correction of SMRT long reads by SGS short reads resulted in 59,328 transcripts. We found 8075 differentially expressed genes (DEGs) between salt-stressed tissues and controls, of which 483 were transcription factors and 1157 were transport proteins. Most DEGs were activated within 6 h of salt stress and their expression stabilized after 48 h; the number of DEGs was greatest within 12 h of salt stress. Gene annotation and functional analyses revealed that expression of genes associated with the osmotic and ionic phases rapidly and coordinately changed during the continuous salt stress in this species, and salt stress-related categories were highly enriched among these DEGs, including oxidation-reduction, transmembrane transport, transcription factor activity and ion channel activity. Orphan, MYB, HB, bHLH, C3H, PHD, bZIP, ARF and NAC TFs were most enriched in DEGs; ABCB1, CLC-A, CPK30, KEA2, KUP9, NHX1, SOS1, VHA-A and VP1 TPs were extensively up-regulated in salt-stressed samples, suggesting that they play important roles in slat tolerance. Importantly, further experimental studies identified a mitogen-activated protein kinase (MAPK) gene MAPKKK18 as continuously up-regulated throughout salt stress, suggesting its crucial role in salt tolerance. The expression patterns of the salt-responsive 24 genes resulted from quantitative real-time PCR were basically consistent with their transcript abundance changes identified by RNA-Seq.The full-length transcripts generated in this study provide a more accurate depiction of gene transcription of A. pumila. We identified potential genes involved in salt tolerance of A. pumila. These data present a genetic resource and facilitate better understanding of salt-adaptation mechanism for ephemeral plants.


September 22, 2019

Comparative genome and transcriptome analysis reveals distinctive surface characteristics and unique physiological potentials of Pseudomonas aeruginosa ATCC 27853.

Pseudomonas aeruginosa ATCC 27853 was isolated from a hospital blood specimen in 1971 and has been widely used as a model strain to survey antibiotics susceptibilities, biofilm development, and metabolic activities of Pseudomonas spp.. Although four draft genomes of P. aeruginosa ATCC 27853 have been sequenced, the complete genome of this strain is still lacking, hindering a comprehensive understanding of its physiology and functional genome.Here we sequenced and assembled the complete genome of P. aeruginosa ATCC 27853 using the Pacific Biosciences SMRT (PacBio) technology and Illumina sequencing platform. We found that accessory genes of ATCC 27853 including prophages and genomic islands (GIs) mainly contribute to the difference between P. aeruginosa ATCC 27853 and other P. aeruginosa strains. Seven prophages were identified within the genome of P. aeruginosa ATCC 27853. Of the predicted 25 GIs, three contain genes that encode monoxoygenases, dioxygenases and hydrolases that could be involved in the metabolism of aromatic compounds. Surveying virulence-related genes revealed that a series of genes that encode the B-band O-antigen of LPS are lacking in ATCC 27853. Distinctive SNPs in genes of cellular adhesion proteins such as type IV pili and flagella biosynthesis were also observed in this strain. Colony morphology analysis confirmed an enhanced biofilm formation capability of ATCC 27853 on solid agar surface compared to Pseudomonas aeruginosa PAO1. We then performed transcriptome analysis of ATCC 27853 and PAO1 using RNA-seq and compared the expression of orthologous genes to understand the functional genome and the genomic details underlying the distinctive colony morphogenesis. These analyses revealed an increased expression of genes involved in cellular adhesion and biofilm maturation such as type IV pili, exopolysaccharide and electron transport chain components in ATCC 27853 compared with PAO1. In addition, distinctive expression profiles of the virulence genes lecA, lasB, quorum sensing regulators LasI/R, and the type I, III and VI secretion systems were observed in the two strains.The complete genome sequence of P. aeruginosa ATCC 27853 reveals the comprehensive genetic background of the strain, and provides genetic basis for several interesting findings about the functions of surface associated proteins, prophages, and genomic islands. Comparative transcriptome analysis of P. aeruginosa ATCC 27853 and PAO1 revealed several classes of differentially expressed genes in the two strains, underlying the genetic and molecular details of several known and yet to be explored morphological and physiological potentials of P. aeruginosa ATCC 27853.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.