In an effort to improve precision medicine in Chinese populations, Novogene announced plans to build a database of structural variants in 1,000 Chinese individuals using PacBio SMRT Sequencing. Databases which catalog SNVs and small indels have proven invaluable for precision medicine, serving as population controls for rare disease research and providing a list of variants for genetic association studies. Yet, most of the base pairs that differ between two human genomes are in structural variants which are not adequately represented in current databases. Furthermore, current databases do not represent the genetic background of all ethnic populations, particularly the Chinese who comprise one-fifth of the world’s population.
Novogene will perform the sequencing using a fleet of up to 10 PacBio Sequel Systems, which can produce reads with an average length of 10,000 – 18,000 bp. Long-reads are better able to map to repetitive regions of the genome and fully span large variants. Previous studies have shown long-read sequencing has five times higher sensitivity for discovery of structural variants as compared to short-read sequencing approaches1. Structural variants are already known to cause many human diseases, including Carney Complex, Potocki-Lupski Syndrome, ALS, and Smith-Magenis syndrome. Thus, complete measurement of structural variation is required for precision medicine.
Novogene’s structural variant database will include a variety of disease types across the population cohort. In a statement, Novogene Founder and CEO Ruiqiang Li said, “This more revealing and informative database should greatly improve our understanding of disease mechanisms and contribute to the development of novel diagnostic and therapeutic approaches.”
Reference:
- Huddleston, J. et al. (2016) Discovery and genotyping of structural variation from long-read haploid genome sequence data. Genome Research (5), 677- 685