The new publicly available assembly (PacBio HG00733) has the fewest gaps of any human genome assembly, with more than half of the genome contained in gapless sequence at least 27 Mb long. The primary contig assembly is 2.89 Gb long and consists of 865 contigs that were assembled with PacBio data generated with the company’s Sequel® System. Using the FALCON-Unzip assembler, maternal and paternal haplotypes were resolved over more than 80% of the genome. Maternal and paternal haplotype blocks were then further phased using Hi-C technology and the FALCON-Phase method developed in collaboration with Phase Genomics. The genome was then de novo scaffolded using Phase Genomics’ Proximo Hi-C platform, resulting in the first chromosome-scale diploid assembly of a single individual accomplished with only two technologies. More specific details about the assembly are included on the PacBio blog.
“This level of human genome resolution was not possible until now and is uniquely enabled by PacBio sequencing technology,” said
“
The current version of the human reference genome assembly released by the
SMRT Sequencing has demonstrated that any individual diploid human genome contains more than 20,000 unique structural variants (defined as ≥50 bp in length) and another ~400,000 insertions or deletion variants (ranging in length from 1 bp to 49 bp). Importantly, more than 80% of these variants are not currently accessible using short-read whole genome sequencing methods due to coverage bias, ambiguity in read mapping and inability to span large variants. In contrast, sensitive detection of these larger variants in human genome studies has been widely demonstrated using PacBio long-read sequencing. More than 40 global initiatives are currently underway to apply these de novo assembly methods to individuals representing multiple ethnic populations, thereby extending the diversity of available human reference genomes.
“In order to enable precision medicine for all populations, it is crucial to achieve high-quality DNA sequencing and to better represent true ethnic diversity within genomic databases,” said
The data are available using NCBI accession IDs: BioProject: (PRJNA483067), assembly: [RBJD00000000] and sequence data (SRP155659).
Additional Resources
- Interactive map showcasing global initiatives underway to generate reference-quality human genome assemblies for diverse populations
- BioReport Podcast on the value of ethnic-specific reference genomes
- Nature Reviews Genetics paper from NHGRI: Prioritizing diversity in human genomics research
- Article in
The Journal of Precision Medicine : “Minority Report – Ethnic Diversity and the Real Promise for Precision Medicine” - Article in Bio-IT World: “Genomic Data Standards Are a Necessity”
NHGRI Project Award: High Quality Human and Non-Human Primate Genome Assemblies
More details are available on the PacBio website:
- Blog post: Data Release: Highest-Quality,
Most Contiguous Individual Human Genome Assembly to Date - Blog post: For Reference-Grade Human Genome Assemblies, SMRT Sequencing Yields Optimal Results
- Webinar: Assembling High-Quality Human Reference Genomes for Global Populations
- FALCON-Phase press release and article preprint
- PacBio research focus webpage about Human Population Genetics
About Pacific Biosciences
Forward-Looking Statements
All statements in this press release that are not historical are forward-looking statements, including, among other things, statements relating to future availability, uses, accuracy, quality or performance of, or benefits of using, products or technologies, the suitability or utility of methods, products or technologies for particular applications, studies or projects, the expected benefits of sequencing projects, and other future events. You should not place undue reliance on forward-looking statements because they involve known and unknown risks, uncertainties, changes in circumstances and other factors that are, in some cases, beyond Pacific Biosciences’ control and could cause actual results to differ materially from the information expressed or implied by forward-looking statements made in this press release. Factors that could materially affect actual results can be found in Pacific Biosciences’ most recent filings with the
Pacific Biosciences undertakes no obligation to revise or update information in this press release to reflect events or circumstances in the future, even if new information becomes available.
Contacts
Media:
nicole@bioscribe.com
Investors:
ir@pacificbiosciences.com
Source: Pacific Biosciences of California, Inc.