Genome in a Bottle consortium |
The National Institute of Standards and Technology held its latest Genome in a Bottle workshop last month in Gaithersburg, Md., and we were honored to attend. NIST has performed pivotal work to establish reference materials for the genomics community, starting with its RNA spike-in standards (ERCC spike-in controls) and continuing now with the GIAB consortium. These standards are essential for quality control and we’re pleased to be working with NIST to help ensure the highest accuracy in human genome sequencing.
Last year, GIAB released its first reference standard, based on the well-studied NA12878 human genome (NIST RM 8398). At this year’s meeting, attendees from clinical sequencing groups shared their experiences with this reference material, which they are using to support clinical testing and validation. The reference is designed to help users make high-confidence variant calls across many types of variation. In addition, attendees from the FDA described how they anticipate using the reference material to assess device performance as part of the regulatory review process for diagnostics.
The workshop also featured a research track reporting on efforts surrounding the newest GIAB reference material, which will be based on an Ashkenazim Jewish trio from the Personal Genome Project. Through the GIAB project consortium, NIST has characterized the genomes from this trio using measurements from 11 different technologies, including those from BioNano Genomics, Complete Genomics (paired-end and Long Fragment Read), Thermo Fisher Scientific (the Ion Proton system and SOLiD sequencing), Oxford Nanopore, Pacific Biosciences, 10X Genomics (GemCode Platform), and Illumina (paired-end, mate-pair, and synthetic long read sequencing). All of the data has been made public through a paper that recently published on bioRxiv, including the data from PacBio® sequencing.
The GIAB data analysis communities have been actively working to analyze these public data sets and report on a variety of results. Speakers including Adam Phillippy, Ali Bashir, Shinichi Morishita, and Will Salerno presented early results from their efforts to fully characterize the trio genomes using the data, including de novo assembly, structural variation profiling, SNV calling, haplotype reconstruction, and methylation analysis of the epigenome. Based on the sheer number of technologies being used to decode the genomes of this family, it seems these three individuals will soon have some of the most deeply analyzed genomes in the world!
Marc Salit, who leads the Genome Scale Measurement Group at NIST, said the institute plans to integrate these data into a consolidated set of high-quality measurements and make them publicly available through NCBI. The previously published reference material, Salit told attendees, has already been used to support 510(k) diagnostic device filings to the FDA and to demonstrate validation by clinical sequencing labs during CAP/CLIA inspections.
Meeting attendees were also invited to a session with the NIST GIAB steering committee, where stakeholders agreed that the most important priority was to completely characterize the NA12878 and AJ trio genomes to ensure high-confidence calls across all categories of genetic variation spanning the whole genome.
The next GIAB meeting will take place January 28-29, 2016, at Stanford University, and we look forward to participating again and continuing our contributions to this community.