UPDATED
March 27, 2018
This paper is now available at BMC Genomics.
ORIGINAL POST
October 5, 2017
A new preprint from scientists at the University of Guelph in Canada and the University of Pennsylvania reports the evaluation of SMRT Sequencing with the Sequel System as a replacement for Sanger platforms for amplicon sequencing. They found that long-read PacBio sequencing was highly accurate, exceeded Sanger coverage metrics, and reduced costs by 40-fold.
“A Sequel to Sanger: Amplicon Sequencing That Scales” comes from lead author Paul Hebert, senior author Evgeny Zakharov, and collaborators. The team embarked on this project in the hopes of finding a suitable amplicon sequencing alternative to costly Sanger technology. Short-read sequencers have not succeeded for this application because “the short read lengths and high error rates of most platforms constrain their utility for amplicon sequencing,” they note. “While recent studies have established that Illumina and Ion Torrent platforms can analyze 1 kb amplicons with good accuracy, their need to concatenate short reads creates risks to data quality linked to the recovery of chimeras and pseudogenes.” In addition, cost improvement of these platforms compared to Sanger is just three- to four-fold.
They turned to the Sequel System and circular consensus sequencing (CCS) of amplicons which were indexed with all combinations of 100 distinct forward and 100 distinct reverse primer barcodes. CCS covers the same amplicon several times in a single read to ensure high accuracy, followed by consensus calling of molecules with the same barcode pair. For a rigorous evaluation, the scientists simultaneously analyzed barcoded amplicons from the mitochondrial cytochrome c oxidase I gene from 10,000 separate DNA extracts and representing more than 5,000 Arthropoda species, in a single SMRT Cell. The PacBio system was thus tested with a range of previously difficult sequencing aspects, from homopolymers to varied GC content, evenness of coverage across isolates, and more. SMRT Sequencing results were compared to those from Sanger technology.
The study found that the Sequel System delivered excellent accuracy and that the technique was robust. “Across this range of templates, SMRT sequencing showed no points of failure,” the scientists report. “SMRT sequences also had a major advantage over their Sanger counterparts as they regularly provided complete coverage for the target amplicon.” Unidirectional Sanger reads, for example, were frequently truncated and bidirectional reads varied noticeably in length, generally reflecting homopolymer runs.
While this project focused on shorter amplicons, the team notes that Sanger technology has known limitations for templates longer than 1 kb because of the need to analyze overlapping amplicons. Even in CCS mode, SMRT Sequencing reads are long enough that multi-kilobase templates can easily be covered several times.
The team reports that sequencing capacity makes the Sequel System particularly attractive for this application. “Because it can characterize amplicon pools from 10,000 DNA extracts in a single run, the SEQUEL reduces costs 40-fold from Sanger analysis,” they write. “Exploitation of this capacity is aided by the fact that data processing is simple.” Unlike Sanger data, which calls for visual inspection of results, or short-read data, “SMRT sequences can be processed with an automated pipeline,” they add.