The constituents and intra-communal interactions of microbial populations have garnered increasing interest in areas such as water remediation, agriculture and human health. Amplification and sequencing of the evolutionarily conserved 16S rRNA gene is an efficient method of profiling communities. Currently, most targeted amplification focuses on short, hypervariable regions of the 16S sequence. Distinguishing information not spanned by the targeted region is lost, and species-level classification is often not possible. PacBio SMRT Sequencing easily spans the entire 1.5 kb 16S gene in a single read, producing highly accurate single-molecule sequences that can improve the identification of individual species in a metapopulation.However, this process still relies upon PCR amplification from a mixture of similar sequences, which may result in chimeras, or recombinant molecules, at rates upwards of 20%. These PCR artifacts make it difficult to identify novel species, and reduce the amount of informative sequences. We investigated multiple factors that may contribute to chimera formation, such as template damage, denaturation time before and during thermocycling, polymerase extension time, and reaction volume. We found two related factors that contribute to chimera formation: the amount of input template into the PCR reaction, and the number of PCR cycles.A second problem that can confound analysis is sequence errors generated during amplification and sequencing. With the updated algorithm for circular consensus sequencing (CCS2), single-molecule reads can be filtered to 99.99% predicted accuracy. Substitution errors in these highly filtered reads may be dominated by mis-incorporations during amplification. Sequence differences in full-length 16S amplicons from several commercial high-fidelity PCR kits were compared.We show results of our experiments and describe our optimized protocol for full-length 16S amplification for SMRT Sequencing. These optimizations have broader implications for other applications that use PCR amplification to phase variations across targeted regions and generate highly accurate reference sequences.
Organization: PacBio
Year: 2016