Menu
August 29, 2024  |  General

Powered by PacBio:
Selected publications from August 2024

 

Genomics experts consistently turn to PacBio HiFi sequencing technology as their preferred tool, empowering them to lead the way in discovery and explore new frontiers across various biological fields.

In this edition of our Powered by PacBio blog series, we highlight scientific papers from the month of August 2024. From documented evidence of the superiority of SBB chemistry, to seeing how HiFi excels in analysis of allele-specific expression and has provided a definitive baseline for future evolutionary studies of humans — these papers demonstrate how PacBio technology is fueling the future of what’s possible in genomic research.

Jump to topic:

SBB chemistry on Onso  Allelic-specific expression  Genetic inheritance  Evolution

 

SBB chemistry on Onso system


Sequencing by binding rivals SMOR error-corrected sequencing by synthesis technology for accurate detection and quantification of minor (< 0.1%subpopulation variants

In the FIRST ever peer-reviewed paper published on Onso and SBB technology, researchers demonstrate the superior accuracy and precision of Onso when compared with ILMN SBS technology. Researchers conducted a comparison of targeted SBB sequencing to identify and quantify rare drug-resistant subpopulations in Mycobacterium tuberculosis (Mtb) samples to error-corrected and non-error-corrected targeted SBS sequencing. The study utilized 30 contrived Mtb samples containing either katG g944c or gyrA a241g resistance mutations at 10%, 1%, 0.1%. 0.01%, and 0.001% frequency, all sequenced with SBB and SBS.

The findings:

  • SBB detected mutations from 0.01 – 0.001% frequency at 100K depth or >0.1% frequency at 20K depth with no error correction methods, a 10x improvement compared to SBS and competitive with error corrected SBS data – quoted in paper “The 100,000x depth improved the ability to detect the ultra-low variant frequencies for katG g944c down to 0.01% for SBS-SMOR and 0.001% for SBB.”
  • Empirical error rate and false positive calls were also lower with PacBio SBB sequencing data – quoted in paper “Overall, SBS-SMOR and SBB showed an 8.3X and 8.5X reduction in observed errors, respectively, compared to SBS alone.”
  • Overall the authors propose SBB as a powerful alternative for detecting rare mutations across applications without requiring complex error correction: “SBB sequencing chemistry detected target SNPs down to 0.01% at 100,000x depth and 0.1% at 20,000x depth, without any error correction methods. Traditional SBS sequencing is unable to achieve this accuracy without the use of sophisticated error correction tools (e.g., SMOR, UMI, duplex, and others)”.

Note: The results here were achieved with only single-end reads and pre-commercial sequencing chemistry. PacBio expects improvements on these results with current paired-end chemistry, which would additionally allow validation of the authors’ hypothesis that “integration of error correction methods with SBB sequencing… has the potential to significantly decrease the sequencing error rate even further.”

Conclusion

SBB offers an extraordinary and highly accurate method for detecting minor populations of drug-resistant Mtb at as low as 0.001% frequency without needing complex error correction techniques like UMIs or SMOR. It also addresses the limitations of ILMN SBS by reducing error rates and simplifying the process, making it a critical tool for early detection of heteroresistance in Mtb, which is essential for tackling drug-resistant tuberculosis. Applied broadly, this capability offers advantages for detecting cancer cells in liquid biopsy samples with tumor-derived cell-free DNA and for other infectious disease applications such as wastewater detection.

 
 

Interested in SBB for your lab?

Get started with our latest promotion

 

Allelic-specific expression


Experimental and computational methods for allelic imbalance analysis from single nucleus RNA-seq data

In a recent preprint, researchers from Aligning Science Across Parkinson’s MD, Broad, Yale, Harvard, Banner Sun AZ, Genentech conducted a detailed of analysis of protocols for “allele-specific expression (ASE) analysis to better understand how variation in the human genome affects RNA expression at the single-cell level”. The study employed single-nucleus RNA-Seq (snRNA-Seq) as introns in pre-mRNA that were enriched for heterozygous variants, thus facilitating allelic assignments for ASE, concluding that allele-specific expression (ASE) analysis is better with HiFi – more accurate, cost-effective, and enabling isoform resolution vs. ILMN short-reads.

Key findings:

  • Comparing Illumina, HiFi & ONT:
    • Accuracy: “PacBio and MAS-seq had a fairly low error rate, with more indels but fewer mismatches than Illumina [>4x lower mismatch error rate], and ONT-based methods had a higher error rate.” [despite the latter using R2C2]
    • Throughput: “we still obtained more phased UMIs overall with Illumina than with the long-read technologies except for MAS-seq with Revio, which had a comparable number. This indicates that MAS-seq and Illumina are both powerful methods for performing gene-level ASE analysis of snRNA-seq data”.
    • Read value/cost: “for every MAS-seq segmented read (S-read) one would need to sequence ~7 Illumina reads (with 130bp in read 2) to get the same number of phased UMIs though the exact number depends on the level of saturation, so that MAS-seq would be more cost effective than Illumina if the cost of one S-read in MAS-seq is less than seven times the cost of an Illumina read.”
  • Isoform resolution: “MAS-Seq has a major advantage over short-read data for isoform-level ASE analysis.” [this is a key result, asmany studies (e.g.,  1 2 3 4 5 6 7 ) have now shown that different isoforms of the same gene often have very different functions, hence isoform expression – not gene expression – determines biology and disease]

Conclusion

HiFi sequencing delivers more detailed isoform-level ASE analysis, capturing complex transcriptomic variations that short-read sequencing is missing. Through cost-effective and automated solutions like Kinnex scNRNA workflow on the Revio system, HiFi sequencing delivers deeper insights into transcriptional regulation and improves understanding of how genomic variation impacts cellular function at a higher resolution.

 

Genetic Inheritance


A familial, telomere-to-telomere reference for human de novomutation and recombination from a four-generation pedigree

In a preprint led by UW and 18 additional institutions, researchers unlock the secrets of genetic inheritance across four generations using HiFi sequencing. This study marks “the most comprehensive, publicly available “truth set” of all classes of genomic variants”:

 

Highlights include:

  • Near-telomere-to-telomere phased genome assemblies (multiple techs, mix of verkko and hifiasm) in a four-generation, 28-member family
  • Using HiFi sequencing to discover de novo SNVs and Indels mutations, as well as mutations in STRs and VNTRs using TRGT & TRGT-denovo (with other technologies being utilized for orthogonal support)
  • The finding that: “Leveraging long-read sequence data in the context of a pedigree provides access to an additional 244 Mbp of the human genome (2.76 Gbp high-confidence regions) when compared to Genome in a Bottle (GIAB) (2.51 Gbp) or Illumina WGS data (2.58 Gbp), including 194 Mbp not present in either study.”
  • Researchers were able to complete detailed analyses of variant callset, sequence-resolved recombination map, de novo SNVs and small indels, de novo tandem repeats and recurrent mutation, centromere familial transmission and de novo SVs therein, Y chromosome mutations, and genome-wide de novo SVs
  • “Most [previous] studies41–45 that establish DNM [de novo mutation] rates utilize short reads amongst large groups of trios and generally agree on 60-70 DNMs per generation; however, this largely excludes highly mutable regions of the genome, e.g., long TRs, SDs, and satellite sequence”. “In this multigenerational pedigree, we estimate 128-259 DNMs per generation” (i.e., doubling-quadrupling the number previously assumed!)

Conclusion

HiFi sequencing offers a comprehensive and accurate method for studying de novo mutations (DNMs), particularly in complex genomic regions like centromeres and the Y chromosome that short-read methods often miss using accurate and scalable WGS on Revio with HP96 library prep kit and friendly automation tools. The ability to resolve challenging areas makes HiFi sequencing the most reliable tool for detecting a broader spectrum of mutations, significantly advancing our understanding of genetic disease, phenotypic variation, and human evolution. Notably, while this landmark study utilized data from five different short- and long-read technologies, PacBio HiFi sequencing contributed the majority of the data.

 

Evolution


Complete sequencing of ape genomes 

Lastly, a preprint led by UW in combination with 59 other institutions offers a major leap in our understanding of what it means to be human. Through this study, researchers walk away with “a definitive baseline for all future evolutionary studies of humans and our closest living ape relatives”:

  • Through haplotype-resolved reference genomes [using T2T workflow and verkko assembler] and comparative analyses of six ape species … researchers achieved chromosome-level contiguity with exceptional sequence accuracy (<1 error in 500,000 base pairs), completely sequencing 215 gapless chromosomes telomere-to-telomere.
  • Each genome assembly was annotated by NCBI [including use of 50 Gb of full-length cDNA generated from each sample with Iso-Seq] and has been adopted as the main reference in RefSeq, replacing the previous short- or long-read- based, less complete versions of the genomes and updating the sex chromosomes with the newly assembled and polished versions.”
  • Detailed analyses, comprising: sequence divergence (including refinement of the “oft-quoted statistic of ∼99% sequence identity between chimpanzee and human” by revealing 5-15fold higher difference in the affected Mbp in rapidly evolving and structural variant regions of the genome), speciation time and incomplete lineage sorting, gene annotation, repeat annotation and mobile element insertion identification, selection and diversity, immunoglobulin and MHC loci, immunoglobulin and T-cell receptor loci, epigenetic features, evolutionary rearrangements and serial ape inversions, structurally divergent and accelerated regions of mutation, acrocentric chromosomes and nucleolar organizer regions, centromere satellite evolution, subterminal heterochromatin, and lineage-specific segmental duplications and gene families.

Conclusion

Only long-read sequencing can resolve the 10%-15% of highly divergent, previously inaccessible regions of ape genomes that are crucial for understanding complex traits. By providing a high-quality, haplotype-resolved reference genome through WGS on the Revio system using the HP96 workflow, researchers can enhance their comprehension of evolutionary and functional differences within ape species, offering valuable insights that can be applied to human health.

 

Ready to kickstart breakthroughs of your own?


These publications highlight the remarkable capabilities of PacBio sequencing. From the Onso system outperforming traditional methods to HiFi sequencing setting new standards for evolutionary research, PacBio technology is empowering researchers to make new groundbreaking discoveries every day.
Learn how to join in on the next era of discovery and see how you can incorporate PacBio data into your next project:

Connect with a PacBio scientist

Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.