Menu
April 11, 2024  |  Human genetics research

Long-read sequencing myths: debunked.
Part 2 — human genomics

 

Blog header image for the Myths of Long Reads blog with HiFi facts and humans in the background

Hearing “the end of the beginning” or “the dawn of a new era” might sound like a grandiose way to describe a scientific milestone. Would it surprise you to learn these phrases were used in 2003 to celebrate the completion of the Human Genome Project? Despite the fanfare, only 92% of the human genome had been mapped then.

Now, let’s leap to 2022. The once elusive 8% has been brought to light, thanks to HiFi sequencing, culminating in the first-ever complete “telomere-to-telomere” assembly of a human genome. Such an accomplishment might seem beyond the reach of everyday research, reserved for colossal health programs with budgets to match. But, pause for a second—this is where we reset expectations. The truth is, long-read sequencing isn’t just for the giants of genomics, it’s also for your personal research endeavors into the human genome.

So, let’s clear the air about long-read sequencing and discover how it fits into your research puzzle. This is part two of our six-part myth-busting series, where we dispel common myths and misconceptions about long-read sequencing in human genomics applications. (New to the series? Check out Part 1 – HiFi sequencing here.)

Keep reading to separate the myth from the facts of what’s really possible in human genomics with the power of HiFi long reads!


Myth #1:

The high accuracy of HiFi long reads is unnecessary for clinical and translational research. Short-read data quality is good enough for variant detection and gene expression.

Fact:

This statement is shortsighted.


Researchers understand that accuracy is crucial for all fields of science. In a clinical or translational research setting, the stakes are even higher. Inaccurate results can mean missing underlying causes of disease that can potentially impact lives.

Long-read sequencing offers a measurably superior view of genomes and transcriptomes, outperforming short-read sequencing by a significant margin. The benefits are substantial: with long reads providing a more comprehensive view, capturing the full scope of genetic information. This includes phasing, methylation patterns, and access to previously inaccessible dark regions of the genome.

For human genomes, long-read sequencing reveals more information about variation, identifying a greater number of small variants and substantially more structural variants than conventional short-read approaches. This includes detecting variations within tandem repeats and shedding light on the elusive “dark” 8% of the genome. Moreover, long reads can phase variants into haplotypes, providing a clearer picture of the genetic blueprint by connecting methylation patterns to genetic variation.

For transcriptomes, long-read sequencing offers an exceptional depth of analysis, revealing a complete view of isoform repertoire. It enables accurate characterization of splice variants and fusion transcripts for a more holistic view of gene expression and regulation mechanisms that short-read sequencing might miss. With long-read sequencing, researchers can explore how changes in isoform usages contribute to phenotypic differences between health and disease.

HiFi sequencing accuracy is paramount. It delivers high accuracy even in “dark” regions of the genome, such as GC-rich or repetitive regions. Unlike short-read sequencing technologies, which require additional steps to phase genomes, HiFi sequencing is able to call and phase small and large variants, giving investigators the haplotype information that is crucial for disease research.

Table showing short-read WGS vs HiFi WGS


Myth #2:

PacBio HiFi sequencing is too expensive for large projects.

Fact:

This statement is outdated.


The cost to do HiFi sequencing is not what it was years ago. Now, the workflow is easier and more economical — from sample prep, to library prep and sequencing. With the launch of the Revio system, long-read sequencing is more affordable and accessible at a cost that is comparable to (and often less than) short-read sequencing. HiFi sequencing on the Revio system can sequence whole human genomes at 30× coverage for less than $1000 USD per genome*.

Because the high accuracy of HiFi reads can generate a fuller picture of the genome, HiFi sequencing allows researchers to understand the genetic causes behind disease faster, with a single assay. In contrast, short-read workflows require multiple tests to arrive at an answer, squandering precious time and resources.

Large-scale projects, — such as the All of Us Research Program run by the National Institutes of Health — have recognized that HiFi sequencing provides additional information – and often surpasses the accuracy of short reads – even at lower coverage (10-15×). This approach provides the benefits of long reads at a further reduced cost.

In a study titled Utility of long-read sequencing for All of Us, the authors state that “HiFi reads produced the most accurate results for both small and large variants.”1

“Long-reads have widespread value for establishing the most complete and accurate variant calls for All of Us and potentially for many other projects.”
—Mahmoud et al, 2023


Myth #3:

PacBio HiFi sequencing can’t be scaled up to run large projects or meet the demands of busy core facilities or service providers.

Fact:

This statement is false.


For the Estonia National Biobank, the All of Us program, Children’s Mercy Hospital, and many other organizations, PacBio HiFi sequencing is a perfect fit for scale to get much-needed answers in disease research. The Estonia National Biobank has purchased three Revio systems, allowing them to reach their target of 10,000 genomes in the next two and a half years.

Verified automation partners, high-throughput library prep solutions, and PacBio-compatible protocols give you extra efficiency for large-scale operations. With flexible multiplexing options for many different types of applications, you can run multiple samples on a single SMRT Cell. This enables scale and a more cost-effective per-sample option for high-volume projects. It also offers new capabilities for multiomic research, with the flexibility to program up to four applications at once, meaning you can get genomic, transcriptomic, amplicon, and single-cell RNA data in one sequencing run.2


Myth #4:

The HiFi workflow is too slow and cumbersome for large-scale applications.

Fact:

This statement is false.


HiFi sequencing is now easier than ever on the the Revio system. You can make your human genomics projects faster and more budget-friendly with the new HiFi Prep Kit 96. This kit reduces library prep time by 60% compared to standard SMRTbell prep. With automation you can now prepare 96 libraries for long-read sequencing in just 13 hours.

And, it’s not just the upfront prep that scales. We also offer automated sample analysis, plus bioinformatics pipelines like the PacBio WGS Variant Pipeline, so that you can maximize your time before, during, and after sequencing.

This software pipeline enables researchers to resolve many different variant types, including single-nucleotide polymorphisms, insertions and deletions, structural variants, tandem repeats, segmental duplications, and copy number variants, in addition to providing methylation and phasing data, all in a single bioinformatic solution – making it the most complete human WGS secondary analysis pipeline available. The single computational workflow integrates PacBio and third-party tools, including TRGT, Paraphase, and Google DeepVariant, in an intuitive user interface.

PacBio analysis partners and additional community-supported analysis tools are also available to make it easier for those new to HiFi sequencing to get started.

 


Myth #5:

HiFi long-read sequencing is only for whole genomes.

Fact:

This statement is inaccurate.


While it’s undeniable that genomes offer a vast and powerful spectrum of information, they may not be the primary focus for every project. You might be considering augmenting your genomic data with detailed insights from transcriptomes; this is where the Kinnex isoform detection comes into play, providing a sharper view of RNA sequences. Alternatively, your interests might lean toward targeted sequencing, especially if you’re aiming for deeper coverage of clinically relevant genes or wishing to illuminate the more elusive ‘unmappable’ regions of the genome.

For researchers studying human disease, PacBio offers the PureTarget repeat expansion panel, a panel of clinically relevant genes that delivers amplification-free, unbiased targeted sequencing. PureTarget libraries consist of DNA in its native form, including epigenetic marks. Libraries prepared using this PCR-free method are free from errors and artifacts that can be introduced during PCR amplification.

Legacy workflows for capturing repeat expansions may include laborious Southern blots or multiple repeat-primed PCR assays, which may have high failure rates, requiring labs to re-run samples. PureTarget offers a more streamlined and robust solution for capturing key repeat expansions. With a convenient end-to-end workflow (including analysis) in less than 3 days, PureTarget enables comprehensive capture of clinically relevant repeats in a more efficient manner.

Whether you’re diving into comprehensive genome sequencing, detailed transcriptome analysis, or focused targeted sequencing, the flexibility is in your hands to orchestrate a project that aligns with your scientific pursuits.


Tomorrow’s genomic discoveries start today

PacBio highly accurate HiFi reads are clearly not the bygone long reads of yesterday (or even of 2003). HiFi whole genomes and multiomic data have the potential to replace traditional short-read sequencing for human disease research, leading the way into a brighter tomorrow.

Stay tuned for part three in our series, where we disprove common myths about long-read sequencing in cancer genomics applications. Let the myth-busting continue!

Are you ready to try HiFi?

Learn more about human genomics

References


  1. Mahmoud, M., Huang, Y., Garimella, K. et al. Utility of long-read sequencing for All of Us. Nat Commun 15, 837 (2024). https://doi.org/10.1038/s41467-024-44804-3
  2. Vollger MR, Korlach J, Eldred KC, et al. Synchronized long-read genome, methylome, epigenome, and transcriptome for resolving a Mendelian condition. bioRxiv 2023.09.26.559521; doi: https://doi.org/10.1101/2023.09.26.559521
*US list price is $995 for sequencing reagents for one Revio SMRT Cell, which has an expected yield of 90 Gb, equivalent to a 30× human genome

Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.