Scientific publications

Publications featuring PacBio long-read + short-read sequencing data

Molecular Neurodegeneration | 2025

Entering the era of precision medicine to treat amyotrophic lateral sclerosis

Frances Theunissen, Loren Flynn, Alfredo Iacoangeli, Ahmad Al Khleifat, Ammar Al-Chalabi, James J. Giordano, Masha Strømme & P. Anthony Akkari

We address the advances in our understanding of the complex genetic architecture of ALS, including the varying models of genetic contribution to disease, and the importance of understanding population genetics and genetic testing when considering patient selection for clinical studies. Additionally, we discuss the advances in long-read whole-genome sequencing technology and how this method can improve streamlined genetic testing and our understanding of the genetic heterogeneity in ALS.

bioRxiv | 2025

A telomere-to-telomere map of somatic mutation burden and functional impact in cancer

Min-Hwan Sohn, Danilo Dubocanin, Mitchell R Vollger, Youngjun Kwon, Anna Minkina, Katherine M Munson, Samuel FM Hart, Jane E Ranchalis, Nancy L Parmalee, Adriana E Sedeño-Cortés, Jeffrey Ou, Natalie YT Au, Stephanie Bohaczuk, Brianne Carroll, Christian D Frazar, William T Harvey, Kendra Hoekzema, Meng-Fan Huang, Caitlin N Jacques, Dana M Jensen, J Thomas Kolar, Rosa Lee, Jiadong Lin, Kelsey Loy, Taralynn Mack, Yizi Mao, Meranda M Pham, Erica Ryke, Joshua D Smith, Lila Sutherlin, Elliott G Swanson, Jeffrey M Weiss, SMaHT Assembly WG, Claudia Carvalho, Tim HH Coorens, Kelley Harris, Chia-Lin Wei, Evan E Eichler, Nicolas Altemose, James T Bennett, Andrew B Stergachis

Oncogenesis involves widespread genetic and epigenetic alterations, yet the full spectrum of somatic variation genome-wide remains unresolved. These findings define the full landscape of a cancer’s somatic variation and their functional impact, establishing a blueprint for T2T studies of mosaicism.

bioRxiv | 2025

Long-read sequencing reveals extensive FMR1 somatic mosaicism in Fragile-X associated tremor/ataxia syndrome in human brain

Anna Dischler, Akshay Avvaru, Susana Lopez-Ignacio, Cristina Lau, Martin W. Breuss, Verónica Martínez Cerdeño, Harriet Dashnow, Caroline M. Dias

This work provides new insight into the extensive molecular variation underlying FXTAS in human brain and establishes a framework for studying repeat expansion disorders more broadly, highlighting the potential of long-read sequencing to advance our fundamental understanding of somatic mosaicism of these intractable regions of our genome.

medRxiv | 2025

Population-scale Long-read Sequencing in the All of Us Research Program

Kiran V Garimella, Qiuhui Li, Julie Wertz, Samuel K Lee, Fabio Cunial, Yongqing Huang, Yulia Mostovoy, Ryan Lorig-Roach, Adam English, Hang Su, Shawn Levy, Donna M Muzny, Chelsea Berngruber, Matt C Danzi, William T Harvey, Emily L LaPlante, Karynne Patterson, Allison N Rozanski, Sophie Schwartz, Beri Shifaw, Yuanyuan Wang, Isaac Wong, Isaac R. L. Xu, Shadi Zaheri, Stephan Zuchner, Xinchang Zheng, Shannon Dugan-Perez, Michal Izydorczyk, Heer Mehta, Richard A Gibbs, Lee Lichtenstein, Namrata Gupta, Niall Lennon, Stacey Gabriel, All of Us Research Program Long Read Working Group, Winston Timp, Kimberly F Doheny, Tara Dutka, Anjene Musick, Chia-Lin Wei, Fritz J Sedlazeck, Michael C Schatz, Michael E Talkowski, Evan E Eichler

The All of Us Research Program (AoU) is a national biobank seeking to enroll one million individuals in the United States to link genomic and biomedical data, including short- and long-read whole-genome sequencing (srWGS/LRS), with rich electronic health record (EHR) information. Here, we present the first large-scale analyses of long-read sequencing (LRS) in AoU and offer a new framework for deriving genomic insights into complex structural variation (SV) of relevance to human health and disease.

JAMA Pediatrics | 2025

Clinical long-read sequencing test for genetic disease diagnosis

Isabelle Thiffault, PhD¹, Emily Farrow, PhD¹; Cassandra Barrett, PhD² et al. 1 Department of Pathology and Laboratory Medicine, Children’s Mercy Kansas City, Kansas City, Missouri 2 Division of Clinical Genetics, Department of Pediatrics, Children’s Mercy Kansas City, Kansas City, Missouri 3 Genomic Medicine Center, Department of Pediatrics, Children’s Mercy Kansas City, Kansas City, Missouri

This landmark study demonstrates how HiFi sequencing can transform pediatric disease discovery by delivering 10% higher success over all prior testing methods, helping to provide families with results in <1 month vs. 3, and reducing the need for multiple stressful and costly rounds of testing.

MedRxiv | 2025

Expanded map of genomic imprinting reveals insight into human disease

Craig Smail, Warren A. Cheung, Boryana Koseva, Adam F. Johnson, Chengpeng Bi, Carl F. Schreck, Michael Lydic, Kristin Holoch, Elena Repnikova, John Herriges, Courtney Marsh, Isabelle Thiffault, Tomi Pastinen, Elin Grundberg

Alle-specific methylation is often underappreciated but plays a crucial role in understanding what drives development and disease. Short reads miss most of this signal (up to 60%), proving methylation isn’t just extra data, it’s essential for discovery. With HiFi, you get it automatically with every genome.

bioRxiv | 2025

RNA splicing dynamics in CD8 T cells uncovers isoforms that impact T cell-mediated cancer immunotherapy

Shay Tzaban, Priyanga Appasamy, Elad Zisman, Shiri Klein, Reyut Lewis, Houlin Yu, Akanksha Khorgade, Marc A. Schwartz, Moshe Sade-Feldman, Thomas Eisenhaure, Oren Parnas, Aron Popovtzer, Cyrille Cohen, Eric Shifrut, Aziz M. Al’Khafaji, Rotem karni, Galit Eisenberg, Nir Hacohen, Michal Lotem

This study shows the power of combining HiFi long-read sequencing with single-cell resolution to map isoform usage in human CD8⁺ T cells. By capturing dynamic splicing programs that short-read methods often miss, the team not only redefined T cell states, but also uncovered novel immune checkpoints with therapeutic potential. This work lays the foundation for isoform-selective immunotherapies that are anchored in discovery, validated in vivo, and guided by a generalizable single-cell framework.

bioRxiv | 2025

Intron retention regulates STAT2 function and predicts immunotherapy response in lung cancer

Ryan P. Englander, Mattia Brugiolo, Te-Chia Wu, Mitch Kostich, Nathan K. Leclair, SungHee Park, Jacques Banchereau, Peter Yu, Andrew Salner, Romain Banchereau, Karolina Palucka, Olga Anczuków

The Iso-seq method doubled the number of isoforms detected compared to short-read RNA-seq, revealing splicing events with direct relevance to cancer immunotherapy. The authors note that long-read RNA sequencing “may hold the key to breakthroughs that lead to the next wave of therapeutic advances.” With current workflows using Kinnex kits and the Revio system, such insights can now be achieved at scale and reduced cost, requiring only half a SMRT Cell to generate equivalent data compared to 12 SMRT Cells in the study.

Frontiers | 2025

The rare hemoglobin variants Hb O-Arab and Hb D-Punjab identified in population-based genetic screening throughout Guangxi, China

Chunrong Gui^1,2†, Zifeng Cheng^3†Yongsheng Chen^4†Yunting MaYunting Ma³, Hongfei Chen³,Wei Wei^1,2, Xianda Wei^1,2, Juliang Liu^1,2, Xu Zhou³, Qianqian Du⁵, Yinghui Lai^4*, Baoheng Gui^1,2,3* 1 Center for Medical Genetics and Genomics, The Second Affiliated Hospital of Guangxi Medical University, Nanning, China 2 The Guangxi Health Commission Key Laboratory of Medical Genetics and Genomics, The Second Affiliated Hospital of Guangxi Medical University, Nanning, China 3 The Second School of Medicine, Guangxi Medical University, Nanning, China 4 Department of Hematology, The Second Affiliated Hospital of Guangxi Medical University, Nanning, China 5 Berry Genomics Corporation, Beijing, China

Finding these rare variants matters. Many aren’t included on standard screening panels, which means families could miss crucial information about their health and risk of developing diseases like thalassemia. By uncovering what other technologies miss, PacBio sequencing can help empower answers for families in high-risk regions.

Springer Nature Link | 2025

First clinical diagnosis of FAME3 via commercial long-read sequencing reveals mosaic repeat expansion in MARCHF6 gene

B. Lakshitha A. Perera, Russell Stewart, Yutaka Furuta, Kimberly M. Ezell, Lynette Rives, Bethany Nunley, Ashley McMinn, Alyson Krokosky, Serena Neumann, Mary E. Koziura, the Undiagnosed Diseases Network, Rizwan Hamid, Joy D. Cogan, Thomas A. Cassini, Eric R. Gamazon, John A. Phillips III & Rory J. Tinker

HiFi sequencing revealed the complex repeat expansion structure driving disease that were missed by conventional testing. The study highlights the value of long-read sequencing for repeat expansion disorders, as well as the ability of TRGT-instability to detect mosaicism at single-molecule resolution. On Revio and Vega, HiFi WGS combined with TRGT tools provides researchers and clinicians with a powerful framework for understanding difficult repeat-associated conditions.

PLOS One | 2025

A comparison of DNA methylation detection between HiFi sequencing and whole-genome bisulfite sequencing in monozygotic twins with Down syndrome

Kanyanee Promsawan,Chalurmpon Srichomthong,Monnat Pongpanich ,Vorasuk Shotelersuk

Because HiFi sequencing interrogates the entire genome, it also delivers a more complete view of the epigenome, capturing millions of additional CpG sites and enabling de novo methylation analysis without chemical conversions or extra workflows. With 5-base sequencing available directly on Revio and Vega, researchers can integrate methylation profiling into every run as part of standard HiFi data.

MedRxiv | 2025

Whole-genome variant detection in long-read sequencing data from ultra-low input patient samples

Katherine Wang, Hayan Lee, Cera J. Aex, Lucas Finot, Kevin Zhu, Julianna R. Chang, Aaron M. Horning, William J. Rowell, Philip Li, Sarah B. Kingan, Michael P. Snyder, Graham S. Erwin

This work demonstrates how ultralow-input HiFi sequencing expands access to clinically relevant samples, enabling comprehensive variant detection even when DNA is limited. The protocol has since been refined into the Ampli-Fi protocol, which reduces input requirements further to just 1 ng. Combined with flexible library prep on Revio and Vega systems, HiFi sequencing now offers exceptional versatility while maintaining its hallmark accuracy and completeness.

Nature Communications | 2025

A draft UAE-based Arab pangenome reference

Nassir, N., Almarri, M.A., Kumail, M. et al. A draft UAE-based Arab pangenome reference.

Pangenomes provide a robust and comprehensive portrayal of genetic diversity in humans, but Arab populations remain underrepresented. We present a preliminary UAE-based Arab Pangenome Reference (UPR) utilizing 53 individuals of diverse Arab ethnicities residing in the United Arab Emirates. We assembled nuclear and mitochondrial pangenomes using 35.27X high-fidelity long reads, 54.22X ultralong reads and 65.46X Hi-C reads. This approach yielded contiguous haplotype-phased de novo assemblies of exceptional quality, with an average N50 of 124.28 Mb. We discovered 111.96 million base pairs of previously uncharacterized euchromatic sequences absent from existing human pangenomes, the T2T-CHM13 and GRCh38 reference human genomes, and other public datasets. Moreover, we identified 8.94 million population-specific small variants and 235,195 structural variants within the Arab pangenome, not present in linear and pangenome references and public datasets. We detected 883 gene duplications, including the TATA-binding protein gene TAF11L5, which was uniquely duplicated across all Arab populations and that included 15.06% of genes associated with recessive diseases. By exploring the mitochondrial pangenome, we identified 1,436 bp of previously unreported sequences. Our study provides a valuable resource for future genetic research and genomic medicine initiatives in Arab population and other population with similar genetic backgrounds.

medRxiv | 2025

Long Read Genome Sequencing Elucidates Diverse Functional Consequences of Structural and Repeat Variation in Autism

Milad Mortazavi, James Guevara, Joshua Diaz, Stephen Tran, Helyaneh Ziaei Jam, Sergey Batalov, Matthew Bainbridge, Aaron D. Besterman, Melissa Gymrek, Abraham A. Palmer, Jonathan Sebat

Long-read whole genome sequencing (LR-WGS) technologies enhance the discovery of structural variants (SVs) and tandem repeats (TRs). Application of LR-WGS has potential to identify novel risk factors that contribute to autism spectrum disorder (ASD). We performed LR-WGS on 243 individuals from 63 ASD families and generated an integrated call set combining long- and short-read data. LR-WGS increased detection of gene-disrupting SVs and TRs by 29% and 38%, respectively, and enabled identification of novel exonic de novo germline and somatic SVs that were not detected previously with short read WGS. We observed complex SV patterns, including a previously undescribed class of nested duplication-deletion (DUP-DEL) events. Joint analysis of phased TRs and methylation data revealed that hypermethylation of expanded FMR1 alleles (≥35 CGG repeats) in females occurs independently of X chromosome inactivation. Rare SVs, TRs, and damaging SNVs together accounted for 6.2% (95% CI: 1.7–15%) of the heritability of ASD in this sample. These findings demonstrate how LR-WGS can resolve complex genetic variation and its functional consequences and regulatory effects in a single assay.

medRxiv | 2025

Pangenome discovery of missing autism variants

Yang Sui, Jiadong Lin, Michelle D. Noyes, Youngjun Kwon, Isaac Wong, Nidhi Koundinya, William T. Harvey, Mei Wu, Kendra Hoekzema, Katherine M. Munson, Gage H. Garcia, Jordan Knuth, Julie Wertz, Tianyun Wang, Kelsey Hennick, Druha Karunakaran, Rafael A. Polo Prieto, Rebecca Meyer-Schuman, Fisher Cherry, Davut Pehlivan, Bernhard Suter, Jonas A. Gustafson, Danny E. Miller, Human Pangenome Reference Consortium (HPRC), Hanna Berk-Rauch, Tomasz J. Nowakowski, Aravinda Chakravarti, Huda Y. Zoghbi, Evan E. Eichler

Autism spectrum disorders (ASDs) are genetically and phenotypically heterogeneous and the majority of cases still remain genetically unresolved. To better understand large-effect pathogenic variation, we generated long-read sequencing data to construct phased and near-complete genome assemblies (average contig N50=43 Mbp, QV=56) for 189 individuals from 51 families with unsolved cases of autism. We applied read- and assembly-based strategies to facilitate comprehensive characterization of de novo mutations (DNMs), structural variants (SVs), and DNA methylation profiles. Merging common SVs obtained from long-read pangenome controls, we efficiently filtered >97% of common SVs exclusive to 87 offspring. We find no evidence of increased autosomal SV burden for probands when compared to unaffected siblings yet note a trend for an increase of SV burden on the X chromosome among affected females. We establish a workflow to prioritize potential pathogenic variants by integrating autism risk genes and putative noncoding regulatory elements defined from ATAC-seq and CUT&Tag data from the developing cortex. In total, we identified three pathogenic variants in TBL1XR1, MECP2, and SYNGAP1, as well as nine candidate de novo and biparental homozygous SVs, most of which were missed by short-read sequencing. Our work highlights the potential of phased genomes to discover complex more pathogenic mutations and the power of the pangenome to restrict the focus on an increasingly smaller number of SVs for clinical evaluation.

Scientific publications

Publications featuring PacBio long-read + short-read sequencing data

Entering the era of precision medicine to treat amyotrophic lateral sclerosis

A telomere-to-telomere map of somatic mutation burden and functional impact in cancer

Long-read sequencing reveals extensive FMR1 somatic mosaicism in Fragile-X associated tremor/ataxia syndrome in human brain

Population-scale Long-read Sequencing in the All of Us Research Program

Clinical long-read sequencing test for genetic disease diagnosis

Expanded map of genomic imprinting reveals insight into human disease

RNA splicing dynamics in CD8 T cells uncovers isoforms that impact T cell-mediated cancer immunotherapy

Intron retention regulates STAT2 function and predicts immunotherapy response in lung cancer

The rare hemoglobin variants Hb O-Arab and Hb D-Punjab identified in population-based genetic screening throughout Guangxi, China

First clinical diagnosis of FAME3 via commercial long-read sequencing reveals mosaic repeat expansion in MARCHF6 gene

A comparison of DNA methylation detection between HiFi sequencing and whole-genome bisulfite sequencing in monozygotic twins with Down syndrome

Whole-genome variant detection in long-read sequencing data from ultra-low input patient samples

A draft UAE-based Arab pangenome reference

Long Read Genome Sequencing Elucidates Diverse Functional Consequences of Structural and Repeat Variation in Autism

Pangenome discovery of missing autism variants

Keyword search

Author search

Year search

Talk with an expert

Antimicrobial resistance research

Scientific publications

Publications featuring PacBio long-read + short-read sequencing data

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Keyword search

Author search

Year search

Talk with an expert