Menu

Scientific posters

AMP 2024  |  2024

Improved detection of low frequency mutations in ovarian and endometrial cancers by utilizing a highly accurate sequencing platform

Timothée Revil1 , Nairi Pezeshkian2, Dan Nasko2, Lucy Gilbert1, Alexandra Sockell2 , Jiannis Ragoussis1 1) McGill University, Quebec, QC, Canada, 2) Pacific Biosciences, Menlo Park, CA

Ovarian and endometrial cancers are the 4th highest (combined) cancer killer of Canadian women. In 2020, over 3000 women were diagnosed with an ovarian cancer, of which 75% were in the later stages. The goal of the DOvEEgene (Detecting Ovarian and Endometrial cancer Early using Genomics) project is to detect these cancers as early as the first stage through a low-cost, low invasiveness and widely available test, similar to what the Pap test has done for cervical cancers. In this assay, for each subject, an intra-uterine brush sample is collected along with a saliva sample. The genomic DNA is extracted from both these samples, captured using probes with a total size of 146.46 kb using SureSelect XT HS (see target design), sequenced at 20 million reads to a median DNA fragment depth of at least 80% at 1000x, and deduplicated using UMIs. In parallel, uncaptured libraries are also used for Low-pass whole genome sequencing (LP-WGS). Somatic and copy number variants are called, as well as germline variants for 10 genes, and microsatellite instability (MSI) status is determined for known microsatellite loci within the target region. Separately, clinical MSI testing is performed on each sample using a PCR-based assay. As the ability to detect early stage cancers relies on high sensitivity and specificity, we were interested in testing the PacBio Onso sequencing by binding (SBB) technology which promises much higher sequencing qualities and better performance in homopolymer regions, thus should potentially increase variant detection and MSI calling performance.
AMP 2024  |  2024

Improved liquid biopsy assay performance using sequencing by binding (SBB) on the PacBio Onso system

Dan Nasko1, Phillip Pham1 , Stuti Joshi1, Kristi Kim1 , Nairi Pezeshkian1, Young Kim1 , Alexandra Sockell1, and Jonas Korlach1 1) Pacific Biosciences, Menlo Park, CA

Liquid biopsy is revolutionizing the field of early cancer detection research through non-invasive detection of tumor DNA in the blood. However, existing liquid biopsy assays are limited in their sensitivity for ctDNA detection at low variant allele frequencies (VAFs). Here we describe the application of the PacBio Onso short-read sequencing system to help enable detecti
AMP 2024  |  2024

Targeted long-read sequencing of native DNA for comprehensive characterization of repeat expansions

Sarah B Kingan1, Guilherme De Sena Brandine1, Jocelyne Bruand1, Jeff Zhou1, Valeriya Gaysinskaya1, Janet Aiyedun1, Julian Rocha1, Duncan Kilburn1, Egor Dolzhenko1, Zoi Kontogeorgiou2, Anita Szabo3, Christina Zarouchlioti3, Robert Thaenert4, Pilar Alvarez Jerez5, Kimberley Billingsley5, Sonia Lameiras6, Sylvain Baulande6, Alice Davidson3, Georgios Koutsis7, Georgia Karadima2, Stéphanie Tomé8, Michael A Eberle1 1. Pacific Biosciences (PacBio), Menlo Park, United States, 2. National and Kapodistrian University of Athens, 1st Department of Neurology, Athens, Greece, 3. University College London, Institute of Ophthalmology, United Kingdom, 4. Quest Diagnostics, Marlborough, United States, 5. National Institutes of Health, Center for Alzheimer's and Related Dementias, National Institute on Aging, Bethesda, United States, 6. Institut Curie, PSL Research University, ICGex Next-Generation Sequencing Platform, Paris, France, 7. National and Kapodistrian University of Athens, Neurogenetics Unit, 1st Department of Neurology, Eginition Hospital, School of Medicine, Athens, Greece 8. Sorbonne Université, Inserm, Institut de Myologie, Centre de Recherche en myologie, Paris, France

Short tandem repeats (STRs) are DNA sequences composed of repetitions of 1 – 6 bp motifs. Expansions of STRs are the cause of over 60 monogenic diseases, including Huntington’s disease, fragile X syndrome, and amyotrophic lateral sclerosis1. In addition to their length, the pathogenicity of these STRs can be impacted by sequence composition, methylation status and mosaicism. One such example is the FMR1 repeat whose CGG repeat expansions are typically hypermethylated and where AGG interruption sequences can stabilize the repeat. Detecting all the characteristics associated with pathogenic repeat expansions traditionally required multiple assays, however long-read sequencing of unamplified DNA holds the promise to resolve all these features in a single assay.
ASHG 2024  |  2024

Detection of repeat expansions with PureTarget

M. Eberle1, G. De Sena Brandine2, V. Gaysinskaya2, J. Aiyedun2, J. Rocha3, D. Kilburn2, S. Kingan4, E. Dolzhenko2, Z. Kontogeorgiou5, A. Szabo6, C. Zarouchlioti6, R. Thaenert7, P. Alvarez Jerez8, K. Billingsley8, S. Lameiras9, S. Baulande9, A. Davidson10, G. Koutsis5, G. Karadima5, S. Tome11; 1) PacBio, Oceanside, CA, 2) PacBio, Menlo Park, CA, 3) PacBio, Bel Air, MD, 4) PacBio, San Mateo, CA, 5) Natl. and Kapodistrian Univ. of Athens, Athens, Greece, 6) Univ. Coll. London, London, United Kingdom, 7) Quest Diagnostics, Marlborough, MA, 8) NIH, Bethesda, MD, 9) Inst. Curie, Paris, France, 10) UCL, London, United Kingdom, 11) INSERM, Paris, France

Abstract: Short tandem repeats (STRs) are DNA sequences composed of repetitions of 1-6bp motifs. Expansions of STRs are the cause of over 60 monogenic diseases, including Huntington’s disease, Fragile X syndrome, and amyotrophic lateral sclerosis. In addition to their length, the pathogenicity of these STRs is impacted by sequence composition, methylation status and mosaicism. One such example is a repeat in an intron of the RFC1 gene whose reference sequence consists of a short stretch of AAAAGs while expansions that span hundreds of AAGGGs cause cerebellar ataxia with neuropathy and vestibular areflexia syndrome. Another example is the FMR1 repeat whose expansions are typically hypermethylated. Detecting all the characteristics associated with pathogenic repeat expansions traditionally required multiple assays, however long-read sequencing of unamplified DNA holds the promise to resolve all of the required features in a single assay.

We describe a robust amplification-free protocol to generate long-read HiFi sequencing libraries containing a panel of loci associated with 20 pathogenic STR expansions. The protocol can be multiplexed to sequence 48 samples at up to 1000x coverage per locus in one sequencing run. To assess the accuracy of this protocol, we sequenced 129 samples with validated pathogenic expansions at 20 loci including CNBP, DMPK, RFC1 and C9orf72.

Combined, we tested 2580 sample-expansion combinations, including technical replicates, for expansions between 66 bp and >10kb. Our assay correctly categorized all (129/129) expansions, including the detection of hypermethylation in the FMR1 expansion and differentiating the pathogenic AAGGG motif in RFC1. We identified additional expansions in FXN, RFC1 and TCF4, consistent with these loci having carrier frequencies between 1:50 and 1:20. Excluding these three genes, we found no unexpected expansions (0/2064) in any sample-loci combination.

We will also present a detailed characterization of lengths, sequence composition, mosaicism, and methylation of normal and expanded alleles in 150 genomes. Most repeats we profiled exhibit high genetic or epigenetic polymorphism and also mosaicism at the expanded size ranges. Motivated by these results, we describe a novel computational approach that will capture all these modalities to robustly differentiate between normal and abnormal variation at known pathogenic or any other repeats in the human genome. In summary, we will present a protocol and a set of computational methods for accurately assessing tissue-level molecular landscapes of various pathogenic STRs, which can be further adapted to other loci in the human genome.

ASHG 2024  |  2024

Sawfish: Improving long-read structural variant discovery and genotyping with local haplotype modeling

Christopher T. Saunders, James M. Holt, Daniel N. Baker, Juniper A. Lake, Jonathan R. Belyeu, Zev Kronenberg, William J. Rowell, Michael A. Eberle

We describe sawfish, a structural variant (SV) caller for mapped high-quality long reads. This method emphasizes assembly of local SV haplotypes and their utilization in downstream sample merging and genotyping steps, improving accuracy compared to variant-focused approaches in both individual and joint-genotyping contexts.

Assessing sawfish against the GIAB draft SV benchmark based on the T2T-HG002-Q100 diploid assembly shows substantial accuracy gains compared to pbsv and Sniffles2 on HiFi WGS 33x input, with a sawfish F1 score of 0.971 compared to 0.930 and 0.935 for pbsv and Sniffles2, respectively. This accuracy gain persists at lower depth, for example at 10x depth the sawfish F1 score is 0.937, compared to 0.857 and 0.882 for pbsv and Sniffles2. For SVs in the GIAB Challenging Medically Relevant Genes benchmark, sawfish has a combined false positive and false negative count of 4, compared to 19 and 15 with pbsv and Sniffles2, respectively.

Sawfish also has higher genotype concordance in the Platinum Pedigree (CEPH-1463). Joint-genotyping accuracy was assessed on 10 HiFi WGS samples comprising the 2nd and 3rd pedigree generations, where the known inheritance pattern enables genotype accuracy assessment. From high genotype-quality calls, sawfish yields 27,811 concordant and 4,414 discordant SV alleles (86.3% concordance), where concordant alleles respectively represent 7.8 Mb and 11.9 Mb of deleted and inserted sequence. This substantially improves concordant allele count, length and percent concordance compared to the next most concordant method, Sniffles2, with 20,519 concordant and 7,645 discordant alleles (72.9% concordance), where concordant alleles represent 4.2 Mb and 5.6 Mb of deleted and inserted sequence.

As additional improvements, our assembly-focused approach allows all calls to be made with single-base precision, enabling breakpoint insertion and homology annotation for all SV types. Sawfish also assesses depth of large deletions and duplications to evaluate their consistency with its own expected GC-corrected depth model, improving precision for these large SV types. Through the combination of high genotyping accuracy, detailed breakpoint modeling, and joint assessment of breakpoint evidence with read depth, sawfish offers improved options for WGS sample analysis with high-quality long reads.

ASHG 2024  |  2024

StarPhase: Comprehensive Phase-Aware Pharmacogenomic Diplotyper for Long-Read Sequencing Data

James M. Holt, John Harting, Xiao Chen, Daniel Baker, Nina Gonzaludo, Zev Kronenberg, Christopher T. Saunders, Michael A. Eberle

Introduction: Pharmacogenomics (PGx) is critically important to precision medicine, informing the use of medications at an individual level, improving both safety and efficacy. PGx diplotyping relies on the ability to both accurately detect genomic variation and phase that variation onto distinct haplotypes, commonly referred to as “star (*) alleles”. PacBio HiFi sequencing provides long reads with highly accurate base-calling, enabling variant calling and phasing for both targeted and whole-genome sequencing approaches.

Methods: We developed StarPhase, a phase-aware tool for generating comprehensive PGx diplotype calls from PacBio HiFi sequencing datasets. StarPhase accepts both phased and unphased variant calls from a HiFi sequencing pipeline (e.g., DeepVariant followed by HiPhase) as well as an aligned BAM file to produce PGx diplotypes for 21 genes, including the complex genes HLA-A, HLA-B, and CYP2D6. In contrast to existing tools, StarPhase correctly handles genes that are fully phased as well as ambiguity from unphased variants.

Results: We compared StarPhase diplotype calls to those from other PGx diplotyping tools (PharmCAT, HiFiHLA, Pangu, and Cyrius) and to known diplotypes from GeT-RM. For simple PGx genes, StarPhase has a 98.24% concordance with PharmCAT, and all discrepancies were explained via manual inspection as either differences in reporting or corrections that StarPhase made relative to PharmCat. For HLA-A and HLA-B, all StarPhase results were 100% concordant with the assembly-based results of HiFiHLA. Finally, StarPhase calls for CYP2D6 were 100% concordant for whole genome sequencing datasets, and 96% concordant for targeted sequencing. All discrepancies in the targeted sequencing were explained through either ambiguity in GeT-RM, errors in the comparator tool, or low coverage of hybrid alleles in the raw data (likely due to reduced capture rate). We further demonstrated the utility of StarPhase by applying it to CEPH pedigree 1463, consisting of 27 whole genome sequencing PacBio HiFi datasets from four generations. In this analysis, all diplotype calls across all 21 genes were inherited consistent with the pedigree.

ASHG 2024  |  2024

Visualize complex structural variants in HiFi data with SVTopo

Jonathan R Belyeu, William J Rowell, Juniper Lake, James M Holt, Zev N Kronenberg, Christopher T Saunders, Michael A Eberle

Structural variants (SVs) are alleles that differ from the reference genome by at least 50 nucleotides. SVs are common in the human genome and play a major role in both phenotypic diversity and human health. Many are deletions or duplications of genomic material resulting from a single non-reference end-joining event. These are easily identified and visualized from high quality long reads by existing software tools. Other SVs, which we define as complex SVs, are less easily categorized and remain difficult to interpret.

Complex SVs often lead to convoluted signals in both coverage and break-ends. Complex SVs may combine multiple copy-number alterations in tandem with duplicated inversions, creating genomic rearrangements that may visually appear as several nearby changes in coverage. Inversion events, often appearing with inconsistent coverage and mapping abnormalities, can be difficult to visualize with popular genome browsers in the best of cases but are even more challenging when appearing in tandem with CNVs.

SVTopo addresses the challenge with a dedicated complex variant plotting approach. It uses haplotagged HiFi reads to identify genome alignment breakpoints relative to a reference genome, connects these into multi-locus SVs via shared chimeric alignments, and presents the supporting evidence in easily understood images.

In an analysis of complex SVs within the Platinum Pedigree (CEPH-1463, a family of 28 samples), SVTopo characterized 469 distinct complex SVs. 34 were solitary inversion events, and 112 were triplications. An additional 87 inversions were found with a flanking deletion on one or both sides, and 112 inversions were found with other complexities such as duplications of nearby sequences, the inverted sequence, or both. SVTopo also found 124 other SVs, such as non-tandem duplications, deletions followed by insertion of a non-tandem duplication, paired deletion/duplication events, and re-ordering of multiple genomic blocks. In many of these cases, SV callers using high quality long reads are able to identify individual SV components, but variant interpretation with SVTopo clarifies multiple calls into a single complex rearrangement.

The prevalence and complexity of these variants in many samples without genetic disorders, makes them an important but challenging target of human genomic research, highlighting both the importance of long reads for their identification and the role of targeted software tools like SVTopo to better understand the signals they produce.

ASM 2024  |  2024

An Integrated Approach for Pathogen Detection, AMR Monitoring, and Functional Analysis in Wastewater

X. Cheng1, J. Wilkinson2, K. Ngo2, P. Baybayan2, Y. Kim2, P. Pham2, E. Carrasco1, S. Tang1, J. Shen1, and K. Locken1, Zymo Research Corporation1, Pacific Biosciences of California, Inc.2

Wastewater surveillance has emerged as a valuable epidemiological instrument in public health. In this study, we introduce an innovative, integrated methodology for the concurrent detection of pathogens, monitoring of antimicrobial resistance (AMR), and functional analysis in wastewater. Leveraging the advanced sample preparation solutions provided by Zymo Research and PacBio Onso short-read sequencing, we aimed to enhance our understanding of microbial dynamics in wastewater and provide a report for actionable insights for public health and water treatment facilities. Real wastewater samples from local treatment facilities were processed using Zymo Research nucleic acid purification technologies. The PacBio Onso short-read sequencing system, which yields Q40+ accuracy, captured a comprehensive microbial profile. Metagenomic sequence data was downsampled to 8 million reads per sample to be comparable to data obtained using the Illumina NextSeq 2000 system at the same depth, then analyzed using the Zymo wastewater analysis pipeline. By comparing the number of taxa, functional groups, and AMR, this study shows the potential of the high accuracy short-read PacBio Onso system for wastewater surveillance. The outcomes of this study offer multifaceted benefits for public health departments and water treatment facilities. Accurate pathogen detection enables the prediction of potential disease outbreaks, empowering public health authorities to implement proactive measures. AMR monitoring provides crucial insights into resistance gene prevalence, informing strategies against the spread of antibiotic resistance. Functional analysis delves deeper into the intricacies of microbial communities within wastewater, specifically highlighting Nitrogen removal species, Phosphorus-accumulating organisms (PAOs), Methylotrophs, Filamentous bacteria, and pathogens. This nuanced understanding empowers water treatment facilities to customize effective strategies, optimizing processes for contaminant removal and water safety. This research signifies a significant stride in proactive public health management and water treatment process optimization for enhanced environmental and public well-being.
ASM 2024  |  2024

Long-read metagenome assembly produces hundreds of high-quality MAGs from wetland soil

Daniel M. Portik1, Luis E. Valentin-Alvarado2,3, Jeremy E. Wilkinson1, Jillian F. Banfield2 1. PacBio, 1305 O’Brien Dr, Menlo Park, California 93025 USA 2. Innovative Genomics Institute, University of California, Berkeley, California 94720 USA 3. Department of Plant and Microbial Ecology, University of California, Berkeley, California USA

Long-read sequencing has revolutionized metagenome assembly, overcoming many historical challenges. New metagenome assembly algorithms have been designed specifically for PacBio HiFi reads, which take advantage of high read accuracy (>Q20). These methods, including hifiasm-meta and metaMDBG, routinely produce long, circular contigs that represent complete bacterial and archaeal genomes. Following assembly, long-read-specific binning workflows can be used to identify metagenome-assembled genomes (MAGs) from the assembled contigs. The combination of new HiFi assembly methods and long-read binning workflows dramatically increases the number and quality of MAGs obtained. We demonstrate the power of these approaches by performing HiFi sequencing for a wetland soil sample on the Revio system. Our analysis resulted in over 1,200 MAGs, including more than 500 high-quality MAGs. A majority of MAGs were assigned to Bacteria, predominantly Acidobacteria, but also included representatives from many understudied taxa. Our genome-centric analysis unveiled a diverse and abundant archaeal presence in the deeper layers of seasonally inundated wetland soil. We successfully reconstructed complete genomes for representative lineages of many archaeal groups, including Asgard Archaea, Bathyarchaeia, Thaumarchaeota, Hadearchaeales, Methanomethyliales, and Micrarchaeia, along with their integrated and coexisting extrachromosomal elements (ECEs). Our findings underscore the importance of complete genomes in metabolic analysis and the identification and characterization of novel ECEs that contribute to genomic diversity and facilitate horizontal gene transfer. These results enhance our understanding of soil archaeal evolution. Furthermore, they provide valuable insights for future studies aiming to exploit the potential of elusive mobile genetic elements for editing uncultivated species in microbial communities.
ASM 2024  |  2024

Microbiome species profiling at scale with the Kinnex kit for full-length 16S rRNA sequencing

Jeremy E Wilkinson1, Jocelyne Bruand1, Khi Pin Chua1, Heather Ferrao1, Davy Lee1, Kristopher Locken2, Shuiquan Tang2, Ethan Thai2, John Sherman2, Brett Farthing2, Elizabeth Tseng1 1. PacBio, 1305 O’Brien Drive, Menlo Park, CA, USA 94025 2. Zymo Research Corporation, 17062 Murphy Avenue, Irvine, CA, USA 92614

Targeted 16S sequencing is a cost-effective approach for assessing the bacterial composition of metagenomic communities. This is especially true for low bacterial biomass samples where amplicon sequencing is the best option. However, the high similarity between the 16S rRNA genes of related bacteria means that sequencing the entirety of the 16S gene (~1.5 kb) with high accuracy is essential for species- or strain-level characterization. Recent comparative studies have shown that PacBio full-length (FL) 16S sequencing outperforms other sequencing methods for taxonomic resolution and data accuracy. The Kinnex 16S rRNA kit takes amplified 16S amplicons as input and outputs a sequencing-ready library that results in an up to 12-fold throughput increase compared to standard FL 16S libraries. The Kinnex 16S kit is based on the multiplexed array sequencing (MAS-Seq) method (Al’Khafaji et al., 2023) applied to FL 16S amplicons. The result is significantly higher throughput and reduced sequencing needs for high accuracy, cost-effective FL 16S sequencing with the ability to multiplex up to 1,536 amplicon samples per SMRT Cell. We tested the Kinnex 16S rRNA kit on a diverse range of samples (13 types) including mock communities, feces, skin swab, plant, veterinary wound swab, soil, vaginal swab, rhizosphere, and wastewater sludge. We then analyzed the data using a user-friendly bioinformatics pipeline, HiFi-16S-workflow, that provides a FASTQ-to-report analysis solution for FL 16S HiFi reads. The results show that Kinnex 16S sequencing can yield >30k average reads per sample at a 1,536-plex on a single Revio SMRT Cell or at a 768-plex on a Sequel IIe SMRT Cell. Comparing Kinnex 16S to standard FL 16S datasets, we found a high correlation and no bias in community compositions and were able to assign up to ~99% of denoised reads to species. In addition, because of the higher number of reads per sample, Kinnex 16S allows for more recovery of lower abundance species. With the Kinnex 16S rRNA kit, researchers may now multiplex more samples to dramatically reduce cost per sample or to profile each sample deeper with more reads/sample. The additional reads/sample along with better taxonomic resolution is advantageous for numerous environmental sample types which are often highly diverse, containing many microbial species.
ASM 2024  |  2024

New long-read metagenome assembly methods increase the number of high-quality MAGs from host and environmental microbiomes

Daniel J. Nasko1, Jeremy E. Wilkinson1, Daniel M. Portik1 1. PacBio, 1305 O’Brien Dr, Menlo Park, California 93025 USA

There are many challenges involved with metagenome assembly, including the presence of multiple species, uneven species abundances, and conserved genomic regions that are shared across species. Many of these historical challenges can be overcome using long-read sequencing. In particular, highly accurate PacBio HiFi reads generated from the Sequel IIe and Revio systems can provide major advantages for metagenome assembly. New metagenome assembly algorithms have been designed specifically for HiFi reads, including hifiasm-meta and metaMDBG, which take advantage of this high read accuracy. These algorithms are capable of producing long, circular contigs which represent complete bacterial or archaeal genomes. Long-read specific binning workflows, such as the HiFi-MAG-Pipeline, can be used to process the assemblies in order to extract metagenome assembled genomes (MAGs). The combination of new HiFi assembly methods and long-read binning workflows improves the total number of MAGs obtained, while also improving the completeness and contiguity of individual MAGs. We demonstrate the power of using these approaches for a variety of microbiomes. We show that more complete, circular MAGs are routinely produced from HiFi metagenome assemblies. We find metaMDBG results in up to a 200% increase in total MAGs, relative to hifiasm-meta. It also produces more small (i.e. <200 Kbp) circular contigs that likely represent plasmids, viruses, and other mobile elements. Overall, we demonstrate that HiFi sequencing can be used to obtain many high-quality MAGs from a variety of microbiomes.
ESHG 2024  |  2024

Assessment of read depth requirements for gene and isoform discovery: a comparative study of long-read and short-read RNA sequencing data in human heart and brain

Nina Gonzaludo*1, Jocelyne Bruand1, Amy Klegarth1, Jason Underwood1, Elizabeth Tseng1, Birth Defects Research Laboratory2, Kimberly A. Aldinger3 1. PacBio, Menlo Park, CA, USA, 2. University of Washington, Seattle, WA, USA, 3. Seattle Children’s Research Institute, Seattle, WA, USA

The PacBio full-length Kinnex RNA kit provides complete transcript coverage for isoform and gene discovery in tissues of interest, enabling understanding of biology and disease. With Kinnex, fewer long reads are needed for gene discovery compared to shortread RNA seq. The majority of known genes and isoforms can be discovered using full-length Kinnex kits at 10-20M reads per sample, suggesting multiplexing may be a cost-effective yet comprehensive option.
ESHG 2024  |  2024

PureTarget: An amplification-free workflow for genetic and epigenetic profiling of short tandem repeat expansions

Guilherme De Sena Brandine, Valeriya Gaysinskaya, Janet Aiyedun, Julian Rocha, Duncan Kilburn, Sarah Kingan, Egor Dolzhenko, Zoi Kontogeorgiou, Anita Szabo, Christina Zarouchlioti, Robert Thaenert, Fabio Fuligni, Aidan Hennigan, Chelsea Roselund, Alesia Piselli, Pilar Alvarez Jerez, Kimberley Billingsley, Sonia Lameiras, Sylvain Baulande, Petra Liskova, Alice Davidson, Georgios Koutsis, Georgia Karadima, Stéphanie Tomé, Michael Eberle

Background/Objectives: Short tandem repeats (STRs) are DNA sequences composed of multiple copies of 1-6bp motifs. STRs are ubiquitous in the human genome, and some are prone to expansions that cause disease. Many pathogenic STR expansions are variable in length and sequence composition, both at the individual and population scale. Motif composition, sequence length and CpG methylation often dictate disease onset and severity. An ideal approach to identifying pathogenic STR expansions requires accurate assessments of all these properties in a single assay. Methods: We describe a robust amplification-free protocol to generate long-read HiFi sequencing libraries containing a panel of loci associated with 20 pathogenic STR expansions. The protocol can be multiplexed to sequence up to 48 samples at 20 to 1000x coverage per locus in one PacBio sequencing run. We measured accuracy in 129 samples with validated pathogenic expansions at loci including CNBP, DMPK, RFC1 and C9orf72. We tested 1720 sample-expansion combinations, including technical replicates, for expansions between 66bp and >10kb. Results: Our assay correctly categorized all (129/129) expansions, including the detection of hypermethylation in the FMR1 expansion and differentiating the pathogenic AAGGG motif in RFC1. We identified additional expansions in FXN, RFC1 and TCF4, consistent with these loci having carrier frequencies between 1:50 and 1:20. Excluding these three genes, we found no unexpected expansions (0/2193) in any sample/loci combinations. Conclusions: The protocol provides an accurate description of the tissue-level molecular landscape of various pathogenic STRs and is adaptable to other loci in the human genome.
AACR 2024  |  2024

Improved detection of low frequency mutations in ovarian and endometrial cancers by utilizing a highly accurate sequencing platform

T. Revil1, N. Pezeshkian2, L. Gilbert1, A. Sockell2, J. Ragoussis1; 1) McGill University, Quebec, QC, Canada, 2) PacBio, Menlo Park, CA

Ovarian and endometrial cancers come within the top-4 for incident cancers as well as deaths in North American women. Cure rates have not improved in 30 years as high-grade subtypes continue to be diagnosed in Stage III/IV. Attempts at early diagnosis have failed because high-grade cancer cells exfoliate and metastasize while the primary cancer is small and undetectable by existing tests based on imaging and blood-based tumor markers. DOvEEgene (Detecting Ovarian and Endometrial cancers Early using genomics) is a genomic uterine pap test developed by a McGill team to screen and detect these cancers while they are confined to the gynecologic organs and curable by surgery. The test identifies pathogenic somatic mutations in uterine brush samples. Here we tested the Onso system, a highly accurate sequencing technology from PacBio in order to potentially increase sensitivity while driving down sequencing costs by reducing required sequencing depth vs the current NGS standard. A highly sensitive, error-correcting capture technology (DOvEEgene-SureSelectHS) utilizing duplex error correction sequencing interrogated the exons of 23 genes involved in the development of sporadic and hereditary ovarian and endometrial cancers. We applied a combination of germline gene panel testing on saliva samples with deep duplex sequencing to detect somatic mutations at <0.1% VAF, interrogation of microsatellite loci for instability, and low coverage WGS for copy number analysis of uterine brush samples. We sequenced 20 duplex Illumina sequencing libraries produced using the DovEE assay at PE 100bp mode and compared Onso data in non- duplex sequencing mode as well as duplex sequencing mode to the original Illumina duplex sequencing method. Here, we present this comparison and highlight the benefits of high accuracy sequencing for the detection of very low frequency (<0.1%) somatic mutations. We observed improved mismatch rates for Onso data compared to Illumina, even after duplex error correction was applied. In addition, we found that more individuals are called as displaying microsatellite instablity from the Onso data, which may be due to improved sequencing performance in repetitive regions for Onso. Finally, we observed fewer potential false positive variant calls in the Onso data, highlighting the value of improved sequencing accuracy for rare variant detection.
AACR 2024  |  2024

Improved liquid biopsy assay performance using sequencing by binding (SBB) on the Onso system

D. Nasko, P. Pham, S. Joshi, K. Kim, N. Pezeshkian, Y. Kim, A. Sockell, J. Korlach; PacBio, Menlo Park, CA

Liquid biopsy is revolutionizing the field of early cancer detection research through non-invasive detection of tumor DNA in the blood. However, existing liquid biopsy assays are limited in their sensitivity for ctDNA detection at low variant allele frequencies (VAFs), with most relying on extreme sequencing depth and computational error correction to separate the true ctDNA signal from background errors. This limitation is particularly problematic in the area of early cancer detection, in which expected ctDNA allele frequencies are extremely low. Novel strategies are therefore needed to help improve liquid biopsy assay sensitivity and reduce per-sample sequencing requirements. Here we describe PacBio’s application of the Onso short-read sequencing system to enable detection of ctDNA at low VAFs using the SeraCare Complete ctDNA Mutation Mix reference standard. The Onso system makes use of a novel sequencing by binding (SBB) method to achieve up to 15x greater quality scores, with ≥90% of reads at Q40 or above. We performed targeted capture and sequencing of libraries prepared from the SeraCare reference mix diluted into WT human DNA at the following VAFs: 0.00% (WT), 0.05%, 0.10%, 0.25%, and 0.50%, and compared the sensitivity at each VAF for SBB compared to a competitor method using sequencing by synthesis (SBS) at varying sequencing depths. We observed superior sensitivity for ctDNA detection at low VAFs (0.05%, 0.1%) using SBB at half the sequencing depth compared to SBS, in part due to reduced false positive calling in the WT sample for SBB. Furthermore, SBB was able to achieve comparable sensitivity results to SBS using four-fold less sequencing, and without the use of computational error correction. Finally, combining SBB with computational error-correction methods boosted sensitivity even further, suggesting an additive value for these technologies. Taken together, our results demonstrate the potential of SBB to improve upon existing methods of liquid biopsy and better enable research on early cancer detection.
Quick search

Quick search is faster but may return fewer results.

Advanced search

Advanced search allows you to search more fields but may take longer.

Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.