PacBio HiFi sequencing technology continues to be the tool of choice for genomics professionals working at the forefront of discovery, enabling them to pursue new avenues of exploration across diverse domains of biology.
In this edition of our Powered by PacBio blog series, we highlight scientific papers from the month of January 2024. These intriguing publications highlight the power of PacBio sequencing for unraveling the complexities of microbiomes, pangenomes, cancer transcriptomes, and more.
Jump to topic:
Metagenomics Tandem repeats Agrigenomics/Pan-transcriptomics Cancer Cancer transcriptomics
Metagenomics
High-quality metagenome assembly from long accurate reads with metaMDBG
An international collaboration of researchers from the UK, France, and the US National Human Genome Research Institute have developed metaMDBG, a metagenomics assembler for PacBio HiFi reads.
Key takeaways:
- The authors point out that when analyzing complex communities using metaMDBG they obtained “up to twice as many high-quality circularized prokaryotic metagenome-assembled genomes as existing methods and had better recovery of viruses and plasmids.”
- In their preliminary compute-performance testing the team noted that metaMDBG “was 1.5 to 12 times faster than the state of the art and required between one-tenth and one-thirtieth of the memory.”
- In discussing the biological value of the data generated with metaMDBG the authors claimed, “better recovery of low-abundance organisms,” and for some samples that they obtained, “a collection of near-complete MAGs that can map over 50% of reads.”
- They also noted that “the phyla that are unique to the metaMDBG near-complete MAGs include recently discovered phyla, with no cultured representatives.”
Tandem repeats
Characterization and visualization of tandem repeats at genome scale
A team of scientists and bioinformaticians at PacBio, Baylor, University of Utah, University of Miami, CMKC, Shriners, UC Davis, Emory, and the MIND Institute have now formally published a description of TRGT a dedicated PacBio tool for tandem repeat (TR) genotyping & visualization.
In discussing TRGT’s capabilities Dolzhenko et al. noted that “In six samples with known repeat expansions, TRGT detected all expansions while also identifying methylation signals and mosaicism and providing finer repeat length resolution than existing methods.” The team also released an accompanying TR database containing allele sequences and methylation levels for 937,122 TRs across 100 genomes.
Agrigenomics/pan-transcriptomes
An international team representing 19 institutions have released “a wheat pan-transcriptome with de novo annotation and differential expression analysis for nine wheat cultivars, across multiple different tissues”
Key takeaways:
- In providing context for what motivated the project the team pointed out that “polyploidy leads to complex effects on gene expression resulting from structural variation, gene duplication, deletion and neofunctionalization, ultimately increasing variation in gene expression and the plasticity of the species.” They also noted that “to date, studies of plant pan-transcriptomes either rely on read alignment to a single reference genome which can result in reference bias or generate de novo transcript assemblies from short read-data that can accumulate errors and technical artifacts.”
- With novelty of their aims established, the group generated “de novo gene annotations, incorporating long reads for the nine assembled wheat cultivars, providing a valuable resource for wheat researchers and breeders.” From their results the scientists found “evidence of widespread gene duplication and deletion, revealing the population structures imposed by repeated hybridizations from wild relatives and different breeding programs.”
- The authors concluded the preprint by stating: “We define the hexaploid wheat core and dispensable transcriptome and our analysis of gene expression and gene networks across different tissues and between cultivars reveals conservation and divergence in expression balance across homoeologous sub-genomes.”
Cancer transcriptomics
A team of scientists from Germany applied the Iso-Seq method to 44 cancer patients with and without SF3B1 mutations (which can affect cancer prognosis) and found > 60% of novel isoforms. The group’s work highlights “the importance of long-read sequencing for comprehensive transcriptome profiling, particularly for detecting novel splice variants and alternative splicing events.” In addition to providing mechanistic insights into mutation effects, the study represents the “most complete long-read transcriptome sequencing study in chronic lymphocytic leukemia (CLL) and myelodysplastic syndromes (MDS) and provides a resource to study aberrant splicing in cancer.”
Cancer
Microsatellite break-induced replication generates highly mutagenized extrachromosomal circular DNAs
Researchers at Wright State University used inverse PCR and targeted PacBio HiFi sequencing to analyze the formation of extrachromosomal circular DNAs (eccDNAs) that are associated with the formation of certain cancers. The eccDNAs linked to four different microsatellites (G4, H-DNA, hairpin, and AT-rich DNA) were investigated by the team.
Key takeaways:
- “Extrachromosomal circular DNAs (eccDNAs) … have been implicated in oncogenesis, neoantigen production and resistance to chemotherapy.”
- “each of these microsatellites produces eccDNAs containing unique template switching events which are recurrent, nonrandom, and distinct from those of the other microsatellites.”
- “eccDNA recombinants are mutagenized at ∼1000-fold the wild-type rate. The microsatellite repeats themselves are hotspots for mutagenesis, including deletions, insertions, and base substitutions. Repeat-induced mutagenesis extends more than 5-10 kb bidirectionally from each microsatellite.”
Ready to kickstart breakthroughs of your own?
These recent publications exemplify the versatility and power of PacBio sequencing. From bioinformatics advances to the elucidation of complex cancer biology, PacBio technology is enabling scientific pioneers to make innovative breakthroughs like never before.
PacBio sequencing is now more accessible for research teams of all sizes –thanks to new options for instrument financing or collaboration with certified service providers. To learn how to incorporate PacBio data into your next project:
Connect with a PacBio scientist