PacBio HiFi long-read sequencing is one of the most advanced sequencing technologies for generating groundbreaking discoveries that can expand our understanding of human health. In case you missed it, we recently published a round-up of some of what we think are the most significant new research publications to employ PacBio sequencing technology. As covered in the round-up, the month of August saw researchers achieve two exciting human genomics milestones that we think warrant some additional discussion.
Scientists with the Telomere-to-Telomere (T2T) Consortium and the Human Genome Structural Variation Consortium (HGSVC) both published the results of pioneering genomic research in the August 23 issue of journal the Nature. Tied together by a common focus, both teams’ publications tackled the last remaining human chromosome to be fully sequenced, the Y chromosome.
The Y chromosome is now complete
The human Y chromosome plays a pivotal role in male sexual development and fertility. Despite being an area of the genome with high biomedical research value, sequencing the Y chromosome with conventional short-read methods had posed considerable challenges for researchers due to its highly repetitive characteristics. To get around this technical roadblock, an international research team led by the Telomere-to-Telomere (T2T) Consortium employed a combination of sequencing technologies that included PacBio HiFi long reads. The potent combination of read length (15-20 kb) and accuracy (Q30+) that defines HiFi sequencing was instrumental in enabling T2T scientists to access missing gDNA and RNA isoform information in the Y’s highly homologous regions. As a result, the team was able to fill the gaps that had previously comprised almost 50% of the chromosome’s sequence.
This new assembly now spans nearly 62.5 million base pairs and is the first complete gapless human Y chromosome ever generated. Of the 62.5 million bases, 30 million were new to science and project contributors also identified 41 predicted protein-coding genes. Previously unappreciated gene regulatory networks and structural rearrangements on the Y chromosome were also discovered, shedding light on male infertility and related conditions. This newly completed Y chromosome sequence is likely to be a very valuable new resource for researchers launching investigations into everything from sexual development and fertility to genealogy, cancer, evolution, and more.
Hidden Y chromosome variation is brought to light
In a second, allied research article, HGSVC scientists presented the results of a remarkably comprehensive pangenomic analysis of Y chromosomes from 43 individuals with diverse genetic backgrounds. The authors employed PacBio HiFi sequencing as the study’s foundation, along with other advanced techniques, to reveal some astounding insights into the characteristics of Y-chromosome at the population level.
Through their analysis, the team found large inversions and distinct mutation rates in male-specific sequences, among other recurrent genetic variation. Most surprisingly, the study revealed that the size of the Y can vary from individual to individual by nearly two-fold, or a size range of approximately 45.2 to 84.9 Mbp! Diving deeper into this groundbreaking dataset, researchers were able to trace approximately 183,000 years of human evolution to reveal even more hidden complexity and remarkable differences in the size and structure of the Y chromosome. Interestingly, the HGSVC’s analysis pointed to low levels of base substitution suggesting that most of the variation housed within the Y chromosome is likely to be structural in nature.
Enhanced capabilities make for new opportunities to discover
Together these studies constitute a significant advancement in human genetics research, offering us a deeper understanding of the Y chromosome’s structure and variability with important implications for future work on human health and evolution. Both utilized PacBio long-read sequencing as an essential tool in generating their findings and both underscore the value of highly accurate long reads for generating high-impact genomic research. As the conclusions on Y chromosome variability show, there is a tremendous potential for undiscovered genetic variation to be found at a structural level in humans and other organisms. With the successful implementation of HiFi long-read sequencing, scientists everywhere now have the means to access this previously inaccessible type of genomic information.
Interested in using PacBio sequencing for your own research?
Connect with a PacBio scientist. We would love to hear from you!