Menu
2022年07月12日

Integrated heteroduplex correction in PacBio’s circular consensus algorithm

Author(s): Derek Barnett1, John Harting1, Walter Lee1, Armin Töpfer1, Fritz Sedlazeck2, Jenny Ekholm1, Nina Gonzaludo1, Justin Blethrow1, James Drake1, Zev Kronenberg1

1Pacific Biosciences (PacBio), Menlo Park, United States; 2Baylor College of Medicine, Human Genome Sequencing Center, Houston, United States

Background/Objectives:

A heteroduplex is a double-stranded sequence comprised of two non-complementary strands that can form during PCR. These mixed-template artifacts produce misleading results in downstream analysis, e.g., false haplotypes during diplotyping. Unlike short-read technologies, PacBio Single- Molecule Real-Time sequencing produces strand-level base calls. Heteroduplex signatures can be directly observed and corrected using the stranded sub-read data. Our new method is integrated in the circular consensus sequence algorithm which generates accurate HiFi data from sub-reads.

Methods:

The transformation of PacBio subreads into high accuracy HiFi reads is done by the circular consensus sequence (CCS) algorithm. During CCS, an intermediate draft sequence is generated, and subreads are mapped and aligned to the draft. The heteroduplex algorithm (hd-finder) takes the subread alignments and generates a read pileup whereby variants are identified. At each site, the bases are sorted and counted by strand. The 2×2 count data is subjected to a Fisher’s exact test. The fraction of significant sites across the draft is used to determine if a read contains heteroduplex. Heteroduplex flagged reads are split by strand and reprocessed resulting in two HiFi reads, one for each strand.

Results:

We demonstrate the accuracy of the hd-finder algorithm is >94% by using a heteroduplex enriched amplicon library. We also show that applying the hd-finder to amplified datasets improves the quality of downstream analysis of important human genes.

Conclusion:

The heteroduplex algorithm is a powerful new method for improving HiFi amplicon targets. The method has been released (v6.3.) and is documented https://ccs.how/faq/mode-heteroduplex- filtering.html.

Organization: PacBio
Year: 2022

View Conference Poster

咨询专家

如果您有疑问、需要查看订单状态或想要购买仪器,我们随时乐意提供帮助。

姓名(Required)
这个字段是用于验证目的,应该保持不变。

在本网页上注册,即表示您同意,并同意 PacBio 根据我们的隐私政策收集和使用该信息.