Estimating Gene Conversion Tract Length and Rate From PacBio HiFi Data

Anders Poulsen Charmouh, Peter Sørud Porsborg, Lasse Thorup Hansen, Søren Besenbacher, Sofia Boeg Winge, Kristian Almstrup, Asger Hobolth, Thomas Bataillon, Mikkel Heide Schierup

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningpeer review

1 Citationer (Scopus)
5 Downloads (Pure)

Abstract

Gene conversions are broadly defined as the transfer of genetic material from a "donor" to an "acceptor" sequence and can happen both in meiosis and mitosis. They are a subset of noncrossover (NCO) events and, like crossover (CO) events, gene conversion can generate new combinations of alleles and counteract mutation load by reverting germline mutations through GC-biased gene conversion. Estimating gene conversion rate and the distribution of gene conversion tract lengths remains challenging. We present a new method for estimating tract length, rate, and detection probability of NCO events directly in HiFi PacBio long read data. The method can be used to make inference from sequencing of gametes from a single individual. The method is unbiased even under low single nucleotide variant (SNV) densities and does not necessitate any demographic or evolutionary assumptions. We test the accuracy and robustness of our method using simulated datasets where we vary length of tracts, number of tracts, the genomic SNV density, and levels of correlation between SNV density and NCO event position. Our simulations show that under low SNV densities, like those found in humans, only a minute fraction (∼2%) of NCO events are expected to become visible as gene conversions by moving at least 1 SNV. We finally illustrate our method by applying it to PacBio sequencing data from human sperm.

OriginalsprogEngelsk
Artikelnummermsaf019
TidsskriftMOLECULAR BIOLOGY AND EVOLUTION
Vol/bind42
Udgave nummer2
Antal sider11
ISSN0737-4038
DOI
StatusUdgivet - 2025
Udgivet eksterntJa

Bibliografisk note

© The Author(s) 2025. Published by Oxford University Press on behalf of Society for Molecular Biology and Evolution.

Citationsformater