TY - JOUR
T1 - Efficient ancestry and mutation simulation with msprime 1.0
AU - Baumdicker, Franz
AU - Bisschop, Gertjan
AU - Goldstein, Daniel
AU - Gower, Graham
AU - Ragsdale, Aaron P.
AU - Tsambos, Georgia
AU - Zhu, Sha
AU - Eldon, Bjarki
AU - Ellerman, E. Castedo
AU - Galloway, Jared G.
AU - Gladstein, Ariella L.
AU - Gorjanc, Gregor
AU - Guo, Bing
AU - Jeffery, Ben
AU - Kretzschumar, Warren W.
AU - Lohse, Konrad
AU - Matschiner, Michael
AU - Nelson, Dominic
AU - Pope, Nathaniel S.
AU - Quinto-Cortes, Consuelo D.
AU - Rodrigues, Murillo F.
AU - Saunack, Kumar
AU - Sellinger, Thibaut
AU - Thornton, Kevin
AU - Van Kemenade, Hugo
AU - Wohns, Anthony W.
AU - Wong, Yan
AU - Gravel, Simon
AU - Kern, Andrew D.
AU - Koskela, Jere
AU - Ralph, Peter L.
AU - Kelleher, Jerome
N1 - Publisher Copyright:
© The Author(s) 2021.
PY - 2022
Y1 - 2022
N2 - Stochastic simulation is a key tool in population genetics, since the models involved are often analytically intractable and simulation is usually the only way of obtaining ground-truth data to evaluate inferences. Because of this, a large number of specialized simulation programs have been developed, each filling a particular niche, but with largely overlapping functionality and a substantial duplication of effort. Here, we introduce msprime version 1.0, which efficiently implements ancestry and mutation simulations based on the succinct tree sequence data structure and the tskit library. We summarize msprime's many features, and show that its performance is excellent, often many times faster and more memory efficient than specialized alternatives. These high-performance features have been thoroughly tested and validated, and built using a collaborative, open source development model, which reduces duplication of effort and promotes software quality via community engagement.
AB - Stochastic simulation is a key tool in population genetics, since the models involved are often analytically intractable and simulation is usually the only way of obtaining ground-truth data to evaluate inferences. Because of this, a large number of specialized simulation programs have been developed, each filling a particular niche, but with largely overlapping functionality and a substantial duplication of effort. Here, we introduce msprime version 1.0, which efficiently implements ancestry and mutation simulations based on the succinct tree sequence data structure and the tskit library. We summarize msprime's many features, and show that its performance is excellent, often many times faster and more memory efficient than specialized alternatives. These high-performance features have been thoroughly tested and validated, and built using a collaborative, open source development model, which reduces duplication of effort and promotes software quality via community engagement.
KW - Ancestral Recombination Graphs
KW - coalescent
KW - mutations
KW - Simulation
U2 - 10.1093/genetics/iyab229
DO - 10.1093/genetics/iyab229
M3 - Journal article
C2 - 34897427
AN - SCOPUS:85125682611
VL - 220
JO - Genetics
JF - Genetics
SN - 1943-2631
IS - 3
M1 - iyab229
ER -