msprime

raw JSON →
1.4.1 verified Fri May 01 auth: no python

msprime is a fast, scalable Python library for simulating genealogical trees and genomic sequence data under population genetic models using the coalescent with recombination. Current version 1.4.1 supports Python >=3.11. Released under GPL, maintained by the TSKIT team with regular updates.

pip install msprime
error AttributeError: module 'msprime' has no attribute 'simulate'
cause Using an old msprime version (<1.0) that had a different API, or a typo.
fix
Install msprime>=1.0: pip install 'msprime>=1.0'
error TypeError: simulate() got an unexpected keyword argument 'Ne'
cause Old code using `Ne` parameter removed in msprime 1.0.
fix
Replace Ne with population_size. See migration guide.
error ValueError: sample_size must be a list if there is more than one population
cause Passing a single int for sample_size when multiple populations are defined.
fix
Provide a list of sample sizes with length equal to number of populations.
breaking In msprime 1.0, the API was overhauled: `msprime.simulate()` replaces `msprime.simulate()` from older versions (no change in name but parameter changes). The `Ne` parameter is no longer accepted; use `population_size`.
fix Replace `Ne=X` with `population_size=X`. Ensure you use msprime 1.x imports.
deprecated `msprime.simulate()` with mutation_rate parameter is deprecated in favor of `msprime.mutate()` applied to a tree sequence.
fix Run `ts = msprime.simulate(...)` then `ts = msprime.mutate(ts, rate=1e-8)`. See docs for details.
gotcha Setting `random_seed` to the same value will produce identical results. For parallel simulations, ensure each process uses a unique seed.
fix Use different seeds (e.g., derived from process ID) when running parallel simulations.
gotcha When using `msprime.simulate()` with multiple populations, the `sample_size` parameter expects a list of sample sizes per population, not the total. Incorrect usage leads to confusing errors.
fix Pass a list: `sample_size=[5,5]` for two populations. Do not pass total 10.
breaking Support for Python 3.8 and 3.9 dropped in msprime 1.4.0. Requires Python >=3.11 as of 1.4.1.
fix Upgrade Python to 3.11 or later.

Simulate a basic coalescent tree sequence with recombination.

import msprime
# Simulate a coalescent sample of 10 chromosomes
recombination_rate = 1e-8
sequence_length = 1e4
ts = msprime.simulate(
    sample_size=10,
    recombination_rate=recombination_rate,
    length=sequence_length,
    random_seed=42
)
print(ts.num_trees, ts.num_samples)