tskit
raw JSON → 1.0.2 verified Fri May 01 auth: no python
tskit is the tree sequence toolkit for storing, manipulating, and analyzing genealogical trees and genetic variation. Current version is 1.0.2. Originally developed in the TSKIT project, it has a slow release cadence with maintenance releases as needed. Requires Python >=3.11 since version 1.0.2.
pip install tskit Common errors
error AttributeError: module 'tskit' has no attribute 'load' ↓
cause You typed tskit.load instead of tskit.load. The correct function is tskit.load (lowercase L).
fix
Use tskit.load('file.trees')
error _tskit.LibraryError: Bad mutation parent ↓
cause Mutation parent inconsistencies: parent mutation is not ancestral to child mutation at the same site or has a different position.
fix
Check and correct mutation.parent column in tables. Ensure parent mutation appears earlier in the mutation table or is a valid ancestor.
error ValueError: The population ID must be an integer, not a string ↓
cause Passing a population name string to ts.samples(population='pop_name')
fix
Use integer IDs: population_id = ts.population('pop_name').id or set population=0.
error ValueError: The number of nodes does not match ↓
cause Attempting to load a tree sequence from tables with missing or extra nodes (e.g., node table not sorted correctly).
fix
Ensure node table has the correct number of nodes and that edges reference existing nodes.
Warnings
breaking TreeSequence.tables now returns a zero-copy immutable view (since 1.0.0b3). To modify tables, use TreeSequence.dump_tables() first. ↓
fix Replace ts.tables with ts.dump_tables() if you need to mutate.
deprecated ts.samples(population=...) now raises ValueError if the population ID is a string (e.g. population name) instead of silently returning no samples (since 1.0.1). ↓
fix Use population IDs (integers) or map names to IDs first.
gotcha Mutation parents must be consistent with tree topology. Invalid mutation parent relationships cause LibraryError when constructing TreeSequence from tables. ↓
fix Ensure mutation.parent column is valid (parent mutation must occur at an earlier position or same position and be an ancestor in the tree).
gotcha When writing VCF with individuals, default node filtering changed: non-sample nodes from individuals are now excluded by default (since 0.6.4). Use include_non_sample_nodes=True to include them. ↓
fix If you need non-sample nodes, pass include_non_sample_nodes=True to write_vcf().
Imports
- tskit
import tskit
Quickstart
import tskit
import msprime
# Simulate a tree sequence using msprime
ts = msprime.simulate(10, length=1e4, recombination_rate=1e-8, random_seed=1)
print(ts.num_trees)
print(ts.num_samples)
# Load a tree sequence from a file (if available)
# ts = tskit.load('example.trees')
# Basic statistics
print(ts.diversity())