Bgen
raw JSON → 1.9.9 verified Fri May 01 auth: no python
Python package for loading and manipulating data from BGEN files, a binary file format for storing genotype data. Current version is 1.9.9, compatible with Python >=3.8. It is actively maintained with periodic releases.
pip install bgen Common errors
error ImportError: No module named 'bgen' ↓
cause Package not installed or installed in a different environment.
fix
Run 'pip install bgen' and ensure you're using the correct Python interpreter.
error bgen.BGENFile' object has no attribute 'read' ↓
cause Using an old API; BGENFile no longer has a read method; data is accessed via iteration or properties.
fix
Use iteration: for variant in bgen: ... or bgen[0] to access first variant.
error KeyError: 'rsid' ↓
cause Trying to access variant metadata using key-based access on a variant object, but variant is not a dict.
fix
Use dot notation: variant.rsid, variant.chromosome, etc.
Warnings
gotcha BGENFile indexing is 0-based, not 1-based. Often users expect variant positions to be 1-indexed. ↓
fix Always check documentation or use .rsid for identifiers.
deprecated Functions like 'bgen_open' are deprecated in favor of direct class instantiation. ↓
fix Use 'from bgen import BGENFile; bgen = BGENFile(filename)' instead.
gotcha Memory usage can be high when calling .genotype() on a large variant because it loads all sample data into memory. ↓
fix Use the 'subsample' or 'probs' arguments to subset or use iterator for large files.
Imports
- BGENFile
from bgen import BGENFile - BGENX
from bgen import BGENX
Quickstart
from bgen import BGENFile
# Replace with your actual BGEN file path
bgen = BGENFile('example.bgen')
# Get number of samples
print('Number of samples:', bgen.nsamples)
# Get number of variants
print('Number of variants:', bgen.nvariants)
# Iterate over first 5 variants
for i, variant in enumerate(bgen):
if i >= 5:
break
print('Variant:', variant.rsid, variant.chromosome, variant.position)
# Access genotype probabilities (3D array: samples x alleles x ploidy)
probs = variant.genotype()
print('Genotype probs shape:', probs.shape)