{"id":2714,"library":"pysam","title":"PySAM","description":"Pysam is a Python module for reading, manipulating, and writing genomic datasets. It is a lightweight wrapper of the HTSlib API, providing facilities to work with SAM/BAM/CRAM, VCF/BCF, BED, GFF/GTF, FASTA/FASTQ files, and access samtools/bcftools command-line functionality. The module supports compression and random access through indexing. Pysam is actively maintained with regular releases, often wrapping new versions of the underlying htslib, samtools, and bcftools C libraries.","status":"active","version":"0.23.3","language":"en","source_language":"en","source_url":"https://github.com/pysam-developers/pysam","tags":["genomics","bioinformatics","DNA sequencing","SAM","BAM","CRAM","VCF","BCF","FASTA","FASTQ","NGS"],"install":[{"cmd":"pip install pysam","lang":"bash","label":"PyPI (recommended)"},{"cmd":"conda install pysam -c bioconda","lang":"bash","label":"Conda (recommended for bioinformatics environments)"}],"dependencies":[{"reason":"Core C library wrapped by pysam for genomic file manipulation.","package":"htslib","optional":false},{"reason":"Provides command-line functionality wrapped by pysam.","package":"samtools","optional":false},{"reason":"Provides command-line functionality for VCF/BCF files wrapped by pysam.","package":"bcftools","optional":false},{"reason":"Required for installing pysam from source code or repository, but not typically needed when installing from pre-built wheels via pip.","package":"Cython","optional":true}],"imports":[{"symbol":"pysam","correct":"import pysam"},{"note":"Classes like AlignmentFile, VariantFile, and FastaFile are top-level members of the pysam module.","wrong":"import pysam.AlignmentFile","symbol":"AlignmentFile","correct":"from pysam import AlignmentFile"},{"symbol":"VariantFile","correct":"from pysam import VariantFile"},{"symbol":"FastaFile","correct":"from pysam import FastaFile"}],"quickstart":{"code":"import pysam\nimport os\n\n# --- Example 1: Read a BAM/SAM file ---\n# Create a dummy BAM file for demonstration\n# In a real scenario, you would use an existing BAM file and its index.\n# For this quickstart, we'll create a simple unmapped SAM file first.\n# NOTE: pysam.AlignmentFile requires a header for 'wb' mode.\n\nheader = {\n    'HD': {'VN': '1.0'},\n    'SQ': [{'LN': 1000, 'SN': 'chr1'}]\n}\n\ndummy_sam_path = \"example_reads.sam\"\nwith pysam.AlignmentFile(dummy_sam_path, \"wh\", header=header) as outfile:\n    # Create a simple unmapped read (no reference_id or pos)\n    read = pysam.AlignedSegment(header)\n    read.query_name = \"read1\"\n    read.query_sequence = \"ATGCATGC\"\n    read.query_qualities = pysam.qualities_to_ints(\"BBBBBBBB\")\n    read.flag = 4 # UNMAPPED\n    outfile.write(read)\n\nprint(f\"Reading from {dummy_sam_path}:\")\nwith pysam.AlignmentFile(dummy_sam_path, \"r\") as samfile:\n    for read in samfile.fetch(until_eof=True): # fetch(until_eof=True) for unindexed or SAM files\n        print(f\"  Read: {read.query_name}, Sequence: {read.query_sequence}, Mapped: {not read.is_unmapped}\")\nos.remove(dummy_sam_path)\n\n# --- Example 2: Read a VCF file ---\n# Create a dummy VCF file\ndummy_vcf_path = \"example_variants.vcf\"\nwith open(dummy_vcf_path, \"w\") as f:\n    f.write(\"##fileformat=VCFv4.2\\n\")\n    f.write(\"##contig=<ID=chr1,length=1000>\\n\")\n    f.write(\"#CHROM\\tPOS\\tID\\tREF\\tALT\\tQUAL\\tFILTER\\tINFO\\tFORMAT\\tSAMPLE1\\n\")\n    f.write(\"chr1\\t100\\t.\\tA\\tG\\t100\\tPASS\\t.\\tGT\\t0/1\\n\")\n    f.write(\"chr1\\t200\\t.\\tC\\tT\\t90\\tPASS\\t.\\tGT\\t1/1\\n\")\n\nprint(f\"\\nReading from {dummy_vcf_path}:\")\nvcf_file = pysam.VariantFile(dummy_vcf_path, \"r\")\nfor variant in vcf_file:\n    print(f\"  Variant: {variant.chrom}:{variant.pos} {variant.ref}>{variant.alts}\")\nvcf_file.close()\nos.remove(dummy_vcf_path)\n\n# --- Example 3: Read a FASTA file ---\n# Create a dummy FASTA file\ndummy_fasta_path = \"example_reference.fasta\"\nwith open(dummy_fasta_path, \"w\") as f:\n    f.write(\">chr1\\n\")\n    f.write(\"ATGCATGCATGCATGCATGCATGCATGCATGCATGCATGC\\n\")\n    f.write(\">chr2\\n\")\n    f.write(\"GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG\\n\")\n\nprint(f\"\\nReading from {dummy_fasta_path}:\")\nfasta_file = pysam.FastaFile(dummy_fasta_path)\nsequence = fasta_file.fetch(\"chr1\", 5, 15) # 0-based start, 0-based exclusive end\nprint(f\"  Fetched sequence from chr1 (5-15): {sequence}\")\nfasta_file.close()\nos.remove(dummy_fasta_path)","lang":"python","description":"This quickstart demonstrates how to open and read data from common genomic file formats (SAM, VCF, FASTA) using `pysam.AlignmentFile`, `pysam.VariantFile`, and `pysam.FastaFile`. It includes creating dummy files for a self-contained example."},"warnings":[{"fix":"Upgrade to Python 3.8 or newer, or pin pysam to a version <=0.23.0.","message":"Pysam v0.23.0 was the last release to officially support Python 3.6 and 3.7. Subsequent versions (v0.23.1 and later) require Python 3.8 or newer.","severity":"breaking","affected_versions":">=0.23.1"},{"fix":"Migrate your project to Python 3.x and update pysam to a compatible version.","message":"Pysam v0.20.0 was the final release to support Python 2.x. All versions after 0.20.0 are Python 3-only.","severity":"breaking","affected_versions":">0.20.0"},{"fix":"Always confirm the coordinate system (0-based/1-based, inclusive/exclusive) for specific `pysam` functions and adjust input ranges accordingly.","message":"When using `AlignmentFile.fetch()` or similar methods, coordinates are generally 0-based and half-open (exclusive end). This is a common convention in bioinformatics but can lead to off-by-one errors if 1-based or fully-inclusive ranges are expected.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Cython projects that compiled against pysam v0.23.1 should update to v0.23.2 or later to restore binary compatibility.","message":"Pysam v0.23.1 inadvertently broke binary compatibility for Cython projects that depend on pysam. This was fixed in v0.23.2. Pure Python projects using pysam were not affected.","severity":"gotcha","affected_versions":"0.23.1"},{"fix":"Ensure a C compiler (e.g., GCC) and `cython` (`pip install cython`) are available in your environment before attempting a source installation.","message":"While `pip install pysam` typically works by installing pre-built wheels (which include compiled htslib and do not require Cython), installing from source or a repository clone requires a C compiler and Cython to be pre-installed.","severity":"gotcha","affected_versions":"All versions (when installing from source)"}],"env_vars":null,"last_verified":"2026-04-10T00:00:00.000Z","next_check":"2026-07-09T00:00:00.000Z"}