{"id":28090,"library":"pyfastx","title":"pyfastx","description":"pyfastx is a Python module for fast random access to sequences from plain and gzipped FASTA/Q files. It provides an efficient, low-memory interface for reading, indexing, and querying biological sequences. Current version: 2.3.0, release cadence is irregular with updates about every 6-12 months.","status":"active","version":"2.3.0","language":"python","source_language":"en","source_url":"https://github.com/lmdu/pyfastx","tags":["bioinformatics","FASTA","FASTQ","sequence","indexing","gzip"],"install":[{"cmd":"pip install pyfastx","lang":"bash","label":"latest release from PyPI"}],"dependencies":[],"imports":[{"note":"Wildcard import is discouraged; use explicit imports like 'import pyfastx' then access classes via pyfastx.Fastx","wrong":"from pyfastx import *","symbol":"pyfastx","correct":"import pyfastx"}],"quickstart":{"code":"import pyfastx\n\n# Open a FASTA file (autodetects gzipped)\nfa = pyfastx.Fasta('example.fasta')\n\n# Get sequence count\nprint(fa.size)  # number of sequences\n\n# Random access by sequence ID\nseq = fa['chr1']\nprint(seq.seq[:10])  # first 10 bases\n\n# Iterate over all sequences\nfor seq in fa:\n    print(seq.id, len(seq))\n\n# For FASTQ:\nfq = pyfastx.Fastq('example.fastq')\nfor read in fq:\n    print(read.id, read.qual)\n","lang":"python","description":"Basic usage: open a FASTA or FASTQ file, iterate, access by ID."},"warnings":[{"fix":"Use index_dir parameter when calling Fasta or Fastq constructor to specify a custom directory: pyfastx.Fasta('file.fa', index_dir='./index')","message":"In version 2.0.0, the default index file path changed. Indexes are now saved in a default location (e.g., ~/.pyfastx/index) unless specified. This can break scripts that rely on custom index locations or that expect indexes in the same directory.","severity":"breaking","affected_versions":"<=1.x vs >=2.0.0"},{"fix":"Do not mix iteration and random access. If you need both, create separate instances: one for iteration, one for indexed access.","message":"When iterating over a Fastx object, breaking out of the loop early may cause indexing errors in subsequent random access. This is because iteration and index-based access share internal state.","severity":"gotcha","affected_versions":"all"},{"fix":"Use len(seq) or seq.len instead of seq.seq_len.","message":"The 'seq_len' attribute on Sequence objects was renamed to 'len' in version 2.0.0. 'seq_len' is now deprecated.","severity":"deprecated","affected_versions":">=2.0.0"}],"env_vars":null,"last_verified":"2026-05-09T00:00:00.000Z","next_check":"2026-08-07T00:00:00.000Z","problems":[{"fix":"Use len(seq) or seq.len instead.","cause":"Using deprecated attribute seq_len which was removed in v2.0.0.","error":"AttributeError: 'Fastx' object has no attribute 'seq_len'"},{"fix":"Use index_dir parameter to specify a writable directory: pyfastx.Fasta('path/to/readonly/file.fa', index_dir='/tmp')","cause":"pyfastx creates an index file (with .fxi extension) and expects write permissions in the same directory as the FASTA file. If the directory is read-only, this fails.","error":"FileNotFoundError: [Errno 2] No such file or directory: 'example.fa.fxi'"},{"fix":"Check the actual header: fa.keys() or iterate to see IDs. Use the exact first word of the header (e.g., for '>chr1 some description', the ID is 'chr1').","cause":"The sequence ID used for random access does not match the actual header in the file. pyfastx uses the first word of the FASTA header (until whitespace) as the ID.","error":"ValueError: Sequence not found: chr1"}],"ecosystem":"pypi","meta_description":null,"install_score":null,"install_tag":null,"quickstart_score":null,"quickstart_tag":null}