{"id":1949,"library":"biotite","title":"Biotite","description":"Biotite is a comprehensive Python library (current version 1.6.0) for computational molecular biology, offering a broad set of tools for sequence analysis, structural bioinformatics, and accessing data from biological databases. It leverages NumPy arrays for efficient, high-performance operations and provides seamless interfaces to integrate with external bioinformatics software, allowing users to streamline their analyses from basic scripting to developing full software packages. The library maintains an active development and release schedule, with significant updates in recent years.","status":"active","version":"1.6.0","language":"en","source_language":"en","source_url":"https://github.com/biotite-dev/biotite","tags":["molecular biology","bioinformatics","structural biology","sequence analysis","protein","DNA","NumPy","computational biology","cheminformatics"],"install":[{"cmd":"pip install biotite","lang":"bash","label":"PyPI"},{"cmd":"conda install -c conda-forge biotite","lang":"bash","label":"Conda"}],"dependencies":[{"reason":"Core data model for sequences and structures relies on NumPy ndarrays for performance and intuitive operations.","package":"numpy","optional":false},{"reason":"Used for accessing biological databases via REST APIs (e.g., NCBI Entrez, UniProt, PDB).","package":"requests","optional":false},{"reason":"Used for efficient data serialization.","package":"msgpack","optional":false},{"reason":"Likely used for graph-based analyses within structural or interaction modules.","package":"networkx","optional":false},{"reason":"Mandatory dependency for trajectory file interfaces within `biotite.structure.io` as of v1.6.0, replacing `mdtraj`.","package":"biotraj","optional":false}],"imports":[{"symbol":"ProteinSequence","correct":"from biotite.sequence import ProteinSequence"},{"symbol":"entrez","correct":"from biotite.database import entrez"},{"symbol":"FastaFile","correct":"from biotite.sequence.io.fasta import FastaFile"},{"symbol":"align_optimal","correct":"from biotite.sequence.align import align_optimal"}],"quickstart":{"code":"import biotite.sequence.align as align\nimport biotite.sequence.io.fasta as fasta\nimport biotite.database.entrez as entrez\nimport os\n\n# Download FASTA file for the sequences of avidin and streptavidin\n# The 'file_name' should ideally be a path to a temporary file.\n# For a runnable example, we'll use a simple name and ensure cleanup in a real scenario.\nfile_name = \"sequences.fasta\"\nuids = [\"CAC34569\", \"ACL82594\"] # Example UIDs for avidin and streptavidin\nentrez.fetch_single_file(\n    uids=uids,\n    file_name=file_name,\n    db_name=\"protein\",\n    ret_type=\"fasta\"\n)\n\n# Parse the downloaded FASTA file and create 'ProteinSequence' objects\nfasta_file = fasta.FastaFile.read(file_name)\navidin_seq, streptavidin_seq = fasta.get_sequences(fasta_file).values()\n\n# Align sequences using the BLOSUM62 matrix with affine gap penalty\nmatrix = align.SubstitutionMatrix.std_protein_matrix()\nalignments = align.align_optimal(\n    avidin_seq, streptavidin_seq, matrix,\n    gap_penalty=(-10, -1),\n    terminal_penalty=False\n)\n\nprint(f\"Number of alignments: {len(alignments)}\")\nif alignments:\n    print(\"First optimal alignment:\")\n    print(alignments[0])\n\n# Clean up the downloaded file\nos.remove(file_name)","lang":"python","description":"Downloads two protein sequences (avidin and streptavidin) from the NCBI Entrez database, parses them from a FASTA file, and performs a pairwise optimal sequence alignment using the BLOSUM62 matrix with affine gap penalties."},"warnings":[{"fix":"Ensure `biotraj` is installed (`pip install biotraj`) and update import paths if directly using `mdtraj` features that were previously proxied by Biotite.","message":"As of Biotite v1.6.0, the `biotraj` package is now a mandatory dependency for trajectory file interfaces in `biotite.structure.io`, and `mdtraj` is no longer required for this purpose. Projects relying on `mdtraj` through Biotite's internal interfaces might require adjustment.","severity":"breaking","affected_versions":">=1.6.0"},{"fix":"Familiarize yourself with NumPy array operations and indexing. Biotite's documentation provides examples of how to interact with its NumPy-based data structures.","message":"Biotite internally stores most sequence and structure data as NumPy `ndarray` objects. While offering high performance and intuitive NumPy-like indexing, users accustomed to other bioinformatics libraries (e.g., Biopython) might need to adapt to this NumPy-centric data model.","severity":"gotcha","affected_versions":"all"},{"fix":"Always refer to the official documentation or example gallery to find the correct import paths for the specific classes or functions you intend to use.","message":"Biotite is organized into several subpackages (e.g., `biotite.sequence`, `biotite.structure`, `biotite.database`). Specific functionalities reside within these submodules, requiring explicit imports from the relevant subpackage rather than a single top-level `import biotite`.","severity":"gotcha","affected_versions":"all"},{"fix":"Explicitly set `color_scheme='rainbow'` if you wish to retain the old default, or adapt to the new `flower` default. Consider if your visualization interpretations are affected by the color scheme change.","message":"In `biotite.sequence.graphics`, the default color scheme for visualizing sequence alignments changed from `rainbow` to `flower` in v1.6.0. The `flower` scheme is considered to represent amino acid similarity more effectively.","severity":"deprecated","affected_versions":">=1.6.0"}],"env_vars":null,"last_verified":"2026-04-09T00:00:00.000Z","next_check":"2026-07-08T00:00:00.000Z"}