Biopython

1.87 · active · verified Thu Apr 09

Biopython is a comprehensive collection of freely available Python tools for computational molecular biology and bioinformatics. It provides functionality for common tasks such as parsing various bioinformatics file formats (e.g., FASTA, GenBank, BLAST), interacting with online biological databases (e.g., NCBI Entrez), working with sequences and alignments, and structural bioinformatics. Currently at version 1.87, it is actively maintained with several releases per year.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates creating and manipulating a Bio.Seq object, and then parsing a FASTA file using Bio.SeqIO.parse, which is a common task in bioinformatics. It includes creating a temporary FASTA file to make the example runnable.

import os
from Bio.Seq import Seq
from Bio import SeqIO

# 1. Working with a basic sequence
my_dna = Seq("ATGACGTACGT")
print(f"Original DNA: {my_dna}")
print(f"Complement: {my_dna.complement()}")
print(f"Reverse Complement: {my_dna.reverse_complement()}")
print(f"Translated protein: {my_dna.translate()}")

# 2. Parsing a FASTA file
# Create a dummy FASTA file for demonstration
fasta_content = (
    ">seq1 description for sequence 1\n"
    "ATGCGTACGTAGCTAGCTAGCATGCAGCTAGCATGCGATGC\n"
    ">seq2 description for sequence 2\n"
    "GATCGATCGATCGATCGATCGATCGATCGATCGATCGA"
)

with open("example.fasta", "w") as f:
    f.write(fasta_content)

print("\n--- Parsing example.fasta ---")
for seq_record in SeqIO.parse("example.fasta", "fasta"):
    print(f"ID: {seq_record.id}")
    print(f"Description: {seq_record.description}")
    print(f"Sequence: {seq_record.seq}")
    print(f"Length: {len(seq_record.seq)}")

# Clean up the dummy file
os.remove("example.fasta")

view raw JSON →