{"id":21379,"library":"gffutils","title":"gffutils","description":"gffutils is a Python package for working with GFF and GTF files in a flexible database framework. It stores annotations in a SQLite database for fast querying, manipulation, and export. Current version is 0.14 (released 2024). It supports Python >=3.8 and is released on a semi-regular cadence.","status":"active","version":"0.14","language":"python","source_language":"en","source_url":"https://github.com/daler/gffutils","tags":["bioinformatics","GFF","GTF","genomics","database"],"install":[{"cmd":"pip install gffutils","lang":"bash","label":"latest"}],"dependencies":[],"imports":[{"note":"create_db is not a top-level function; use gffutils.create_db (it's a module-level attribute after import).","wrong":"from gffutils import create_db","symbol":"create_db","correct":"import gffutils"}],"quickstart":{"code":"import gffutils\n\n# Download a sample GFF file or use your own\n# For this example, we'll use an in-memory database\nimport os\ndb = gffutils.create_db(':memory:', from_string='##gff-version 3\\nchr1\\t.\\tgene\\t1\\t1000\\t.\\t+\\t.\\tID=gene1\\nchr1\\t.\\texon\\t100\\t200\\t.\\t+\\t.\\tID=exon1;Parent=gene1\\n', force=True, keep_order=True)\n\ngene = db['gene1']\nprint(gene.id, gene.seqid, gene.start, gene.end)\n\n# Query all features\nexons = db.children('gene1', featuretype='exon')\nfor exon in exons:\n    print(exon.id, exon.start, exon.end)","lang":"python","description":"Create an in-memory GFF database from a string and query features."},"warnings":[{"fix":"Always include force=True if you intend to recreate the database, or use a new file path.","message":"The force=True parameter is required when overwriting an existing database file. If you omit it and the file exists, you'll get a 'Database already exists' error.","severity":"gotcha","affected_versions":"all"},{"fix":"Use gffutils.create_db('file.gtf', id_spec={'gene': 'gene_id', 'transcript': 'transcript_id'})","message":"Gene/transcript naming conventions differ between GFF and GTF. gffutils uses the GFF convention (ID attribute). If your file is GTF, the ID attribute may be named differently (e.g., gene_id, transcript_id). Use the 'id_spec' parameter to specify custom IDs.","severity":"gotcha","affected_versions":"all"},{"fix":"Replace db.all_features() with db.features()","message":"The method `db.all_features()` is deprecated since version 0.11; use `db.features()` instead.","severity":"deprecated","affected_versions":">=0.11"},{"fix":"Set merge_strategy='merge' in create_db if you need to handle overlapping features with the same ID.","message":"In version 0.12, the default for 'merge_strategy' changed from 'merge' to 'error'. This means that if you have overlapping features with the same ID, the database creation will fail unless you specify merge_strategy='merge'.","severity":"breaking","affected_versions":">=0.12"}],"env_vars":null,"last_verified":"2026-04-27T00:00:00.000Z","next_check":"2026-07-26T00:00:00.000Z","problems":[{"fix":"Delete the existing database file and recreate it with force=True.","cause":"The database file exists but is empty or not a valid gffutils database (e.g., created without force=True and not overwritten).","error":"sqlite3.OperationalError: no such table: features"},{"fix":"Use: import gffutils; db = gffutils.create_db(...)","cause":"Importing the module incorrectly (e.g., from gffutils import create_db) when the function is not a direct attribute.","error":"AttributeError: module 'gffutils' has no attribute 'create_db'"},{"fix":"Provide an id_spec dictionary mapping feature types to attribute names, e.g., id_spec={'gene': 'gene_id', 'transcript': 'transcript_id'}","cause":"The GFF/GTF file does not have an 'ID' attribute for genes (common in GTF format) or the attribute name differs.","error":"ValueError: No valid ID found for feature type 'gene'. Please specify an id_spec."}],"ecosystem":"pypi","meta_description":null,"install_score":null,"install_tag":null,"quickstart_score":null,"quickstart_tag":null}