GOATOOLS

raw JSON →
1.6.4 verified Mon Apr 27 auth: no python

A Python library to find enrichment of Gene Ontology (GO) terms among a set of genes. It provides tools for GO term enrichment analysis (Fisher's exact test, GOEA), GO term association, and visualization of GO hierarchies. Current version 1.6.4, released June 2024. Regular updates every few months.

pip install goatools
error AttributeError: 'GODag' object has no attribute 'get'
cause Trying to use GODag as a dictionary without calling `GODag('go-basic.obo')` first.
fix
Instantiate GODag: obodag = GODag('go-basic.obo'). Then use obodag['GO:0008150'].
error ValueError: Read associations: No data parsed, check file format
cause The associations file is malformed (e.g., wrong delimiter, extra columns).
fix
Ensure tab-separated file with exactly two columns: gene_id and GO_id. Use head -n5 gene2go.tsv to verify.
error ModuleNotFoundError: No module named 'goatools'
cause goatools not installed or installed in a different environment.
fix
Run pip install goatools in your active environment.
gotcha The `GODag` expects a local OBO file. Many users forget to download go-basic.obo first. Use `goatools.obo_parser.GODag('http://purl.obolibrary.org/obo/go/go-basic.obo')` to fetch remotely only if you have stable internet; but for reproducibility, download locally.
fix Download go-basic.obo from http://geneontology.org/ontology/go-basic.obo and pass local path.
gotcha The `associations` file must be tab-separated with two columns: gene_id and GO_id. Extra columns or non-tab separators cause silent errors.
fix Ensure exactly two columns: gene_symbol\tGO:0000000.
deprecated The function `goatools.evidence._evidence_filter` was removed in v1.0. Use `goatools.associations` instead.
fix Use `from goatools.associations import read_associations`.
gotcha Enrichment results may have many rows with the same p-value due to propagation. Use `propagate_count=True` to avoid spurious significance.
fix Set `propagate_count=True` in GOEnrichmentStudy constructor.

Basic workflow: load GO DAG, load associations, run enrichment study.

from goatools import GOEnrichmentStudy
from goatools.obo_parser import GODag
from goatools.associations import read_associations

# Download go-basic.obo from http://geneontology.org/ontology/go-basic.obo
obodag = GODag('go-basic.obo')

# Load gene-to-GO associations (tab-separated: gene_id\tGO_id)
assoc = read_associations('gene2go.tsv', obodag, no_top=True)

# List of gene IDs of interest (e.g., from RNA-seq)
gene_list = ['GENE1', 'GENE2', 'GENE3']

# Background population (e.g., all genes in genome)
population = list(assoc.keys())

# Run enrichment
goea = GOEnrichmentStudy(population, assoc, obodag, propagate_count=True, alpha=0.05)
results = goea.run_study(gene_list)

print(results[:5])