gprofiler-official
The official Python 3 interface to the g:Profiler toolkit, providing functional enrichment analysis of GO and other terms, conversion between identifier namespaces, and mapping orthologous genes. It is currently at version 1.0.0, released in April 2019, and appears to have a stable, though not rapid, release cadence based on its history.
Warnings
- breaking The 1.0.x series introduced breaking changes compared to the 0.3.x series. Code written for older versions (e.g., 0.3.5) will not be compatible with 1.0.0 without modifications.
- gotcha The library is exclusively for Python 3 since version 0.3. Earlier versions (e.g., 0.2.x) were Python 2 compatible. Attempting to use recent versions with Python 2 will result in errors.
- gotcha If identifiers in your query have multiple possible mappings in g:Profiler, they might be excluded by default. This can lead to incomplete results.
- gotcha Using `return_dataframe=True` (as shown in quickstart) requires the `pandas` library to be installed. If `pandas` is not present, this will lead to an `ImportError` or `ModuleNotFoundError` when results are being processed.
Install
-
pip install gprofiler-official
Imports
- GProfiler
from gprofiler import GProfiler
Quickstart
import os
from gprofiler import GProfiler
# Initialize GProfiler object
# user_agent is optional but good practice to identify your application.
# Setting return_dataframe=True makes results easier to work with using pandas.
gp = GProfiler(
user_agent=os.environ.get('GPROFILER_USER_AGENT', 'MyAwesomeBioTool/1.0'),
return_dataframe=True
)
# Example: Functional enrichment analysis (g:GOSt)
# Query a list of genes for Homo sapiens
genes_query = ['NR1H4', 'TRIP12', 'UBC', 'FCRL3', 'PLXNA3', 'GDNF', 'VPS11']
print(f"Querying genes: {genes_query}")
results = gp.profile(organism='hsapiens', query=genes_query)
print("\nFunctional Enrichment Results (head):")
if results is not None and not results.empty:
print(results.head())
else:
print("No enrichment results found or query failed.")
# Example: Gene ID conversion (g:Convert)
# Convert gene IDs from one namespace to another
convert_query = ['NR1H4', 'TRIP12']
print(f"\nConverting gene IDs: {convert_query}")
converted_genes = gp.convert(
organism='hsapiens',
query=convert_query,
target_namespace='ENTREZGENE_ACC'
)
print("\nGene ID Conversion Results (head):")
if converted_genes is not None and not converted_genes.empty:
print(converted_genes.head())
else:
print("No conversion results found or query failed.")