Refgenie
Refgenie creates a standardized folder structure for reference genome files and indexes, facilitating their sharing and usage across different bioinformatics tools and analyses. The current version is 0.13.0, with new releases typically occurring a few times per year, focusing on bug fixes, feature enhancements, and occasional schema updates.
Common errors
-
Error: Config file is outdated or invalid. Please run 'refgenie config upgrade'
cause Your `refgenie.yaml` file's schema version is older than what the installed `refgenie` library expects.fixOpen your terminal and execute `refgenie config upgrade`. -
FileNotFoundError: [Errno 2] No such file or directory: '/path/to/refgenie/data/asset_name'
cause The requested genome asset (e.g., `hg38/fasta`) has not been downloaded or built locally by Refgenie, or the client is not configured to point to the correct data directory.fixFirst, ensure your Refgenie client points to the correct config file and data directory. Then, pull or build the missing asset using the CLI, for example: `refgenie pull hg38/fasta` or `refgenie build hg38/fasta`. -
Error: No such command 'add'. Did you mean 'build'?
cause You are attempting to use the deprecated `refgenie add` command with a Refgenie version (0.9.0+) that has replaced it with `refgenie build`.fixReplace `refgenie add` with `refgenie build` in your command-line invocations.
Warnings
- breaking Refgenie config file schema has undergone breaking changes across major versions (e.g., 0.10.0). Using an older `refgenie.yaml` with a newer version of the library can lead to errors.
- breaking The primary CLI command for adding new assets changed from `refgenie add` to `refgenie build` in version 0.9.0. Using the old command with newer versions will result in a 'command not found' error.
- gotcha The `RefgenieClient` constructor will by default attempt to load `~/.refgenie/refgenie.yaml`. If this file doesn't exist, it will initialize an empty client without raising an error. This can lead to unexpected behavior if you expect an error for a missing configuration or an automatically populated client.
Install
-
pip install refgenie
Imports
- RefgenieClient
from refgenie import RefgenieClient
- RefgenieProject
from refgenie import RefgenieProject
- cli
from refgenie import cli
import refgenie.cli
Quickstart
import refgenie
import os
# Initialize RefgenieClient. It will look for ~/.refgenie/refgenie.yaml by default.
# For a temporary or custom config, provide the path.
# If no config is found at the default path, an empty client is initialized.
# For this example, we'll try to use a default or an empty one.
# Create a dummy refgenie.yaml for demonstration if it doesn't exist
config_path = os.path.expanduser("~/.refgenie/refgenie.yaml")
if not os.path.exists(os.path.dirname(config_path)):
os.makedirs(os.path.dirname(config_path))
if not os.path.exists(config_path):
with open(config_path, 'w') as f:
f.write("refgenie:\n genome_config_version: 0.2\n genomes: {}\n")
rg = refgenie.RefgenieClient(config_file=config_path)
print(f"Refgenie client initialized. Config file: {rg.config_file}")
# List available genomes (will be empty if no assets are pulled)
genomes = rg.list_genomes()
print(f"Available genomes: {list(genomes.keys())}")
# Example: trying to get an asset (will likely fail if not pulled/built)
# To make this runnable without pre-pulled assets, we'll skip direct asset retrieval
# Try to get a non-existent asset to show expected behavior without erroring
try:
# This assumes 'human_genome' is a valid genome and 'fasta' is an asset for it
# For a real scenario, you would run `refgenie pull human_genome/fasta` first
# or build your own assets.
# asset_path = rg.get_asset("human_genome/fasta")
# print(f"Path to human_genome fasta: {asset_path}")
print("To get an asset, first ensure it's pulled or built locally.")
print("Example: 'refgenie pull hg38/fasta' from command line.")
except Exception as e:
print(f"Could not retrieve asset (expected if not pulled): {e}")
# Clean up dummy config for demonstration
# os.remove(config_path)
# os.rmdir(os.path.dirname(config_path)) # Only if directory is empty