RefGenConf
RefGenConf provides a standardized configuration object for reference genome assemblies. It enables robust and centralized management of paths to reference genome assets, ensuring consistency for bioinformatics tools. The current version is 0.13.1, and the project maintains an active development and release cadence.
Common errors
-
FileNotFoundError: [Errno 2] No such file or directory: '~/.refgenie/refgenie.yaml'
cause The default RefGenConf configuration file was not found at its expected location.fixInitialize `refgenie` by running `refgenie init` in your terminal. If you intend to use a different file, pass its path explicitly to `RefGenConf(filepath='your/custom/path.yaml')`. -
refgenconf.exceptions.RefgenconfError: Malformed configuration file
cause The `refgenie.yaml` file does not conform to the expected schema, possibly due to manual edits or an outdated format.fixRun `refgenie upgrade` to attempt an automatic migration. If that doesn't work, review the RefGenConf documentation for the correct schema or regenerate the configuration by re-initializing `refgenie` and adding assets. -
KeyError: 'some_genome_or_asset_name'
cause You are trying to access a genome assembly or an asset that is not defined in your `refgenie.yaml` file, or the name is misspelled/incorrectly cased.fixCheck your `refgenie.yaml` file or use `refgenie list` to confirm the exact names of registered genomes and assets. Ensure your Python code uses these names precisely as they appear in the configuration.
Warnings
- breaking RefGenConf versions prior to 0.10.0 used a different configuration file schema. Upgrading from older versions (e.g., 0.9.x to 0.10.x or newer) often requires migrating your `refgenie.yaml` file.
- gotcha RefGenConf typically relies on a `refgenie.yaml` configuration file, usually located at `~/.refgenie/refgenie.yaml`. If this file is missing or malformed, initialization will fail, or assets will not be found.
- gotcha Asset names and genome assembly identifiers are case-sensitive and must exactly match the entries in the `refgenie.yaml` configuration. Incorrect casing or typos will result in `KeyError` or 'Asset not found' errors.
Install
-
pip install refgenconf
Imports
- RefGenConf
from refgenconf import RefGenConf
Quickstart
import refgenconf
import os
import yaml
# --- Quickstart Setup: Create a dummy config file for demonstration ---
# In a real scenario, this would typically be '~/.refgenie/refgenie.yaml'
# initialized via 'refgenie init' command.
config_path = "temp_refgenie_config.yaml"
dummy_config_data = {
"refgenie": {
"genome_assembly": {
"assets": {
"fasta": {"path": "/path/to/fasta.fa", "description": "Genome fasta file"},
"chrom_sizes": {"path": "/path/to/chrom.sizes", "description": "Chromosome sizes file"}
}
}
}
}
with open(config_path, "w") as f:
yaml.dump(dummy_config_data, f)
# ---------------------------------------------------------------------
# Initialize RefGenConf with the config file path
rgc = refgenconf.RefGenConf(config_path)
# Access a genome assembly's assets
genome = "genome_assembly"
asset_key = "fasta"
if rgc.has_genome(genome) and rgc.has_asset(genome, asset_key):
asset_path = rgc.seek(genome, asset_key)
print(f"Path to {asset_key} for {genome}: {asset_path}")
# Example: Access metadata
metadata = rgc.get_asset_data(genome, asset_key)
print(f"Metadata for {asset_key}: {metadata}")
else:
print(f"Genome '{genome}' or asset '{asset_key}' not found in config.")
# Clean up the dummy config file
os.remove(config_path)