spaCy Loggers
spacy-loggers is a Python library that provides logging utilities for spaCy v3.2 and newer, allowing integration with various MLOps tools and machine learning frameworks. It currently supports Weights & Biases, MLflow, ClearML, PyTorch, and CuPy, offering a way to track training metrics and model artifacts independently from the core spaCy library. The current version is 1.0.5, with an active release cadence that regularly introduces new loggers and updates existing ones.
Warnings
- breaking Starting with `spacy.WandbLogger.v5`, `spacy.MLflowLogger.v2`, and `spacy.ClearMLLogger.v2`, these loggers no longer automatically call the default console logger. If you want console output alongside these, you must explicitly use `spacy.ChainLogger.v1` and include `spacy.ConsoleLogger.v2` in the chain.
- gotcha The `prodigy train` command (from Prodigy, a related annotation tool) overrides logger settings in its configuration. Therefore, `spacy-loggers` integrations might not function as expected when training via `prodigy train`. It is recommended to use `python -m spacy train` with your `config.cfg` for full logger functionality.
- gotcha Each external logger (Weights & Biases, MLflow, ClearML) requires its respective library to be installed separately and may need initial configuration (e.g., `pip install wandb && wandb login`, `pip install mlflow`, `pip install clearml && clearml-init`). Without these, the loggers will not function correctly.
- gotcha When using `spacy.MLflowLogger.v2` for remote MLflow tracking, environment variables such as `MLFLOW_TRACKING_URI` must be correctly set before launching `spacy train`. Forgetting to do so may result in silent failures or local logging instead of remote.
- gotcha The `spacy.MLflowLogger.v2`'s `log_custom_stats` parameter is intended to filter logged metrics, but it currently logs all metrics internally before applying the regex filters. This can lead to more data being logged than intended if not aware of the behavior.
Install
-
pip install spacy-loggers -
pip install spacy-loggers[wandb] spacy-loggers[mlflow] spacy-loggers[clearml] spacy-loggers[torch] spacy-loggers[cupy]
Imports
- WandbLogger.v5
@loggers = "spacy.WandbLogger.v5"
- MLflowLogger.v2
@loggers = "spacy.MLflowLogger.v2"
- ClearMLLogger.v2
@loggers = "spacy.ClearMLLogger.v2"
- ChainLogger.v1
@loggers = "spacy.ChainLogger.v1"
- PyTorchLogger.v1
@loggers = "spacy.PyTorchLogger.v1"
- CupyLogger.v1
@loggers = "spacy.CupyLogger.v1"
Quickstart
# Example config.cfg snippet for Weights & Biases logging
# This file is typically used with `python -m spacy train config.cfg`
[training.logger]
@loggers = "spacy.WandbLogger.v5"
project_name = "my_spacy_project"
remove_config_values = ["paths.train", "paths.dev"]
# To combine with console logging (required for v5+ if console output is desired)
# [training.logger]
# @loggers = "spacy.ChainLogger.v1"
# loggers = [
# { "@loggers": "spacy.ConsoleLogger.v2" },
# { "@loggers": "spacy.WandbLogger.v5", "project_name": "my_spacy_project" }
# ]