ML Collections
ML Collections is a library of Python collections designed for ML use cases. It provides dict-like data structures, primarily `ConfigDict` and `FrozenConfigDict`, which offer dot-based access, type safety, and other features useful for managing experiment configurations in a structured way. The library is actively maintained, with its current version being 1.1.0, and receives regular updates.
Warnings
- gotcha Using `config_dict.get_ref()` creates a bidirectional dependency. If you change the referenced value, the original value also changes. For one-way references, use `config_dict.get_oneway_ref()` instead.
- gotcha ConfigDicts are largely type-safe: once a field is set with a particular type, reassigning a value of an incompatible type (e.g., `int` to a `str` field) will raise a `TypeError`. An exception is made for `int` values being assigned to `float` fields, which are automatically converted.
- gotcha When using `ml_collections.config_flags` to override boolean values from the command line, the syntax is specific: `--config.boolean_field` sets it to `True`, and `--noconfig.boolean_field` sets it to `False`. Standard `--config.boolean_field=value` (with 'true', 'false', 'True', 'False') is also supported.
- breaking Initializing a `ConfigDict` with an `initial_dictionary` that contains lists or tuples with nested dictionaries, `ConfigDict`s, or `FieldReference`s directly can lead to errors. The internal reference structure must form a Directed Acyclic Graph (DAG).
Install
-
pip install ml-collections
Imports
- ConfigDict
from ml_collections import config_dict
- FrozenConfigDict
from ml_collections import config_dict
- FieldReference
from ml_collections import config_dict
- DEFINE_config_dict
from ml_collections import config_flags
- DEFINE_config_file
from ml_collections import config_flags
Quickstart
from ml_collections import config_dict
# Create a ConfigDict
cfg = config_dict.ConfigDict()
# Assign values with dot notation
cfg.learning_rate = 0.001
cfg.optimizer = 'Adam'
cfg.model = config_dict.ConfigDict()
cfg.model.name = 'ResNet50'
cfg.model.num_layers = 50
# Access values
print(f"Learning rate: {cfg.learning_rate}")
print(f"Model name: {cfg.model.name}")
# ConfigDicts are type-safe (mostly)
try:
cfg.learning_rate = 'high' # This will raise a TypeError
except TypeError as e:
print(f"Caught expected error: {e}")
# Integer can be assigned to float fields
cfg.weight_decay = 1e-5
cfg.weight_decay = 0 # This works as int -> float conversion is allowed
print(f"Weight decay: {cfg.weight_decay}")