Outerbounds (Metaflow Distribution)
Outerbounds is an opinionated distribution of Metaflow, designed to streamline machine learning workflows by providing pre-configured cloud infrastructure and a curated set of dependencies. It aims to reduce administrative overhead, allowing data scientists to focus more on model development. The current version is 0.12.28 and it follows a continuous release cadence, often aligning with Metaflow updates.
Common errors
-
ModuleNotFoundError: No module named 'metaflow'
cause The `outerbounds` package, despite being a distribution, explicitly depends on `metaflow`. This error indicates that `metaflow` itself was not installed or is not accessible in the current Python environment.fixEnsure `outerbounds` (and thus `metaflow`) is installed correctly: `pip install outerbounds`. If using virtual environments, activate the correct environment. -
RuntimeError: Failed to connect to backend: Failed to connect to S3 (AWS) / Failed to connect to Kubernetes (K8s)
cause This typically means your cloud environment (AWS, Kubernetes) is not properly configured, or your credentials are missing/invalid for Metaflow to store/retrieve data or orchestrate tasks.fixVerify your cloud credentials and configuration. For AWS, run `aws configure` or ensure `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, and `AWS_DEFAULT_REGION` are set. For Kubernetes, ensure your `kubectl` context is correct and `metaflow configure kubernetes` has been run. -
Flow 'MyFlow' did not find any steps. Did you decorate your methods with '@step'?
cause A Metaflow flow requires at least one method decorated with `@step`. This error occurs if no methods in your `FlowSpec` subclass are marked as steps.fixEnsure that all methods intended as steps in your Metaflow flow are decorated with `@step` (e.g., `@step def start(self):`). -
AttributeError: 'SomeFlow' object has no attribute 'some_data'
cause Attempting to access a data attribute (e.g., `self.some_data`) in a Metaflow flow before it has been set in a preceding step, or if there's a typo.fixMetaflow data attributes are passed between steps. Ensure `self.some_data = value` is set in an earlier step before it's accessed in a later one. Verify variable names for typos.
Warnings
- gotcha Outerbounds is primarily a distribution and configuration layer for Metaflow. It does not introduce new Python APIs compared to `metaflow`. All core functionalities and imports come directly from the `metaflow` package.
- gotcha The primary value of `outerbounds` comes from its pre-configured cloud integration (AWS, Kubernetes). While you can run flows locally, leveraging its full capabilities requires proper cloud setup and authentication.
- gotcha When running flows in a distributed cloud environment, ensure all necessary Python dependencies are specified. `outerbounds` bundles common ML dependencies, but custom or less common packages may need to be explicitly listed using `@conda`, `@pypi`, or `@pip` decorators in your Metaflow code.
- breaking While `outerbounds` aims for stability, underlying `metaflow` versions may introduce breaking changes. Upgrading `outerbounds` implicitly upgrades `metaflow` and its dependencies, which can occasionally lead to unexpected behavior if your code relies on deprecated `metaflow` features.
Install
-
pip install outerbounds
Imports
- FlowSpec
from metaflow import FlowSpec
- step
from metaflow import step
- current
from metaflow import current
- project
from outerbounds import project
from metaflow import project
Quickstart
from metaflow import FlowSpec, step
class MyFirstFlow(FlowSpec):
@step
def start(self):
self.message = 'Hello, Outerbounds!'
print(self.message)
self.next(self.end)
@step
def end(self):
print(f"Flow finished with message: {self.message}")
if __name__ == '__main__':
MyFirstFlow()