{"id":4108,"library":"metaflow","title":"Metaflow","description":"Metaflow is a human-friendly Python library that helps scientists and engineers build and manage real-life data science projects. Originally developed at Netflix, it provides a unified API to the infrastructure stack required for data science projects, from prototype to production. It is actively maintained with frequent patch releases.","status":"active","version":"2.19.22","language":"en","source_language":"en","source_url":"https://github.com/Netflix/metaflow","tags":["MLOps","workflow orchestration","data science","machine learning","cloud computing","AWS"],"install":[{"cmd":"pip install metaflow","lang":"bash","label":"Install Metaflow"}],"dependencies":[{"reason":"Used for integrations with AWS services like S3, Batch, and Step Functions. Metaflow is tightly integrated with AWS.","package":"boto3","optional":true},{"reason":"Required for Kubernetes and Argo Workflows orchestration.","package":"kubernetes","optional":true},{"reason":"Used for managing Python environments and dependencies via the @conda decorator.","package":"conda","optional":true},{"reason":"An increasingly popular package manager supported by Metaflow (new in 2.15.8) for managing PyPI dependencies.","package":"uv","optional":true}],"imports":[{"note":"The base class for defining a Metaflow workflow.","symbol":"FlowSpec","correct":"from metaflow import FlowSpec"},{"note":"A decorator to mark a method as a step in the workflow DAG.","symbol":"step","correct":"from metaflow import step"},{"note":"Used to define command-line parameters for a flow.","symbol":"Parameter","correct":"from metaflow import Parameter"},{"note":"Decorator for creating rich UI cards for steps.","symbol":"card","correct":"from metaflow import card"},{"note":"Decorator for managing step-specific PyPI dependencies for local and remote execution.","symbol":"pypi","correct":"from metaflow import pypi"},{"note":"Decorator for managing step-specific Conda dependencies for local and remote execution.","symbol":"conda","correct":"from metaflow import conda"}],"quickstart":{"code":"import os\nfrom metaflow import FlowSpec, step\n\nclass HelloFlow(FlowSpec):\n    \"\"\"A simple Metaflow that prints 'Hi'.\"\"\"\n\n    @step\n    def start(self):\n        \"\"\"This is the 'start' step. All flows must have a step named 'start'.\"\"\"\n        print(\"HelloFlow is starting.\")\n        self.message = \"Metaflow says: Hi!\"\n        self.next(self.hello)\n\n    @step\n    def hello(self):\n        \"\"\"A step for Metaflow to introduce itself.\"\"\"\n        print(self.message)\n        self.next(self.end)\n\n    @step\n    def end(self):\n        \"\"\"This is the 'end' step. All flows must have an 'end' step.\"\"\"\n        print(\"HelloFlow is all done.\")\n\nif __name__ == \"__main__\":\n    HelloFlow()","lang":"python","description":"This quickstart defines a basic Metaflow workflow. It consists of three sequential steps: `start`, `hello`, and `end`. The `start` step initializes a message, the `hello` step prints it, and the `end` step marks the completion of the flow. To run this, save it as a Python file (e.g., `hello_flow.py`) and execute `python hello_flow.py run` in your terminal."},"warnings":[{"fix":"Consult the `Release Notes` section on the Metaflow documentation or GitHub releases page for specific breaking changes and migration steps before upgrading.","message":"While Metaflow generally prioritizes backward compatibility, minor breaking changes can occur, especially in patch versions addressing bug fixes or internal architectural improvements. Always review the GitHub release notes before upgrading.","severity":"breaking","affected_versions":"All versions (check release notes for specifics)"},{"fix":"Use `@pypi(packages={'package_name': 'version'})` or `@conda(packages={'package_name': 'version'})` decorators on your `FlowSpec` class or individual `@step` methods for all external Python dependencies.","message":"When scaling Metaflow flows to remote compute environments (e.g., AWS Batch, Kubernetes), locally `pip install`'d or `conda install`'d third-party dependencies are not automatically available. You must explicitly declare these dependencies using the `@pypi` or `@conda` decorators on your flow or individual steps to ensure reproducibility and correct execution in remote environments.","severity":"gotcha","affected_versions":"All versions when using remote execution."},{"fix":"Review Metaflow's documentation for specific cloud provider integrations to understand the scope of support and any necessary configurations for non-AWS environments.","message":"Metaflow's most mature and battle-tested integrations are with Amazon Web Services (AWS), including S3 for storage, Batch for compute, and Step Functions for orchestration. While it supports other cloud providers like Azure and GCP, the level of integration and available features may vary, potentially requiring more manual configuration.","severity":"gotcha","affected_versions":"All versions when using non-AWS cloud providers."},{"fix":"Install and configure Windows Subsystem for Linux (WSL) and then install Metaflow within the WSL environment.","message":"Metaflow does not offer native support for Windows operating systems. Users on Windows must utilize the Windows Subsystem for Linux (WSL) to install and run Metaflow, as it relies on a *nix-like environment.","severity":"gotcha","affected_versions":"All versions on Windows."},{"fix":"Always pass data between steps by assigning it to instance variables (e.g., `self.data_artifact = value`). Avoid relying on global state or external files for inter-step communication, as Metaflow handles serialization and deserialization of `self.` attributes automatically.","message":"Data artifacts (instance variables prefixed with `self.`) are automatically persisted and passed between steps. Directly relying on global variables or modifying external state outside of Metaflow's artifact management can lead to non-reproducible runs, especially in distributed or resumed executions, as these changes might not be tracked or correctly restored.","severity":"gotcha","affected_versions":"All versions."}],"env_vars":null,"last_verified":"2026-04-11T00:00:00.000Z","next_check":"2026-07-10T00:00:00.000Z"}