Metaflow

2.19.22 · active · verified Sat Apr 11

Metaflow is a human-friendly Python library that helps scientists and engineers build and manage real-life data science projects. Originally developed at Netflix, it provides a unified API to the infrastructure stack required for data science projects, from prototype to production. It is actively maintained with frequent patch releases.

Warnings

Install

Imports

Quickstart

This quickstart defines a basic Metaflow workflow. It consists of three sequential steps: `start`, `hello`, and `end`. The `start` step initializes a message, the `hello` step prints it, and the `end` step marks the completion of the flow. To run this, save it as a Python file (e.g., `hello_flow.py`) and execute `python hello_flow.py run` in your terminal.

import os
from metaflow import FlowSpec, step

class HelloFlow(FlowSpec):
    """A simple Metaflow that prints 'Hi'."""

    @step
    def start(self):
        """This is the 'start' step. All flows must have a step named 'start'."""
        print("HelloFlow is starting.")
        self.message = "Metaflow says: Hi!"
        self.next(self.hello)

    @step
    def hello(self):
        """A step for Metaflow to introduce itself."""
        print(self.message)
        self.next(self.end)

    @step
    def end(self):
        """This is the 'end' step. All flows must have an 'end' step."""
        print("HelloFlow is all done.")

if __name__ == "__main__":
    HelloFlow()

view raw JSON →