{"id":10224,"library":"sf-hamilton","title":"Hamilton","description":"Hamilton (current version 1.89.0) is a Python micro-framework for defining dataflows as functions, enabling modular, testable, and maintainable data pipelines. It represents data transformations as a directed acyclic graph (DAG) where nodes are Python functions and edges are dependencies, making it easy to build complex dataframes. It has an active release cadence with frequent updates.","status":"active","version":"1.89.0","language":"en","source_language":"en","source_url":"https://github.com/DAGWorks-Inc/hamilton","tags":["data-engineering","data-science","workflow","dataframe","DAG","ETL"],"install":[{"cmd":"pip install sf-hamilton","lang":"bash","label":"Install core library"},{"cmd":"pip install \"sf-hamilton[pandas, visualization]\"","lang":"bash","label":"Install with common extras"}],"dependencies":[{"reason":"Fundamental for working with DataFrames, which is Hamilton's primary use case. Often installed as part of a `[pandas]` extra or by the user's project.","package":"pandas","optional":true},{"reason":"Required for visualizing the DAG (e.g., `driver.visualize_execution()`). Also requires the system-level `graphviz` executable.","package":"pygraphviz","optional":true}],"imports":[{"symbol":"Driver","correct":"from hamilton import driver"},{"symbol":"function_modifiers","correct":"from hamilton import function_modifiers as fm"}],"quickstart":{"code":"from hamilton import driver\nfrom hamilton import function_modifiers as fm\nimport pandas as pd\n\n# Define functions representing nodes in the DAG\ndef initial_transactions() -> pd.DataFrame:\n    \"\"\"Simulate initial transaction data.\"\"\"\n    return pd.DataFrame({\n        'user_id': [1, 1, 2, 2, 3],\n        'amount': [10.0, 15.0, 5.0, 20.0, 30.0],\n        'date': pd.to_datetime(['2024-01-01', '2024-01-02', '2024-01-01', '2024-01-03', '2024-01-02'])\n    })\n\ndef daily_spend(initial_transactions: pd.DataFrame) -> pd.DataFrame:\n    \"\"\"Calculate daily spend per user.\"\"\"\n    return initial_transactions.groupby(['user_id', 'date'])['amount'].sum().reset_index()\n\n@fm.config.when(period='30_day')\ndef avg_spend__30_day(daily_spend: pd.DataFrame) -> pd.DataFrame:\n    \"\"\"Calculate average daily spend over a configured 30-day period.\"\"\"\n    # In a real scenario, this would filter for the last 30 days\n    return daily_spend.groupby('user_id')['amount'].mean().reset_index().rename(columns={'amount': 'avg_30_day_spend'})\n\n# Create and run the driver\ndr = driver.Driver({'period': '30_day'})\nresult = dr.execute(final_outputs=['avg_spend__30_day'])\n\nprint(result['avg_spend__30_day'])","lang":"python","description":"This quickstart defines a simple dataflow: initial transactions are aggregated into daily spend, and then an average daily spend over a specific period is calculated. It demonstrates function-based node definition, the use of a function modifier (`@fm.config.when`), and executing the `Driver` to obtain a specific output."},"warnings":[{"fix":"Consult the official migration guide for versions 1.0.0 and above. Redesign dataflow functions to align with the new Hamilton paradigm.","message":"Major Refactor in Version 1.0.0. If upgrading from versions prior to 1.0.0, expect significant breaking changes, including how configuration is handled and the removal of `base_functions`. The `function_modifiers.extract_fields` replaced older patterns.","severity":"breaking","affected_versions":"<1.0.0"},{"fix":"Ensure that the parameter names in your functions exactly match the names of the functions producing their required inputs, or the names of initial inputs provided to the Driver.","message":"Function Parameter Naming is Crucial. Hamilton resolves dependencies by matching function parameter names to other function names (or configured variable names). A typo in a parameter name will lead to a 'missing required parameter' error, as the DAG cannot be correctly constructed.","severity":"gotcha","affected_versions":"All"},{"fix":"Always add the name of the desired output function(s) to the `final_outputs` list in your `driver.execute()` or `driver.materialize()` call.","message":"Outputs Must Be Explicit. When using `driver.execute()` or `driver.materialize()`, you must explicitly list all desired outputs in the `final_outputs` parameter. If a function is defined but not specified as an output or a dependency for a requested output, it won't be executed.","severity":"gotcha","affected_versions":"All"}],"env_vars":null,"last_verified":"2026-04-17T00:00:00.000Z","next_check":"2026-07-16T00:00:00.000Z","problems":[{"fix":"Check the spelling of `some_output_name`. Ensure the function `def some_output_name(...)` is correctly defined in an imported module or directly in the script, and that the `Driver` is aware of it.","cause":"The requested output function `some_output_name` does not exist in the DAG. This could be due to a typo, the function not being defined, or not being accessible to the `Driver`.","error":"hamilton.graph.GraphException: Nodes ['some_output_name'] were not found in the graph."},{"fix":"Define a function `def some_dependency_name(...)` to provide the required input, or pass `some_dependency_name` as an initial input argument to the `Driver`'s constructor or `execute` method (e.g., `driver.execute(inputs={'some_dependency_name': ...}, ...) `).","cause":"The function `my_transform` requires an input named `some_dependency_name`, but no function named `some_dependency_name` exists in the graph, nor was it provided as an initial input to the `Driver`.","error":"TypeError: Missing required parameter for function 'my_transform': 'some_dependency_name'"},{"fix":"Install `pygraphviz` via `pip install \"sf-hamilton[visualization]\"`. For the underlying `graphviz` command-line tool, install it via your system's package manager (e.g., `sudo apt-get install graphviz` on Debian/Ubuntu, `brew install graphviz` on macOS) and ensure it's in your system's PATH.","cause":"You are trying to visualize the DAG (e.g., `driver.visualize_execution()`) but the necessary visualization tools (`pygraphviz` Python package and/or the `graphviz` system tool) are not installed or not in the system PATH.","error":"ModuleNotFoundError: No module named 'pygraphviz' or OSError: failed to execute ['dot', '-V'], exit code 1, stderr: b'sh: dot: command not found\\n'"}]}