{"id":8060,"library":"dagster-snowflake-pandas","title":"Dagster Snowflake Pandas Integration","description":"The `dagster-snowflake-pandas` library provides a robust integration for using Pandas DataFrames with Snowflake within the Dagster data orchestration framework. It enables reading and writing Pandas DataFrames directly to and from Snowflake tables via Dagster's I/O manager system. This package is currently at version 0.29.0 and typically aligns its release cadence with the main Dagster core library.","status":"active","version":"0.29.0","language":"en","source_language":"en","source_url":"https://github.com/dagster-io/dagster/tree/master/python_modules/libraries/dagster-snowflake-pandas","tags":["dagster","snowflake","pandas","etl","data-integration","io-manager","dataframe"],"install":[{"cmd":"pip install dagster-snowflake-pandas","lang":"bash","label":"Install dagster-snowflake-pandas"}],"dependencies":[{"reason":"Core Dagster framework for asset and resource definitions.","package":"dagster"},{"reason":"Required for DataFrame manipulation and storage.","package":"pandas"},{"reason":"Provides the underlying Snowflake I/O manager and resource functionality. Often installed alongside `dagster-snowflake-pandas`.","package":"dagster-snowflake"},{"reason":"The official Python connector for Snowflake, used internally by the Dagster integration.","package":"snowflake-connector-python"}],"imports":[{"symbol":"SnowflakePandasIOManager","correct":"from dagster_snowflake_pandas import SnowflakePandasIOManager"}],"quickstart":{"code":"import pandas as pd\nimport os\nfrom dagster import asset, Definitions, EnvVar, Config\nfrom dagster_snowflake_pandas import SnowflakePandasIOManager\n\n@asset\ndef my_pandas_table() -> pd.DataFrame:\n    # Example: Create a simple Pandas DataFrame\n    data = {'col1': [1, 2], 'col2': ['A', 'B']}\n    df = pd.DataFrame(data)\n    return df\n\n@asset\ndef downstream_asset(my_pandas_table: pd.DataFrame):\n    # Example: Use the DataFrame loaded from Snowflake\n    print(f\"Loaded DataFrame from Snowflake:\\n{my_pandas_table}\")\n    return len(my_pandas_table)\n\nclass SnowflakeConfig(Config):\n    account: str\n    user: str\n    password: str\n    database: str\n    schema: str = \"public\"\n    warehouse: str = \"compute_wh\"\n\ndefs = Definitions(\n    assets=[my_pandas_table, downstream_asset],\n    resources={\n        \"io_manager\": SnowflakePandasIOManager(\n            account=EnvVar(\"SNOWFLAKE_ACCOUNT\"),\n            user=EnvVar(\"SNOWFLAKE_USER\"),\n            password=EnvVar(\"SNOWFLAKE_PASSWORD\"),\n            database=EnvVar(\"SNOWFLAKE_DATABASE\"),\n            schema=EnvVar(\"SNOWFLAKE_SCHEMA\", \"public\"),\n            warehouse=EnvVar(\"SNOWFLAKE_WAREHOUSE\", \"compute_wh\"),\n        )\n    },\n)\n\n# To run this locally, set the following environment variables:\n# os.environ[\"SNOWFLAKE_ACCOUNT\"] = \"your_account_identifier\"\n# os.environ[\"SNOWFLAKE_USER\"] = \"your_username\"\n# os.environ[\"SNOWFLAKE_PASSWORD\"] = \"your_password\"\n# os.environ[\"SNOWFLAKE_DATABASE\"] = \"your_database\"\n# os.environ[\"SNOWFLAKE_SCHEMA\"] = \"your_schema\" # Optional, defaults to 'public'\n# os.environ[\"SNOWFLAKE_WAREHOUSE\"] = \"your_warehouse\" # Optional, defaults to 'compute_wh'\n\n# Example of how you might test this (not runnable as a single script without dagster dev/launch_assets):\n# from dagster import materialize\n# if __name__ == \"__main__\":\n#     result = materialize([my_pandas_table, downstream_asset], resources=defs.resources)\n#     assert result.success\n","lang":"python","description":"This quickstart demonstrates how to configure `SnowflakePandasIOManager` as an I/O manager in Dagster to store and load Pandas DataFrames in Snowflake. It defines two assets: one that creates a DataFrame and another that consumes it, showcasing seamless data transfer via Snowflake. Snowflake connection details are securely managed using environment variables."},"warnings":[{"fix":"Be explicit about timezones in Pandas DataFrames. Consider setting `store_timestamps_as_strings=False` in `SnowflakePandasIOManager` config if you want `TIMESTAMP` types in Snowflake and are mindful of timezone conversions. If you require exact timestamp representation, ensure your Pandas DataFrames have explicit timezones (e.g., UTC) before writing to Snowflake.","message":"Handling of Pandas timestamp data in Snowflake can be problematic. The underlying `snowflake-connector-python` may corrupt timestamp data without timezones or convert non-UTC timestamps to UTC. `dagster-snowflake-pandas` attempts to mitigate this by assigning UTC by default or converting to strings if `store_timestamps_as_strings=True` is configured, but this can lead to unexpected type changes or data loss if not carefully managed.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Ensure your Pandas DataFrame column names adhere to Snowflake's valid identifier rules (start with a letter or underscore, contain only letters, numbers, and underscores). Alternatively, if you need to use invalid identifiers, you might need to preprocess your DataFrame to rename columns or potentially implement a custom type handler to enforce quoting.","message":"Beginning with Dagster core versions around 1.6.x (and corresponding `dagster-snowflake-pandas` versions), the `SnowflakePandasIOManager` changed its behavior regarding column identifiers. It now explicitly sets `quote_identifiers=False` when writing to Snowflake. This can cause `SQL compilation error: invalid identifier` if your Pandas DataFrame column names are not valid Snowflake identifiers (e.g., contain spaces, start with numbers).","severity":"breaking","affected_versions":"Versions 0.28.x and above (corresponding to Dagster core 1.6.x and above)"},{"fix":"Always install `dagster` and `dagster-snowflake-pandas` (and other `dagster-*` libraries) with compatible versions. It is often recommended to upgrade all `dagster` related packages simultaneously, typically by aligning the library versions with your core `dagster` version (e.g., if core is 1.13.0, use `dagster-snowflake-pandas~=0.29.0`).","message":"`dagster-snowflake-pandas` releases are tightly coupled with `dagster` core releases. For example, `dagster-snowflake-pandas==0.29.0` is released alongside `dagster==1.13.0`. Mismatched versions between the core framework and libraries can lead to unexpected behavior or runtime errors.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-16T00:00:00.000Z","next_check":"2026-07-15T00:00:00.000Z","problems":[{"fix":"To resolve this, either drop and recreate the Snowflake table with the timestamp column as `VARCHAR`, or set `store_timestamps_as_strings=False` in your `SnowflakePandasIOManager` configuration to allow Dagster to write Pandas timestamps directly to Snowflake `TIMESTAMP` columns. Ensure Pandas DataFrames have explicit timezones if storing as `TIMESTAMP` to avoid potential data corruption.","cause":"The `SnowflakePandasIOManager` was configured to convert Pandas timestamp columns to strings (e.g., by default or `store_timestamps_as_strings=True`), but the target column in Snowflake already exists and is defined as a `TIMESTAMP` type.","error":"DagsterInvariantViolationError: Snowflake I/O manager configured to convert time data in DataFrame column 'my_timestamp_column' to strings, but the corresponding MY_TIMESTAMP_COLUMN column in table 'MY_TABLE' is not of type VARCHAR, it is of type TIMESTAMP."},{"fix":"Rename your Pandas DataFrame columns to be valid Snowflake identifiers before returning the DataFrame from your asset function. Valid identifiers typically start with a letter or underscore and contain only alphanumeric characters and underscores (e.g., change `5_stars` to `_5_stars` or `five_stars`).","cause":"Your Pandas DataFrame contains column names that are not valid Snowflake identifiers (e.g., they start with a number or contain spaces/special characters that require quoting). Recent versions of `SnowflakePandasIOManager` explicitly set `quote_identifiers=False` when writing data, which means invalid names will cause SQL errors.","error":"snowflake.connector.errors.ProgrammingError: SQL compilation error: invalid identifier '5_STARS'"},{"fix":"Install the library using `pip install dagster-snowflake-pandas`. If in a virtual environment, ensure it is activated. If using a dependency manager like `pip-tools` or `Poetry`, ensure it's added to your project's dependencies and installed correctly.","cause":"The `dagster-snowflake-pandas` library has not been installed in your Python environment or the active environment is not the one where it was installed.","error":"ModuleNotFoundError: No module named 'dagster_snowflake_pandas'"}]}