Snowflake Snowpark Python

raw JSON →
1.48.0 verified Tue May 12 auth: no python install: draft

Snowflake Snowpark for Python provides an intuitive API for querying and processing data in Snowflake using Python. It enables data engineers and data scientists to build scalable data pipelines and machine learning workflows directly within Snowflake, leveraging its elastic, scalable, and secure engine. The library is actively maintained with frequent releases, typically every few weeks, bringing new features, improvements, and bug fixes.

pip install snowflake-snowpark-python
error ModuleNotFoundError: No module named 'snowflake.snowpark'
cause The `snowflake-snowpark-python` library is not installed in the Python environment or the environment is not correctly activated.
fix
Run pip install snowflake-snowpark-python in your terminal to install the library.
error snowflake.snowpark.exceptions.SnowparkClientException: Failed to connect to Snowflake. Please check your connection parameters.
cause One or more required connection parameters (e.g., account, user, password, role, warehouse, database, schema) are missing, incorrect, or the provided credentials are invalid.
fix
Verify all connection parameters are correctly provided in the dictionary passed to Session.builder.configs() and ensure your network can reach Snowflake.
error AttributeError: 'SessionBuilder' object has no attribute 'account'
cause The Snowpark `SessionBuilder` API uses a single `configs()` method to pass all connection parameters as a dictionary, rather than individual setter methods like `account()`, `user()`, etc.
fix
Use Session.builder.configs({'account': 'your_account', 'user': 'your_user', 'password': 'your_password'}).create() to build the session.
error snowflake.snowpark.exceptions.SnowparkSQLException: SQL compilation error: Statement is too large or complex to compile.
cause The sequence of Snowpark DataFrame transformations generates an underlying SQL query that exceeds Snowflake's internal limits for statement size or complexity.
fix
Break down complex DataFrame operations into smaller steps by materializing intermediate results using .cache_result() or by saving to a temporary table with .to_df_writer().save_as_table().
breaking Snowpark Python has dropped support for Python 3.8. Version 1.24.0 was the last to support it. Using Snowpark Python with Python 3.8 will trigger deprecation warnings.
fix Upgrade your Python environment to 3.9 or greater. The library now requires Python >=3.9, <3.14.
gotcha The default 'overwrite' mode for `DataFrameWriter.save_as_table` drops and recreates the target table, leading to potential data loss for non-matching rows and impacting grants. This can be unexpected if partial updates are desired.
fix For targeted delete-insert operations, use the `overwrite_condition` parameter (available since v1.44.0) with `mode='overwrite'`. If you need to preserve grants, specify `copy_grants=True` where applicable.
gotcha When registering UDFs/SPROCs, specifying an empty list (`[]`) for the `imports` or `packages` argument now explicitly means *no* imports/packages for that specific UDF/SPROC. This behavior changed from older versions where an empty list implicitly meant using session-level imports/packages.
fix To use session-level imports or packages, pass `None` or omit the `imports`/`packages` argument. To explicitly use no imports/packages, pass an empty list `[]`.
bug A bug existed where `Session.udf.register_from_file` did not properly process the `strict` and `secure` parameters, potentially leading to UDFs not being created with the intended security or null-handling characteristics.
fix Upgrade to `snowflake-snowpark-python` version 1.47.0 or higher.
gotcha Managing Python packages not available in Snowflake's Anaconda channel for UDFs and stored procedures can be complex. These often require manual zipping and uploading to Snowflake stages, and careful management of `imports` and `packages` parameters.
fix Prioritize packages from Snowflake's Anaconda channel. For custom or unavailable packages, zip them and upload to a Snowflake stage. Reference these staged files using the `imports` parameter when registering UDFs/SPROCs.
breaking Attempting to create a Snowpark session with incorrect or incomplete connection parameters can result in an `HttpError: 404 Not Found` during the login process, indicating that the Snowflake endpoint could not be reached or is invalid.
fix Ensure all required connection parameters (e.g., `account`, `user`, `password`/`authenticator`, `role`, `warehouse`, `database`, `schema`) are correctly provided and formatted. Double-check the account identifier and region in your connection string. Refer to the Snowflake documentation for correct connection string formats and parameter requirements.
breaking Building `snowflake-connector-python` fails due to a missing C/C++ compiler toolchain (e.g., for Arrow support) in the environment. This error typically appears as 'g++: No such file or directory' during the wheel build process.
fix Install the required C/C++ compiler toolchain in your environment. For Alpine Linux, use `apk add build-base`. For Debian/Ubuntu, use `apt-get install build-essential`. For other operating systems, refer to their documentation for installing development tools.
python os / libc status wheel install import disk
3.10 alpine (musl) build_error - - - -
3.10 alpine (musl) - - - -
3.10 slim (glibc) wheel 8.8s 2.39s 111M
3.10 slim (glibc) - - 2.20s 109M
3.11 alpine (musl) build_error - - - -
3.11 alpine (musl) - - - -
3.11 slim (glibc) wheel 7.7s 3.63s 119M
3.11 slim (glibc) - - 3.45s 117M
3.12 alpine (musl) build_error - - - -
3.12 alpine (musl) - - - -
3.12 slim (glibc) wheel 7.6s 4.24s 119M
3.12 slim (glibc) - - 4.87s 117M
3.13 alpine (musl) build_error - - - -
3.13 alpine (musl) - - - -
3.13 slim (glibc) wheel 7.4s 3.94s 118M
3.13 slim (glibc) - - 4.03s 117M
3.9 alpine (musl) build_error - - - -
3.9 alpine (musl) - - - -
3.9 slim (glibc) wheel 9.9s 2.68s 110M
3.9 slim (glibc) - - 2.48s 109M

This quickstart demonstrates how to establish a Snowpark Session, create a DataFrame from local data, perform a basic transformation, and display the results. Connection parameters are loaded from environment variables for secure and flexible setup. Remember to replace placeholder values with your Snowflake account details.

import os
from snowflake.snowpark import Session
from snowflake.snowpark.functions import col

# Establish a Snowpark Session using environment variables
# Replace with your actual connection parameters, or configure ~/.snowflake/connections.toml
connection_parameters = {
    "account": os.environ.get("SNOWFLAKE_ACCOUNT", "your_account_identifier"),
    "user": os.environ.get("SNOWFLAKE_USER", "your_username"),
    "password": os.environ.get("SNOWFLAKE_PASSWORD", "your_password"),
    "role": os.environ.get("SNOWFLAKE_ROLE", "your_role"),
    "warehouse": os.environ.get("SNOWFLAKE_WAREHOUSE", "your_warehouse"),
    "database": os.environ.get("SNOWFLAKE_DATABASE", "your_database"),
    "schema": os.environ.get("SNOWFLAKE_SCHEMA", "your_schema"),
}

session = Session.builder.configs(connection_parameters).create()
print("Snowpark Session created successfully.")

# Create a simple DataFrame
data = [("Alice", 1), ("Bob", 2), ("Charlie", 3)]
df = session.create_dataframe(data, schema=["name", "id"])

# Perform a simple transformation and show results
df.filter(col("id") > 1).show()

# Close the session
session.close()
print("Snowpark Session closed.")