Snowpark Connect

1.21.1 · active · verified Thu Apr 16

Snowpark Connect (current version 1.21.1) allows developers to run Snowpark Python code locally using a local Spark cluster, emulating Snowpark functionalities without requiring a direct Snowflake connection. This facilitates offline development, testing, and CI/CD pipelines. It receives updates typically aligned with Snowpark Python and underlying Spark/Snowflake connector releases, and is actively maintained by Snowflake Labs.

Common errors

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to initialize a local Snowpark Connect session using `connect_with_spark_session_builder`, create a Snowpark session from it, and perform a basic DataFrame operation. It requires a Java Runtime Environment (JRE) to be installed and `JAVA_HOME` configured for Spark to run.

import os
from snowpark_connect.session import connect_with_spark_session_builder
from snowpark.types import StructType, StructField, StringType, IntegerType

# Create a local Spark session that emulates Snowpark behavior
# Ensure these JARs are compatible with your Spark and Snowflake versions.
spark_session = connect_with_spark_session_builder(
    app_name="SnowparkConnectLocalApp",
    config={
        "spark.jars.packages": "net.snowflake:snowflake-jdbc:3.13.29,net.snowflake:spark-snowflake_2.12:2.11.0-spark_3.4",
        "spark.jars.repositories": "https://repo1.maven.org/maven2"
    }
)

# Use the Spark session to create a Snowpark session
session = spark_session.getOrCreateSnowparkSession()

# Example: Create a Snowpark DataFrame and show its content
schema = StructType([
    StructField("name", StringType()),
    StructField("age", IntegerType())
])
data = [("Alice", 30), ("Bob", 25)]
df = session.create_dataframe(data, schema=schema)
df.show()

session.close()
spark_session.stop()

view raw JSON →