{"id":8657,"library":"snowpark-connect","title":"Snowpark Connect","description":"Snowpark Connect (current version 1.21.1) allows developers to run Snowpark Python code locally using a local Spark cluster, emulating Snowpark functionalities without requiring a direct Snowflake connection. This facilitates offline development, testing, and CI/CD pipelines. It receives updates typically aligned with Snowpark Python and underlying Spark/Snowflake connector releases, and is actively maintained by Snowflake Labs.","status":"active","version":"1.21.1","language":"en","source_language":"en","source_url":"https://github.com/Snowflake-Labs/snowpark-connect","tags":["snowflake","spark","data-processing","etl","local-development","testing"],"install":[{"cmd":"pip install snowpark-connect","lang":"bash","label":"Install Snowpark Connect"}],"dependencies":[{"reason":"Core library that Snowpark Connect emulates for local execution.","package":"snowpark-python","optional":false},{"reason":"The underlying Apache Spark framework used for local execution.","package":"pyspark","optional":false},{"reason":"Aids in locating PySpark installations, especially in non-standard environments.","package":"findspark","optional":true}],"imports":[{"symbol":"connect_with_spark_session_builder","correct":"from snowpark_connect.session import connect_with_spark_session_builder"},{"note":"The actual Snowpark Session class is imported from `snowpark`, not `snowpark_connect`.","wrong":"from snowpark_connect.session import Session","symbol":"Session","correct":"from snowpark.snowpark_session import Session"},{"note":"Snowpark DataFrame objects are part of the `snowpark` library, not `snowpark_connect`.","wrong":"from snowpark_connect.dataframe import DataFrame","symbol":"DataFrame","correct":"from snowpark.dataframe import DataFrame"}],"quickstart":{"code":"import os\nfrom snowpark_connect.session import connect_with_spark_session_builder\nfrom snowpark.types import StructType, StructField, StringType, IntegerType\n\n# Create a local Spark session that emulates Snowpark behavior\n# Ensure these JARs are compatible with your Spark and Snowflake versions.\nspark_session = connect_with_spark_session_builder(\n    app_name=\"SnowparkConnectLocalApp\",\n    config={\n        \"spark.jars.packages\": \"net.snowflake:snowflake-jdbc:3.13.29,net.snowflake:spark-snowflake_2.12:2.11.0-spark_3.4\",\n        \"spark.jars.repositories\": \"https://repo1.maven.org/maven2\"\n    }\n)\n\n# Use the Spark session to create a Snowpark session\nsession = spark_session.getOrCreateSnowparkSession()\n\n# Example: Create a Snowpark DataFrame and show its content\nschema = StructType([\n    StructField(\"name\", StringType()),\n    StructField(\"age\", IntegerType())\n])\ndata = [(\"Alice\", 30), (\"Bob\", 25)]\ndf = session.create_dataframe(data, schema=schema)\ndf.show()\n\nsession.close()\nspark_session.stop()","lang":"python","description":"This quickstart demonstrates how to initialize a local Snowpark Connect session using `connect_with_spark_session_builder`, create a Snowpark session from it, and perform a basic DataFrame operation. It requires a Java Runtime Environment (JRE) to be installed and `JAVA_HOME` configured for Spark to run."},"warnings":[{"fix":"Ensure your Python environment meets the `requires_python` specification. Consider using `pyenv` or `conda` to manage Python versions.","message":"Snowpark Connect has specific Python version requirements (currently >=3.10, <3.13). Using incompatible Python versions can lead to installation failures or runtime errors.","severity":"breaking","affected_versions":"<1.21.0, >1.21.1"},{"fix":"Install a compatible Java Runtime Environment (JRE) or Java Development Kit (JDK) (e.g., OpenJDK 8 or 11) and set the `JAVA_HOME` environment variable to its installation directory.","message":"A `java.io.IOException` or similar error indicating 'Cannot run program \"java\"' often means the `JAVA_HOME` environment variable is not set correctly, or Java is not installed or discoverable in your system's PATH.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Refer to the official Snowpark Connect documentation or GitHub README for the recommended `spark.jars.packages` values compatible with your desired Spark and Snowpark Python versions.","message":"Incorrect or outdated `spark.jars.packages` values in the `config` dictionary can lead to runtime errors when Snowpark Connect tries to load Spark-Snowflake connector JARs, preventing proper emulation.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Always import `Session`, `DataFrame`, `functions` etc., from `snowpark` (e.g., `from snowpark.session import Session`), and `connect_with_spark_session_builder` from `snowpark_connect.session`.","message":"It's common to confuse imports: `snowpark_connect` provides the session *builder*, but core Snowpark objects like `Session`, `DataFrame`, and `functions` are imported directly from the `snowpark` library.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-16T00:00:00.000Z","next_check":"2026-07-15T00:00:00.000Z","problems":[{"fix":"Set the `JAVA_HOME` environment variable to the path of your Java JDK/JRE installation (e.g., `/usr/lib/jvm/java-11-openjdk-amd64`). Also ensure Java is in your system's PATH.","cause":"The `JAVA_HOME` environment variable, which points to your Java installation, is not configured.","error":"Error: JAVA_HOME is not set"},{"fix":"Verify Java is installed and its `bin` directory is included in your system's PATH, or that `JAVA_HOME` points directly to the Java installation directory, and the `bin` subdirectory contains `java`.","cause":"The 'java' executable is not found in your system's PATH, or `JAVA_HOME` is incorrectly set.","error":"java.io.IOException: Cannot run program \"java\": error=2, No such file or directory"},{"fix":"Ensure `connect_with_spark_session_builder()` and `getOrCreateSnowparkSession()` are called successfully before any Snowpark DataFrame operations. Check Spark configuration, especially `spark.jars.packages` for correctness.","cause":"This error typically indicates that the Spark session was not properly initialized or has been stopped before being used, or there's an issue with the underlying Spark environment.","error":"org.apache.spark.SparkException: Cannot find any SparkSession in the current JVM."},{"fix":"If `pyspark` is installed, add `import findspark; findspark.init()` at the beginning of your script. Ensure `pyspark` is indeed installed with `pip show pyspark`.","cause":"Although `pyspark` might be installed, Python cannot find it, often due to environment path issues or `findspark` not being initialized.","error":"ModuleNotFoundError: No module named 'pyspark'"}]}