{"library":"pyspark-pandas","title":"PySpark-Pandas","description":"PySpark-Pandas (version 0.0.7) is an early project that aimed to provide tools and algorithms for pandas DataFrames distributed on PySpark. Its last release was in 2016, and the project has since been abandoned. The PyPI description itself advises users to consider alternatives like SparklingPandas, and the official Apache Spark project now includes its own 'Pandas API on Spark' (formerly Koalas) for similar functionality, which is the recommended modern solution.","language":"python","status":"abandoned","last_verified":"Sun May 17","install":{"commands":["pip install pyspark-pandas"],"cli":null},"imports":["from pyspark_pandas import DataFrame"],"auth":{"required":false,"env_vars":[]},"quickstart":{"code":"# The 'pyspark-pandas' (0.0.7) library is abandoned and lacks a functional, self-contained quickstart example\n# compatible with modern Spark/Python environments.\n# Its primary functionality would have involved wrapping Spark RDDs or DataFrames with a pandas-like interface.\n#\n# For modern 'Pandas API on Spark' functionality, use pyspark.pandas:\nfrom pyspark.sql import SparkSession\nimport pyspark.pandas as ps\nimport pandas as pd\n\n# Create a SparkSession\nspark = SparkSession.builder.appName(\"PandasOnSparkQuickstart\").getOrCreate()\n\n# Create a pandas-on-Spark DataFrame from a pandas DataFrame\npd_df = pd.DataFrame({\"col1\": [1, 2, 3], \"col2\": [4, 5, 6]})\nps_df = ps.from_pandas(pd_df)\n\nprint(\"Pandas-on-Spark DataFrame:\")\nprint(ps_df)\nprint(f\"Type: {type(ps_df)}\")\n\n# Perform a simple operation\nps_df['col3'] = ps_df['col1'] + ps_df['col2']\nprint(\"\\nDataFrame after operation:\")\nprint(ps_df)\n\n# Convert back to a pandas DataFrame (collects data to driver)\npandas_result = ps_df.to_pandas()\nprint(\"\\nResult as pandas DataFrame:\")\nprint(pandas_result)\n\nspark.stop()","lang":"python","description":"The `pyspark-pandas` library (version 0.0.7) is largely unmaintained and does not offer a readily available, functional quickstart. The provided code demonstrates a quickstart using the official 'Pandas API on Spark' (`pyspark.pandas`), which is the recommended alternative for distributed pandas-like operations in a modern PySpark environment.","tag":null,"tag_description":null,"last_tested":null,"results":[]},"compatibility":{"tag":null,"tag_description":null,"last_tested":"2026-05-17","installed_version":"0.0.7","pypi_latest":"0.0.7","is_stale":false,"summary":{"python_range":"3.10–3.9","success_rate":100,"avg_install_s":8.7,"avg_import_s":null,"wheel_type":"sdist"},"results":[{"runtime":"python:3.10-alpine","python_version":"3.10","os_libc":"alpine (musl)","variant":"pyspark-pandas","exit_code":0,"wheel_type":"sdist","failure_reason":null,"import_side_effects":"broken","install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":"166.0M"},{"runtime":"python:3.10-slim","python_version":"3.10","os_libc":"slim (glibc)","variant":"pyspark-pandas","exit_code":0,"wheel_type":"sdist","failure_reason":null,"import_side_effects":"broken","install_time_s":8.2,"import_time_s":null,"mem_mb":null,"disk_size":"159M"},{"runtime":"python:3.11-alpine","python_version":"3.11","os_libc":"alpine (musl)","variant":"pyspark-pandas","exit_code":0,"wheel_type":"sdist","failure_reason":null,"import_side_effects":"broken","install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":"179.2M"},{"runtime":"python:3.11-slim","python_version":"3.11","os_libc":"slim (glibc)","variant":"pyspark-pandas","exit_code":0,"wheel_type":"sdist","failure_reason":null,"import_side_effects":"broken","install_time_s":7.8,"import_time_s":null,"mem_mb":null,"disk_size":"171M"},{"runtime":"python:3.12-alpine","python_version":"3.12","os_libc":"alpine (musl)","variant":"pyspark-pandas","exit_code":0,"wheel_type":"sdist","failure_reason":null,"import_side_effects":"broken","install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":"162.5M"},{"runtime":"python:3.12-slim","python_version":"3.12","os_libc":"slim (glibc)","variant":"pyspark-pandas","exit_code":0,"wheel_type":"sdist","failure_reason":null,"import_side_effects":"broken","install_time_s":9.1,"import_time_s":null,"mem_mb":null,"disk_size":"154M"},{"runtime":"python:3.13-alpine","python_version":"3.13","os_libc":"alpine (musl)","variant":"pyspark-pandas","exit_code":0,"wheel_type":"sdist","failure_reason":null,"import_side_effects":"broken","install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":"161.5M"},{"runtime":"python:3.13-slim","python_version":"3.13","os_libc":"slim (glibc)","variant":"pyspark-pandas","exit_code":0,"wheel_type":"sdist","failure_reason":null,"import_side_effects":"broken","install_time_s":8.7,"import_time_s":null,"mem_mb":null,"disk_size":"153M"},{"runtime":"python:3.9-alpine","python_version":"3.9","os_libc":"alpine (musl)","variant":"pyspark-pandas","exit_code":0,"wheel_type":"sdist","failure_reason":null,"import_side_effects":"broken","install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":"173.9M"},{"runtime":"python:3.9-slim","python_version":"3.9","os_libc":"slim (glibc)","variant":"pyspark-pandas","exit_code":0,"wheel_type":"sdist","failure_reason":null,"import_side_effects":"broken","install_time_s":9.9,"import_time_s":null,"mem_mb":null,"disk_size":"169M"}]}}