{"id":14798,"library":"parquet","title":"Parquet","description":"The `parquet` library (parquet-python) is a pure-Python implementation for working with the Apache Parquet file format. As of its last update (version 1.3.1), it primarily offers read-only support for Parquet files, allowing users to extract data as JSON or TSV. The project explicitly states that performance has not been optimized and many features, including writing, are not implemented. Development appears to have ceased in 2017 on GitHub and the last PyPI upload was in 2020, indicating it is an unmaintained project.","status":"abandoned","version":"1.3.1","language":"en","source_language":"en","source_url":"https://github.com/jcrobak/parquet-python","tags":["data serialization","file format","apache parquet","read-only","unmaintained"],"install":[{"cmd":"pip install parquet","lang":"bash","label":"Install base package"},{"cmd":"pip install 'parquet[snappy]'","lang":"bash","label":"Install with Snappy compression support"}],"dependencies":[{"reason":"Required for parsing Parquet metadata.","package":"pythrift2","optional":false},{"reason":"Optional dependency for supporting Snappy compressed Parquet files.","package":"python-snappy","optional":true}],"imports":[{"note":"The PyPI examples use `import parquet` and then access methods via `parquet.MethodName`.","wrong":"from parquet import DictReader","symbol":"DictReader","correct":"import parquet\nparquet.DictReader(...)"},{"symbol":"reader","correct":"import parquet\nparquet.reader(...)"}],"quickstart":{"code":"import parquet\nimport json\nimport os\n\n# Create a dummy Parquet file for demonstration\n# This library only supports reading, so we'll simulate a file.\n# In a real scenario, you'd have an existing .parquet file.\n\n# For demonstration, we'll write a simple text file\n# and ask the user to manually create a test.parquet file\n# since this library does not support writing.\n# You would replace 'test.parquet' with your actual file.\n\nprint(\"This library is read-only. Please ensure 'test.parquet' exists.\")\nprint(\"Example content (replace with actual Parquet data):\")\nprint(\"## foo bar baz\\n## 1 2 3\\n## 4 5 6\")\n\n# Assuming a 'test.parquet' file exists with data:\n# {'foo': 1, 'bar': 2, 'baz': 3}\n# {'foo': 4, 'bar': 5, 'baz': 6}\n\ntry:\n    with open(\"test.parquet\", \"rb\") as fo:\n        print(\"\\nReading 'test.parquet' with DictReader (columns 'foo', 'bar'):\")\n        for row in parquet.DictReader(fo, columns=['foo', 'bar']):\n            print(json.dumps(row))\n\n    with open(\"test.parquet\", \"rb\") as fo:\n        print(\"\\nReading 'test.parquet' with reader (columns 'foo', 'bar'):\")\n        for row in parquet.reader(fo, columns=['foo', 'bar']):\n            print(\",\".join([str(r) for r in row]))\nexcept FileNotFoundError:\n    print(\"Error: 'test.parquet' not found. Please create one for testing.\")\nexcept Exception as e:\n    print(f\"An error occurred: {e}\")","lang":"python","description":"The quickstart demonstrates reading a Parquet file using the `DictReader` to get rows as dictionaries, and `reader` to get rows as lists. It's crucial to note that this library is strictly read-only; you cannot create Parquet files with it. For testing, you must provide an existing Parquet file."},"warnings":[{"fix":"For writing Parquet files, use actively maintained libraries such as `pyarrow` or `fastparquet`.","message":"The `parquet` library (parquet-python) is explicitly a read-only implementation of the Parquet format; it does not support writing Parquet files.","severity":"breaking","affected_versions":"All versions (1.0 - 1.3.1)"},{"fix":"Consider migrating to `pyarrow` or `fastparquet` for full feature support, better performance, and active maintenance. `pyarrow` is generally recommended, especially with Pandas 3.0+ reliance on it.","message":"This library is largely unmaintained and has not seen significant development since 2017 (GitHub) / 2020 (PyPI). Many features of the Parquet format, including nested data, are not fully implemented or tested, and performance is explicitly stated as 'not yet optimized'.","severity":"gotcha","affected_versions":"All versions (1.0 - 1.3.1)"},{"fix":"If using modern Python versions, use `pyarrow` or `fastparquet` which are actively maintained and support current Python environments.","message":"The library officially supports Python 2.7, 3.6, and 3.7. Compatibility with newer Python versions (3.8+) is not guaranteed and unlikely to be addressed due to the project's abandonment.","severity":"deprecated","affected_versions":"Python 3.8+"},{"fix":"Be aware of potential bugs or incomplete features. For stable and production-ready Parquet handling, `pyarrow` is the recommended choice.","message":"The project is labeled with a 'Development Status :: 3 - Alpha' on PyPI, indicating it is an unstable and experimental project, despite its age.","severity":"gotcha","affected_versions":"All versions (1.0 - 1.3.1)"},{"fix":"For efficient processing of large datasets, especially with vectorized operations or in parallel computing environments, use `pyarrow` or `fastparquet`.","message":"The `fastparquet` library was forked from `parquet-python` in 2016 specifically because `parquet-python` was 'not designed for vectorised loading of big data or parallel access,' highlighting its performance limitations for large-scale data.","severity":"gotcha","affected_versions":"All versions (1.0 - 1.3.1)"}],"env_vars":null,"last_verified":"2026-04-15T00:00:00.000Z","next_check":"2026-07-14T00:00:00.000Z","problems":[],"ecosystem":"pypi"}