{"id":857,"library":"deltalake","title":"Delta Lake Python","description":"Deltalake is an open-source Python library providing native Delta Lake bindings based on the `delta-rs` Rust library, offering efficient and robust interaction with Delta Lake tables without requiring Apache Spark or JVM dependencies. It includes seamless integration with data manipulation libraries like Pandas, Polars, and PyArrow. The library is actively developed, with its current version being 1.5.0, and receives frequent updates to enhance performance and features.","status":"active","version":"1.5.0","language":"python","source_language":"en","source_url":"https://github.com/delta-io/delta-rs","tags":["delta lake","data lake","etl","parquet","dataframe","rust","native"],"install":[{"cmd":"pip install deltalake pandas pyarrow","lang":"bash","label":"Install with Pandas and PyArrow integration"}],"dependencies":[{"reason":"Required for DataFrame integration and `write_deltalake` functionality.","package":"pandas","optional":false},{"reason":"Underpins data handling, especially for DataFrame conversions and Arrow-native operations.","package":"pyarrow","optional":false},{"reason":"Needed for pretty-printing DataFrames in some quickstart examples.","package":"tabulate","optional":true}],"imports":[{"symbol":"DeltaTable","correct":"from deltalake import DeltaTable"},{"symbol":"write_deltalake","correct":"from deltalake import write_deltalake"}],"quickstart":{"code":"import pandas as pd\nfrom deltalake import write_deltalake, DeltaTable\nimport os\n\n# Define a Delta Lake table path\ntable_path = \"./tmp_delta_table\"\n\n# Ensure the directory exists or is cleaned up for a fresh start\nif os.path.exists(table_path):\n    import shutil\n    shutil.rmtree(table_path)\n\n# 1. Create a Pandas DataFrame\ndf = pd.DataFrame({\"id\": [1, 2], \"value\": [\"A\", \"B\"]})\n\n# 2. Write the DataFrame to a Delta Lake table\nwrite_deltalake(table_path, df)\nprint(f\"Initial Delta table created at: {table_path}\")\n\n# 3. Load the Delta table\ndt = DeltaTable(table_path)\nprint(f\"Current table version: {dt.version()}\")\nprint(\"Current table data:\")\nprint(dt.to_pandas().to_markdown(index=False))\n\n# 4. Append new data to the table\nnew_df = pd.DataFrame({\"id\": [3, 4], \"value\": [\"C\", \"D\"]})\nwrite_deltalake(table_path, new_df, mode=\"append\")\nprint(\"\\nData appended. New table version:\")\ndt_updated = DeltaTable(table_path)\nprint(f\"Current table version: {dt_updated.version()}\")\nprint(\"Updated table data:\")\nprint(dt_updated.to_pandas().to_markdown(index=False))\n\n# 5. Read an older version of the table (Time Travel)\ndt_v0 = DeltaTable(table_path, version=0)\nprint(\"\\nData from version 0 (time travel):\")\nprint(dt_v0.to_pandas().to_markdown(index=False))\n\n# Clean up temporary files (optional)\n# shutil.rmtree(table_path)","lang":"python","description":"This quickstart demonstrates how to create, append data to, and read different versions (time travel) of a Delta Lake table using `deltalake` and Pandas. It first creates an initial table, then appends new records, and finally shows how to access a previous state of the table by specifying a version."},"warnings":[{"fix":"Update code to expect and handle `pyarrow.Table` objects from `get_add_actions`. Adjust API calls accordingly (e.g., `to_pandas()` might still work on `ArrowTable`).","message":"In `deltalake` v1.5.0, the `get_add_actions` method now returns an `ArrowTable` instead of an `ArrowRecordBatch`. Code relying on the specific `ArrowRecordBatch` type or its API will break.","severity":"breaking","affected_versions":">=1.5.0"},{"fix":"There is currently no direct migration path provided in the library for this specific checkpoint schema change. Users might need to recreate tables or use the version of `deltalake` that created the original checkpoints for continued compatibility.","message":"Checkpoint schema changes between `deltalake` versions, notably around `0.25.5` and `1.0.2`, can lead to `DeltaError: Failed to parse parquet: Arrow: Incompatible type` when attempting to read or create checkpoints from older tables, especially if `nullable` properties for fields like `path`, `size`, `modificationTime` changed from `True` to `False`.","severity":"breaking","affected_versions":">=1.0.2 when interacting with tables created by <=0.25.5"},{"fix":"If integrating with Spark, use `delta-spark`. For Python-native operations without Spark, use `deltalake`. Avoid mixing imports or expectations from the two libraries.","message":"The `deltalake` Python library is a native implementation distinct from `delta-spark`. While both interact with Delta Lake, `deltalake` does not require Apache Spark or a JVM. Ensure you are using the correct library for your ecosystem, as `delta-spark` imports (e.g., `from delta.tables import DeltaTable`) are not compatible with `deltalake`.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Implement retry logic with exponential backoff and jitter for write operations. Consider partitioning tables strategically to minimize file-level conflicts.","message":"Concurrent write operations (e.g., multiple processes appending or updating a table simultaneously) can lead to `ConcurrentAppendException`, `ConcurrentDeleteReadException`, or `ConcurrentModificationException` due to optimistic concurrency control. While Delta Lake guarantees ACID properties, conflicts require handling.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Regularly run `DeltaTable.vacuum()` on your tables to physically remove stale data files. Be aware that `vacuum()` can break time travel beyond its retention period (default 7 days).","message":"Operations like `DeltaTable.delete()` or `write_deltalake(mode=\"overwrite\")` only mark files for deletion in the Delta transaction log. The physical files are not immediately removed from storage. This can lead to increased storage costs if not managed.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Utilize disk spilling configuration options available for `MERGE` operations in `deltalake` v1.5.0+ when dealing with large datasets.","message":"Some functionalities, especially around `MERGE` operations, might require configuring disk spilling for large datasets to avoid out-of-memory errors.","severity":"gotcha","affected_versions":">=1.5.0 for disk spilling feature, older versions prone to OOM for large merges"},{"fix":"Ensure the `tabulate` library is installed in your environment (e.g., `pip install tabulate`) if you intend to use `pandas.DataFrame.to_markdown()`.","message":"When using `dt.to_pandas().to_markdown()` to display table data, `pandas` requires the optional `tabulate` library to be installed. Without it, an `ImportError: Missing optional dependency 'tabulate'` will occur.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Ensure all necessary optional dependencies for pandas display methods are installed in your environment (e.g., `pip install tabulate`, `pip install jinja2`). Check pandas documentation for specific requirements of each display function.","message":"When converting Delta tables to pandas DataFrames and then attempting to use display methods like `to_markdown()`, `to_latex()`, or `to_html()`, you might encounter `ImportError` for packages like `tabulate`, `jinja2`, or `xhtml2pdf`. These are optional dependencies for pandas display functionalities, not direct dependencies of `deltalake`.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-05-12T20:28:43.838Z","next_check":"2026-06-27T00:00:00.000Z","problems":[{"fix":"Install the 'deltalake' library using pip: `pip install deltalake`","cause":"The 'deltalake' package is not installed in your Python environment or is not accessible on the Python path.","error":"ModuleNotFoundError: No module named 'deltalake'"},{"fix":"Ensure the DataFrame or PyArrow table has a well-defined schema, or explicitly provide a `schema` parameter to `write_deltalake` if creating a new table from an iterable source that can't be reliably inferred. For pandas DataFrames, ensure data types are consistent. For example, `write_deltalake(path, df, mode='overwrite')` will typically infer the schema correctly for a DataFrame, but if issues arise, converting to PyArrow Table first or defining schema explicitly helps.","cause":"When writing data to a new Delta Lake table using `write_deltalake` with an iterable (like a Pandas DataFrame, which is internally converted to an Arrow table) and the table does not exist or schema inference fails, an explicit schema might be required.","error":"ValueError: you must provide schema if data is iterable"},{"fix":"To allow schema changes, use `mode='overwrite'` with `overwrite_schema=True` for a complete schema replacement, or `schema_mode='merge'` (or `mode='append', schema_mode='merge'`) to append new columns and fill missing ones with nulls during an append operation: `write_deltalake(path, data, mode='append', schema_mode='merge')` or `write_deltalake(path, data, mode='overwrite', overwrite_schema=True)`.","cause":"Attempting to write data to an existing Delta Lake table where the schema of the incoming data differs from the table's current schema without explicitly handling schema evolution.","error":"ValueError: Schema of data does not match the existing table's schema."},{"fix":"Either upgrade to a newer version of `deltalake` if support has been added, disable the unsupported table features if possible (which might require re-creating the table in a compatible way), or use an alternative reader that supports these features (e.g., DuckDB with its Delta Lake extension, or Spark if available).","cause":"The Delta Lake table being read utilizes advanced features (like deletion vectors or column mapping) that are not yet fully supported by the `deltalake` Python library (which is based on `delta-rs`).","error":"DeltaProtocolError: The table has set these reader features: {'deletionVectors'} but these are not yet supported by the deltalake reader."},{"fix":"Update your code to use the new API, replacing `to_pyarrow()` with `to_arrow()`. If using dependent libraries, ensure they are compatible with your `deltalake` version, or pin `deltalake` to a version compatible with your dependencies.","cause":"This error typically occurs due to breaking API changes in `deltalake` version 1.0 or later, where `to_pyarrow()` was renamed to `to_arrow()` or other schema-related methods were updated, leading to incompatibility with older code or dependent libraries like Polars.","error":"AttributeError: type object 'deltalake._internal.Schema' has no attribute 'to_pyarrow'. Did you mean: 'to_arrow'?"}],"ecosystem":"pypi","meta_description":null,"install_score":100,"install_tag":"verified","quickstart_score":null,"quickstart_tag":null,"pypi_latest":"1.5.1","cli_name":"deltalake","install_checks":{"last_tested":"2026-05-12","tag":"verified","tag_description":"installs cleanly on critical runtimes, fast import, recently tested","installed_version":"1.2.1","pypi_latest":"1.5.1","is_stale":true,"results":[{"runtime":"python:3.10-alpine","python_version":"3.10","os_libc":"alpine (musl)","variant":"deltalake","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":null,"import_time_s":0.11,"mem_mb":3.5,"disk_size":"456.8M"},{"runtime":"python:3.10-alpine","python_version":"3.10","os_libc":"alpine (musl)","variant":"deltalake","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":0.08,"mem_mb":3.5,"disk_size":"443.0M"},{"runtime":"python:3.10-slim","python_version":"3.10","os_libc":"slim (glibc)","variant":"deltalake","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":11.3,"import_time_s":0.06,"mem_mb":3.5,"disk_size":"425M"},{"runtime":"python:3.10-slim","python_version":"3.10","os_libc":"slim (glibc)","variant":"deltalake","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":0.06,"mem_mb":3.5,"disk_size":"412M"},{"runtime":"python:3.11-alpine","python_version":"3.11","os_libc":"alpine (musl)","variant":"deltalake","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":null,"import_time_s":0.15,"mem_mb":4,"disk_size":"471.8M"},{"runtime":"python:3.11-alpine","python_version":"3.11","os_libc":"alpine (musl)","variant":"deltalake","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":0.18,"mem_mb":4.1,"disk_size":"457.9M"},{"runtime":"python:3.11-slim","python_version":"3.11","os_libc":"slim (glibc)","variant":"deltalake","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":10.5,"import_time_s":0.13,"mem_mb":4,"disk_size":"439M"},{"runtime":"python:3.11-slim","python_version":"3.11","os_libc":"slim (glibc)","variant":"deltalake","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":0.12,"mem_mb":4.1,"disk_size":"426M"},{"runtime":"python:3.12-alpine","python_version":"3.12","os_libc":"alpine (musl)","variant":"deltalake","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":null,"import_time_s":0.13,"mem_mb":3.5,"disk_size":"456.4M"},{"runtime":"python:3.12-alpine","python_version":"3.12","os_libc":"alpine (musl)","variant":"deltalake","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":0.14,"mem_mb":3.5,"disk_size":"442.5M"},{"runtime":"python:3.12-slim","python_version":"3.12","os_libc":"slim (glibc)","variant":"deltalake","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":10.5,"import_time_s":0.11,"mem_mb":3.5,"disk_size":"424M"},{"runtime":"python:3.12-slim","python_version":"3.12","os_libc":"slim (glibc)","variant":"deltalake","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":0.12,"mem_mb":3.5,"disk_size":"411M"},{"runtime":"python:3.13-alpine","python_version":"3.13","os_libc":"alpine (musl)","variant":"deltalake","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":null,"import_time_s":0.12,"mem_mb":3.6,"disk_size":"455.4M"},{"runtime":"python:3.13-alpine","python_version":"3.13","os_libc":"alpine (musl)","variant":"deltalake","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":0.11,"mem_mb":3.5,"disk_size":"441.4M"},{"runtime":"python:3.13-slim","python_version":"3.13","os_libc":"slim (glibc)","variant":"deltalake","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":10.6,"import_time_s":0.1,"mem_mb":3.4,"disk_size":"423M"},{"runtime":"python:3.13-slim","python_version":"3.13","os_libc":"slim (glibc)","variant":"deltalake","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":0.11,"mem_mb":3.4,"disk_size":"410M"},{"runtime":"python:3.9-alpine","python_version":"3.9","os_libc":"alpine (musl)","variant":"deltalake","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":null,"import_time_s":0.13,"mem_mb":4.8,"disk_size":"479.3M"},{"runtime":"python:3.9-alpine","python_version":"3.9","os_libc":"alpine (musl)","variant":"deltalake","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":0.14,"mem_mb":4.8,"disk_size":"479.3M"},{"runtime":"python:3.9-slim","python_version":"3.9","os_libc":"slim (glibc)","variant":"deltalake","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":13.3,"import_time_s":0.11,"mem_mb":4.8,"disk_size":"494M"},{"runtime":"python:3.9-slim","python_version":"3.9","os_libc":"slim (glibc)","variant":"deltalake","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":0.11,"mem_mb":4.8,"disk_size":"494M"}]},"quickstart_checks":{"last_tested":"2026-04-24","tag":null,"tag_description":null,"results":[{"runtime":"python:3.10-alpine","exit_code":1},{"runtime":"python:3.10-slim","exit_code":1},{"runtime":"python:3.11-alpine","exit_code":1},{"runtime":"python:3.11-slim","exit_code":1},{"runtime":"python:3.12-alpine","exit_code":1},{"runtime":"python:3.12-slim","exit_code":1},{"runtime":"python:3.13-alpine","exit_code":1},{"runtime":"python:3.13-slim","exit_code":1},{"runtime":"python:3.9-alpine","exit_code":1},{"runtime":"python:3.9-slim","exit_code":1}]}}