{"id":2694,"library":"pyiceberg-core","title":"PyIceberg-Core","description":"PyIceberg-core is a foundational Python library that provides a Rust-powered core for PyIceberg, enabling efficient access to Apache Iceberg tables without a JVM. It's primarily intended as an internal dependency for the main PyIceberg library but offers performance optimizations for Iceberg data operations. The current version is 0.9.0, and it is actively maintained as part of the broader Apache Iceberg Python project with frequent releases aligning with PyIceberg.","status":"active","version":"0.9.0","language":"en","source_language":"en","source_url":"https://github.com/apache/iceberg-python","tags":["apache-iceberg","data-lakehouse","table-format","python","data-processing","rust"],"install":[{"cmd":"pip install pyiceberg-core","lang":"bash","label":"Basic Installation"},{"cmd":"pip install \"pyiceberg[pyiceberg-core,pyarrow]\"","lang":"bash","label":"Recommended for PyIceberg with Rust core and PyArrow I/O"}],"dependencies":[{"reason":"pyiceberg-core is an internal dependency and provides an optimized core for the main pyiceberg library.","package":"pyiceberg","optional":false},{"reason":"Commonly used for data interchange and file I/O operations with Iceberg tables.","package":"pyarrow","optional":true},{"reason":"Required for S3 object storage interaction.","package":"s3fs","optional":true},{"reason":"Required for Azure Data Lake Storage interaction.","package":"adlfs","optional":true},{"reason":"Required for Google Cloud Storage interaction.","package":"gcsfs","optional":true},{"reason":"Required for experimental DataFusion integration.","package":"datafusion","optional":true}],"imports":[{"symbol":"load_catalog","correct":"from pyiceberg.catalog import load_catalog"},{"symbol":"Schema","correct":"from pyiceberg.schema import Schema"},{"symbol":"NestedField, StringType, LongType","correct":"from pyiceberg.types import NestedField, StringType, LongType"}],"quickstart":{"code":"import os\nimport shutil\nimport pyarrow as pa\nfrom pyiceberg.catalog import load_catalog\nfrom pyiceberg.schema import Schema\nfrom pyiceberg.types import NestedField, StringType, LongType, IntegerType\n\n# Define a temporary warehouse directory\nWAREHOUSE_PATH = \"/tmp/pyiceberg_warehouse\"\nCATALOG_DB_PATH = os.path.join(WAREHOUSE_PATH, \"pyiceberg_catalog.db\")\n\n# Clean up previous run if exists\nif os.path.exists(WAREHOUSE_PATH):\n    shutil.rmtree(WAREHOUSE_PATH)\nos.makedirs(WAREHOUSE_PATH, exist_ok=True)\n\n# Configure and load a local SQL catalog\ncatalog = load_catalog(\n    \"default\",\n    type=\"sql\",\n    uri=f\"sqlite:///{CATALOG_DB_PATH}\",\n    warehouse=f\"file://{WAREHOUSE_PATH}\"\n)\n\n# Create a namespace (database)\nNAMESPACE = \"my_namespace\"\ncatalog.create_namespace(NAMESPACE, properties={\"comment\": \"My first Iceberg namespace\"})\nprint(f\"Created namespace: {NAMESPACE}\")\n\n# Define a schema for the Iceberg table\nschema = Schema(\n    NestedField(1, \"id\", LongType(), required=True),\n    NestedField(2, \"name\", StringType()),\n    NestedField(3, \"age\", IntegerType())\n)\n\n# Create an Iceberg table\nTABLE_NAME = \"my_table\"\ntable = catalog.create_table(f\"{NAMESPACE}.{TABLE_NAME}\", schema, properties={\n    \"format-version\": \"2\",\n    \"write.parquet.compression-codec\": \"zstd\"\n})\nprint(f\"Created table: {table.name}\")\n\n# Prepare data with PyArrow\ndata = pa.table({\n    \"id\": [1, 2, 3],\n    \"name\": [\"Alice\", \"Bob\", \"Charlie\"],\n    \"age\": [30, 24, 35]\n})\n\n# Append data to the table\ntable.append(data)\nprint(\"Appended data to the table.\")\n\n# Read data from the table\nread_df = table.scan().to_arrow()\nprint(\"\\nData read from Iceberg table:\")\nprint(read_df.to_pandas())\n\n# Clean up\nshutil.rmtree(WAREHOUSE_PATH)\nprint(f\"Cleaned up warehouse at {WAREHOUSE_PATH}\")","lang":"python","description":"This quickstart demonstrates how to use PyIceberg (which leverages pyiceberg-core) to set up a local SQLite catalog, create a namespace and a table with a defined schema, append data using PyArrow, and then read the data back. It includes necessary cleanup."},"warnings":[{"fix":"For general usage, install `pyiceberg` and optionally include `pyiceberg-core` as an extra: `pip install \"pyiceberg[pyiceberg-core]\"`. `pyiceberg` will then leverage the Rust core for optimized operations.","message":"`pyiceberg-core` is an internal dependency of `pyiceberg`. While it can be installed separately, it is typically managed as an extra by `pyiceberg` (e.g., `pip install \"pyiceberg[pyiceberg-core]\"`). Directly using `pyiceberg-core` without `pyiceberg` is not the standard pattern and may not expose a full public API.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Install the appropriate filesystem package for your storage solution, e.g., `pip install \"pyiceberg[s3fs]\"` for S3 or `pip install \"pyiceberg[pyarrow]\"` for PyArrow-backed I/O.","message":"File I/O with object storage (S3, ADLS, GCS) requires installing specific optional dependencies such as `s3fs`, `adlfs`, `gcsfs`, or `pyarrow` (for local filesystem and some cloud storage via PyArrow's filesystem abstractions). Not installing these will lead to runtime errors when attempting to read/write files.","severity":"gotcha","affected_versions":"All versions"},{"fix":"If using DataFusion, check the latest PyIceberg documentation for compatible DataFusion versions. The integration is evolving and may have breaking changes or strict version requirements. Avoid using in production without careful testing.","message":"The DataFusion integration with PyIceberg (which uses `pyiceberg-core`) is considered experimental and currently has strict version dependencies. For `pyiceberg-core 0.9.0`, it might align with `datafusion == 51`.","severity":"deprecated","affected_versions":"<=0.9.0"},{"fix":"Ensure your environment has a compatible Rust toolchain installed if you encounter build errors during installation. For most common platforms, pre-built wheels should handle this automatically.","message":"Building `pyiceberg-core` can require a Rust toolchain on certain architectures (e.g., non-x86_64 or for specific environments), especially if pre-built wheels are not available.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-10T00:00:00.000Z","next_check":"2026-07-09T00:00:00.000Z"}