{"id":9491,"library":"arrow-odbc","title":"arrow-odbc library","description":"arrow-odbc is a Python library that enables efficient reading of data from any ODBC data source directly into Apache Arrow record batches. Built with Rust, it provides a high-performance bridge between relational databases accessible via ODBC and Python's data analysis ecosystem. As of version 10.1.0, it offers robust capabilities for data ingestion into Arrow, supporting various data types and large datasets. It generally follows a regular release cadence, with major versions often introducing significant features or breaking changes.","status":"active","version":"10.1.0","language":"en","source_language":"en","source_url":"https://github.com/pacman82/arrow-odbc-python","tags":["database","odbc","apache-arrow","data-engineering","rust-bindings"],"install":[{"cmd":"pip install arrow-odbc","lang":"bash","label":"Install latest version"}],"dependencies":[{"reason":"Required for creating and handling Apache Arrow tables and record batches.","package":"pyarrow"}],"imports":[{"symbol":"read_arrow_tables","correct":"from arrow_odbc import read_arrow_tables"}],"quickstart":{"code":"import os\nfrom arrow_odbc import read_arrow_tables\nimport pyarrow.parquet as pq\n\n# NOTE: You must have an ODBC driver installed on your system\n# for the target database (e.g., SQL Server, PostgreSQL, MySQL).\n# The connection string below is an example. Adjust it for your setup.\n\n# Example connection strings:\n# SQL Server (Windows/Linux): DRIVER={ODBC Driver 17 for SQL Server};SERVER=localhost;DATABASE=testdb;UID=user;PWD=password\n# PostgreSQL (Linux): DRIVER={PostgreSQL Unicode};SERVER=localhost;DATABASE=testdb;UID=user;PASSWORD=password\n\nconnection_string = os.environ.get(\n    'ARROW_ODBC_CONNECTION_STRING', \n    'DRIVER={ODBC Driver 17 for SQL Server};SERVER=localhost;DATABASE=testdb;UID=user;PWD=password'\n)\n\n# Example query. Adjust 'YourTable' and syntax for your database.\n# For SQL Server: \"SELECT TOP 100 * FROM YourTable\"\n# For PostgreSQL: \"SELECT * FROM YourTable LIMIT 100\"\nquery = \"SELECT TOP 100 * FROM YourTable\"\n\ntry:\n    # Read data into a PyArrow Table\n    arrow_table = read_arrow_tables(\n        connection_string=connection_string,\n        query=query\n    )\n\n    print(f\"Successfully read {arrow_table.num_rows} rows.\")\n    print(f\"Schema:\\n{arrow_table.schema}\")\n    if arrow_table.num_rows > 0:\n        print(f\"First 5 rows:\\n{arrow_table.slice(0, min(5, arrow_table.num_rows)).to_pylist()}\")\n\n    # Example: Save to Parquet\n    # pq.write_table(arrow_table, \"output.parquet\")\n\nexcept Exception as e:\n    print(f\"An error occurred: {e}\")\n    print(\"Please ensure your ODBC driver is installed and the connection string/query are correct.\")","lang":"python","description":"This quickstart demonstrates how to connect to an ODBC data source and retrieve data as an Apache Arrow Table using `arrow-odbc`. It highlights the `read_arrow_tables` function, which is the primary entry point for data retrieval. Users must replace the example connection string and query with their specific database details and ensure the appropriate ODBC driver is installed on their system."},"warnings":[{"fix":"Update calls from `read_all_tables(...)` to `read_arrow_tables(...)`.","message":"The primary data retrieval function `read_all_tables` was renamed to `read_arrow_tables` in version 10.0.0. Older code using `read_all_tables` will fail after upgrading.","severity":"breaking","affected_versions":">=10.0.0"},{"fix":"Consult your database vendor's documentation for instructions on installing and configuring the necessary ODBC driver (e.g., `unixodbc-dev` and `msodbcsql17` for SQL Server on Linux).","message":"Installation of the appropriate ODBC driver for your specific database and operating system is a prerequisite and is handled outside of Python. `arrow-odbc` relies on a correctly configured ODBC environment.","severity":"gotcha","affected_versions":"all"},{"fix":"Verify the connection string format against your specific ODBC driver's documentation and database requirements. Pay close attention to driver name, server address, database name, and credentials.","message":"ODBC connection string syntax is highly specific to the ODBC driver and database being used. Incorrectly formatted connection strings are a common source of connection errors.","severity":"gotcha","affected_versions":"all"},{"fix":"For very large datasets, consider fetching data in smaller chunks using `LIMIT`/`OFFSET` in your SQL query, or leverage the `chunk_size_in_rows` and `max_chunks_in_flight` parameters in `read_arrow_tables` if available (check current documentation for `arrow-odbc`'s support for these). Alternatively, process the Arrow Table iteratively or save it to disk directly (e.g., Parquet).","message":"When querying large datasets, the resulting Apache Arrow Table can consume significant amounts of memory, potentially leading to out-of-memory errors if not managed carefully.","severity":"gotcha","affected_versions":"all"}],"env_vars":null,"last_verified":"2026-04-17T00:00:00.000Z","next_check":"2026-07-16T00:00:00.000Z","problems":[{"fix":"Run `pip install arrow-odbc` to install the library.","cause":"The `arrow-odbc` library is not installed in the active Python environment.","error":"ModuleNotFoundError: No module named 'arrow_odbc'"},{"fix":"Install the appropriate ODBC driver for your database and OS. For example, on Ubuntu, you might need `sudo apt-get install unixodbc-dev msodbcsql17` for SQL Server.","cause":"The specified ODBC driver (e.g., 'ODBC Driver 17 for SQL Server') is not installed or not correctly configured on the operating system.","error":"pyodbc.Error: ('01000', \"[01000] [unixODBC][Driver Manager]Can't open lib 'ODBC Driver 17 for SQL Server' : file not found (0) (SQLDriverConnect)\")"},{"fix":"Ensure your code uses `connection_string` for `arrow-odbc` versions 10.0.0 and above. If you are intentionally using an older version (9.x), use `_connection_string`.","cause":"This error can occur if you are using an `arrow-odbc` version where the connection string parameter was named `connection_string` (v10.x+) but your code is using `_connection_string`, or vice-versa from an older version (v9.x).","error":"TypeError: read_arrow_tables() got an unexpected keyword argument '_connection_string'"},{"fix":"Adjust your SQL query to cast the problematic column(s) to a more permissive type (e.g., `VARCHAR` or a larger integer type if the database supports it) or handle the conversion explicitly in your Python code after retrieval.","cause":"A data type conversion issue occurred, where a numeric value from the database exceeds the range or precision of the target Apache Arrow type. This can happen with very large integers or high-precision decimals.","error":"arrow_odbc.ArrowOdbcError: ODBC error: [State: 22003, Native: 0, Message: [Microsoft][ODBC Driver 17 for SQL Server]Numeric value out of range]"}]}