{"id":1649,"library":"pyhive","title":"PyHive: Python Interface for Hive and Presto","description":"PyHive provides Python DB-API and SQLAlchemy interfaces for various data warehouses, primarily Apache Hive and Presto. It enables Python applications to connect, query, and fetch results from these systems. The current version is 0.7.0, with a release cadence that is somewhat sporadic, with significant gaps between major and minor releases.","status":"active","version":"0.7.0","language":"en","source_language":"en","source_url":"https://github.com/dropbox/PyHive","tags":["database","hive","presto","data warehouse","sql","db-api","sqlalchemy"],"install":[{"cmd":"pip install pyhive[hive]","lang":"bash","label":"For Hive"},{"cmd":"pip install pyhive[presto]","lang":"bash","label":"For Presto"},{"cmd":"pip install pyhive[hive,presto,hive_kerberos,sqlalchemy]","lang":"bash","label":"Full installation with all optional dependencies"}],"dependencies":[{"reason":"Required for Hive connectivity. Specific versions of PyHive may mandate specific Thrift versions.","package":"thrift","optional":true},{"reason":"Required for Hive Kerberos authentication (`pyhive[hive_kerberos]` extra).","package":"sasl","optional":true},{"reason":"Required for Presto connectivity (`pyhive[presto]` extra).","package":"requests","optional":true},{"reason":"Required for SQLAlchemy integration (`pyhive[sqlalchemy]` extra).","package":"sqlalchemy","optional":true}],"imports":[{"symbol":"connect","correct":"from pyhive.hive import connect"},{"symbol":"connect","correct":"from pyhive.presto import connect"},{"symbol":"HiveConnection","correct":"from pyhive.hive import Connection as HiveConnection"},{"symbol":"PrestoConnection","correct":"from pyhive.presto import Connection as PrestoConnection"}],"quickstart":{"code":"import os\nfrom pyhive import hive\n\n# Example for Hive connection\n# Ensure HiveServer2 is running and accessible\n# Replace with your actual host, port, username, database\nhost = os.environ.get('HIVE_HOST', 'localhost')\nport = int(os.environ.get('HIVE_PORT', 10000))\nusername = os.environ.get('HIVE_USERNAME', 'anonymous')\ndatabase = os.environ.get('HIVE_DATABASE', 'default')\n\nconnection = None\ncursor = None\ntry:\n    connection = hive.connect(host=host, port=port, username=username, database=database)\n    cursor = connection.cursor()\n\n    # Execute a query\n    cursor.execute('SELECT 1 + 1')\n\n    # Fetch results\n    result = cursor.fetchone()\n    print(f\"Query result: {result}\")\n\n    cursor.execute('SHOW TABLES')\n    tables = cursor.fetchall()\n    print(\"Available tables:\")\n    for table in tables:\n        print(f\"  {table[0]}\")\n\nexcept Exception as e:\n    print(f\"An error occurred: {e}\")\nfinally:\n    if cursor:\n        cursor.close()\n    if connection:\n        connection.close()\n","lang":"python","description":"This quickstart demonstrates how to establish a connection to Apache Hive using `pyhive.hive`, execute a simple query, and fetch results. It uses environment variables for connection details for security and flexibility. Remember to install `pyhive[hive]` to get the necessary dependencies. For Presto, replace `pyhive.hive` with `pyhive.presto` and adjust connection parameters."},"warnings":[{"fix":"Check the release notes for your PyHive version for required HiveServer2 versions or Thrift binding compatibility. Upgrade HiveServer2 or downgrade PyHive as necessary to match protocols.","message":"Major protocol and Thrift binding updates in PyHive versions can cause incompatibility with older HiveServer2 versions. Specifically, v0.2.0 changed to Hive protocol V6 (requiring Hive 0.13+), and v0.5.0 updated Thrift bindings to V11. Ensure your PyHive version matches the expected protocol/Thrift version of your HiveServer2.","severity":"breaking","affected_versions":"0.2.0, 0.5.0, 0.7.0 (and potentially others)"},{"fix":"Update your code to expect tuples for rows and byte strings for binary data when upgrading from PyHive versions prior to 0.2.0.","message":"PyHive v0.2.0 introduced changes to data return types: rows are now returned as tuples instead of lists, and binary data is returned as byte strings instead of Unicode strings. This can break existing code that assumes specific data types.","severity":"breaking","affected_versions":"0.2.0 and newer"},{"fix":"Always install PyHive with the required extras, e.g., `pip install pyhive[hive]`, `pip install pyhive[presto,hive_kerberos,sqlalchemy]` for full functionality.","message":"PyHive's core installation (`pip install pyhive`) does not include necessary dependencies for Hive, Presto, Kerberos, or SQLAlchemy integration. These must be installed via optional extras.","severity":"gotcha","affected_versions":"All versions"},{"fix":"For SQLAlchemy integration, use a recent version of PyHive and a reasonably modern and compatible version of SQLAlchemy (typically the latest stable versions work best).","message":"Older PyHive versions explicitly dropped support for specific SQLAlchemy versions (e.g., v0.5.0 dropped SQLAlchemy 0.6, v0.5.1 dropped SQLAlchemy 0.7). While newer PyHive versions generally aim for compatibility, always check if you're using an older PyHive with a very old or very new SQLAlchemy version.","severity":"gotcha","affected_versions":"0.5.0, 0.5.1, and potentially others"}],"env_vars":null,"last_verified":"2026-04-09T00:00:00.000Z","next_check":"2026-07-08T00:00:00.000Z"}