PyHive: Python Interface for Hive and Presto

0.7.0 · active · verified Thu Apr 09

PyHive provides Python DB-API and SQLAlchemy interfaces for various data warehouses, primarily Apache Hive and Presto. It enables Python applications to connect, query, and fetch results from these systems. The current version is 0.7.0, with a release cadence that is somewhat sporadic, with significant gaps between major and minor releases.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to establish a connection to Apache Hive using `pyhive.hive`, execute a simple query, and fetch results. It uses environment variables for connection details for security and flexibility. Remember to install `pyhive[hive]` to get the necessary dependencies. For Presto, replace `pyhive.hive` with `pyhive.presto` and adjust connection parameters.

import os
from pyhive import hive

# Example for Hive connection
# Ensure HiveServer2 is running and accessible
# Replace with your actual host, port, username, database
host = os.environ.get('HIVE_HOST', 'localhost')
port = int(os.environ.get('HIVE_PORT', 10000))
username = os.environ.get('HIVE_USERNAME', 'anonymous')
database = os.environ.get('HIVE_DATABASE', 'default')

connection = None
cursor = None
try:
    connection = hive.connect(host=host, port=port, username=username, database=database)
    cursor = connection.cursor()

    # Execute a query
    cursor.execute('SELECT 1 + 1')

    # Fetch results
    result = cursor.fetchone()
    print(f"Query result: {result}")

    cursor.execute('SHOW TABLES')
    tables = cursor.fetchall()
    print("Available tables:")
    for table in tables:
        print(f"  {table[0]}")

except Exception as e:
    print(f"An error occurred: {e}")
finally:
    if cursor:
        cursor.close()
    if connection:
        connection.close()

view raw JSON →