Hive Metastore Client

1.0.9 · active · verified Thu Apr 16

The `hive-metastore-client` library provides a Pythonic interface for connecting to and performing Data Definition Language (DDL) operations on a Hive Metastore using the Thrift protocol. It simplifies interactions with Hive metadata, enabling users to programmatically create and manage databases, tables, and partitions. Actively maintained by QuintoAndar, the library is currently at version 1.0.9, offering a high-level abstraction over the underlying Thrift APIs.

Common errors

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to instantiate the HiveMetastoreClient, create a new database, and then create a table within that database. It uses the provided builders for constructing Thrift objects and includes basic error handling for connection issues. Ensure the Hive Metastore service is running and accessible at the specified host and port (defaults to localhost:9083, configurable via environment variables).

import os
from hive_metastore_client import HiveMetastoreClient
from hive_metastore_client.builders import DatabaseBuilder, FieldSchemaBuilder, TableBuilder
from hive_metastore_client.thrift_files.libraries.thrift_hive_metastore_client.ttypes import Table, FieldSchema, StorageDescriptor, SerDeInfo

HIVE_HOST = os.environ.get('HIVE_METASTORE_HOST', 'localhost')
HIVE_PORT = int(os.environ.get('HIVE_METASTORE_PORT', '9083'))

try:
    # 1. Create a database
    db_name = "my_test_database"
    database = DatabaseBuilder(name=db_name).build()
    with HiveMetastoreClient(HIVE_HOST, HIVE_PORT) as hive_client:
        hive_client.create_database(database, if_not_exists=True)
        print(f"Database '{db_name}' created successfully (or already exists).")

        # 2. Create a table in the new database
        table_name = "my_test_table"
        columns = [
            FieldSchemaBuilder(name="id", type="int").build(),
            FieldSchemaBuilder(name="name", type="string").build()
        ]
        table = TableBuilder(name=table_name, database_name=db_name, columns=columns).build()
        hive_client.create_table(table, if_not_exists=True)
        print(f"Table '{db_name}.{table_name}' created successfully (or already exists).")

        # Example: Get the created table
        retrieved_table = hive_client.get_table(db_name, table_name)
        print(f"Retrieved table: {retrieved_table.name} in database {retrieved_table.dbName}")

except Exception as e:
    print(f"An error occurred: {e}")
    print("Ensure HIVE_METASTORE_HOST and HIVE_METASTORE_PORT environment variables are set or defaults are correct.")
    print("Also, verify that the Hive Metastore service is running and accessible.")

view raw JSON →