HMSClient
HMSClient is a Python package designed to interact with the Apache Hive Metastore via the Thrift protocol. It provides a thin Python wrapper around generated Thrift code to facilitate operations like checking for partitions. The current version is 0.1.1, but the project appears to be unmaintained, with its last release in April 2018.
Warnings
- breaking The `hmsclient` library has not been updated since April 2018, and its GitHub repository shows no recent activity for eight years. This makes it highly likely to be incompatible with newer Python versions (e.g., Python 3.8+) or recent Apache Hive Metastore versions without significant issues or manual patches.
- gotcha This library is primarily designed for unsecured Thrift connections. Using it in environments requiring Kerberos or other secure authentication mechanisms for the Hive Metastore is not directly supported and may expose security vulnerabilities.
- gotcha The client lacks built-in automatic reconnection logic. If the Hive Metastore becomes temporarily unstable or the connection drops, the client will not automatically attempt to re-establish the connection, leading to persistent failures in your application until manually restarted. This is a common challenge with HMS clients.
- deprecated The existence of `hmsclient-hive-3` (a fork explicitly rebuilt for Hive Metastore API version 3.0) strongly suggests that the original `hmsclient` (version 0.1.1) may not be compatible with Hive Metastore versions 3.0 and newer.
Install
-
pip install hmsclient
Imports
- hmsclient
from hmsclient import hmsclient
- HMSClient
from hmsclient import hmsclient client = hmsclient.HMSClient(...)
Quickstart
import os
from hmsclient import hmsclient
host = os.environ.get('HMS_HOST', 'localhost')
port = int(os.environ.get('HMS_PORT', 9083))
# Example: Connect and check for a named partition
try:
client = hmsclient.HMSClient(host=host, port=port)
with client as c:
# Replace 'your_db', 'your_table', and 'date=...' with actual values
partition_exists = c.check_for_named_partition('your_db', 'your_table', 'date=20180101')
print(f"Partition exists: {partition_exists}")
except Exception as e:
print(f"An error occurred: {e}")