Apache Kylin Python Client Library
The `kylinpy` library provides a Python client to interact with Apache Kylin, an OLAP engine for Big Data, allowing users to query and manage Kylin instances programmatically. It abstracts the REST API interactions, facilitating tasks such as executing queries, managing projects, and retrieving metadata. The current version is 2.8.4, and it maintains an active release cadence, often aligning with Apache Kylin server updates.
Warnings
- gotcha Many Kylin operations are scoped to a specific project. Forgetting to call `client.use_project(project_name)` after initializing `KylinClient` can lead to 'project not found' or authentication errors, even if your client credentials are correct.
- gotcha While `kylinpy` offers `query_to_dataframe` for convenient integration with pandas, if pandas is not strictly needed or if you're dealing with very large result sets where DataFrame conversion overhead is a concern, consider using `client.query_to_list` or accessing raw API responses to manage memory and performance more directly.
- gotcha As of `v2.8.0`, `kylinpy` introduced `KylinAsyncClient` for `asyncio` support. If you're building an asynchronous application, ensure you import `KylinAsyncClient` (e.g., `from kylinpy import KylinAsyncClient`) and use `await` with its methods. Mixing synchronous `KylinClient` methods with `asyncio` patterns without proper wrapping will lead to runtime errors.
Install
-
pip install kylinpy
Imports
- KylinClient
from kylinpy import KylinClient
- KylinAsyncClient
from kylinpy import KylinAsyncClient
Quickstart
import os
from kylinpy import KylinClient
# Configure connection details using environment variables for security
HOST = os.environ.get('KYLIN_HOST', 'localhost')
PORT = os.environ.get('KYLIN_PORT', '7070')
USERNAME = os.environ.get('KYLIN_USERNAME', 'ADMIN')
PASSWORD = os.environ.get('KYLIN_PASSWORD', 'KYLIN')
PROJECT = os.environ.get('KYLIN_PROJECT', 'learn_kylin')
try:
# Initialize the KylinClient
client = KylinClient(
host=HOST,
port=PORT,
username=USERNAME,
password=PASSWORD
)
# Select the project context for operations
client.use_project(PROJECT)
print(f"Connected to Kylin at {HOST}:{PORT}, project: {PROJECT}")
# Execute a sample query and get results as a pandas DataFrame
sql_query = "SELECT LSTG_FORMAT_NAME, sum(PRICE) FROM KYLIN_SALES GROUP BY LSTG_FORMAT_NAME ORDER BY sum(PRICE) DESC LIMIT 5"
df = client.query_to_dataframe(sql_query)
print("\nQuery Results (top 5 rows):")
print(df.to_string())
except Exception as e:
print(f"An error occurred: {e}")
print("Please ensure KYLIN_HOST, KYLIN_PORT, KYLIN_USERNAME, KYLIN_PASSWORD, and KYLIN_PROJECT environment variables are correctly set or provide valid defaults.")