Qubole Data Service SDK (qds-sdk)
The Qubole Data Service (QDS) Python SDK provides a programmatic interface for interacting with the Qubole Data Service API, allowing users to manage clusters, submit commands (e.g., Hive, Spark, Presto), and access query results. The library is actively maintained, with version 1.17.0 being the latest, and typically sees several releases per year to add features and ensure compatibility.
Common errors
-
qds_sdk.exception.ApiException: HTTP 401 Unauthorized
cause The Qubole API token is either missing, invalid, or expired, preventing authentication with the Qubole service.fixVerify that `QDS_API_TOKEN` environment variable is set correctly and contains a valid, active Qubole API token. You can generate a new token from your Qubole account settings. -
qds_sdk.exception.ValidationError: '<parameter_name>' is a required field
cause A necessary parameter for creating or updating a Qubole resource (e.g., `cluster_id` for a command, `name` for a new cluster) was not provided.fixConsult the QDS SDK documentation or the Qubole API reference for the specific command or resource you are using to identify all required parameters. Ensure all mandatory arguments are passed to the SDK method. -
AttributeError: module 'qds_sdk' has no attribute 'some_old_module_or_class'
cause This usually indicates that a module, class, or function name has been changed, moved, or removed in a newer version of the SDK.fixCheck the release notes for the `qds-sdk` version you upgraded to. Update your import statements and code to use the new names or paths. For example, `qds_sdk.commands` is where most command classes reside. -
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='api.qubole.com', port=443): Max retries exceeded with url: /api/v1.2/...
cause The SDK was unable to establish a connection to the Qubole API endpoint. This could be due to network issues, an incorrect API URL, or proxy configuration problems.fixVerify your internet connection. Check that `QDS_API_URL` environment variable is set to the correct Qubole API endpoint for your region (e.g., `https://api.qubole.com/api/v1.2`). If you are behind a corporate proxy, ensure proxy settings are correctly configured for Python's `requests` library.
Warnings
- breaking Support for Python 2.6 was dropped. Users on older Python 2.x versions might encounter compatibility issues or errors.
- breaking The default S3 Signature Version for API requests changed to V4. This might affect interactions with older S3 buckets or specific S3 endpoints that only support V2 signatures.
- gotcha Versions prior to 1.17.0 might have compatibility issues or syntax warnings when running on Python 3.12 due to deprecated constructs and third-party library changes.
- gotcha API commands and resources frequently receive new parameters (e.g., `--upload-to-source` for JupyterNotebookCommand). While usually additions, incorrect usage or assumptions about default behavior can lead to errors.
Install
-
pip install qds-sdk==1.17.0 -
pip install qds-sdk
Imports
- Qubole
import qds_sdk qds_sdk.set_api_token('YOUR_API_TOKEN') - HiveCommand
from qds_sdk.commands import HiveCommand
- Cluster
from qds_sdk.clusters import Cluster
- Command
from qds_sdk.commands import Command
Quickstart
import os
from qds_sdk.commands import HiveCommand
from qds_sdk import qds
# Configure Qubole API token and endpoint
qds.set_api_token(os.environ.get('QDS_API_TOKEN', ''))
qds.set_api_url(os.environ.get('QDS_API_URL', 'https://api.qubole.com/api/v1.2'))
# Define and execute a Hive command
hive_command = HiveCommand.create(
query='SELECT 1 + 1 AS result;',
cluster_id=os.environ.get('QDS_CLUSTER_ID', None) # Use an existing cluster ID or name
)
print(f"Submitted Hive Command ID: {hive_command.id}")
# Wait for command completion (optional, for synchronous execution)
status = HiveCommand.wait_for_completion(hive_command.id)
print(f"Command {hive_command.id} finished with status: {status}")
# Fetch results
if status == 'done':
result = HiveCommand.get_results(hive_command.id)
print("Query Result:")
print(result)