Qubole Data Service SDK (qds-sdk)

1.17.0 · active · verified Thu Apr 16

The Qubole Data Service (QDS) Python SDK provides a programmatic interface for interacting with the Qubole Data Service API, allowing users to manage clusters, submit commands (e.g., Hive, Spark, Presto), and access query results. The library is actively maintained, with version 1.17.0 being the latest, and typically sees several releases per year to add features and ensure compatibility.

Common errors

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to configure the Qubole SDK with API credentials and submit a simple Hive command. It requires setting `QDS_API_TOKEN` and `QDS_API_URL` as environment variables, and optionally `QDS_CLUSTER_ID` for targeting a specific cluster. The example submits a query, waits for its completion, and then fetches the results.

import os
from qds_sdk.commands import HiveCommand
from qds_sdk import qds

# Configure Qubole API token and endpoint
qds.set_api_token(os.environ.get('QDS_API_TOKEN', ''))
qds.set_api_url(os.environ.get('QDS_API_URL', 'https://api.qubole.com/api/v1.2'))

# Define and execute a Hive command
hive_command = HiveCommand.create(
    query='SELECT 1 + 1 AS result;',
    cluster_id=os.environ.get('QDS_CLUSTER_ID', None) # Use an existing cluster ID or name
)

print(f"Submitted Hive Command ID: {hive_command.id}")

# Wait for command completion (optional, for synchronous execution)
status = HiveCommand.wait_for_completion(hive_command.id)
print(f"Command {hive_command.id} finished with status: {status}")

# Fetch results
if status == 'done':
    result = HiveCommand.get_results(hive_command.id)
    print("Query Result:")
    print(result)

view raw JSON →