Azure Kusto Data Client
The Azure Kusto Data Client is a Python library that provides capabilities to query Azure Data Explorer (Kusto) clusters. It is Python 3.x compatible and supports various data types through a familiar Python DB API interface, enabling its use within environments like Jupyter Notebooks. The library is actively maintained and receives regular updates, with the current version being 6.0.3.
Common errors
-
ModuleNotFoundError: No module named 'azure.kusto.data'
cause The `azure-kusto-data` package or its dependencies were not installed correctly or are not accessible in the current Python environment. This can happen if `pip install azure-kusto-data` was not run, or if there are virtual environment issues.fixEnsure the package is installed in your active Python environment. If using a virtual environment, activate it before installing. Restart your kernel if in a notebook. `pip install azure-kusto-data` -
KustoAuthenticationError: AADSTS700030: Invalid certificate - subject name in certificate is not authorized.
cause Authentication failed because the Azure Active Directory (AAD) application's certificate, or the service principal used, does not have the necessary permissions or is improperly configured (e.g., incorrect subject name, missing role assignment) for the Kusto cluster. Another common authentication error is `action Principal 'aaduser=[AAD account id];[ AAD tenant id ]' is not authorized to perform operation.` due to insufficient database permissions.fixVerify that the Azure AD application registration's certificate subject name is correct and authorized. Ensure the service principal has the 'Viewer' (or appropriate) role assigned at the cluster or database level in Azure Data Explorer. For cross-tenant access, additional configurations might be needed. -
Failed to connect to cluster. Please verify the URI and check if the cluster is available.
cause The Kusto cluster URI is incorrect, the cluster is stopped, or there are network connectivity issues preventing the client from reaching the cluster endpoint. This can also manifest as a `KustoNetworkError`.fixDouble-check the cluster URI for typos (it should be `https://<ClusterName>.<Region>.kusto.windows.net`). Confirm the Azure Data Explorer cluster's status in the Azure portal and start it if it's stopped. Review network security group (NSG) rules if the cluster is in a virtual network. -
Entity 'table name that doesn't exist' of kind 'Table' wasn't found.
cause The Kusto Query Language (KQL) query attempts to access a table or other entity (like a function or materialized view) that does not exist in the specified database or is misspelled. This can also be caused by missing or incorrect ingestion mappings if the data is being ingested.fixVerify the exact name and existence of the table (or entity) in your Kusto database. Check for typos in the query. If ingesting data, ensure that the table exists and any required ingestion mappings are correctly defined and applied. -
azure.kusto.data.exceptions.KustoClientError: AADSTS700016: Application with identifier '{app_id}' was not found in the directory '{tenant_id}'.cause The provided Azure AD application ID or tenant ID is incorrect, or the application registration is missing or misconfigured in the specified Azure AD tenant.fixVerify that the Azure AD application ID and tenant ID are correct, and ensure the application is properly registered and configured in Azure Active Directory with appropriate permissions.
Warnings
- breaking Version 6.0.0 raised the minimum supported Python version to 3.9, aligning with other Azure SDKs. Earlier versions (e.g., 5.x) supported Python 3.7+.
- breaking Version 5.0.0 introduced breaking changes to the `KustoConnectionStringBuilder` keywords, removing `msi_auth`, `msi_authentication`, `msi_params`, and `msi_type`. While building connection strings via the builder methods still works, direct parsing of these keywords will fail.
- gotcha When using `dataframe_from_result_table` with pandas, ensure compatibility. Version 6.0.2 fixed an issue with pandas 3.0 when datetime columns contained all null values, and the minimum supported pandas version is now 2.3.1.
- gotcha The `KustoClient` instance should be reused across multiple operations for optimal performance, as it manages a pool of connections. Frequently recreating clients can lead to performance issues and increased load on your Kusto cluster.
- gotcha Managed Streaming ingestion in earlier versions might not correctly handle throttling events. This was addressed in version 6.0.1.
Install
-
pip install azure-kusto-data -
pip install azure-kusto-data[aio] -
pip install azure-kusto-data[pandas]
Imports
- KustoClient
from azure.kusto.data import KustoClient
- KustoConnectionStringBuilder
from azure.kusto.data import KustoConnectionStringBuilder
- KustoClient (async)
from azure.kusto.data.aio import KustoClient
- dataframe_from_result_table
from azure.kusto.data.helpers import dataframe_from_result_table
Quickstart
import os
from azure.kusto.data import KustoClient, KustoConnectionStringBuilder
from azure.kusto.data.exceptions import KustoServiceError
# Replace with your Kusto cluster URI
CLUSTER_URI = os.environ.get('KUSTO_CLUSTER_URI', 'https://<your_cluster_name>.kusto.windows.net')
# Replace with your AAD application ID (client ID)
CLIENT_ID = os.environ.get('KUSTO_CLIENT_ID', 'your_aad_application_id')
# Replace with your AAD application key (client secret)
CLIENT_SECRET = os.environ.get('KUSTO_CLIENT_SECRET', 'your_aad_application_key')
# Replace with your AAD tenant ID
TENANT_ID = os.environ.get('KUSTO_TENANT_ID', 'your_aad_tenant_id')
DB_NAME = 'Samples'
QUERY = 'StormEvents | take 5'
def main():
if 'your_aad_application_id' in CLIENT_ID or 'your_aad_application_key' in CLIENT_SECRET:
print("Please set KUSTO_CLUSTER_URI, KUSTO_CLIENT_ID, KUSTO_CLIENT_SECRET, and KUSTO_TENANT_ID environment variables or replace placeholders.")
return
# Build connection string for AAD application key authentication
kcsb = KustoConnectionStringBuilder.with_aad_application_key_authentication(
CLUSTER_URI, CLIENT_ID, CLIENT_SECRET, TENANT_ID
)
# It is good practice to re-use the KustoClient instance, as it maintains a pool of connections.
try:
with KustoClient(kcsb) as client:
print(f"Executing query on database '{DB_NAME}'...")
response = client.execute(DB_NAME, QUERY)
for row in response.primary_results[0]:
print(f"Timestamp: {row['StartTime']}, EventType: {row['EventType']}, State: {row['State']}")
except KustoServiceError as e:
print(f"Kusto service error: {e}")
except Exception as e:
print(f"An unexpected error occurred: {e}")
if __name__ == '__main__':
main()