PyApacheAtlas
PyApacheAtlas is a Python package designed to simplify interaction with the Apache Atlas REST APIs, including Azure Purview. It provides clients for authentication, entity management, glossary operations, and Excel template parsing for bulk uploads. The library is actively maintained, with new minor versions released roughly every 1-2 months, incorporating features for the latest Apache Atlas and Azure Purview capabilities. The current version is 0.16.0.
Warnings
- breaking The `PurviewClient.search_entities` method was deprecated and moved. Users should now use `PurviewClient.discover.search_entities`.
- breaking The `TablesLineage` and `FineGrainColumnLineage` entity types were deprecated from the Excel bulk upload template. Attempting to use these types might lead to errors or unexpected behavior.
- gotcha PyApacheAtlas heavily relies on dictionary structures and specific 'typeName' and 'qualifiedName' conventions for Atlas entities. Incorrectly formatted entities are a common source of API errors.
Install
-
pip install pyapacheatlas
Imports
- PurviewClient
from pyapacheatlas.auth import PurviewClient
- ExcelReader
from pyapacheatlas.readers import ExcelReader
- AtlasEntity
from pyapacheatlas.core.atlas_entities import AtlasEntity
- AtlasClassification
from pyapacheatlas.core.atlas_entities import AtlasClassification
- DefaultAzureCredential
from azure.identity import DefaultAzureCredential
Quickstart
import os
from azure.identity import DefaultAzureCredential
from pyapacheatlas.auth import PurviewClient
from pyapacheatlas.core.atlas_entities import AtlasEntity
# Replace with your Purview account name
purview_account_name = os.environ.get("PURVIEW_NAME", "YOUR_PURVIEW_ACCOUNT_NAME")
try:
# Authenticate using DefaultAzureCredential (ensure 'azure-identity' is installed)
credential = DefaultAzureCredential()
client = PurviewClient(
account_name=purview_account_name,
authentication=credential
)
# Example: Define a simple Atlas entity (e.g., a table)
table_entity = AtlasEntity(
name="my_example_table",
typeName="azure_sql_table",
qualifiedName=f"mssql://{purview_account_name}/server/database/my_example_table",
guid="-100" # Use negative GUIDs for temporary entities in bulk uploads
)
# Example: Upload the entity to Purview
# For real use, ensure 'YOUR_PURVIEW_ACCOUNT_NAME' is replaced or PURVIEW_NAME env var is set
# response = client.upload_entities([table_entity])
# print(f"Successfully uploaded entity: {response.json()}")
# Example: Search for entities (using the recommended discover method)
search_results = client.discover.search_entities(query="my_example_table")
print(f"Search results (first 5): {search_results.json().get('value', [])[:5]}")
except Exception as e:
print(f"An error occurred: {e}")
print("Ensure 'PURVIEW_NAME' environment variable is set or replace 'YOUR_PURVIEW_ACCOUNT_NAME'.")
print("Also ensure your Azure AD account has permissions to the Purview account.")