AzureML DataPrep Native Extensions
This package provides the underlying native extensions for Azure Machine Learning's Data Preparation capabilities. It is typically consumed as a dependency by higher-level AzureML SDK components like `azureml-dataprep` or `azureml.core`, rather than being directly imported by end-users. The current version is 42.1.0, and its release cadence is tied to the broader AzureML SDK.
Warnings
- gotcha This package is a native extension component and is not designed for direct user-level Python imports. Functionality it provides is exposed through higher-level SDK packages like `azureml-dataprep` or `azureml.core.Dataset`.
- breaking Due to its native components (C++/Rust), `azureml-dataprep-native` has specific platform (OS) and Python version requirements. Installation may fail or lead to runtime issues on unsupported environments.
- gotcha Installing `azureml-dataprep-native` alone does not provide a complete data preparation solution. It must be used in conjunction with `azureml-dataprep` or other AzureML SDK components to access user-facing APIs.
Install
-
pip install azureml-dataprep-native
Imports
- NoDirectUserImports
Functionality is exposed via azureml.dataprep or azureml.core.Dataset
Quickstart
# Install azureml-dataprep-native to make its native components available.
# Users typically interact with the higher-level azureml.dataprep or
# azureml.core.Dataset APIs, which leverage this package internally.
# No direct user-facing imports are typically made from azureml-dataprep-native.
# To use data prep functionality, install azureml-dataprep:
# pip install azureml-dataprep
# Example of how functionality (implicitly powered by this library) is accessed:
import pandas as pd
from azureml.dataprep import read_csv, Dataflow
import os
# Create a dummy CSV file
csv_content = "id,name\n1,Alice\n2,Bob\n3,Charlie"
with open("sample.csv", "w") as f:
f.write(csv_content)
# Read data using azureml.dataprep (which uses azureml-dataprep-native internally)
# This code will only run if azureml-dataprep is also installed.
# To run this code, ensure you have: pip install azureml-dataprep
try:
dataflow: Dataflow = read_csv("sample.csv")
print("Dataflow created successfully (backed by azureml-dataprep-native).")
# Further operations would typically follow, e.g., dataflow.to_pandas_dataframe()
# For demonstration, let's just show the schema
print(dataflow.get_profile().schema_summary)
except ImportError:
print("azureml-dataprep not installed. Please install it to use data prep features:")
print("pip install azureml-dataprep")
except Exception as e:
print(f"An error occurred during data prep operation: {e}")
finally:
# Clean up the dummy file
if os.path.exists("sample.csv"):
os.remove("sample.csv")
print("\nazureml-dataprep-native is primarily an underlying dependency.")