Acryl Great Expectations
Acryl Great Expectations is Acryl Data's opinionated flavor of the Great Expectations data validation library, providing a specific set of configurations and integrations primarily for use with the DataHub metadata platform. It currently pins to an older, V2 configuration API of Great Expectations (versions `<0.16.0`). Its release cadence is tied to DataHub releases.
Common errors
-
ModuleNotFoundError: No module named 'great_expectations.data_context.data_context'
cause Attempting to import `DataContext` from a V3-specific path, which might not be directly available or work as expected with the V2 API.fixUse `from great_expectations.data_context import DataContext` (V2 common import) or confirm V2-compatible paths. -
AttributeError: 'DataContext' object has no attribute 'get_context'
cause Attempting to use the V3 API method `DataContext.get_context()` with a DataContext instance from the V2 API, which lacks this method.fixInstantiate the `DataContext` directly via its constructor (e.g., `DataContext('/path/to/project')`) as per V2 API documentation. -
great_expectations.exceptions.exceptions.InvalidConfigurationBundleError: The great_expectations.yml file in /path/to/great_expectations is not valid. The following errors were found: - ... (schema validation errors related to V3 fields)
cause Using a `great_expectations.yml` file formatted for the V3 API with `acryl-great-expectations`, which expects the V2 configuration schema.fixRevert to a `great_expectations.yml` configuration adhering to the V2 API schema, or manually adjust the file to remove V3-specific elements like `fluent_datasources`.
Warnings
- breaking `acryl-great-expectations` explicitly pins its `great-expectations` dependency to versions `<0.16.0`. This means it *only* supports the legacy V2 API of Great Expectations, which is fundamentally incompatible with the V3 API introduced in `great-expectations>=0.17.0`.
- gotcha Configuration files (e.g., `great_expectations.yml`) generated or modified for the V3 Great Expectations API will cause parsing errors or unexpected behavior when used with `acryl-great-expectations` due to its V2 API dependency. The structure and available parameters differ significantly.
- gotcha The primary intent of `acryl-great-expectations` is to provide Great Expectations functionality specifically for DataHub integrations. While it can be used standalone, users new to Great Expectations might find clearer guidance and more up-to-date examples using the main `great-expectations` library (which supports V3 API).
Install
-
pip install acryl-great-expectations
Imports
- PandasDataset
from great_expectations.dataset import PandasDataset
- ExpectationSuite
from great_expectations.core.expectation_suite import ExpectationSuite
- DataContext
from great_expectations.data_context.data_context import DataContext
from great_expectations.data_context import DataContext
Quickstart
import pandas as pd
from great_expectations.dataset import PandasDataset
from great_expectations.core.batch_spec import BatchSpec
# Sample data
df = pd.DataFrame({
"id": [1, 2, 3, 4, 5],
"value": [10, 20, 30, 40, 50],
"category": ["A", "B", "A", "C", "B"]
})
# Create a PandasDataset (V2 API style for in-memory validation)
batch = PandasDataset(df, batch_spec=BatchSpec(data_asset_name="my_dataframe"))
# Define and add expectations
batch.expect_column_to_exist("id")
batch.expect_column_values_to_be_between("value", min_value=0, max_value=100)
batch.expect_column_distinct_values_to_be_in_set("category", ["A", "B", "C"])
# Validate the batch
validation_result = batch.validate()
print(f"Validation successful: {validation_result.success}")
if not validation_result.success:
print("Validation failed details:")
for result in validation_result.results:
if not result.success:
print(f" Expectation: {result.expectation_config.expectation_type}, Status: {result.success}")
# Expected output: Validation successful: True