Acryl Great Expectations

raw JSON →
0.15.50.1 verified Thu Apr 16 auth: no en

Acryl Great Expectations is Acryl Data's opinionated flavor of the Great Expectations data validation library, providing a specific set of configurations and integrations primarily for use with the DataHub metadata platform. It currently pins to an older, V2 configuration API of Great Expectations (versions `<0.16.0`). Its release cadence is tied to DataHub releases.

pip install acryl-great-expectations
cli great_expectations
error ModuleNotFoundError: No module named 'great_expectations.data_context.data_context'
cause Attempting to import `DataContext` from a V3-specific path, which might not be directly available or work as expected with the V2 API.
fix
Use from great_expectations.data_context import DataContext (V2 common import) or confirm V2-compatible paths.
error AttributeError: 'DataContext' object has no attribute 'get_context'
cause Attempting to use the V3 API method `DataContext.get_context()` with a DataContext instance from the V2 API, which lacks this method.
fix
Instantiate the DataContext directly via its constructor (e.g., DataContext('/path/to/project')) as per V2 API documentation.
error great_expectations.exceptions.exceptions.InvalidConfigurationBundleError: The great_expectations.yml file in /path/to/great_expectations is not valid. The following errors were found: - ... (schema validation errors related to V3 fields)
cause Using a `great_expectations.yml` file formatted for the V3 API with `acryl-great-expectations`, which expects the V2 configuration schema.
fix
Revert to a great_expectations.yml configuration adhering to the V2 API schema, or manually adjust the file to remove V3-specific elements like fluent_datasources.
breaking `acryl-great-expectations` explicitly pins its `great-expectations` dependency to versions `<0.16.0`. This means it *only* supports the legacy V2 API of Great Expectations, which is fundamentally incompatible with the V3 API introduced in `great-expectations>=0.17.0`.
fix Ensure all Great Expectations code and configuration used with `acryl-great-expectations` adheres to the V2 API patterns. Do not attempt to use V3-specific classes, methods, or configuration formats (e.g., `DataContext.get_context()` or V3 YML schema).
gotcha Configuration files (e.g., `great_expectations.yml`) generated or modified for the V3 Great Expectations API will cause parsing errors or unexpected behavior when used with `acryl-great-expectations` due to its V2 API dependency. The structure and available parameters differ significantly.
fix Always refer to V2 API documentation for `great_expectations.yml` structure and best practices when working with `acryl-great-expectations`.
gotcha The primary intent of `acryl-great-expectations` is to provide Great Expectations functionality specifically for DataHub integrations. While it can be used standalone, users new to Great Expectations might find clearer guidance and more up-to-date examples using the main `great-expectations` library (which supports V3 API).
fix If not explicitly integrating with DataHub, consider using the main `pip install great-expectations` package to access the latest features and documentation (V3 API).

This quickstart demonstrates basic data validation using Great Expectations' V2 API, which `acryl-great-expectations` provides. It uses an in-memory Pandas DataFrame to create a `PandasDataset`, adds a few common expectations, and runs a validation.

import pandas as pd
from great_expectations.dataset import PandasDataset
from great_expectations.core.batch_spec import BatchSpec

# Sample data
df = pd.DataFrame({
    "id": [1, 2, 3, 4, 5],
    "value": [10, 20, 30, 40, 50],
    "category": ["A", "B", "A", "C", "B"]
})

# Create a PandasDataset (V2 API style for in-memory validation)
batch = PandasDataset(df, batch_spec=BatchSpec(data_asset_name="my_dataframe"))

# Define and add expectations
batch.expect_column_to_exist("id")
batch.expect_column_values_to_be_between("value", min_value=0, max_value=100)
batch.expect_column_distinct_values_to_be_in_set("category", ["A", "B", "C"])

# Validate the batch
validation_result = batch.validate()

print(f"Validation successful: {validation_result.success}")
if not validation_result.success:
    print("Validation failed details:")
    for result in validation_result.results:
        if not result.success:
            print(f"  Expectation: {result.expectation_config.expectation_type}, Status: {result.success}")

# Expected output: Validation successful: True