Apache Airflow Provider for Weaviate

raw JSON →
3.3.3 verified Fri May 01 auth: no python

This is the Apache Airflow provider package for integrating with Weaviate vector database. It enables Airflow DAGs to interact with Weaviate for vector search, import/export of objects, and schema management. Current version 3.3.3, compatible with Airflow >=2.10.0 and Python >=3.10. Released roughly quarterly.

pip install apache-airflow-providers-weaviate
error ModuleNotFoundError: No module named 'airflow.hooks.weaviate'
cause Trying to import hook from old deprecated path.
fix
Use from airflow.providers.weaviate.hooks.weaviate import WeaviateHook.
error AttributeError: 'WeaviateHook' object has no attribute 'get_conn'
cause Using deprecated method after provider update.
fix
Replace hook.get_conn() with hook.get_client().
error weaviate.exceptions.WeaviateClosedError: the client is closed
cause Reusing a closed client or not properly managing hook lifecycle.
fix
Ensure each task uses the hook to obtain a fresh client; do not share clients across tasks without reopening.
breaking In version 3.0.0, the hook import path changed from `airflow.providers.weaviate.hooks.weaviate_hook` to `airflow.providers.weaviate.hooks.weaviate`. Old imports will break.
fix Update import to `from airflow.providers.weaviate.hooks.weaviate import WeaviateHook`.
deprecated The `WeaviateHook` method `get_conn` is deprecated; use `get_client` instead.
fix Replace `hook.get_conn()` with `hook.get_client()`.
gotcha The Weaviate connection must use the 'weaviate' connection type in Airflow, not 'http' or 'generic'. Setting the wrong conn_type will cause silent fallback or errors.
fix Create a connection with conn_type='weaviate', host as the Weaviate URL, and optionally login/password for API key.
gotcha When using `WeaviateIngestOperator`, the `class_name` must already exist in the Weaviate schema. The operator does NOT automatically create the class; it only ingests objects.
fix First run a schema creation operation (e.g., using `WeaviateCreateSchemaOperator` or direct client call) before ingestion.
pip install apache-airflow[weaviate]

Minimal DAG to test Weaviate connection using the provider hook.

from datetime import datetime
from airflow import DAG
from airflow.operators.python import PythonOperator
from airflow.providers.weaviate.hooks.weaviate import WeaviateHook

def test_connection():
    hook = WeaviateHook(weaviate_conn_id='weaviate_default')
    client = hook.get_client()
    print(client.is_ready())

with DAG(
    dag_id='weaviate_test',
    start_date=datetime(2023,1,1),
    schedule=None,
    catchup=False,
) as dag:
    task = PythonOperator(
        task_id='test_conn',
        python_callable=test_connection
    )