{"id":3445,"library":"dagster-gcp","title":"Dagster GCP","description":"Dagster-gcp is a Python library that provides components for interacting with Google Cloud Platform (GCP) services within the Dagster data orchestration framework. It includes resources and I/O managers for services like BigQuery, Google Cloud Storage (GCS), and Dataproc. The library is actively maintained with frequent releases, often in conjunction with the core Dagster framework, to ensure compatibility and introduce new features.","status":"active","version":"0.29.0","language":"en","source_language":"en","source_url":"https://github.com/dagster-io/dagster/tree/master/python_modules/libraries/dagster-gcp","tags":["data orchestration","etl","gcp","google cloud","bigquery","gcs","dataproc"],"install":[{"cmd":"pip install dagster-gcp","lang":"bash","label":"Install dagster-gcp"}],"dependencies":[{"reason":"Core Dagster framework is required for functionality.","package":"dagster","optional":false},{"reason":"Requires Python 3.10 or higher, but less than 3.15.","package":"python","optional":false},{"reason":"Optional dependency for Dataproc integration.","package":"dataproc","optional":true},{"reason":"Optional dependency, potentially for optimized data handling with certain I/O managers.","package":"pyarrow","optional":true}],"imports":[{"symbol":"BigQueryResource","correct":"from dagster_gcp import BigQueryResource"},{"symbol":"BigQueryIOManager","correct":"from dagster_gcp import BigQueryIOManager"},{"symbol":"GCSResource","correct":"from dagster_gcp.gcs import GCSResource"},{"symbol":"GCSPickleIOManager","correct":"from dagster_gcp.gcs import GCSPickleIOManager"},{"symbol":"DataprocResource","correct":"from dagster_gcp.dataproc import DataprocResource"},{"note":"This API is currently in preview and may have breaking changes in patch releases; not recommended for production.","symbol":"PipesDataprocJobClient","correct":"from dagster_gcp.pipes import PipesDataprocJobClient"}],"quickstart":{"code":"import os\nfrom dagster import Definitions, asset, EnvVar\nfrom dagster_gcp import BigQueryResource\n\n# Ensure GOOGLE_APPLICATION_CREDENTIALS or similar env var is set for local execution\n# For simplicity, project is hardcoded or read from an env var. In production, consider more robust auth.\n\n@asset\ndef my_bq_table(bigquery: BigQueryResource):\n    \"\"\"An asset that queries a BigQuery table.\"\"\"\n    project_id = os.environ.get('GCP_PROJECT_ID', 'your-gcp-project')\n    dataset_id = os.environ.get('BIGQUERY_DATASET', 'my_dataset')\n    table_id = os.environ.get('BIGQUERY_TABLE', 'my_table')\n    \n    # Example: Execute a simple query\n    query = f\"SELECT COUNT(*) FROM `{project_id}.{dataset_id}.{table_id}`\"\n    \n    with bigquery.get_client() as client:\n        query_job = client.query(query)\n        results = query_job.result()\n        print(f\"Query executed successfully. First row: {list(results)[0]}\")\n\ndefs = Definitions(\n    assets=[my_bq_table],\n    resources={\n        \"bigquery\": BigQueryResource(\n            project=EnvVar(\"GCP_PROJECT_ID\"), # Use EnvVar for production\n            location=EnvVar(\"GCP_REGION\", default_value=\"us-central1\"),\n            # You can also pass gcp_credentials as a base64 encoded JSON string via EnvVar\n        )\n    },\n)\n\n# To run this locally:\n# 1. Set environment variables, e.g., GOOGLE_APPLICATION_CREDENTIALS, GCP_PROJECT_ID, BIGQUERY_DATASET, BIGQUERY_TABLE\n# 2. Run `dagster dev -f your_file.py`\n# 3. Navigate to Dagit UI, find 'my_bq_table' asset and materialize it.","lang":"python","description":"This quickstart demonstrates defining a Dagster asset that interacts with Google BigQuery using `BigQueryResource`. It shows how to configure the resource and execute a simple SQL query. Authentication is expected via standard GCP mechanisms (like `GOOGLE_APPLICATION_CREDENTIALS` environment variable) or configured directly on the resource via `gcp_credentials`."},"warnings":[{"fix":"Refer to the specific library's changelog for each upgrade. Test integrations thoroughly after updates, especially for 'beta' or 'preview' features.","message":"Dagster integration libraries, including `dagster-gcp`, follow a pre-1.0 versioning track (e.g., `0.x.y`) even though Dagster core is at `1.x.y`. While `0.16+` library releases are generally compatible with `Dagster 1.x`, their APIs are not as mature as core and 'Beta APIs may have breaking changes in minor version releases, with behavior changes in patch releases'.","severity":"gotcha","affected_versions":"All 0.x.y versions of `dagster-gcp`."},{"fix":"Ensure `GOOGLE_APPLICATION_CREDENTIALS` points to a valid service account key file, or configure `gcp_credentials` on the resource with a base64 encoded service account JSON string. Verify the service account has necessary permissions for the GCP services being accessed.","message":"Proper GCP authentication is critical. `dagster-gcp` components rely on standard Google Cloud authentication mechanisms. Misconfiguration of `GOOGLE_APPLICATION_CREDENTIALS` environment variable or incorrect `gcp_credentials` in resource configuration is a common issue.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Always explicitly define the target BigQuery dataset for assets or ops using `BigQueryIOManager` via the `dataset` config, `key_prefix` for assets, or output `metadata={'schema': '...'}` for ops.","message":"When using `BigQueryIOManager` for assets or ops, if a dataset is not explicitly specified via `project` and `dataset` configuration, `key_prefix` on assets, or `schema` in op output metadata, it will default to 'public'. This might lead to data being written to an unintended or incorrect dataset.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Avoid using 'preview' or 'beta' APIs in production environments. If used, be prepared for frequent breaking changes and thoroughly test after every update.","message":"APIs marked as 'preview' or 'beta' (e.g., `PipesDataprocJobClient`) are not considered ready for production use and may introduce breaking changes in patch or minor releases.","severity":"deprecated","affected_versions":"Specific components, check documentation for 'preview' or 'beta' markers."}],"env_vars":null,"last_verified":"2026-04-11T00:00:00.000Z","next_check":"2026-07-10T00:00:00.000Z"}