{"id":3816,"library":"sqllineage","title":"SQL Lineage Analysis Tool","description":"SQLLineage is a Python library designed for SQL lineage analysis, capable of identifying source and target tables, as well as providing column-level lineage from SQL queries. It leverages popular SQL parser libraries like sqlfluff and sqlparse, and uses networkx for graph representation. The library is actively maintained, with its current version being 1.5.7, and sees regular minor releases to introduce enhancements and bug fixes.","status":"active","version":"1.5.7","language":"en","source_language":"en","source_url":"https://github.com/reata/sqllineage.git","tags":["sql","lineage","data-governance","etl","data-observability","metadata"],"install":[{"cmd":"pip install sqllineage","lang":"bash","label":"Install latest stable version"}],"dependencies":[{"reason":"Core SQL parsing engine.","package":"sqlfluff","optional":false},{"reason":"Core SQL parsing engine.","package":"sqlparse","optional":false},{"reason":"Default graph operator for lineage representation.","package":"networkx","optional":false},{"reason":"Required for MetaDataProviders to enhance column lineage accuracy by retrieving metadata from SQL databases.","package":"sqlalchemy","optional":true},{"reason":"Experimental alternative graph operator for improved performance on large SQL/graphs.","package":"rustworkx","optional":true}],"imports":[{"symbol":"LineageRunner","correct":"from sqllineage.runner import LineageRunner"}],"quickstart":{"code":"from sqllineage.runner import LineageRunner\n\nsql = \"INSERT INTO target_schema.target_table SELECT col1, col2 FROM source_schema.source_table WHERE col3 > 100\"\n\nrunner = LineageRunner(sql)\n\nprint(f\"Source Tables: {[str(t) for t in runner.source_tables]}\")\nprint(f\"Target Tables: {[str(t) for t in runner.target_tables]}\")\n\n# For column-level lineage (requires metadata for full accuracy)\n# from sqllineage.core.metadata_provider import DummyMetaDataProvider\n# metadata = {\n#     'source_schema.source_table': ['col1', 'col2', 'col3']\n# }\n# runner_with_metadata = LineageRunner(sql, metadata_provider=DummyMetaDataProvider(metadata))\n# for path in runner_with_metadata.get_column_lineage():\n#     print(f\"Column Lineage: {path.source.column} -> {path.target.column}\")","lang":"python","description":"This quickstart demonstrates how to initialize `LineageRunner` with a SQL statement and extract the source and target tables. For accurate column-level lineage, especially with `SELECT *` or unqualified columns, providing a `MetaData` object (e.g., via `DummyMetaDataProvider` or `SQLAlchemyMetaDataProvider`) is crucial."},"warnings":[{"fix":"Always explicitly pass the `dialect` parameter to `LineageRunner` or the CLI for your specific SQL dialect (e.g., `sparksql`, `hive`, `tsql`) to ensure correct parsing, especially for complex or dialect-specific SQL. Use `sqllineage --dialects` to see available options.","message":"Starting with v1.5.x, `ansi` is the default SQL dialect. This might cause parsing issues for non-ANSI compliant SQL that previously worked without explicit dialect specification. Non-validating dialects are targeted for deprecation in v1.6.","severity":"breaking","affected_versions":">=1.5.0"},{"fix":"Provide a `MetaData` object or use a `MetaData` provider (like `SQLAlchemyMetaDataProvider`) to `LineageRunner`. This allows `sqllineage` to access schema details and fully resolve column-level lineage.","message":"Column-level lineage can be inaccurate or incomplete without providing table metadata. For queries involving `SELECT *` or unqualified column names, `sqllineage` cannot fully resolve column dependencies without schema information.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Upgrade your Python environment to a supported version (e.g., Python 3.10 to 3.14 for `v1.5.7`). Check the official documentation for the latest Python compatibility matrix.","message":"Python 3.9 support was dropped, and Python 3.14 support was added in `v1.5.7`. Python 3.8 was deprecated in `v1.5.4`. Ensure your environment uses a compatible Python version.","severity":"breaking","affected_versions":">=1.5.4 (for 3.8), >=1.5.7 (for 3.9)"},{"fix":"For highly complex or non-standard SQL, manual inspection and possibly pre-processing the SQL (e.g., by expanding templates, normalizing vendor-specific syntax) may be necessary to improve lineage accuracy. Providing a `MetaData` provider is also crucial for such cases.","message":"Complex SQL patterns, deeply nested subqueries, large UNION chains, dynamic table/column name resolution, or vendor-specific functions can challenge generic SQL parsers, including `sqllineage`, potentially leading to incomplete or incorrect lineage.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-11T00:00:00.000Z","next_check":"2026-07-10T00:00:00.000Z"}