{"id":9791,"library":"graph-retriever","title":"Graph Retriever","description":"Graph Retriever is a Python library that combines unstructured similarity search with structured document traversal to enhance Retrieval-Augmented Generation (RAG) applications. It enables traversing relationships between documents to find more relevant context than simple similarity search alone. The current version is 0.8.0, with minor releases occurring frequently, often driven by integration updates with DataStax Astra DB and other components.","status":"active","version":"0.8.0","language":"en","source_language":"en","source_url":"https://github.com/datastax/graph-rag","tags":["graph","rag","retrieval","astra","datastax","llm","ai"],"install":[{"cmd":"pip install graph-retriever","lang":"bash","label":"Install core library"}],"dependencies":[{"reason":"Required for using AstraGraphStore, the primary graph store implementation for DataStax Astra DB.","package":"astrapy","optional":true}],"imports":[{"symbol":"GraphRetriever","correct":"from graph_retriever.graph_retriever import GraphRetriever"},{"symbol":"Document","correct":"from graph_retriever.document import Document"},{"note":"The Id class moved to the top-level `id.py` module.","wrong":"from graph_retriever.id_object import Id","symbol":"Id","correct":"from graph_retriever.id import Id"},{"symbol":"BFSTraversalStrategy","correct":"from graph_retriever.retriever_strategies.graph_traversal import BFSTraversalStrategy"},{"symbol":"AstraGraphStore","correct":"from graph_retriever.graph_store.astra_graph_store import AstraGraphStore"}],"quickstart":{"code":"import os\nfrom astrapy.db import AstraDB\nfrom graph_retriever.graph_retriever import GraphRetriever\nfrom graph_retriever.retriever_strategies.graph_traversal import BFSTraversalStrategy\nfrom graph_retriever.document import Document\nfrom graph_retriever.graph_store.astra_graph_store import AstraGraphStore\n\n# Initialize AstraDB connection\ntoken = os.environ.get(\"ASTRA_DB_APPLICATION_TOKEN\", \"YOUR_ASTRA_DB_APPLICATION_TOKEN\")\napi_endpoint = os.environ.get(\"ASTRA_DB_API_ENDPOINT\", \"YOUR_ASTRA_DB_API_ENDPOINT\")\n\nif not token or not api_endpoint:\n    raise ValueError(\"Please set ASTRA_DB_APPLICATION_TOKEN and ASTRA_DB_API_ENDPOINT environment variables.\")\n\nastra_db = AstraDB(token=token, api_endpoint=api_endpoint)\n\n# Initialize GraphStore (using a test collection)\ngraph_store = AstraGraphStore(astra_db=astra_db, collection_name=\"my_rag_collection\")\n\n# Example documents and edges\ndocs = [\n    Document(id=\"doc1\", content=\"Python is a versatile programming language.\", metadata={\"topic\": \"programming\"}),\n    Document(id=\"doc2\", content=\"Generative AI models are changing software development.\", metadata={\"topic\": \"AI\"}),\n    Document(id=\"doc3\", content=\"Large Language Models (LLMs) are a type of Generative AI.\", metadata={\"topic\": \"AI\"})\n]\ngraph_store.add_documents(docs)\ngraph_store.add_edge(\"doc1\", \"doc2\", label=\"discusses_impact_on\")\ngraph_store.add_edge(\"doc2\", \"doc3\", label=\"explains\")\n\n# Initialize Retriever Strategy\nretriever_strategy = BFSTraversalStrategy(k=2, max_depth=1) # Retrieve 2 nodes, 1 depth\n\n# Initialize GraphRetriever\n# embedding_dimension is required for vector search, ensure it matches your embedding model\nretriever = GraphRetriever(\n    graph_store=graph_store,\n    retriever_strategy=retriever_strategy,\n    embedding_dimension=1536 # Example: for OpenAI embeddings\n)\n\n# Example query\nquery_doc = Document(id=\"query_id\", content=\"What are LLMs?\")\nretrieved_nodes = retriever.get_relevant_documents(query_doc)\n\nprint(\"\\nRetrieved Nodes:\")\nfor node in retrieved_nodes:\n    print(f\"  ID: {node.id}, Content: {node.content[:50]}...\")\n\n# Optional: Clean up the collection (uncomment to run)\n# graph_store.clear_collection()","lang":"python","description":"This quickstart demonstrates how to set up `GraphRetriever` with `AstraGraphStore`. It involves initializing an `AstraDB` connection, adding documents and edges to the graph store, defining a retrieval strategy (e.g., BFS), and then using the `GraphRetriever` to fetch relevant documents based on a query, considering both content similarity and graph structure. Ensure `ASTRA_DB_APPLICATION_TOKEN` and `ASTRA_DB_API_ENDPOINT` environment variables are set."},"warnings":[{"fix":"Update edge creation calls to pass string IDs directly instead of wrapping them in `Id()` objects, especially for system fields like `'$id'`.","message":"The representation of document IDs on edges changed in `v0.6.0`. Previously, `Id()` objects were used; now, string IDs (e.g., `'$id'`) are preferred or required in many contexts.","severity":"breaking","affected_versions":">=0.6.0"},{"fix":"Review the documentation or source code for the specific `RetrieverStrategy` class you are using to confirm correct parameter names and types after upgrading to `v0.5.0` or later.","message":"The internal design of retriever strategies was significantly refactored in `v0.5.0`, leading to potential changes in constructor parameters for various `RetrieverStrategy` implementations.","severity":"breaking","affected_versions":">=0.5.0"},{"fix":"If encountering issues with the `k` parameter, ensure you are running `v0.5.1` or later. If stuck on `v0.5.0`, remove the `k` parameter from strategy constructors or upgrade.","message":"The `k` parameter, used to specify the number of nodes to retrieve in strategies, was temporarily removed in `v0.5.0` and then restored in `v0.5.1`. This can cause `TypeError` for users on `v0.5.0` and then suddenly work again on `v0.5.1+`.","severity":"gotcha","affected_versions":"0.5.0"},{"fix":"Ensure you `pip install astrapy` if you plan to use `AstraGraphStore` for connecting to DataStax Astra DB.","message":"While `graph-retriever` itself has minimal direct dependencies, using the `AstraGraphStore` (a common use case) explicitly requires the `astrapy` library, which is not automatically installed with `graph-retriever`.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-17T00:00:00.000Z","next_check":"2026-07-16T00:00:00.000Z","problems":[{"fix":"Install the `astrapy` library: `pip install astrapy`.","cause":"The `AstraGraphStore` class, used for integrating with DataStax Astra DB, requires the `astrapy` library, which is not a direct dependency of `graph-retriever`.","error":"ModuleNotFoundError: No module named 'astrapy'"},{"fix":"Instead of `graph_store.add_edge(Id('doc_a'), Id('doc_b'))`, use `graph_store.add_edge('doc_a', 'doc_b')`. Ensure you are passing string IDs.","cause":"Attempting to use `Id()` objects directly as string IDs or in contexts where a string is expected, particularly for edge identifiers, after `v0.6.0`.","error":"TypeError: 'Id' object is not callable"},{"fix":"Upgrade to `graph-retriever` `v0.5.1` or newer where the `k` parameter was restored. If you must use `v0.5.0`, remove the `k` parameter from the strategy constructor.","cause":"The `k` parameter was temporarily removed in `v0.5.0` for retriever strategies.","error":"TypeError: BFSTraversalStrategy.__init__() got an unexpected keyword argument 'k'"},{"fix":"Double-check your `ASTRA_DB_APPLICATION_TOKEN` and `ASTRA_DB_API_ENDPOINT`. Ensure they are correctly set as environment variables or passed explicitly when initializing `AstraDB`.","cause":"The provided Astra DB application token or API endpoint is incorrect, expired, or improperly formatted.","error":"astrapy.exceptions.AstraDBConnectionError: Failed to connect to Astra DB"}]}