Vanna
Vanna is a Python library that enables users to generate SQL queries from natural language, visualize results, and get answers, now with enterprise security and user-aware permissions. Version 2.0.2 is a complete rewrite, focusing on better support for agentic models and an improved user-aware framework. The library maintains an active development pace with frequent updates.
Common errors
-
ModuleNotFoundError: No module named 'vanna.openai'
cause The optional `openai` dependency was not installed along with `vanna`.fixInstall Vanna with the OpenAI extra: `pip install "vanna[openai]"`. -
AuthenticationError: Incorrect API key provided
cause The OpenAI (or other LLM) API key is missing, incorrect, or expired.fixSet the `OPENAI_API_KEY` environment variable correctly, or pass a valid API key via the `config` dictionary during LLM initialization. -
AttributeError: 'Vanna' object has no attribute 'train_sql'
cause Attempting to use pre-2.0 training methods like `train_sql` or `train_ddl` with Vanna 2.0.0+.fixIn Vanna 2.0.0+, all training methods are consolidated under `vn.train()`. Use `vn.train(sql=...)`, `vn.train(ddl=...)`, `vn.train(question=...)`, etc. -
TypeError: __init__ missing 1 required positional argument: 'config'
cause When creating a custom Vanna instance by inheriting from `VannaBase`, `LLM`, and `VectorStore` components, the `__init__` methods of the base classes (like `ChromaDB_VectorStore` or `OpenAI_Chat`) expect a `config` dictionary.fixEnsure that your custom Vanna class's `__init__` method correctly calls the `__init__` methods of its parent components with a `config` dictionary, even if empty. Example: `ChromaDB_VectorStore.__init__(self, config=config)`.
Warnings
- breaking Vanna 2.0.0 represents a complete architectural rewrite. Code written for pre-2.0 versions will likely break due to significant API changes, especially around `Vanna` class instantiation, training methods, and connector configuration.
- gotcha Vanna requires additional packages for specific LLMs (e.g., OpenAI, Anthropic) and Vector Stores (e.g., ChromaDB, Pinecone, pgvector). The base `pip install vanna` does not include these.
- gotcha API keys for LLMs (like OpenAI) are critical for Vanna's functionality. Incorrect or missing keys will lead to authentication errors.
Install
-
pip install vanna
Imports
- VannaDefault
from vanna import Vanna
from vanna.local import VannaDefault
- VannaBase
from vanna.base import VannaBase
- OpenAI_Chat
from vanna.openai import OpenAI_Chat
- ChromaDB_VectorStore
from vanna.chromadb import ChromaDB_VectorStore
Quickstart
import os
from vanna.local import VannaDefault
import pandas as pd
# For local testing, VannaDefault uses DuckDB and OpenAI as defaults.
# Ensure OPENAI_API_KEY is set in your environment.
# pip install "vanna[duckdb]" "vanna[openai]"
# Initialize VannaDefault with your OpenAI API key and preferred model
vn = VannaDefault(
model='gpt-4o',
api_key=os.environ.get('OPENAI_API_KEY', 'YOUR_OPENAI_API_KEY'), # Replace with actual key or set env var
path='my_vanna_db' # Path for local ChromaDB and DuckDB storage
)
# Connect to a database (DuckDB is default for VannaDefault)
vn.run_sql_query("CREATE TABLE sales (id INT, product TEXT, amount INT)")
vn.run_sql_query("INSERT INTO sales (id, product, amount) VALUES (1, 'Apple', 100), (2, 'Banana', 150)")
# Train Vanna with DDL, documentation, and example questions/SQL
vn.train(ddl="CREATE TABLE employees (id INT, name TEXT, salary INT)")
vn.train(question="What are the total sales for each product?", sql="SELECT product, SUM(amount) FROM sales GROUP BY product")
vn.train(sql="SELECT * FROM sales", df=pd.DataFrame({'id': [1], 'product': ['Apple'], 'amount': [100]}))
# Ask a question
question = "Show me the sum of sales for each product"
sql = vn.generate_sql(question=question)
print(f"Generated SQL: {sql}")
# Run the generated SQL
if sql:
results = vn.run_sql(sql=sql)
print(f"Query Results:\n{results}")
else:
print("Could not generate SQL for the question.")