Vanna

2.0.2 · active · verified Fri Apr 17

Vanna is a Python library that enables users to generate SQL queries from natural language, visualize results, and get answers, now with enterprise security and user-aware permissions. Version 2.0.2 is a complete rewrite, focusing on better support for agentic models and an improved user-aware framework. The library maintains an active development pace with frequent updates.

Common errors

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to use `VannaDefault` for a local setup. It initializes Vanna with OpenAI and a local DuckDB, trains it with schema and example questions, and then generates and runs SQL for a natural language query. Ensure you have the `OPENAI_API_KEY` environment variable set or replace the placeholder.

import os
from vanna.local import VannaDefault
import pandas as pd

# For local testing, VannaDefault uses DuckDB and OpenAI as defaults.
# Ensure OPENAI_API_KEY is set in your environment.
# pip install "vanna[duckdb]" "vanna[openai]"

# Initialize VannaDefault with your OpenAI API key and preferred model
vn = VannaDefault(
    model='gpt-4o',
    api_key=os.environ.get('OPENAI_API_KEY', 'YOUR_OPENAI_API_KEY'), # Replace with actual key or set env var
    path='my_vanna_db' # Path for local ChromaDB and DuckDB storage
)

# Connect to a database (DuckDB is default for VannaDefault)
vn.run_sql_query("CREATE TABLE sales (id INT, product TEXT, amount INT)")
vn.run_sql_query("INSERT INTO sales (id, product, amount) VALUES (1, 'Apple', 100), (2, 'Banana', 150)")

# Train Vanna with DDL, documentation, and example questions/SQL
vn.train(ddl="CREATE TABLE employees (id INT, name TEXT, salary INT)")
vn.train(question="What are the total sales for each product?", sql="SELECT product, SUM(amount) FROM sales GROUP BY product")
vn.train(sql="SELECT * FROM sales", df=pd.DataFrame({'id': [1], 'product': ['Apple'], 'amount': [100]}))

# Ask a question
question = "Show me the sum of sales for each product"
sql = vn.generate_sql(question=question)
print(f"Generated SQL: {sql}")

# Run the generated SQL
if sql:
    results = vn.run_sql(sql=sql)
    print(f"Query Results:\n{results}")
else:
    print("Could not generate SQL for the question.")

view raw JSON →