What is Vanna?
Vanna is an open-source AI SQL agent that learns your specific database. It uses RAG (Retrieval-Augmented Generation) to train on your DDL schemas, documentation, and example queries — then generates accurate SQL from natural language questions. Unlike generic text-to-SQL tools, Vanna improves over time as you add more training data specific to your database.
Answer-Ready: Vanna is an AI SQL agent using RAG to generate accurate SQL from natural language. Trains on your schema, docs, and query history. Supports PostgreSQL, MySQL, BigQuery, Snowflake, and 15+ databases. Built-in Streamlit UI for non-technical users. 13k+ GitHub stars.
Best for: Data teams wanting natural language database access. Works with: Any SQL database, OpenAI, Anthropic Claude, Ollama. Setup time: Under 5 minutes.
Core Features
1. Train on Your Data
# Train on DDL
vn.train(ddl="CREATE TABLE users (id INT, name VARCHAR, email VARCHAR)")
# Train on documentation
vn.train(documentation="The users table stores all registered users. The email field is unique.")
# Train on example queries
vn.train(sql="SELECT name, COUNT(*) as order_count FROM users JOIN orders ON users.id = orders.customer_id GROUP BY name")2. Multi-Database Support
| Database | Connector |
|---|---|
| PostgreSQL | psycopg2 |
| MySQL | pymysql |
| BigQuery | google-cloud-bigquery |
| Snowflake | snowflake-connector |
| DuckDB | duckdb |
| SQLite | sqlite3 |
| Databricks | databricks-sql |
3. Built-in UI
from vanna.flask import VannaFlaskApp
app = VannaFlaskApp(vn)
app.run()
# Opens web UI at localhost:80844. Auto-Visualization
# Generate SQL + run it + create chart
vn.ask("Show monthly revenue trend as a line chart")
# Returns: SQL query, DataFrame result, and Plotly chartFAQ
Q: How accurate is it? A: Accuracy depends on training data quality. With good schema docs and example queries, 85-95% accuracy on typical business questions.
Q: Does it support Claude?
A: Yes, use vanna.anthropic.Anthropic_Chat as the LLM backend with any Anthropic model.
Q: Is my data safe? A: Vanna sends schema metadata and questions to the LLM, not raw data. Use local models via Ollama for full privacy.