PandasAI — Chat with Your Data Using AI
Conversational data analysis with LLMs. Chat with SQL databases, CSV, Parquet files using natural language. Auto-generates Python/SQL and visualizations. 23K+ stars.
What it is
PandasAI is a Python library that adds conversational capabilities to data analysis. You ask questions about your data in natural language, and PandasAI generates the Python code or SQL query to answer them. It works with pandas DataFrames, SQL databases, CSV files, and Parquet files. It can also generate visualizations automatically.
PandasAI targets data analysts, scientists, and business users who want to explore datasets without writing code manually. It bridges the gap between natural language questions and programmatic data analysis.
How it saves time or tokens
PandasAI eliminates the translation step between 'what do I want to know' and 'how do I write the code.' Instead of composing groupby, merge, and pivot operations manually, you describe what you need. The library generates and executes the code, returning results or charts. For exploratory analysis where you run dozens of ad-hoc queries, this reduces each query from minutes to seconds. Estimated token usage is around 500 tokens per query.
How to use
- Install PandasAI:
pip install pandasai
- Create a SmartDataframe and ask questions:
import pandas as pd
from pandasai import SmartDataframe
df = pd.DataFrame({
'country': ['USA', 'China', 'Japan', 'Germany', 'India'],
'gdp': [21400, 14700, 5100, 3800, 2900],
'population': [331, 1400, 126, 83, 1380]
})
sdf = SmartDataframe(df)
result = sdf.chat('Which country has the highest GDP per capita?')
print(result)
- Generate visualizations:
sdf.chat('Plot a bar chart of GDP by country')
# Generates and displays a matplotlib chart
Example
Connecting to a SQL database:
from pandasai import SmartDataframe
from pandasai.connectors import PostgreSQLConnector
connector = PostgreSQLConnector(
config={
'host': 'localhost',
'port': 5432,
'database': 'analytics',
'username': 'user',
'password': 'pass',
'table': 'sales'
}
)
sdf = SmartDataframe(connector)
sdf.chat('What were the total sales by region last quarter?')
Related on TokRepo
- AI coding tools — AI-powered development and analysis tools
- Research tools — data exploration and research frameworks
Common pitfalls
- PandasAI executes generated code automatically. Review the generated code before running on sensitive datasets, especially when connected to production databases.
- Complex multi-step queries sometimes produce incorrect code. Break complex questions into simpler sub-questions for more reliable results.
- LLM API costs accumulate during heavy exploratory sessions. Consider using local models via Ollama for high-volume analysis.
Frequently Asked Questions
Yes. PandasAI supports Ollama and other local model providers. Configure the LLM parameter to point to your local endpoint, eliminating API costs for exploratory sessions.
Yes. PandasAI provides connectors for PostgreSQL, MySQL, and other databases. It generates SQL queries or loads data into DataFrames depending on the question complexity.
PandasAI executes generated Python code, which carries risk. Use read-only database credentials and review generated code before execution. Do not point it at production databases without safeguards.
PandasAI generates matplotlib and seaborn charts including bar charts, line plots, scatter plots, histograms, and pie charts. Specify the chart type in your natural language query.
Accuracy depends on the LLM model used and the clarity of your question. Simple aggregations and filters are highly reliable. Complex joins or multi-step transformations may require iteration or manual correction.
Citations (3)
- PandasAI GitHub— PandasAI enables conversational data analysis with LLMs
- PandasAI Documentation— Supports DataFrames, SQL databases, CSV, and Parquet files
- PandasAI README— Generates Python code and visualizations from natural language
Related on TokRepo
Source & Thanks
Created by Sinaptik AI. Licensed under custom license. Sinaptik-AI/pandas-ai — 23,000+ GitHub stars
Discussion
Related Assets
Moodle — Open-Source Learning Management System
The most widely used open-source learning platform, providing course management, assessments, and collaboration tools for educators and organizations worldwide.
Sylius — Headless E-Commerce Framework on Symfony
An open-source headless e-commerce platform built on Symfony and API Platform, designed for developers who need a customizable and API-first commerce solution.
Akaunting — Free Self-Hosted Accounting Software
A free, open-source online accounting application built on Laravel for small businesses and freelancers to manage invoices, expenses, and financial reports.