What is Ibis — The Portable Python Dataframe Library?

A Python dataframe library that provides a single API for writing analytics code that runs on any backend, from DuckDB and Polars to Postgres, Spark, BigQuery, and more.

Is Ibis — The Portable Python Dataframe Library free to use?

Yes. Ibis — The Portable Python Dataframe Library is freely available on TokRepo. Check the Source & Thanks section on the asset page for the specific open-source license.

How do I install Ibis — The Portable Python Dataframe Library?

Visit the asset page on TokRepo and click "Copy for agent" to get the installation instructions. Most assets can be installed with a single command.

Ibis — The Portable Python Dataframe Library

Introduction

Ibis provides a unified Python dataframe API that compiles analytics expressions to SQL or native query plans for many different backends. You write your data transformations once in Python, and Ibis translates them to run on DuckDB, Polars, PostgreSQL, Spark, BigQuery, Snowflake, or other engines without changing your code.

What Ibis Does

Provides a pandas-like API that produces deferred expressions instead of eager computation
Compiles expressions to optimized SQL or query plans for the target backend engine
Supports 20+ backends including DuckDB, Polars, PostgreSQL, Spark, BigQuery, and Snowflake
Enables interactive exploration with the same code that runs in production pipelines
Offers a consistent type system across backends for predictable behavior

Architecture Overview

Ibis uses a two-layer architecture. The top layer is the expression API, which builds a lazy computation graph of dataframe operations. The bottom layer is a compiler that translates the expression graph into the target backend's query language (SQL for relational databases, native API calls for Polars or DataFusion). Backends implement a standard interface so that new engines can be added as plugins. No data moves between systems until the user explicitly materializes results.

Self-Hosting & Configuration

Install with pip including your backend extra, e.g., pip install ibis-framework[duckdb]
Connect to a backend with ibis.<backend>.connect() passing connection parameters
Read data from files, tables, or existing database schemas
Chain operations using the expression API (filter, select, group_by, join, mutate)
Materialize results with .to_pandas(), .to_polars(), or .execute()

Key Features

Backend portability: switch from DuckDB in development to BigQuery in production with no code changes
Lazy evaluation: operations build an expression tree and execute only when results are requested
SQL output: call .compile() on any expression to see the generated SQL for debugging
Type-safe expressions: operations are validated at expression-build time, catching errors before execution
Composable: build reusable transformation functions that work across any backend

Comparison with Similar Tools

pandas — eager in-memory computation; Ibis is lazy and pushes computation to the backend engine
Polars — single fast engine; Ibis is a multi-backend API layer that can target Polars as one of many backends
SQLAlchemy — ORM and SQL toolkit for application development; Ibis is an analytics-focused dataframe API
dbt — SQL-based transformation layer; Ibis provides Python-native analytics with SQL compilation
PySpark — Spark-specific dataframe API; Ibis supports Spark and 20+ other backends with one API

FAQ

Q: Is Ibis a replacement for pandas? A: Ibis can replace pandas for analytics workflows where you want backend portability or to work with data that does not fit in memory. For small in-memory data manipulation, pandas remains a fine choice.

Q: Can I see the SQL that Ibis generates? A: Yes. Call ibis.to_sql(expression) or expression.compile() to inspect the generated SQL for any relational backend.

Q: Does Ibis load all data into memory? A: No. Ibis pushes computation to the backend engine. Data stays in the database or engine until you explicitly call .execute() or .to_pandas() to fetch results.

Q: Which backends are supported? A: DuckDB, Polars, PostgreSQL, MySQL, SQLite, Spark, BigQuery, Snowflake, Trino, ClickHouse, DataFusion, Impala, MSSQL, Oracle, Exasol, Flink, and others. The list grows as the community adds backends.

Ibis — The Portable Python Dataframe Library

Introduction

What Ibis Does

Architecture Overview

Self-Hosting & Configuration

Key Features

Comparison with Similar Tools

FAQ

Sources

Discussion

Related Assets

RQ — Simple Python Job Queues Backed by Redis

Pillow — The Python Imaging Library Fork

APScheduler — Advanced Python Scheduler for Background Jobs

Gensim — Topic Modeling and Semantic NLP in Python