Narwhals
Narwhals is an extremely lightweight and extensible compatibility layer between dataframe libraries. It provides a unified API, largely mirroring the Polars API, enabling users to write dataframe-agnostic code that works across various backends such as pandas, Polars, cuDF, PyArrow, Dask, DuckDB, Ibis, PySpark, and SQLFrame. It is currently at version 2.18.1 and maintains an active development cycle with frequent releases, often including weekly or bi-weekly updates for bug fixes and minor enhancements.
Warnings
- gotcha Narwhals is a compatibility layer and does not provide dataframe functionality itself. You must install the underlying dataframe libraries (e.g., `pandas`, `polars`, `pyarrow`) separately for Narwhals to function with those backends. Not installing them will lead to `ModuleNotFoundError` or `TypeError` when `from_native` is used.
- breaking The main `narwhals` namespace may undergo breaking changes, deprecations, or API shifts in new releases. For critical library development requiring long-term stability, prefer `import narwhals.stable.v1 as nw_stable`. This stable API is promised to never change or remove public functions. Future stable versions (e.g., `v2`, `v3`) will be introduced if breaking changes are necessary.
- gotcha Narwhals implements a *subset* of the Polars API. Not all Polars functions, arguments, or behaviors are necessarily supported or identically replicated across all backends. Complex or less common Polars operations might not be available or might behave differently in specific backend implementations. Always consult the official Narwhals API completeness documentation.
- gotcha Narwhals preserves the eager/lazy execution model of the underlying dataframe. If you pass a lazy frame (e.g., Polars LazyFrame, Dask DataFrame), operations remain lazy. Explicitly call `.collect()` when an eager result is required, especially before operations like `.shape` or `.pivot()`, which necessitate materializing the data. Failing to do so can lead to errors or unexpected behavior.
- deprecated External library deprecations can affect Narwhals. For example, DuckDB 1.5 deprecated `fetch_arrow_table`. While Narwhals strives to adapt, relying on specific backend versions might expose you to these upstream changes. This is particularly relevant for `when/then` conditions, `join` operations, and null value handling across different backends, which have seen several fixes in recent versions.
Install
-
pip install narwhals
Imports
- narwhals
import narwhals as nw
- IntoFrameT
from narwhals.typing import IntoFrameT
- stable.v1
import narwhals.stable.v1 as nw_stable
Quickstart
import narwhals as nw
import pandas as pd
import polars as pl
from narwhals.typing import IntoFrameT
def process_data(df_native: IntoFrameT) -> IntoFrameT:
df = nw.from_native(df_native)
result = (
df.group_by(nw.col('category'))
.agg(nw.col('value').mean().alias('mean_value'))
.sort('mean_value', descending=True)
)
return result.to_native()
# Example with pandas
pd_df = pd.DataFrame({'category': ['A', 'B', 'A', 'C'], 'value': [10, 20, 15, 25]})
pd_result = process_data(pd_df)
print('Pandas Result:')
print(pd_result)
# Example with polars
pl_df = pl.DataFrame({'category': ['A', 'B', 'A', 'C'], 'value': [10, 20, 15, 25]})
pl_result = process_data(pl_df)
print('\nPolars Result:')
print(pl_result)