pandas

3.0.1 verified Tue May 12 auth: no python install: verified quickstart: verified

The standard Python DataFrame library for data analysis. Current version is 3.0.1 (Feb 2026). pandas 3.0 is a major release with two ecosystem-wide breaking changes: Copy-on-Write (CoW) is now the only mode, and string columns now default to str dtype instead of object. Requires Python >=3.11.

pip install pandas

Common errors

error ModuleNotFoundError: No module named 'pandas' ↓

cause The pandas library is not installed in the Python environment where the code is being executed.

fix

Install pandas using pip: pip install pandas (or conda install pandas if using Anaconda).

error KeyError: 'some_column_name' ↓

cause Attempting to access a DataFrame column or index label that does not exist in the DataFrame, often due to typos, incorrect casing, or hidden whitespace in column names.

fix

Verify the exact column names using df.columns and correct any typos or casing issues. It's often helpful to strip whitespace from column names: df.columns = df.columns.str.strip().

error AttributeError: 'DataFrame' object has no attribute 'append' ↓

cause The `append()` method for DataFrames and Series was deprecated in pandas 1.4.0 and completely removed in pandas 2.0 and later versions.

fix

Replace df.append() with pd.concat() for combining DataFrames or Series. Example: new_df = pd.concat([df1, df2]).

error AttributeError: module 'pandas' has no attribute 'dataframe' ↓

cause Incorrect capitalization when trying to create a DataFrame; the class name for DataFrame must start with a capital 'D'.

fix

Use pd.DataFrame() with a capital 'D' for DataFrame. Example: df = pd.DataFrame({'col1': [1, 2]}).

error ChainedAssignmentError: A value is trying to be set on a copy of a slice from a DataFrame. ↓

cause In pandas 3.0, Copy-on-Write (CoW) is enabled by default, making chained assignments (e.g., `df[condition]['column'] = value`) reliably operate on a temporary copy, not the original DataFrame. This error prevents silent, incorrect modifications that previously might have only issued a `SettingWithCopyWarning`.

fix

Use .loc for a single-step, explicit assignment to ensure modification of the original DataFrame. Example: df.loc[df['column_a'] > 5, 'column_b'] = new_value.

Warnings

breaking Copy-on-Write (CoW) is now the only mode in pandas 3.0. Chained assignment df['col'][mask] = value silently does nothing — no error, no warning, no modification. This is the most common invisible bug when upgrading. ↓

fix Use df.loc[mask, 'col'] = value for conditional assignment. Use df = df.assign(col=...) for derived columns. Remove all defensive .copy() calls added to silence old SettingWithCopyWarning.

breaking String columns now default to str dtype instead of numpy object dtype. Code checking dtype == object or dtype == 'O' to detect string columns will fail silently in pandas 3.0. ↓

fix Replace dtype == object checks with pd.api.types.is_string_dtype(col) or dtype == 'str'. For library code: handle both 'object' and 'str' dtypes during transition.

breaking Python 3.10 and below dropped. pandas 3.0 requires Python >=3.11. ↓

fix Pin pandas<3.0 for Python <=3.10 environments. Upgrade Python to 3.11+ to use pandas 3.0.

breaking Datetime default resolution changed from nanoseconds to microseconds (or input resolution). pd.Timestamp arithmetic and comparisons with nanosecond precision may produce different results. ↓

fix Explicitly pass unit='ns' where nanosecond precision is required: pd.to_datetime(arr, unit='ns').

breaking DataFrame.groupby() observed parameter default changed to True for Categorical columns. Previously unobserved categories were included by default, causing silent behavior changes on groupby aggregations. ↓

fix Pass observed=False explicitly to restore old behavior if unobserved categories are needed.

deprecated mode.copy_on_write option deprecated — setting it has no effect in pandas 3.0 and will be removed in 4.0. ↓

fix Remove pd.options.mode.copy_on_write = True/False from your code — CoW is always on.

gotcha pyarrow is not required but strongly recommended for pandas 3.0. Without pyarrow, the new str dtype falls back to numpy object-backed storage, losing most performance benefits. ↓

fix pip install pyarrow alongside pandas. Verified by: pd.Series(['a']).dtype shows 'str' regardless, but performance differs significantly.

gotcha Many third-party libraries (scikit-learn, seaborn, statsmodels, SHAP) had pandas 3.0 compatibility issues at release. Check library versions when upgrading. ↓

fix Test your full dependency stack against pandas 3.0 before upgrading. Use pandas 2.3.x as a stepping stone to surface deprecation warnings first.

gotcha Building `psycopg2` from source fails due to missing PostgreSQL development headers and libraries (`pg_config`). This is a common issue in minimal environments. ↓

fix Ensure PostgreSQL development headers are installed, or install the `psycopg2-binary` package instead of `psycopg2` (e.g., `pip install psycopg2-binary`). For Debian/Ubuntu, `apt-get install libpq-dev`.

gotcha Installing `pandas[performance]` (which requires numba and llvmlite) may fail in minimal environments (e.g., Alpine Linux) due to missing system build tools like `gcc` and `cmake`, which are necessary for compiling these dependencies. ↓

fix Ensure that essential build tools are installed in your environment before installing `pandas[performance]`. For Alpine Linux, use `apk add build-base cmake`.

Install

pip install pandas[performance]

pip install pandas pyarrow

pip install pandas[all]

Install compatibility verified last tested: 2026-05-12

python os / libc variant status wheel install import disk

3.10 alpine (musl) pandas - - 0.78s 164.6M

3.10 alpine (musl) pandas - - 0.99s 334.2M

3.10 alpine (musl) all - - - -

3.10 alpine (musl) performance - - - -

3.10 slim (glibc) pandas - - 0.56s 157M

3.10 slim (glibc) pandas - - 0.71s 304M

3.10 slim (glibc) all - - - -

3.10 slim (glibc) performance - - 0.58s 348M

3.11 alpine (musl) pandas - - 1.09s 177.1M

3.11 alpine (musl) pandas - - 1.40s 348.9M

3.11 alpine (musl) all - - - -

3.11 alpine (musl) performance - - - -

3.11 slim (glibc) pandas - - 0.92s 169M

3.11 slim (glibc) pandas - - 1.15s 318M

3.11 slim (glibc) all - - - -

3.11 slim (glibc) performance - - 0.94s 368M

3.12 alpine (musl) pandas - - 0.93s 162.3M

3.12 alpine (musl) pandas - - 1.20s 333.9M

3.12 alpine (musl) all - - - -

3.12 alpine (musl) performance - - - -

3.12 slim (glibc) pandas - - 0.95s 154M

3.12 slim (glibc) pandas - - 1.24s 303M

3.12 slim (glibc) all - - - -

3.12 slim (glibc) performance - - 0.98s 352M

3.13 alpine (musl) pandas - - 0.90s 161.2M

3.13 alpine (musl) pandas - - 1.15s 332.8M

3.13 alpine (musl) all - - - -

3.13 alpine (musl) performance - - - -

3.13 slim (glibc) pandas - - 0.93s 153M

3.13 slim (glibc) pandas - - 1.17s 302M

3.13 slim (glibc) all - - - -

3.13 slim (glibc) performance - - 0.94s 351M

3.9 alpine (musl) pandas - - 0.75s 172.4M

3.9 alpine (musl) pandas - - 0.92s 328.5M

3.9 alpine (musl) all - - - -

3.9 alpine (musl) performance - - - -

3.9 slim (glibc) pandas - - 0.67s 167M

3.9 slim (glibc) pandas - - 0.86s 306M

3.9 slim (glibc) all - - - -

3.9 slim (glibc) performance - - 0.69s 326M

Imports

pandas

wrong

df['b'][df['a'] > 1] = 'z'  # chained assignment — silently fails in pandas 3.0 (CoW)

correct

import pandas as pd

df = pd.DataFrame({'a': [1, 2], 'b': ['x', 'y']})
# Modify using loc in one step
df.loc[df['a'] > 1, 'b'] = 'z'

Copy-on-Write is now mandatory in 3.0. Chained assignment (df['col'][mask] = val) no longer modifies the DataFrame — it silently does nothing.

string dtype

wrong

if df['col'].dtype == object:  # no longer true for string columns in pandas 3.0
    ...

correct

ser = pd.Series(['a', 'b'])
ser.dtype  # dtype('str') in pandas 3.0

# Check for string dtype in 3.0-compatible way:
if pd.api.types.is_string_dtype(ser):
    ...

String columns now infer to str dtype instead of numpy object. Code checking dtype == object or dtype == 'O' for string detection will silently miss string columns.

Quickstart verified last tested: 2026-04-23

pandas 3.0 patterns. Use loc for in-place modification. Strings are str dtype not object.

import pandas as pd
import numpy as np

# Create DataFrame
df = pd.DataFrame({
    'name': ['Alice', 'Bob', 'Charlie'],
    'score': [85, 92, 78],
    'dept': ['eng', 'eng', 'mkt']
})

# Correct modification in pandas 3.0 (CoW)
df.loc[df['score'] > 80, 'grade'] = 'pass'

# Or use assign() for derived columns (returns new DataFrame)
df = df.assign(grade=lambda x: np.where(x['score'] > 80, 'pass', 'fail'))

# Check dtypes — strings are now 'str', not 'object'
print(df.dtypes)
# name     str
# score    int64
# dept     str