pandas

raw JSON →
3.0.1 verified Tue May 12 auth: no python install: verified quickstart: verified

The standard Python DataFrame library for data analysis. Current version is 3.0.1 (Feb 2026). pandas 3.0 is a major release with two ecosystem-wide breaking changes: Copy-on-Write (CoW) is now the only mode, and string columns now default to str dtype instead of object. Requires Python >=3.11.

pip install pandas
error ModuleNotFoundError: No module named 'pandas'
cause The pandas library is not installed in the Python environment where the code is being executed.
fix
Install pandas using pip: pip install pandas (or conda install pandas if using Anaconda).
error KeyError: 'some_column_name'
cause Attempting to access a DataFrame column or index label that does not exist in the DataFrame, often due to typos, incorrect casing, or hidden whitespace in column names.
fix
Verify the exact column names using df.columns and correct any typos or casing issues. It's often helpful to strip whitespace from column names: df.columns = df.columns.str.strip().
error AttributeError: 'DataFrame' object has no attribute 'append'
cause The `append()` method for DataFrames and Series was deprecated in pandas 1.4.0 and completely removed in pandas 2.0 and later versions.
fix
Replace df.append() with pd.concat() for combining DataFrames or Series. Example: new_df = pd.concat([df1, df2]).
error AttributeError: module 'pandas' has no attribute 'dataframe'
cause Incorrect capitalization when trying to create a DataFrame; the class name for DataFrame must start with a capital 'D'.
fix
Use pd.DataFrame() with a capital 'D' for DataFrame. Example: df = pd.DataFrame({'col1': [1, 2]}).
error ChainedAssignmentError: A value is trying to be set on a copy of a slice from a DataFrame.
cause In pandas 3.0, Copy-on-Write (CoW) is enabled by default, making chained assignments (e.g., `df[condition]['column'] = value`) reliably operate on a temporary copy, not the original DataFrame. This error prevents silent, incorrect modifications that previously might have only issued a `SettingWithCopyWarning`.
fix
Use .loc for a single-step, explicit assignment to ensure modification of the original DataFrame. Example: df.loc[df['column_a'] > 5, 'column_b'] = new_value.
breaking Copy-on-Write (CoW) is now the only mode in pandas 3.0. Chained assignment df['col'][mask] = value silently does nothing — no error, no warning, no modification. This is the most common invisible bug when upgrading.
fix Use df.loc[mask, 'col'] = value for conditional assignment. Use df = df.assign(col=...) for derived columns. Remove all defensive .copy() calls added to silence old SettingWithCopyWarning.
breaking String columns now default to str dtype instead of numpy object dtype. Code checking dtype == object or dtype == 'O' to detect string columns will fail silently in pandas 3.0.
fix Replace dtype == object checks with pd.api.types.is_string_dtype(col) or dtype == 'str'. For library code: handle both 'object' and 'str' dtypes during transition.
breaking Python 3.10 and below dropped. pandas 3.0 requires Python >=3.11.
fix Pin pandas<3.0 for Python <=3.10 environments. Upgrade Python to 3.11+ to use pandas 3.0.
breaking Datetime default resolution changed from nanoseconds to microseconds (or input resolution). pd.Timestamp arithmetic and comparisons with nanosecond precision may produce different results.
fix Explicitly pass unit='ns' where nanosecond precision is required: pd.to_datetime(arr, unit='ns').
breaking DataFrame.groupby() observed parameter default changed to True for Categorical columns. Previously unobserved categories were included by default, causing silent behavior changes on groupby aggregations.
fix Pass observed=False explicitly to restore old behavior if unobserved categories are needed.
deprecated mode.copy_on_write option deprecated — setting it has no effect in pandas 3.0 and will be removed in 4.0.
fix Remove pd.options.mode.copy_on_write = True/False from your code — CoW is always on.
gotcha pyarrow is not required but strongly recommended for pandas 3.0. Without pyarrow, the new str dtype falls back to numpy object-backed storage, losing most performance benefits.
fix pip install pyarrow alongside pandas. Verified by: pd.Series(['a']).dtype shows 'str' regardless, but performance differs significantly.
gotcha Many third-party libraries (scikit-learn, seaborn, statsmodels, SHAP) had pandas 3.0 compatibility issues at release. Check library versions when upgrading.
fix Test your full dependency stack against pandas 3.0 before upgrading. Use pandas 2.3.x as a stepping stone to surface deprecation warnings first.
gotcha Building `psycopg2` from source fails due to missing PostgreSQL development headers and libraries (`pg_config`). This is a common issue in minimal environments.
fix Ensure PostgreSQL development headers are installed, or install the `psycopg2-binary` package instead of `psycopg2` (e.g., `pip install psycopg2-binary`). For Debian/Ubuntu, `apt-get install libpq-dev`.
gotcha Installing `pandas[performance]` (which requires numba and llvmlite) may fail in minimal environments (e.g., Alpine Linux) due to missing system build tools like `gcc` and `cmake`, which are necessary for compiling these dependencies.
fix Ensure that essential build tools are installed in your environment before installing `pandas[performance]`. For Alpine Linux, use `apk add build-base cmake`.
pip install pandas[performance]
pip install pandas pyarrow
pip install pandas[all]
python os / libc variant status wheel install import disk
3.10 alpine (musl) pandas - - 0.78s 164.6M
3.10 alpine (musl) pandas - - 0.99s 334.2M
3.10 alpine (musl) all - - - -
3.10 alpine (musl) performance - - - -
3.10 slim (glibc) pandas - - 0.56s 157M
3.10 slim (glibc) pandas - - 0.71s 304M
3.10 slim (glibc) all - - - -
3.10 slim (glibc) performance - - 0.58s 348M
3.11 alpine (musl) pandas - - 1.09s 177.1M
3.11 alpine (musl) pandas - - 1.40s 348.9M
3.11 alpine (musl) all - - - -
3.11 alpine (musl) performance - - - -
3.11 slim (glibc) pandas - - 0.92s 169M
3.11 slim (glibc) pandas - - 1.15s 318M
3.11 slim (glibc) all - - - -
3.11 slim (glibc) performance - - 0.94s 368M
3.12 alpine (musl) pandas - - 0.93s 162.3M
3.12 alpine (musl) pandas - - 1.20s 333.9M
3.12 alpine (musl) all - - - -
3.12 alpine (musl) performance - - - -
3.12 slim (glibc) pandas - - 0.95s 154M
3.12 slim (glibc) pandas - - 1.24s 303M
3.12 slim (glibc) all - - - -
3.12 slim (glibc) performance - - 0.98s 352M
3.13 alpine (musl) pandas - - 0.90s 161.2M
3.13 alpine (musl) pandas - - 1.15s 332.8M
3.13 alpine (musl) all - - - -
3.13 alpine (musl) performance - - - -
3.13 slim (glibc) pandas - - 0.93s 153M
3.13 slim (glibc) pandas - - 1.17s 302M
3.13 slim (glibc) all - - - -
3.13 slim (glibc) performance - - 0.94s 351M
3.9 alpine (musl) pandas - - 0.75s 172.4M
3.9 alpine (musl) pandas - - 0.92s 328.5M
3.9 alpine (musl) all - - - -
3.9 alpine (musl) performance - - - -
3.9 slim (glibc) pandas - - 0.67s 167M
3.9 slim (glibc) pandas - - 0.86s 306M
3.9 slim (glibc) all - - - -
3.9 slim (glibc) performance - - 0.69s 326M

pandas 3.0 patterns. Use loc for in-place modification. Strings are str dtype not object.

import pandas as pd
import numpy as np

# Create DataFrame
df = pd.DataFrame({
    'name': ['Alice', 'Bob', 'Charlie'],
    'score': [85, 92, 78],
    'dept': ['eng', 'eng', 'mkt']
})

# Correct modification in pandas 3.0 (CoW)
df.loc[df['score'] > 80, 'grade'] = 'pass'

# Or use assign() for derived columns (returns new DataFrame)
df = df.assign(grade=lambda x: np.where(x['score'] > 80, 'pass', 'fail'))

# Check dtypes — strings are now 'str', not 'object'
print(df.dtypes)
# name     str
# score    int64
# dept     str