pandas
The standard Python DataFrame library for data analysis. Current version is 3.0.1 (Feb 2026). pandas 3.0 is a major release with two ecosystem-wide breaking changes: Copy-on-Write (CoW) is now the only mode, and string columns now default to str dtype instead of object. Requires Python >=3.11.
Warnings
- breaking Copy-on-Write (CoW) is now the only mode in pandas 3.0. Chained assignment df['col'][mask] = value silently does nothing — no error, no warning, no modification. This is the most common invisible bug when upgrading.
- breaking String columns now default to str dtype instead of numpy object dtype. Code checking dtype == object or dtype == 'O' to detect string columns will fail silently in pandas 3.0.
- breaking Python 3.10 and below dropped. pandas 3.0 requires Python >=3.11.
- breaking Datetime default resolution changed from nanoseconds to microseconds (or input resolution). pd.Timestamp arithmetic and comparisons with nanosecond precision may produce different results.
- breaking DataFrame.groupby() observed parameter default changed to True for Categorical columns. Previously unobserved categories were included by default, causing silent behavior changes on groupby aggregations.
- deprecated mode.copy_on_write option deprecated — setting it has no effect in pandas 3.0 and will be removed in 4.0.
- gotcha pyarrow is not required but strongly recommended for pandas 3.0. Without pyarrow, the new str dtype falls back to numpy object-backed storage, losing most performance benefits.
- gotcha Many third-party libraries (scikit-learn, seaborn, statsmodels, SHAP) had pandas 3.0 compatibility issues at release. Check library versions when upgrading.
Install
-
pip install pandas -
pip install pandas[performance] -
pip install pandas pyarrow -
pip install pandas[all]
Imports
- pandas
import pandas as pd df = pd.DataFrame({'a': [1, 2], 'b': ['x', 'y']}) # Modify using loc in one step df.loc[df['a'] > 1, 'b'] = 'z' - string dtype
ser = pd.Series(['a', 'b']) ser.dtype # dtype('str') in pandas 3.0 # Check for string dtype in 3.0-compatible way: if pd.api.types.is_string_dtype(ser): ...
Quickstart
import pandas as pd
import numpy as np
# Create DataFrame
df = pd.DataFrame({
'name': ['Alice', 'Bob', 'Charlie'],
'score': [85, 92, 78],
'dept': ['eng', 'eng', 'mkt']
})
# Correct modification in pandas 3.0 (CoW)
df.loc[df['score'] > 80, 'grade'] = 'pass'
# Or use assign() for derived columns (returns new DataFrame)
df = df.assign(grade=lambda x: np.where(x['score'] > 80, 'pass', 'fail'))
# Check dtypes — strings are now 'str', not 'object'
print(df.dtypes)
# name str
# score int64
# dept str