pybaseball
raw JSON → 2.2.7 verified Fri May 01 auth: no python
pybaseball is a Python library for retrieving and analyzing baseball data from sources like Baseball Savant, FanGraphs, and Baseball-Reference. Version 2.2.7 (current) fixes FanGraphs leaderboard URLs and adds new features like PitchingBot/Stuff+ stat enums and strike zone plotting. Release cadence is irregular, with minor patches every few months.
pip install pybaseball Common errors
error ValueError: URL does not return valid JSON ↓
cause Statcast API endpoint temporarily down or changed.
fix
Wait and retry, or check the pybaseball GitHub issue tracker for known API issues.
error AttributeError: module 'pybaseball' has no attribute 'statcast' ↓
cause Importing from submodule instead of top level, or very old version (<2.0.0).
fix
Use from pybaseball import statcast and upgrade to latest version.
error HTTPError: 429 Client Error: Too Many Requests ↓
cause Exceeding rate limits on Baseball Savant or FanGraphs.
fix
Add delays between requests (time.sleep(1)). Or use cached data after first successful fetch.
error KeyError: 'events' ↓
cause Statcast data column missing; schema changed.
fix
Check df.columns for available columns; the 'events' column may be named differently or absent for certain date ranges.
Warnings
breaking Statcast data schema changes frequently: column names, data types, and null handling can change without notice. Always check the actual columns after fetching. ↓
fix Inspect df.columns after fetching and handle missing/renamed columns gracefully.
breaking FanGraphs leaderboard URL changed in v2.2.6; older versions cannot retrieve FanGraphs data. ↓
fix Upgrade to pybaseball>=2.2.6.
deprecated Python 3.6 support dropped in v2.2.5; 3.7 also dropped later. ↓
fix Use Python 3.8+.
gotcha Caching is disabled by default, but enabling it can cause stale data if not cleared. Manual cache clearing is required. ↓
fix Use cache.enable() then cache.disable() or delete cache files manually.
gotcha Web scraping can be unreliable: frequent HTTP errors (429, 503) and HTML structure changes may break scraping functions. ↓
fix Wrap calls in retry logic; check GitHub issues for known outages.
Imports
- statcast wrong
from pybaseball.statcast import statcastcorrectfrom pybaseball import statcast - batting_stats wrong
from pybaseball.fangraphs import batting_statscorrectfrom pybaseball import batting_stats - playerid_lookup wrong
from pybaseball.lookup import playerid_lookupcorrectfrom pybaseball import playerid_lookup
Quickstart
from pybaseball import statcast
import pandas as pd
# Disable cache to avoid stale data
from pybaseball import cache
df = statcast(start_dt='2024-05-01', end_dt='2024-05-02')
print(df.head())