pykrx: KRX Data Scraper
pykrx is a Python library for scraping data from the Korea Exchange (KRX) and related financial data sources (e.g., Naver Finance). It provides functions to retrieve stock prices, market capitalization, OHLCV data, ETF, and bond information. The library is actively maintained with frequent updates to adapt to changes in upstream data sources and API policies. The current version is 1.2.7.
Common errors
-
AttributeError: module 'pykrx' has no attribute 'stock'
cause Attempting to access submodules like `stock` directly from the top-level `pykrx` import, e.g., `import pykrx; pykrx.stock.get_market_ohlcv_by_date(...)`.fixImport the specific submodule directly: `from pykrx import stock` or the function: `from pykrx.stock import get_market_ohlcv_by_date`. -
ValueError: invalid literal for int() with base 10: '2023-01-01'
cause Incorrect date format provided to a `pykrx` function. Dates must be in 'YYYYMMDD' string format.fixFormat your date strings correctly. For example, `start_date = '20230101'` or `datetime_obj.strftime('%Y%m%d')`. -
pykrx.errors.PyKRXError: Fail to read data
cause This generic error indicates that `pykrx` failed to retrieve data, often due to an invalid ticker symbol, an invalid or non-trading date, or a temporary issue with the data source.fixDouble-check the ticker symbol, ensure the date range includes trading days, and verify internet connectivity. If the issue persists, try updating `pykrx` as the upstream API might have changed.
Warnings
- breaking pykrx relies on scraping external websites (KRX, Naver Finance). Changes in their website structure or API policies (e.g., 'New 2026 Login Policy' mentioned in v1.1.1 release) can break existing functionality. Always keep pykrx updated to the latest version to ensure compatibility.
- gotcha Date parameters (start_date, end_date) for most functions expect a strict 'YYYYMMDD' string format. Providing dates in other formats (e.g., 'YYYY-MM-DD', `datetime` objects) will lead to errors.
- gotcha Data availability can be inconsistent. Queries for non-trading days (weekends, holidays), delisted tickers, or very old/future dates might return empty dataframes or raise specific `PyKRXError` exceptions. Always handle potential empty results.
Install
-
pip install pykrx
Imports
- get_market_ohlcv_by_date
import pykrx
from pykrx import stock
Quickstart
from pykrx import stock
from datetime import datetime, timedelta
# Get today's date and 30 days ago
today = datetime.now().strftime('%Y%m%d')
start_date = (datetime.now() - timedelta(days=30)).strftime('%Y%m%d')
# Example: Get OHLCV data for Samsung Electronics (ticker: 005930) for the last 30 days
df = stock.get_market_ohlcv_by_date(start_date, today, "005930")
print(f"OHLCV data for 005930 from {start_date} to {today}:")
print(df.head())
# Example: Get market cap data for KOSPI on a specific date
df_cap = stock.get_market_cap_by_date("20231026", "KOSPI")
print(f"\nMarket cap data for KOSPI on 20231026:")
print(df_cap.head())