pytrends
Pytrends is an unofficial Python library that provides a pseudo-API for Google Trends, allowing users to automate the downloading of search interest data. The current version, 4.9.2, was released in April 2023. Due to its unofficial nature, it does not adhere to a fixed release cadence but rather updates as needed to adapt to changes in Google's backend, which can frequently introduce breaking changes.
Warnings
- breaking Pytrends is an unofficial API and is prone to frequent breaking changes when Google updates its backend or internal API. This can lead to unexpected errors (e.g., 404, 500) or changes in expected behavior without prior notice.
- gotcha Frequent HTTP 429 (Too Many Requests) errors are common due to Google's rate limiting. This can occur even with seemingly low request volumes, especially from shared IP addresses (e.g., school networks).
- gotcha Google Trends data is relative and scaled from 0-100. This means that if you pull data for a keyword in two separate requests, the absolute popularity between them cannot be directly compared. Comparisons are only valid when multiple keywords are requested within the *same* `build_payload` call.
- gotcha Incorrect timezone handling is a common pitfall. By default, Pytrends data is based on UTC. The `tz` parameter in `TrendReq` expects the timezone offset in minutes, but Google's internal handling often means positive values for 'west' of UTC (e.g., '360' for UTC-6, not '-360').
- gotcha HTTP 400 (Bad Request) errors can occur if the structure of the request payload is incorrect, unrelated to rate limits. This might involve invalid keyword combinations, unsupported timeframes, or malformed parameters.
Install
-
pip install pytrends
Imports
- TrendReq
from pytrends.request import TrendReq
Quickstart
from pytrends.request import TrendReq
import pandas as pd
# Initialize pytrends object
# hl: host language, tz: timezone offset (e.g., 360 for US CST, NOT -360)
pytrends = TrendReq(hl='en-US', tz=360)
# Define keyword list
kw_list = ["Python programming", "Data Science"]
# Build payload for the request
# timeframe: e.g., 'today 5-y' for last 5 years
pytrends.build_payload(kw_list, cat=0, timeframe='today 5-y', geo='US', gprop='')
# Get Interest Over Time data
interest_over_time_df = pytrends.interest_over_time()
# Print the first few rows of the DataFrame
print("Interest Over Time (first 5 rows):")
print(interest_over_time_df.head())
# Get Interest by Region data
interest_by_region_df = pytrends.interest_by_region(resolution='COUNTRY', inc_low_vol=True, inc_geo_code=True)
print("\nInterest By Region (first 5 rows):")
print(interest_by_region_df.head())