CrawlerDetect
raw JSON → 0.3.2 verified Mon Apr 27 auth: no python
CrawlerDetect is a Python library for identifying bots, crawlers, and spiders by analyzing user agent strings. It is a port of the PHP Crawler-Detect library. Current version is 0.3.2, released infrequently.
pip install crawlerdetect Common errors
error ModuleNotFoundError: No module named 'crawlerdetect' ↓
cause Library not installed or installed in a different environment.
fix
Run
pip install crawlerdetect in the correct Python environment. error AttributeError: module 'crawlerdetect' has no attribute 'CrawlerDetect' ↓
cause Using an outdated version or incorrect import statement.
fix
Upgrade to latest version:
pip install --upgrade crawlerdetect. Then use from crawlerdetect import CrawlerDetect. Warnings
gotcha The library uses a static list of crawler signatures. If you need custom detection or real-time updates, you must manage the data file yourself. ↓
fix Fork or extend the library to update the signatures from the PHP source.
gotcha User agent parsing may be case-sensitive. Ensure you pass the exact user agent string without modification. ↓
fix Pass the raw user agent string as obtained from request headers.
deprecated Older versions used a different import pattern (e.g., from crawlerdetect import CrawlerDetect as cd). This still works but is deprecated in favor of the direct import. ↓
fix Use `from crawlerdetect import CrawlerDetect`.
Imports
- CrawlerDetect
from crawlerdetect import CrawlerDetect
Quickstart
from crawlerdetect import CrawlerDetect
detector = CrawlerDetect()
user_agent = "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
is_crawler = detector.is_crawler(user_agent)
print(f"Is crawler: {is_crawler}") # True