CrawlerDetect

raw JSON →
0.3.2 verified Mon Apr 27 auth: no python

CrawlerDetect is a Python library for identifying bots, crawlers, and spiders by analyzing user agent strings. It is a port of the PHP Crawler-Detect library. Current version is 0.3.2, released infrequently.

pip install crawlerdetect
error ModuleNotFoundError: No module named 'crawlerdetect'
cause Library not installed or installed in a different environment.
fix
Run pip install crawlerdetect in the correct Python environment.
error AttributeError: module 'crawlerdetect' has no attribute 'CrawlerDetect'
cause Using an outdated version or incorrect import statement.
fix
Upgrade to latest version: pip install --upgrade crawlerdetect. Then use from crawlerdetect import CrawlerDetect.
gotcha The library uses a static list of crawler signatures. If you need custom detection or real-time updates, you must manage the data file yourself.
fix Fork or extend the library to update the signatures from the PHP source.
gotcha User agent parsing may be case-sensitive. Ensure you pass the exact user agent string without modification.
fix Pass the raw user agent string as obtained from request headers.
deprecated Older versions used a different import pattern (e.g., from crawlerdetect import CrawlerDetect as cd). This still works but is deprecated in favor of the direct import.
fix Use `from crawlerdetect import CrawlerDetect`.

Basic usage: create a CrawlerDetect instance and call is_crawler with a user agent string.

from crawlerdetect import CrawlerDetect

detector = CrawlerDetect()
user_agent = "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
is_crawler = detector.is_crawler(user_agent)
print(f"Is crawler: {is_crawler}")  # True