{"id":1058,"library":"protego","title":"Protego","description":"Protego is a pure-Python robots.txt parser with support for modern conventions like those defined by Google. As of version 0.6.0, it actively supports Python 3.10 and newer, with regular updates aligning with new Python releases. It is widely used for web scraping and compliance checking.","status":"active","version":"0.6.0","language":"python","source_language":"en","source_url":"https://github.com/scrapy/protego","tags":["robots.txt","web scraping","parser","compliance"],"install":[{"cmd":"pip install protego","lang":"bash","label":"Install stable version"}],"dependencies":[],"imports":[{"symbol":"Protego","correct":"from protego import Protego"}],"quickstart":{"code":"from protego import Protego\n\nrobotstxt_content = \"\"\"\nUser-agent: *\nDisallow: /admin/\nAllow: /admin/login\nCrawl-delay: 5\nSitemap: http://example.com/sitemap.xml\n\"\"\"\n\nrp = Protego.parse(robotstxt_content)\n\n# Check if a URL can be fetched by a user agent\ncan_fetch_admin = rp.can_fetch(\"http://example.com/admin/settings\", \"mybot\")\ncan_fetch_login = rp.can_fetch(\"http://example.com/admin/login\", \"mybot\")\n\nprint(f\"Can 'mybot' fetch /admin/settings? {can_fetch_admin}\")\nprint(f\"Can 'mybot' fetch /admin/login? {can_fetch_login}\")\nprint(f\"Crawl delay for 'mybot': {rp.crawl_delay('mybot')} seconds\")\nprint(f\"Sitemaps: {list(rp.sitemaps)}\")","lang":"python","description":"Initialize the parser with robots.txt content and check URL access permissions for a specific user agent, retrieve crawl delay, and sitemaps."},"warnings":[{"fix":"Upgrade to Python 3.10+ or pin Protego to a version <0.6.0 (e.g., `protego<0.6.0`).","message":"Version 0.6.0 dropped official support for Python 3.9 and PyPy 3.10. Users on these Python versions should use Protego 0.5.x or upgrade their Python environment.","severity":"breaking","affected_versions":">=0.6.0"},{"fix":"Upgrade to Python 3.9+ or pin Protego to a version <0.4.0.","message":"Version 0.4.0 dropped official support for Python 3.8.","severity":"breaking","affected_versions":">=0.4.0"},{"fix":"Upgrade to Python 3.8+ and remove `six` if it was only a Protego dependency. Pin Protego to a version <0.3.0 for older Python environments.","message":"Version 0.3.0 dropped support for Python 2.7, 3.5, 3.6, and 3.7. The `six` dependency was also removed in this version, making it Python 3 only.","severity":"breaking","affected_versions":">=0.3.0"},{"fix":"Ensure that the `robotstxt_body` passed to `Protego.parse()` is always a string.","message":"In Protego 0.3.0 and later, `Protego.parse()` will raise a `ValueError` if the `robotstxt_body` argument is not a string.","severity":"gotcha","affected_versions":">=0.3.0"},{"fix":"Upgrade to Protego 0.1.16 or newer to ensure correct interpretation of absolute URLs in `robots.txt` directives.","message":"Version 0.1.16 fixed an issue where absolute URLs in `Allow` and `Disallow` directives were incorrectly parsed, ignoring their protocol and netloc. Older versions might misinterpret these directives, leading to incorrect access decisions.","severity":"gotcha","affected_versions":"<0.1.16"},{"fix":"Avoid importing internal modules; rely only on the documented public API (e.g., `from protego import Protego`).","message":"Version 0.5.0 restructured the internal code from a single `protego.py` file into multiple modules. While the public API `from protego import Protego` remains stable, direct imports of internal modules (if any were used) would have broken.","severity":"gotcha","affected_versions":">=0.5.0"}],"env_vars":null,"last_verified":"2026-05-12T23:16:52.038Z","next_check":"2026-06-30T00:00:00.000Z","problems":[{"fix":"Install the library using pip: `pip install protego`","cause":"The `protego` library has not been installed in the current Python environment.","error":"ModuleNotFoundError: No module named 'protego'"},{"fix":"Ensure that the `robotstxt_body` passed to `Protego.parse()` is decoded into a string (e.g., UTF-8) before parsing. Example: `Protego.parse(response.text)` if using `requests`, or `Protego.parse(robotstxt_bytes.decode('utf-8'))`.","cause":"The `Protego.parse()` method expects its input (`robotstxt_body`) to be a string, but it received a non-string type (e.g., bytes). This behavior was explicitly introduced in Protego 0.6.0 and newer versions.","error":"ValueError: content is not a string"},{"fix":"Import the `Protego` class and call `parse()` as a static method on the class: `from protego import Protego` then `robots = Protego.parse(robots_txt_content)`.","cause":"This error occurs when attempting to call `parse()` directly on the `protego` module (e.g., `protego.parse(...)`) instead of on an instance of the `Protego` class. The `parse` method is a static method of the `Protego` class.","error":"AttributeError: module 'protego' has no attribute 'parse'"}],"ecosystem":"pypi","meta_description":null,"install_score":100,"install_tag":"verified","quickstart_score":null,"quickstart_tag":null,"pypi_latest":"0.6.0","cli_name":"","cli_version":null,"install_checks":{"last_tested":"2026-05-12","tag":"verified","tag_description":"installs cleanly on critical runtimes, fast import, recently tested","installed_version":"0.5.0","pypi_latest":"0.6.0","is_stale":true,"results":[{"runtime":"python:3.10-alpine","python_version":"3.10","os_libc":"alpine (musl)","variant":"protego","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":null,"import_time_s":0.03,"mem_mb":1.8,"disk_size":"17.8M"},{"runtime":"python:3.10-alpine","python_version":"3.10","os_libc":"alpine (musl)","variant":"protego","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":0.03,"mem_mb":1.8,"disk_size":"17.8M"},{"runtime":"python:3.10-slim","python_version":"3.10","os_libc":"slim (glibc)","variant":"protego","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":1.4,"import_time_s":0.02,"mem_mb":1.8,"disk_size":"18M"},{"runtime":"python:3.10-slim","python_version":"3.10","os_libc":"slim (glibc)","variant":"protego","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":0.02,"mem_mb":1.8,"disk_size":"18M"},{"runtime":"python:3.11-alpine","python_version":"3.11","os_libc":"alpine (musl)","variant":"protego","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":null,"import_time_s":0.07,"mem_mb":2.2,"disk_size":"19.7M"},{"runtime":"python:3.11-alpine","python_version":"3.11","os_libc":"alpine (musl)","variant":"protego","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":0.08,"mem_mb":2.2,"disk_size":"19.7M"},{"runtime":"python:3.11-slim","python_version":"3.11","os_libc":"slim (glibc)","variant":"protego","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":1.6,"import_time_s":0.06,"mem_mb":2.2,"disk_size":"20M"},{"runtime":"python:3.11-slim","python_version":"3.11","os_libc":"slim (glibc)","variant":"protego","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":0.06,"mem_mb":2.2,"disk_size":"20M"},{"runtime":"python:3.12-alpine","python_version":"3.12","os_libc":"alpine (musl)","variant":"protego","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":null,"import_time_s":0.05,"mem_mb":1.8,"disk_size":"11.6M"},{"runtime":"python:3.12-alpine","python_version":"3.12","os_libc":"alpine (musl)","variant":"protego","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":0.05,"mem_mb":1.8,"disk_size":"11.6M"},{"runtime":"python:3.12-slim","python_version":"3.12","os_libc":"slim (glibc)","variant":"protego","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":1.4,"import_time_s":0.05,"mem_mb":1.8,"disk_size":"12M"},{"runtime":"python:3.12-slim","python_version":"3.12","os_libc":"slim (glibc)","variant":"protego","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":0.07,"mem_mb":1.8,"disk_size":"12M"},{"runtime":"python:3.13-alpine","python_version":"3.13","os_libc":"alpine (musl)","variant":"protego","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":null,"import_time_s":0.05,"mem_mb":2.1,"disk_size":"11.3M"},{"runtime":"python:3.13-alpine","python_version":"3.13","os_libc":"alpine (musl)","variant":"protego","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":0.05,"mem_mb":2.1,"disk_size":"11.2M"},{"runtime":"python:3.13-slim","python_version":"3.13","os_libc":"slim (glibc)","variant":"protego","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":1.4,"import_time_s":0.05,"mem_mb":1.9,"disk_size":"12M"},{"runtime":"python:3.13-slim","python_version":"3.13","os_libc":"slim (glibc)","variant":"protego","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":0.05,"mem_mb":1.9,"disk_size":"12M"},{"runtime":"python:3.9-alpine","python_version":"3.9","os_libc":"alpine (musl)","variant":"protego","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":null,"import_time_s":0.03,"mem_mb":1.8,"disk_size":"17.3M"},{"runtime":"python:3.9-alpine","python_version":"3.9","os_libc":"alpine (musl)","variant":"protego","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":0.04,"mem_mb":1.8,"disk_size":"17.3M"},{"runtime":"python:3.9-slim","python_version":"3.9","os_libc":"slim (glibc)","variant":"protego","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":1.7,"import_time_s":0.03,"mem_mb":1.8,"disk_size":"18M"},{"runtime":"python:3.9-slim","python_version":"3.9","os_libc":"slim (glibc)","variant":"protego","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":0.03,"mem_mb":1.8,"disk_size":"18M"}]},"quickstart_checks":{"last_tested":"2026-04-24","tag":null,"tag_description":null,"results":[{"runtime":"python:3.10-alpine","exit_code":0},{"runtime":"python:3.10-slim","exit_code":0},{"runtime":"python:3.11-alpine","exit_code":0},{"runtime":"python:3.11-slim","exit_code":0},{"runtime":"python:3.12-alpine","exit_code":0},{"runtime":"python:3.12-slim","exit_code":0},{"runtime":"python:3.13-alpine","exit_code":0},{"runtime":"python:3.13-slim","exit_code":0},{"runtime":"python:3.9-alpine","exit_code":0},{"runtime":"python:3.9-slim","exit_code":0}]}}