{"id":7706,"library":"scrapfly-sdk","title":"Scrapfly Python SDK","description":"The Scrapfly Python SDK (current version 0.10.0) provides a robust interface to the Scrapfly API for web scraping, screenshot capture, AI-powered data extraction, and website crawling. It helps developers bypass anti-bot measures, manage proxies, render JavaScript, and integrates seamlessly with frameworks like Scrapy, LlamaIndex, and LangChain. The library maintains an active development and release cadence.","status":"active","version":"0.10.0","language":"en","source_language":"en","source_url":"https://github.com/scrapfly/python-sdk","tags":["scraping","web scraping","data extraction","api client","headless browser","proxy rotation","anti-bot bypass"],"install":[{"cmd":"pip install scrapfly-sdk","lang":"bash","label":"Basic Installation"},{"cmd":"pip install \"scrapfly-sdk[all]\"","lang":"bash","label":"Full Installation with all extras"}],"dependencies":[{"reason":"Optional: For performance improvements (compression) via `[speedups]` extra.","package":"brotli","optional":true},{"reason":"Optional: For performance improvements (serialization) via `[speedups]` extra.","package":"msgpack","optional":true},{"reason":"Optional: Required for built-in HTML parsing via `ScrapeApiResponse.selector` property, installed with `[parser]` or `[all]` extra.","package":"parsel","optional":true},{"reason":"Optional: For Scrapy integration via `[scrapy]` extra. Includes `parsel`.","package":"scrapy","optional":true},{"reason":"Optional: Required for the built-in webhook server via `[webhook-server]` extra.","package":"flask","optional":true},{"reason":"Optional: For concurrency features via `[concurrency]` extra, often built-in Python.","package":"asyncio","optional":true}],"imports":[{"symbol":"ScrapflyClient","correct":"from scrapfly import ScrapflyClient"},{"symbol":"ScrapeConfig","correct":"from scrapfly import ScrapeConfig"}],"quickstart":{"code":"import os\nfrom scrapfly import ScrapflyClient, ScrapeConfig\n\nSCRAPFLY_API_KEY = os.environ.get('SCRAPFLY_API_KEY', 'YOUR_SCRAPFLY_API_KEY')\n\nasync def main():\n    client = ScrapflyClient(key=SCRAPFLY_API_KEY)\n    try:\n        result = await client.scrape(ScrapeConfig(url='https://web-scraping.dev/product/1', render_js=True, country='us'))\n        print(f\"Status: {result.status_code}\")\n        print(f\"Content length: {len(result.content)} bytes\")\n        # If 'parsel' or 'scrapy' is installed, you can use .selector\n        # print(f\"Product Title: {result.selector.css('h3::text').get()}\")\n    except Exception as e:\n        print(f\"An error occurred: {e}\")\n    finally:\n        await client.close()\n\nif __name__ == '__main__':\n    import asyncio\n    asyncio.run(main())\n","lang":"python","description":"This quickstart demonstrates how to initialize the Scrapfly client and perform a basic scrape request to a test page. It shows how to enable JavaScript rendering and specify a proxy country. Remember to replace 'YOUR_SCRAPFLY_API_KEY' with your actual key or set the SCRAPFLY_API_KEY environment variable. For HTML parsing with `.selector`, ensure `parsel` or `scrapy` is installed as an extra dependency."},"warnings":[{"fix":"Install the `parser` or `scrapy` extra: `pip install \"scrapfly-sdk[parser]\"`","message":"Accessing the `ScrapeApiResponse.selector` property for built-in HTML parsing requires installing either `parsel` or `scrapy` as an optional dependency (e.g., `pip install \"scrapfly-sdk[parser]\"`). Without these, attempting to use `.selector` will result in an `AttributeError`.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Use `os.environ.get('SCRAPFLY_API_KEY', 'default_or_error_key')` to load your API key, and set the environment variable `SCRAPFLY_API_KEY`.","message":"Hardcoding your Scrapfly API key directly in your code is insecure and inflexible. It's best practice to retrieve it from environment variables or a secure configuration system.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Implement robust `try-except` blocks for `ScrapflyError` and its subclasses. Refer to the Scrapfly API troubleshooting guide for specific error codes and their meanings.","message":"Scrapfly API errors (e.g., HTTP 400, 401, 429, 5xx) are encapsulated by `scrapfly.errors.ScrapflyError` subclasses. Incorrect handling or misinterpretation of these can lead to brittle scrapers. Consult the official Scrapfly error documentation for detailed explanations and suggested remedies.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Before upgrading, check the `scrapfly-sdk` GitHub repository's release notes or the official Scrapfly documentation's changelog for any API parameter renames or deprecations.","message":"While not explicitly documented as a breaking change for the Python SDK `0.10.0` specifically, other Scrapfly SDKs (e.g., TypeScript SDK v0.6.9) have undergone parameter renames (e.g., `ephemeral_template` to `extraction_ephemeral_template` in the Extraction API). Always review the official Changelog or release notes for potential API parameter changes when upgrading, especially across minor or major versions.","severity":"breaking","affected_versions":"Future minor/major versions"}],"env_vars":null,"last_verified":"2026-04-16T00:00:00.000Z","next_check":"2026-07-15T00:00:00.000Z","problems":[{"fix":"Install the `parser` extra: `pip install \"scrapfly-sdk[parser]\"`. If you are using Scrapy, install `pip install \"scrapfly-sdk[scrapy]\"` instead.","cause":"You are trying to use the `.selector` property on a `ScrapeApiResponse` object for HTML parsing, but the necessary optional dependencies (`parsel` or `scrapy`) have not been installed.","error":"AttributeError: 'ScrapeApiResponse' object has no attribute 'selector'"},{"fix":"Verify your API key from your Scrapfly dashboard (https://scrapfly.io/dashboard) and ensure it's correctly passed during client initialization, ideally from an environment variable.","cause":"The Scrapfly API key provided to `ScrapflyClient` is either missing, incorrect, expired, or has insufficient permissions.","error":"scrapfly.errors.ScrapflyError: Invalid API key (HTTP 401 Unauthorized)"},{"fix":"Implement exponential backoff and retry logic in your scraping code. Review your Scrapfly plan limits or consider upgrading your plan if sustained higher rates are needed.","cause":"Your Scrapfly account has exceeded its allocated request rate limit or concurrent request limit for the given time period.","error":"scrapfly.errors.ScrapflyError: Too Many Requests (HTTP 429)"},{"fix":"In your `ScrapeConfig`, enable `render_js=True`, `asp=True` (Anti-Scraping Protection), and potentially specify a `proxy_pool='public_residential_pool'` and `country='us'` (or relevant target country).","cause":"The target website's anti-bot protection mechanisms successfully identified and blocked the scraping request, or the page content requires JavaScript rendering.","error":"Scraped content is empty, incomplete, or shows an anti-bot page (no SDK error, but unexpected content)."}]}