Firecrawl Python SDK
The Firecrawl Python SDK provides a client for the Firecrawl API, enabling developers to scrape, crawl, and interact with web pages efficiently. It's designed for AI agents and data extraction tasks, supporting various formats and dynamic content. The current version is 4.22.1, and the library undergoes active development with frequent minor and patch releases.
Warnings
- breaking The library's package name changed from `firecrawl-py` to `firecrawl` with version 3.0.0. Projects using the older `firecrawl-py` will need to update their `pip install` command and import paths.
- breaking Version 3.0.0 introduced a complete rewrite of the SDK to align with the new Firecrawl API structure. This may lead to changes in method signatures, available parameters, and the structure of API responses, even for seemingly similar operations like `scrape` or `crawl`.
- gotcha A valid Firecrawl API key is mandatory for all API calls. Requests made without a properly configured API key will result in authentication errors or failed operations.
- gotcha Firecrawl API usage consumes credits. Be aware of the unified billing model (implemented on the API side) where various operations, including scrapes, crawls, and interactions, deduct from your account credits.
Install
-
pip install firecrawl
Imports
- FirecrawlApp
from firecrawl import FirecrawlApp
Quickstart
import os
from firecrawl import FirecrawlApp
# Ensure you have your Firecrawl API key set as an environment variable
api_key = os.environ.get('FIRECRAWL_API_KEY', '')
if not api_key:
print("Error: FIRECRAWL_API_KEY environment variable not set.")
print("Please set FIRECRAWL_API_KEY to your Firecrawl API key.")
exit(1)
try:
app = FirecrawlApp(api_key=api_key)
# Scrape a URL
url_to_scrape = "https://www.firecrawl.ai/"
print(f"Scraping: {url_to_scrape}")
scraped_data = app.scrape(url_to_scrape)
print("Scrape successful!")
# Access the content from the 'data' field, which is a list of results
if scraped_data and isinstance(scraped_data, dict) and 'data' in scraped_data and scraped_data['data']:
first_item = scraped_data['data'][0]
print(f"Title: {first_item.get('metadata', {}).get('title', 'N/A')}")
print(f"Text content length: {len(first_item.get('content', ''))}")
else:
print("No data found in scrape result.")
# Example: Crawl a website (optional, might take longer and consume more credits)
# print(f"\nCrawling: {url_to_scrape} up to 1 page")
# crawled_data = app.crawl(url_to_scrape, params={'limit': 1})
# if crawled_data and 'data' in crawled_data:
# print(f"Crawled {len(crawled_data['data'])} pages.")
except Exception as e:
print(f"An error occurred: {e}")