{"id":228,"library":"beautifulsoup4","title":"Beautiful Soup 4","description":"Beautiful Soup 4 (often imported as `bs4`) is a Python library for pulling data out of HTML and XML files. It works with your favorite parser to provide idiomatic ways of navigating, searching, and modifying the parse tree, commonly saving programmers hours or days of work in web scraping and data extraction. The library is actively maintained with an irregular release cadence, focusing on Python 3 development. The `bs4` package on PyPI is a dummy package, and the actual library to install is `beautifulsoup4`.","status":"active","version":"4.14.3","language":"python","source_language":"en","source_url":"https://www.crummy.com/software/BeautifulSoup/","tags":["web scraping","html parsing","xml parsing","data extraction","dom traversal"],"install":[{"cmd":"pip install beautifulsoup4","lang":"bash","label":"Install Beautiful Soup 4"},{"cmd":"pip install beautifulsoup4 lxml html5lib","lang":"bash","label":"Install Beautiful Soup 4 with recommended parsers"}],"dependencies":[{"reason":"Required for CSS selectors.","package":"soupsieve","optional":false},{"reason":"Highly recommended for faster parsing and robust handling of malformed HTML/XML. Can be used as `BeautifulSoup(markup, 'lxml')`.","package":"lxml","optional":true},{"reason":"Highly recommended for extremely lenient parsing of malformed HTML, similar to how web browsers render it. Can be used as `BeautifulSoup(markup, 'html5lib')`.","package":"html5lib","optional":true},{"reason":"Optional for improved character encoding detection.","package":"chardet","optional":true},{"reason":"Optional, faster alternative to chardet for character encoding detection.","package":"cchardet","optional":true}],"imports":[{"note":"The `BeautifulSoup` package (capital B) is Beautiful Soup 3, which is deprecated. Always import from `bs4` for Beautiful Soup 4.","wrong":"from BeautifulSoup import BeautifulSoup","symbol":"BeautifulSoup","correct":"from bs4 import BeautifulSoup"}],"quickstart":{"code":"import requests\nfrom bs4 import BeautifulSoup\n\n# Example HTML content\nhtml_doc = \"\"\"\n<html><head><title>The Dormouse's story</title></head>\n<body>\n<p class=\"title\"><b>The Dormouse's story</b></p>\n<p class=\"story\">Once upon a time there were three little sisters; and their names were\n<a href=\"http://example.com/elsie\" class=\"sister\" id=\"link1\">Elsie</a>,\n<a href=\"http://example.com/lacie\" class=\"sister\" id=\"link2\">Lacie</a> and\n<a href=\"http://example.com/tillie\" class=\"sister\" id=\"link3\">Tillie</a>;\nand they lived at the bottom of a well.</p>\n<p class=\"story\">...</p>\n</body></html>\n\"\"\"\n\n# Create a BeautifulSoup object\nsoup = BeautifulSoup(html_doc, 'html.parser')\n\n# Pretty-print the HTML\nprint(\"\\n--- Pretty Printed HTML ---\")\nprint(soup.prettify())\n\n# Accessing tags\nprint(\"\\n--- Page Title ---\")\nprint(soup.title.string)\n\n# Finding all links\nprint(\"\\n--- All Links ---\")\nfor link in soup.find_all('a'):\n    print(link.get('href'))\n\n# Finding an element by ID\nprint(\"\\n--- Link with ID 'link3' ---\")\nlink3 = soup.find(id=\"link3\")\nif link3: # Check if link3 was found before accessing attributes\n    print(link3.get_text())\n\n# Using CSS selectors (requires soupsieve, which is a dependency)\nprint(\"\\n--- Paragraphs with class 'story' ---\")\nfor p_tag in soup.select('p.story'):\n    print(p_tag.get_text(strip=True))","lang":"python","description":"This quickstart demonstrates basic parsing of an HTML document, pretty-printing the output, accessing specific tags like the title, finding all elements of a certain type (e.g., links), finding an element by its ID, and using CSS selectors to locate elements with a specific class."},"warnings":[{"fix":"Ensure your project uses Python 3.7 or newer. If migrating from Beautiful Soup 3, review the porting guide for significant API changes.","message":"Beautiful Soup 4 discontinued official support for Python 2 on December 31, 2020. The last version to support Python 2 was 4.9.3. New development targets Python 3.7+ (current versions require Python >=3.7.0). Running BS4 code on Python 2, or Python 2 BS3 code on Python 3, will result in `ImportError` or unexpected behavior.","severity":"breaking","affected_versions":"4.9.4+"},{"fix":"Consult the 'Porting code to BS4' section in the official documentation for a comprehensive list of changes. Update import statements and attribute/method calls accordingly.","message":"When migrating from Beautiful Soup 3 to Beautiful Soup 4, several attributes and methods were renamed for PEP 8 compliance. For example, `Tag.next` became `Tag.next_element`, and `Tag.previous` became `Tag.previous_element`. The primary import also changed from `from BeautifulSoup import BeautifulSoup` to `from bs4 import BeautifulSoup`.","severity":"breaking","affected_versions":"All BS4 versions when migrating from BS3"},{"fix":"Install `lxml` and/or `html5lib` via `pip install lxml html5lib` for better performance and robustness. Always specify the parser explicitly (e.g., `BeautifulSoup(markup, 'lxml')` or `BeautifulSoup(markup, 'html5lib')`).","message":"Beautiful Soup relies on an underlying HTML/XML parser. While Python's built-in `html.parser` is the default, it is often less performant and more prone to issues with malformed HTML than `lxml` or `html5lib`. Not installing an external parser can lead to slower parsing, different parse trees, or crashes with certain documents.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Update your code to use the recommended, non-deprecated alternatives. Review the `DeprecationWarning` messages for specific guidance or consult the latest Beautiful Soup documentation.","message":"Starting with Beautiful Soup 4.13.0, many methods that were previously documented as deprecated now explicitly issue `DeprecationWarning` when used. These methods, including the `BeautifulStoneSoup` class and `parentGenerator`, are scheduled for removal in future versions (e.g., 4.15.0).","severity":"deprecated","affected_versions":"4.13.0+"},{"fix":"If using Beautiful Soup 4.13.0 or newer, uninstall the `types-beautifulsoup4` package (`pip uninstall types-beautifulsoup4`). The built-in type hints are now sufficient.","message":"For Beautiful Soup versions 4.13.0 and newer, type annotations are included directly within the `beautifulsoup4` package. If you were previously using the `types-beautifulsoup4` stub package for type checking, it can lead to conflicts or incorrect type resolution.","severity":"gotcha","affected_versions":"4.13.0+"},{"fix":"Install the `requests` library using pip: `pip install requests`.","message":"The `requests` library, which is commonly used for making HTTP requests in web scraping scripts, is not installed. Attempting to import `requests` without it being available in the environment will lead to a `ModuleNotFoundError`.","severity":"breaking","affected_versions":"All versions"},{"fix":"Install the 'requests' library using pip: `pip install requests`.","message":"The application failed because the 'requests' library was not found. While 'requests' is frequently used in conjunction with Beautiful Soup to fetch web pages, it is not a direct dependency of the 'beautifulsoup4' package itself. This error occurs if 'requests' is imported in the application script but was not installed alongside 'beautifulsoup4'.","severity":"breaking","affected_versions":"N/A (external dependency issue)"}],"env_vars":null,"last_verified":"2026-05-12T12:04:05.878Z","next_check":"2026-06-27T00:00:00.000Z","problems":[{"fix":"Install the library using pip: `pip install beautifulsoup4` (note the 'beautifulsoup4' package name, not 'bs4'). If already installed, ensure your IDE or script is using the correct Python interpreter where it's installed.","cause":"The beautifulsoup4 library is not installed in the Python environment being used, or there's a mismatch between the installed package name and the import statement.","error":"ModuleNotFoundError: No module named 'bs4'"},{"fix":"Always check if the result of `find()` or `select_one()` is not `None` before attempting to access its attributes or methods. For example: `element = soup.find('div', class_='my-class'); if element: print(element.get_text())`.","cause":"This error occurs when `find()` or `select_one()` methods in Beautiful Soup do not find any matching element and return `None`. Subsequent attempts to call a method (like `get_text()`) on this `None` object lead to the `AttributeError`.","error":"AttributeError: 'NoneType' object has no attribute 'get_text' (or 'find', 'contents', etc.)"},{"fix":"Ensure you are importing the `BeautifulSoup` class directly: `from bs4 import BeautifulSoup`. Then, create your soup object as `soup = BeautifulSoup(markup, 'html.parser')`.","cause":"This typically happens when you import the `bs4` module itself and then try to call `bs4()` as if it were the `BeautifulSoup` class, instead of importing the `BeautifulSoup` class specifically from `bs4`.","error":"TypeError: 'module' object is not callable"},{"fix":"Use the `.get()` method to safely access attributes, which returns `None` if the attribute does not exist, preventing a `KeyError`. For example: `link = tag.get('href')`.","cause":"This error occurs when you try to access an attribute using dictionary-style lookup (`tag['attribute']`) on a tag that does not possess that specific attribute.","error":"KeyError: 'href' (or 'class', etc.)"},{"fix":"Extract the HTML content from the `requests.Response` object using `.text` (for string content) or `.content` (for bytes content) before passing it to BeautifulSoup. For example: `soup = BeautifulSoup(response.text, 'html.parser')`.","cause":"You are passing a `requests.Response` object directly to the `BeautifulSoup` constructor instead of its text or content.","error":"TypeError: Incoming markup is of an invalid type: <Response [200]> (or expected string or buffer)"}],"ecosystem":"pypi","meta_description":null,"install_score":100,"install_tag":"verified","quickstart_score":0,"quickstart_tag":"stale","pypi_latest":null,"install_checks":{"last_tested":"2026-05-12","tag":"verified","tag_description":"installs cleanly on critical runtimes, fast import, recently tested","results":[{"runtime":"python:3.10-alpine","python_version":"3.10","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.35,"mem_mb":5.4,"disk_size":"19.1M"},{"runtime":"python:3.10-alpine","python_version":"3.10","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.58,"mem_mb":11.4,"disk_size":"32.4M"},{"runtime":"python:3.10-slim","python_version":"3.10","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.26,"mem_mb":5.4,"disk_size":"20M"},{"runtime":"python:3.10-slim","python_version":"3.10","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.45,"mem_mb":11.4,"disk_size":"33M"},{"runtime":"python:3.11-alpine","python_version":"3.11","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.85,"mem_mb":5.6,"disk_size":"21.2M"},{"runtime":"python:3.11-alpine","python_version":"3.11","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":1.14,"mem_mb":12,"disk_size":"34.6M"},{"runtime":"python:3.11-slim","python_version":"3.11","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.69,"mem_mb":5.7,"disk_size":"22M"},{"runtime":"python:3.11-slim","python_version":"3.11","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.92,"mem_mb":12,"disk_size":"35M"},{"runtime":"python:3.12-alpine","python_version":"3.12","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.54,"mem_mb":5.4,"disk_size":"13.0M"},{"runtime":"python:3.12-alpine","python_version":"3.12","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.82,"mem_mb":11.8,"disk_size":"26.5M"},{"runtime":"python:3.12-slim","python_version":"3.12","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.58,"mem_mb":5.4,"disk_size":"13M"},{"runtime":"python:3.12-slim","python_version":"3.12","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.88,"mem_mb":11.8,"disk_size":"27M"},{"runtime":"python:3.13-alpine","python_version":"3.13","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.49,"mem_mb":5.8,"disk_size":"12.6M"},{"runtime":"python:3.13-alpine","python_version":"3.13","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.74,"mem_mb":11.6,"disk_size":"26.2M"},{"runtime":"python:3.13-slim","python_version":"3.13","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.54,"mem_mb":5.8,"disk_size":"13M"},{"runtime":"python:3.13-slim","python_version":"3.13","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.8,"mem_mb":11.6,"disk_size":"27M"},{"runtime":"python:3.9-alpine","python_version":"3.9","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.3,"mem_mb":5.4,"disk_size":"18.6M"},{"runtime":"python:3.9-alpine","python_version":"3.9","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.52,"mem_mb":11.4,"disk_size":"31.8M"},{"runtime":"python:3.9-slim","python_version":"3.9","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.25,"mem_mb":5.4,"disk_size":"19M"},{"runtime":"python:3.9-slim","python_version":"3.9","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.47,"mem_mb":11.4,"disk_size":"32M"}]},"quickstart_checks":{"last_tested":"2026-04-23","tag":"stale","tag_description":"widespread failures or data too old to trust","results":[{"runtime":"python:3.10-alpine","exit_code":1},{"runtime":"python:3.10-slim","exit_code":1},{"runtime":"python:3.11-alpine","exit_code":1},{"runtime":"python:3.11-slim","exit_code":1},{"runtime":"python:3.12-alpine","exit_code":1},{"runtime":"python:3.12-slim","exit_code":1},{"runtime":"python:3.13-alpine","exit_code":1},{"runtime":"python:3.13-slim","exit_code":1},{"runtime":"python:3.9-alpine","exit_code":1},{"runtime":"python:3.9-slim","exit_code":1}]}}