{"id":9931,"library":"mechanicalsoup","title":"MechanicalSoup","description":"MechanicalSoup is a Python library for automating interaction with websites. It builds on top of `requests` and `BeautifulSoup4` to provide a stateful browser experience, making it easy to navigate, fill forms, and submit data without a full-fledged browser. The current version is 1.4.0, and it maintains a moderate release cadence, typically releasing minor versions every 6-12 months with occasional patch releases.","status":"active","version":"1.4.0","language":"en","source_language":"en","source_url":"https://github.com/MechanicalSoup/MechanicalSoup","tags":["web scraping","automation","forms","headless browser","http client"],"install":[{"cmd":"pip install mechanicalsoup","lang":"bash","label":"Standard installation"},{"cmd":"pip install mechanicalsoup[full]","lang":"bash","label":"Installation with optional HTML parsers (lxml, html5lib)"}],"dependencies":[{"reason":"Used for making HTTP requests, minimum version increased in 1.1.0.","package":"requests","optional":false},{"reason":"Used for parsing HTML and navigating the DOM, minimum version increased in 1.1.0.","package":"beautifulsoup4","optional":false},{"reason":"Underlying HTTP client, minimum version specified in 1.4.0 to mitigate security vulnerabilities.","package":"urllib3","optional":false},{"reason":"Provides Mozilla's carefully curated collection of Root Certificates for validating the trustworthiness of SSL certificates, minimum version specified in 1.4.0.","package":"certifi","optional":false},{"reason":"Default (and recommended) HTML parser, part of `mechanicalsoup[full]` install.","package":"lxml","optional":true},{"reason":"Alternative HTML parser, part of `mechanicalsoup[full]` install.","package":"html5lib","optional":true}],"imports":[{"symbol":"StatefulBrowser","correct":"from mechanicalsoup import StatefulBrowser"},{"symbol":"Browser","correct":"from mechanicalsoup import Browser"},{"note":"Form is directly exposed by the top-level package in recent versions.","wrong":"from mechanicalsoup.form import Form","symbol":"Form","correct":"from mechanicalsoup import Form"}],"quickstart":{"code":"import mechanicalsoup\nimport os\n\n# Create a headless browser instance\nbrowser = mechanicalsoup.StatefulBrowser()\n\n# Open a page (replace with a real URL for testing, e.g., a login page)\n# For a test, we'll use a mock login setup\n# In a real scenario, you'd open a target URL:\n# browser.open(\"http://example.com/login\")\n\n# Simulate a simple HTML page with a form\n# For demonstration, we'll parse a string. In reality, browser.open() returns a response.\nhtml_content = '''\n<html><body>\n  <form action=\"/login\" method=\"post\">\n    <input type=\"text\" name=\"username\" value=\"\">\n    <input type=\"password\" name=\"password\" value=\"\">\n    <input type=\"submit\" value=\"Login\">\n  </form>\n</body></html>\n'''\nbrowser.set_content(html_content)\n\n# Select the form (by index or CSS selector)\nbrowser.select_form('form[action=\"/login\"]')\n\n# Fill in the form fields\nbrowser[\"username\"] = os.environ.get('TEST_USERNAME', 'testuser')\nbrowser[\"password\"] = os.environ.get('TEST_PASSWORD', 'testpass')\n\n# Submit the form\n# In a real scenario, this would send the request to the action URL\n# response = browser.submit_selected()\n\nprint(f\"Form selected: {browser.form}\")\nprint(f\"Username field value: {browser['username']}\")\nprint(f\"Password field value: {browser['password']}\")\n# print(f\"Response URL after submission: {browser.url}\")\n# print(f\"Response content: {browser.page.text}\")\n","lang":"python","description":"This quickstart demonstrates how to initialize a `StatefulBrowser`, set its content (or open a URL), select a form, fill its fields, and prepare to submit it. For actual interaction with a website, replace the `set_content` call with `browser.open(\"http://your.site/login\")` and uncomment the submission and response handling lines."},"warnings":[{"fix":"Instead of `browser['upload_field'] = '/path/to/file'`, use `browser['upload_field'] = open('/path/to/file', 'rb')`.","message":"As of v1.3.0, uploading files in forms requires explicitly opening the file object (e.g., `open('/path/to/file', 'rb')`) instead of just passing the file path as a string. This change was implemented to prevent malicious web servers from reading arbitrary local files.","severity":"breaking","affected_versions":">=1.3.0"},{"fix":"Upgrade your Python interpreter to version 3.9 or newer. For older Python versions, use MechanicalSoup < 1.4.0.","message":"MechanicalSoup v1.4.0 dropped support for Python versions 3.6, 3.7, and 3.8. Earlier versions (v1.1.0) dropped 2.7 and 3.5. Ensure your environment uses Python 3.9 or higher.","severity":"breaking","affected_versions":">=1.4.0"},{"fix":"Prefer using the properties directly: `browser.page`, `browser.form`, `browser.url`.","message":"Since v1.0.0, `StatefulBrowser` introduced properties (`.page`, `.form`, `.url`) to access the current page, form, and URL. The older method calls (`.get_current_page()`, `.get_current_form()`, `.get_url()`) are still present but are considered deprecated and may be removed in future versions.","severity":"gotcha","affected_versions":">=1.0.0"},{"fix":"Initialize your browser with `browser = mechanicalsoup.StatefulBrowser(raise_on_404=True)` to ensure `LinkNotFoundError` is raised on 404 responses.","message":"The `StatefulBrowser` and `Browser` constructors accept a `raise_on_404=True` argument, which is highly recommended. By default, it's `False` for backward compatibility, meaning HTTP 404 errors might not immediately raise an exception, potentially leading to silent failures.","severity":"gotcha","affected_versions":">=0.8.0"}],"env_vars":null,"last_verified":"2026-04-17T00:00:00.000Z","next_check":"2026-07-16T00:00:00.000Z","problems":[{"fix":"Pass an explicitly opened file object: `browser[\"upload_field\"] = open(\"/path/to/file.txt\", \"rb\")`. Remember to close the file handle after submission if not managed automatically.","cause":"Attempting to upload a file in MechanicalSoup 1.3.0+ by passing a string path directly to a form field, instead of an opened file object.","error":"FileNotFoundError: [Errno 2] No such file or directory: '/path/to/file'"},{"fix":"Use the direct properties introduced in 1.0.0: `browser.page`, `browser.form`, `browser.url`.","cause":"Using an old, deprecated method (`get_current_page`, `get_current_form`, or `get_url`) after upgrading to MechanicalSoup 1.0.0+ where these might be removed or not correctly mapped in some contexts (though generally still present as of 1.4.0, it's a common confusion point).","error":"AttributeError: 'StatefulBrowser' object has no attribute 'get_current_page'"},{"fix":"Verify your link selector is correct and matches an existing link. If you're getting 404s, consider if the URL is valid, or if you need to handle `LinkNotFoundError` if `raise_on_404=True`.","cause":"The specified link selector did not match any link on the current page, or a 404 Not Found error occurred and `raise_on_404` is enabled for the browser.","error":"mechanicalsoup.LinkNotFoundError: No link found with selector 'a.broken-link'"},{"fix":"Ensure `browser.select_form()` is called with a valid selector (e.g., `browser.select_form('form[action=\"/login\"]')`) before trying to manipulate form fields or submit.","cause":"Attempting to interact with form fields (e.g., `browser['field_name'] = 'value'`) or submit a form without first successfully selecting one using `browser.select_form()`.","error":"ValueError: No form selected"}]}