{"library":"maincontentextractor","title":"MainContentExtractor","description":"MainContentExtractor is a Python library designed to extract the core content from HTML documents. It aims to address limitations found in other extraction tools, such as the inability to output clean HTML directly. The library is useful for LLM-related tasks and for feeding data into frameworks like LangChain and LlamaIndex by providing output in HTML, Text, or Markdown formats. It is currently at version 0.0.4, with a relatively active development cadence.","language":"python","status":"active","last_verified":"Fri May 15","install":{"commands":["pip install MainContentExtractor"],"cli":null},"imports":["from main_content_extractor import MainContentExtractor"],"auth":{"required":false,"env_vars":[]},"quickstart":{"code":"import requests\nfrom main_content_extractor import MainContentExtractor\n\n# Example HTML content (or fetch from a URL)\nhtml_content = \"\"\"\n<html>\n<head><title>Example Page</title></head>\n<body>\n    <header>Navigation Bar</header>\n    <main>\n        <h1>Important Article Title</h1>\n        <p>This is the main content paragraph.</p>\n        <p>Another paragraph with <a href=\"#\">a link</a> inside.</p>\n    </main>\n    <footer>Footer content</footer>\n</body>\n</html>\n\"\"\"\n\n# Or, fetch from a URL (requires 'requests')\n# url = \"https://www.example.com\"\n# response = requests.get(url)\n# response.encoding = 'utf-8'\n# html_content = response.text\n\n# Extract main content as HTML\nextracted_html = MainContentExtractor.extract(html_content)\nprint(\"--- Extracted HTML ---\")\nprint(extracted_html)\n\n# Extract main content as Markdown\nextracted_markdown = MainContentExtractor.extract(html_content, output_format=\"markdown\")\nprint(\"\\n--- Extracted Markdown ---\")\nprint(extracted_markdown)\n\n# Extract main content as plain text\nextracted_text = MainContentExtractor.extract(html_content, output_format=\"text\")\nprint(\"\\n--- Extracted Text ---\")\nprint(extracted_text)","lang":"python","description":"This quickstart demonstrates how to extract the main content from an HTML string using MainContentExtractor. It shows output in HTML, Markdown, and plain text formats. If fetching HTML from a URL, ensure `requests` is installed (`pip install requests`).","tag":null,"tag_description":null,"last_tested":null,"results":[]},"compatibility":{"tag":null,"tag_description":null,"last_tested":"2026-05-15","installed_version":"0.0.4","pypi_latest":"0.0.4","is_stale":false,"summary":{"python_range":"3.10–3.9","success_rate":100,"avg_install_s":4.8,"avg_import_s":3.04,"wheel_type":"wheel"},"results":[{"runtime":"python:3.10-alpine","python_version":"3.10","os_libc":"alpine (musl)","variant":"MainContentExtractor","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"broken","install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":"80.9M"},{"runtime":"python:3.10-slim","python_version":"3.10","os_libc":"slim (glibc)","variant":"MainContentExtractor","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"broken","install_time_s":5.2,"import_time_s":null,"mem_mb":null,"disk_size":"82M"},{"runtime":"python:3.11-alpine","python_version":"3.11","os_libc":"alpine (musl)","variant":"MainContentExtractor","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":null,"import_time_s":3.6,"mem_mb":24,"disk_size":"84.6M"},{"runtime":"python:3.11-slim","python_version":"3.11","os_libc":"slim (glibc)","variant":"MainContentExtractor","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":4.8,"import_time_s":3.49,"mem_mb":24,"disk_size":"86M"},{"runtime":"python:3.12-alpine","python_version":"3.12","os_libc":"alpine (musl)","variant":"MainContentExtractor","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":null,"import_time_s":2.84,"mem_mb":24,"disk_size":"76.2M"},{"runtime":"python:3.12-slim","python_version":"3.12","os_libc":"slim (glibc)","variant":"MainContentExtractor","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":4.1,"import_time_s":3.06,"mem_mb":24,"disk_size":"77M"},{"runtime":"python:3.13-alpine","python_version":"3.13","os_libc":"alpine (musl)","variant":"MainContentExtractor","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":null,"import_time_s":2.48,"mem_mb":24.5,"disk_size":"76.0M"},{"runtime":"python:3.13-slim","python_version":"3.13","os_libc":"slim (glibc)","variant":"MainContentExtractor","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":4.1,"import_time_s":2.76,"mem_mb":24.5,"disk_size":"77M"},{"runtime":"python:3.9-alpine","python_version":"3.9","os_libc":"alpine (musl)","variant":"MainContentExtractor","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"broken","install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":"80.4M"},{"runtime":"python:3.9-slim","python_version":"3.9","os_libc":"slim (glibc)","variant":"MainContentExtractor","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"broken","install_time_s":5.9,"import_time_s":null,"mem_mb":null,"disk_size":"81M"}]}}