{"id":9070,"library":"layoutparser","title":"LayoutParser","description":"LayoutParser is a unified toolkit for Deep Learning Based Document Image Analysis, providing a comprehensive set of tools for tasks like document layout detection, OCR, and visualization. It is currently at version 0.3.4 and maintains an active development cycle with regular patch releases and significant minor/major updates that introduce new models and backend support.","status":"active","version":"0.3.4","language":"en","source_language":"en","source_url":"https://github.com/Layout-Parser/layout-parser","tags":["OCR","document analysis","deep learning","layout parsing","computer vision","AI","PDF processing"],"install":[{"cmd":"pip install layoutparser","lang":"bash","label":"Minimal installation (core library)"},{"cmd":"pip install 'layoutparser[all]'","lang":"bash","label":"Full installation (includes Detectron2, Tesseract, GCV, PaddleOCR dependencies)"},{"cmd":"pip install 'layoutparser[detectron2]' 'layoutparser[tesseract]'","lang":"bash","label":"Example: Install with Detectron2 models and Tesseract OCR"}],"dependencies":[{"reason":"Core deep learning framework dependency for many models.","package":"torch","optional":false},{"reason":"Computer vision utilities, complements torch.","package":"torchvision","optional":false},{"reason":"Image processing library.","package":"Pillow","optional":false},{"reason":"Numerical computing.","package":"numpy","optional":false},{"reason":"OpenCV bindings for image operations.","package":"opencv-python","optional":false},{"reason":"Required for using Detectron2-based layout models (e.g., PubLayNet).","package":"detectron2","optional":true},{"reason":"Required for using the Tesseract OCR agent.","package":"pytesseract","optional":true},{"reason":"Required for using the Google Cloud Vision OCR agent.","package":"google-cloud-vision","optional":true},{"reason":"Required for using PaddleDetection-based layout models.","package":"paddleocr","optional":true}],"imports":[{"note":"Standard convention for importing the library.","symbol":"layoutparser","correct":"import layoutparser as lp"},{"note":"Use layoutparser.Image for full compatibility with the library's functions, although PIL.Image can often be converted.","wrong":"from PIL import Image","symbol":"Image","correct":"from layoutparser import Image"},{"note":"Introduced in v0.3.0, the recommended way to load pre-trained models from various backends.","symbol":"AutoLayoutModel","correct":"from layoutparser import AutoLayoutModel"},{"note":"Explicit class for Detectron2 models. Still valid, but AutoLayoutModel is often more convenient since v0.3.0.","symbol":"Detectron2LayoutModel","correct":"from layoutparser import Detectron2LayoutModel"},{"note":"Class for integrating Tesseract OCR.","symbol":"TesseractAgent","correct":"from layoutparser import TesseractAgent"},{"note":"Utility function for visualizing layout elements.","symbol":"draw_box","correct":"from layoutparser import draw_box"}],"quickstart":{"code":"import layoutparser as lp\nfrom PIL import Image\nimport io\nimport requests\n\n# Download a sample image\nimage_url = \"https://layout-parser.github.io/assets/images/publaynet.png\"\nresponse = requests.get(image_url)\nimage_bytes = io.BytesIO(response.content)\n\n# Load the image using PIL, then convert to layoutparser.Image\npil_image = Image.open(image_bytes)\nlp_image = lp.Image(pil_image)\n\n# Load a pre-trained layout model (using AutoLayoutModel since v0.3.0+)\n# Requires 'layoutparser[detectron2]' installed.\nmodel = lp.AutoLayoutModel(model_path=\"lp://PubLayNet/faster_rcnn_R_50_FPN_3x/config\")\n\n# Detect the layout\nlayout = model.detect(lp_image)\n\n# Print detected blocks and their types\nprint(f\"Detected {len(layout)} blocks:\")\nfor block in layout:\n    print(f\"  - Type: {block.type}, Box: {block.coordinates}\")\n\n# (Optional) Visualize the layout\n# You might need matplotlib for this to display the image\n# import matplotlib.pyplot as plt\n# fig = lp.draw_box(lp_image, layout, box_width=3)\n# plt.imshow(fig)\n# plt.show()\n","lang":"python","description":"This quickstart demonstrates how to load an image from a URL, use `AutoLayoutModel` (recommended for v0.3.0+) to detect the document layout, and print the detected blocks. For visualization, ensure `matplotlib` is installed and uncomment the relevant lines."},"warnings":[{"fix":"Always use `pip install 'layoutparser[all]'` for a comprehensive setup, or `pip install 'layoutparser[backend_name]'` for specific needs. Refer to the official installation guide for a full list of extras.","message":"LayoutParser relies on various deep learning backends and OCR engines. The minimal `pip install layoutparser` only installs core dependencies. For full functionality (e.g., using Detectron2 models or Tesseract OCR), you must install with 'extras' like `layoutparser[all]`, `layoutparser[detectron2]`, or `layoutparser[tesseract]`.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Migrate model loading to `lp.AutoLayoutModel(model_path=\"lp://...\")`. This simplifies model instantiation and allows easier switching between backends.","message":"Starting from v0.3.0, LayoutParser introduced `AutoLayoutModel` for multi-backend support. While direct `Detectron2LayoutModel` usage is still possible, `AutoLayoutModel` is the recommended and more flexible way to load models. Using older explicit model classes might require more configuration or become less idiomatic.","severity":"breaking","affected_versions":">=0.3.0"},{"fix":"Install Tesseract OCR engine globally on your operating system (e.g., `sudo apt-get install tesseract-ocr` on Debian/Ubuntu, or follow instructions for Windows/macOS). Ensure its executable is accessible via your system's PATH.","message":"When using `TesseractAgent` for OCR, the Tesseract OCR engine must be installed on your system (not just the Python `pytesseract` package) and added to your system's PATH. This is a common oversight.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-16T00:00:00.000Z","next_check":"2026-07-15T00:00:00.000Z","problems":[{"fix":"Install LayoutParser with the Detectron2 extras: `pip install 'layoutparser[detectron2]'` or `pip install 'layoutparser[all]'`.","cause":"You attempted to use a Detectron2-based model (e.g., `Detectron2LayoutModel` or a model requiring Detectron2 via `AutoLayoutModel`) without installing the `detectron2` dependency.","error":"ModuleNotFoundError: No module named 'detectron2'"},{"fix":"Install the Tesseract OCR engine on your operating system (e.g., `brew install tesseract` on macOS, or see Tesseract's official documentation for other OSes) and ensure it's added to your system's PATH environment variable.","cause":"The Python `pytesseract` library is installed, but the Tesseract OCR executable is not found in your system's PATH.","error":"FileNotFoundError: [Errno 2] No such file or directory: 'tesseract'"},{"fix":"Double-check your `model_path` string for typos. Ensure all necessary dependencies for that specific model backend (e.g., `detectron2` for Detectron2 models) are installed using the correct `layoutparser` extras. Clear the LayoutParser model cache if you suspect corruption (usually in `~/.cache/layoutparser`).","cause":"This usually happens when a layout model fails to load correctly, resulting in the model object being `None`. Common reasons include incorrect `model_path` (or `config_path`), missing dependencies for the chosen backend, or a corrupted model cache.","error":"AttributeError: 'NoneType' object has no attribute 'detect'"}]}