img2table
raw JSON → 1.4.2 verified Mon Apr 27 auth: no python
img2table is a table identification and extraction library for PDFs and images, based on OpenCV image processing. Current version: 1.4.2. Supports Python 3.9-3.13. Released on PyPI with moderate cadence.
pip install img2table Common errors
error ModuleNotFoundError: No module named 'paddleocr' ↓
cause PaddleOCR is an extra dependency, not installed by default with img2table.
fix
pip install paddleocr
error ImportError: cannot import name 'PaddleOCR' from 'img2table.ocr' ↓
cause Using wrong casing; correct class name is PaddleOCR (capital O, C, R).
fix
Use: from img2table.ocr import PaddleOCR
error AttributeError: 'Image' object has no attribute 'extract_tables' ↓
cause Incorrect import; Image class is not in top-level package.
fix
Use: from img2table.document import Image
Warnings
breaking In v1.4.0, the PDF backend was migrated from PyMuPDF/fitz to pypdfium2 for license compliance. Existing code expecting fitz will break. ↓
fix No action needed if using Document classes; only affects direct use of PDF library internals.
deprecated The old TesseractOCR class used Tesseract 4.x; future versions may remove support. Recommended to migrate to PaddleOCR or SuryaOCR. ↓
fix Switch to PaddleOCR or SuryaOCR via pip install paddleocr or pip install surya-ocr (separate).
gotcha OCR initialization is heavy; avoid recreating OCR instance per image in loops. Reuse the same OCR object for multiple documents. ↓
fix Create one OCR object and pass it to multiple extract_tables calls.
Imports
- OCR wrong
from img2table.ocr import PaddleOcrcorrectfrom img2table.ocr import PaddleOCR, TesseractOCR - Document wrong
from img2table import Imagecorrectfrom img2table.document import Image, PDF
Quickstart
import os
from img2table.document import Image
from img2table.ocr import PaddleOCR
# Use environment variable for API key if needed
ocr = PaddleOCR(lang='en', api_key=os.environ.get('PADDLE_OCR_KEY', ''))
img = Image(src='table.png')
tables = img.extract_tables(ocr=ocr)
print(tables)