{"id":3730,"library":"paddleocr","title":"PaddleOCR","description":"PaddleOCR is an awesome multilingual OCR and document parsing toolkit built upon the PaddlePaddle deep learning framework. It provides robust capabilities for text detection, text recognition, and structured document understanding, capable of transforming images and PDFs into structured data (JSON/Markdown) for AI and LLM-based applications. The library is currently at version 3.4.0 and maintains a frequent release cadence, with minor versions released every few months, incorporating new models and features.","status":"active","version":"3.4.0","language":"en","source_language":"en","source_url":"https://github.com/PaddlePaddle/PaddleOCR","tags":["OCR","AI","ML","Computer Vision","Document Processing","PaddlePaddle","Text Recognition","Text Detection","Multilingual","Document Understanding"],"install":[{"cmd":"pip install paddlepaddle\npip install paddleocr","lang":"bash","label":"Basic OCR (CPU)"},{"cmd":"pip install paddlepaddle-gpu # Or specific CUDA version, see PaddlePaddle docs\npip install \"paddleocr[all]\"","lang":"bash","label":"Full features (GPU)"}],"dependencies":[{"reason":"Essential deep learning framework runtime for all PaddleOCR functionalities.","package":"paddlepaddle","optional":false},{"reason":"Used for PDF document processing, but specific version conflicts can occur.","package":"pymupdf","optional":true},{"reason":"Required for advanced layout analysis features.","package":"layoutparser","optional":true}],"imports":[{"note":"The primary class for OCR inference is PaddleOCR.","wrong":"import paddleocr","symbol":"PaddleOCR","correct":"from paddleocr import PaddleOCR"},{"note":"Used for advanced Vision-Language model (VLM) document parsing tasks, available with [doc-parser] extra.","symbol":"PaddleOCRVL","correct":"from paddleocr import PaddleOCRVL"}],"quickstart":{"code":"from paddleocr import PaddleOCR\nimport os\nimport cv2\nimport numpy as np\n\n# Create a dummy image for demonstration\nimg_path = 'temp_ocr_test_image.png'\nimg = np.zeros((100, 300, 3), dtype=np.uint8)\ncv2.putText(img, 'Hello PaddleOCR!', (10, 60), cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 255, 255), 2)\ncv2.imwrite(img_path, img)\n\n# Initialize PaddleOCR with default language (Chinese & English) or specify 'en' for English\n# Models will be downloaded automatically on first use\nocr = PaddleOCR(use_angle_cls=True, lang='en', show_log=False)\n\n# Perform OCR on the image\nresult = ocr.ocr(img_path, cls=True)\n\n# Print detected text and confidence scores\nfor idx in range(len(result)):\n    res = result[idx]\n    for line in res:\n        print(f\"Text: {line[1][0]}, Confidence: {line[1][1]:.2f}\")\n\n# Clean up dummy image\nos.remove(img_path)","lang":"python","description":"This quickstart demonstrates how to perform basic OCR on an image using the `PaddleOCR` class. It initializes the OCR engine (downloading models if not present), processes a dummy image, and prints the detected text along with its confidence score. For document parsing with VLM, use `PaddleOCRVL`."},"warnings":[{"fix":"Refer to the official documentation for PaddleOCR 3.x to update code. Pin PaddleOCR and PaddlePaddle versions in your project's dependencies.","message":"PaddleOCR 3.x introduced significant interface changes compared to 2.x versions. Code written for 2.x will likely break with 3.x.","severity":"breaking","affected_versions":"3.0.0 and above"},{"fix":"Always install the `paddlepaddle` framework first, ensuring its version is compatible with your `paddleocr` installation and GPU environment (e.g., CUDA version). Consult PaddlePaddle's official installation guide for specific platform/CUDA versions.","message":"Incompatible `PaddlePaddle` framework versions can lead to runtime errors, especially with GPU usage (CUDA/cuDNN issues). `paddleocr` requires PaddlePaddle 3.0 or above.","severity":"gotcha","affected_versions":"All versions"},{"fix":"If `paddleocr[all]` causes issues with PDFs, try uninstalling `pymupdf` and installing a specific compatible version, for example: `pip uninstall pymupdf` then `pip install pymupdf==1.19.0`.","message":"When processing PDF files, `AttributeError` related to `pymupdf` (e.g., 'Document' object has no attribute 'metadata') can occur due to version conflicts.","severity":"gotcha","affected_versions":"All versions (specific to pymupdf dependency)"},{"fix":"Ensure `libcublas.so` and `libcudnn.so` (and potentially other cuDNN files) are correctly linked or discoverable in `/usr/lib` or other system library paths. Creating symbolic links to the actual library locations is a common solution.","message":"GPU installations can frequently encounter `RuntimeError: (PreconditionNotMet) Cannot load cudnn shared library` or similar. This often means PaddleOCR cannot find required CUDA/cuDNN libraries.","severity":"gotcha","affected_versions":"All GPU-enabled versions"},{"fix":"Experiment with different PaddleOCR model versions (e.g., PP-OCRv5 for general scenes), preprocess images to improve quality, or fine-tune models on domain-specific datasets. Adjust parameters like `rec_image_shape` or enable `use_angle_cls` as needed.","message":"Default models may not achieve optimal accuracy for specific text types (e.g., numeric-only) or challenging image conditions (low contrast, noise).","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-11T00:00:00.000Z","next_check":"2026-07-10T00:00:00.000Z"}