BabelDOC

raw JSON →
0.5.24 verified Fri May 01 auth: no python

Yet Another Document Translator. Current version 0.5.24. Supports Python >=3.10 <3.14. Released roughly weekly.

pip install babeldoc[ocr]
error ModuleNotFoundError: No module named 'babeldoc'
cause The package is not installed or installed in a different environment.
fix
Run 'pip install babeldoc' in the correct Python environment.
error ImportError: cannot import name 'DocumentTranslator' from 'babeldoc'
cause Old version of babeldoc or incorrect import path. The class was renamed in v0.5.
fix
Upgrade to latest: 'pip install --upgrade babeldoc'. Use 'from babeldoc import DocumentTranslator'.
breaking Python 3.13 is not supported; requires Python >=3.10 <3.14.
fix Use Python 3.10, 3.11, or 3.12.
gotcha The 'ocr' extra is required for OCR-based translation; otherwise, text extraction only.
fix Install with 'pip install babeldoc[ocr]' and ensure you have Tesseract installed separately.
gotcha API key must be passed explicitly or set via environment variable; no default.
fix Set OPENAI_API_KEY environment variable or pass api_key parameter.
pip install babeldoc

Basic usage: instantiate DocumentTranslator with API key and model, then call translate with a TranslationConfig.

from babeldoc import DocumentTranslator, TranslationConfig

translator = DocumentTranslator(
    api_key=os.environ.get('OPENAI_API_KEY', ''),
    model='gpt-4o',
    source_lang='en',
    target_lang='zh'
)
config = TranslationConfig(
    input_file='input.pdf',
    output_file='output.pdf'
)
translator.translate(config)