{"id":3205,"library":"pagefind","title":"Pagefind Python API","description":"Pagefind is a Python API wrapper for the Pagefind Rust binary, providing static, low-bandwidth full-text search capabilities for websites. It excels at indexing large static sites, supporting HTML files, custom records, and multilingual content. The library is actively maintained, with version 1.5.0 released on April 6, 2026, offering an asynchronous interface for programmatic indexing.","status":"active","version":"1.5.0","language":"en","source_language":"en","source_url":"https://github.com/Pagefind/pagefind","tags":["static site generation","search","full-text search","indexing","offline search"],"install":[{"cmd":"pip install 'pagefind[bin]'","lang":"bash","label":"With standard binary"},{"cmd":"pip install 'pagefind[extended]'","lang":"bash","label":"With extended binary (Chinese/Japanese support)"},{"cmd":"pip install pagefind","lang":"bash","label":"Python wrapper only (requires pre-installed Pagefind binary)"}],"dependencies":[{"reason":"Runtime dependency","package":"python","version":">=3.9"},{"reason":"Core indexing engine, bundled via extras or requires manual installation","package":"pagefind (Rust binary)","optional":false}],"imports":[{"symbol":"PagefindIndex","correct":"from pagefind.index import PagefindIndex"},{"symbol":"IndexConfig","correct":"from pagefind.index import IndexConfig"}],"quickstart":{"code":"import asyncio\nimport json\nimport logging\nimport os\nfrom pagefind.index import PagefindIndex, IndexConfig\n\nlogging.basicConfig(level=os.environ.get(\"LOG_LEVEL\", \"INFO\"))\nlog = logging.getLogger(__name__)\n\nhtml_content = (\n    \"<html>\"\n    \" <body>\"\n    \" <main>\"\n    \" <h1>Example HTML</h1>\"\n    \" <p>This is an example HTML page.</p>\"\n    \" </main>\"\n    \" </body>\"\n    \"</html>\"\n)\n\nasync def main():\n    config = IndexConfig(\n        root_selector=\"main\",\n        logfile=\"index.log\",\n        output_path=\"./pagefind_output\", # Using a custom output path for example\n        verbose=True\n    )\n\n    async with PagefindIndex(config=config) as index:\n        log.debug(\"Opened index\")\n        new_file, new_record = await asyncio.gather(\n            index.add_html_file(\n                content=html_content,\n                url=\"https://example.com/some-page\",\n                source_path=\"example.html\",\n            ),\n            index.add_custom_record(\n                url=\"/elephants/\",\n                content=\"Some testing content regarding elephants\",\n                language=\"en\",\n                meta={\"title\": \"Elephants\"},\n            ),\n        )\n\n        print(f\"Indexed HTML file: {json.dumps(new_file, indent=2)}\")\n        print(f\"Added custom record: {json.dumps(new_record, indent=2)}\")\n\n    print(\"Indexing complete. Output written to ./pagefind_output\")\n\nif __name__ == \"__main__\":\n    # To run this, ensure you have installed with `pip install 'pagefind[bin]'`\n    # or have the pagefind binary available in your PATH.\n    asyncio.run(main())","lang":"python","description":"This quickstart demonstrates how to initialize Pagefind, add HTML content directly, and include custom records in the search index using its asynchronous Python API. The output index files will be saved to `./pagefind_output`."},"warnings":[{"fix":"Update your build process or configuration to expect `pagefind` as the output directory, or explicitly set `output_path` in `IndexConfig`.","message":"Pagefind 1.0 (and subsequent versions) changed the default output directory from `_pagefind` to `pagefind`. Existing build scripts or configurations might need updating.","severity":"breaking","affected_versions":">=1.0.0"},{"fix":"Adjust any direct CLI calls or subprocess commands to use the new option names (`--site`, `--output-subdir`).","message":"Pagefind 1.0 introduced CLI option renames: `source` was renamed to `site`, and `bundle-dir` was renamed to `output-subdir`. This primarily affects direct CLI usage but can impact Python scripts invoking the CLI.","severity":"breaking","affected_versions":">=1.0.0"},{"fix":"Always use `pip install 'pagefind[bin]'` or `pip install 'pagefind[extended]'` unless you intend to provide the Pagefind binary yourself.","message":"Installing `pagefind` via `pip install pagefind` only installs the Python wrapper. To get the necessary Pagefind Rust binary automatically, you must install with extras: `pip install 'pagefind[bin]'` for the standard binary, or `pip install 'pagefind[extended]'` for extended language support.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Wrap `PagefindIndex` instantiation and usage within an `async with PagefindIndex(...) as index:` block.","message":"The `PagefindIndex` object is an asynchronous context manager and must be used with `async with`. Failing to do so will prevent the index from being properly opened, written, and the backing service shut down.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Be aware that search result order may differ. Ranking parameters can be configured via `IndexConfig` if fine-tuning is required.","message":"Pagefind v1.1.0 improved its core result ranking algorithm to align with BM25. This change will alter the ordering of search results compared to earlier versions, potentially providing better relevance by default.","severity":"behavioral","affected_versions":">=1.1.0"}],"env_vars":null,"last_verified":"2026-04-11T00:00:00.000Z","next_check":"2026-07-10T00:00:00.000Z"}