Wikipedia-API

raw JSON →
0.13.0 verified Tue May 12 auth: no python install: verified

Wikipedia-API is a Python wrapper that provides an easy-to-use interface for interacting with Wikipedia's MediaWiki API. It supports extracting various types of information, including text, sections, links, categories, and translations from Wikipedia pages. The library offers both synchronous and asynchronous clients for flexible integration, and is actively maintained with frequent releases, currently at version 0.13.0.

pip install wikipedia-api
error ModuleNotFoundError: No module named 'wikipedia_api'
cause The Python module name to import is `wikipediaapi` (without hyphen or underscore), even though the installation package name is `wikipedia-api`.
fix
import wikipediaapi
error AttributeError: 'WikipediaPage' object has no attribute 'content'
cause The `wikipedia-api` library uses the attribute `text` to access the main textual content of a Wikipedia page, not `content`.
fix
page_object.text
error AttributeError: 'WikipediaPage' object has no attribute 'info'
cause The `WikipediaPage` object in `wikipedia-api` does not have a direct `info` attribute; detailed information is accessed via other attributes like `sections`, `links`, or `categories`.
fix
Use page_object.sections, page_object.links, page_object.categories, or page_object.summary to access specific information.
error TypeError: 'builtin_function_or_method' object is not subscriptable
cause You are attempting to access properties of a `WikipediaPage` object using dictionary-style bracket notation (e.g., `page['sections']`) instead of dot notation.
fix
Use dot notation, such as page.sections (or page.links, page.categories, etc.), to access properties of the WikipediaPage object.
breaking The `user_agent` parameter became mandatory in the `Wikipedia` and `AsyncWikipedia` constructors starting from version 0.6.0. Failing to provide it will raise an error.
fix Always pass a descriptive `user_agent` string as the first argument to the constructor, e.g., `wikipediaapi.Wikipedia(user_agent='MyProjectName (contact@example.com)', ...)`.
breaking The position of the `variant` parameter in the `Wikipedia` and `AsyncWikipedia` constructors changed to the third argument in versions 0.7.x. If you were explicitly passing `variant` by position in older versions, this will break.
fix Ensure `variant` is passed as a keyword argument (e.g., `wikipediaapi.Wikipedia(..., language='en', variant='zh-tw')`) or as the third positional argument if maintaining positional arguments.
gotcha When using the `AsyncWikipedia` client, all data-fetching attributes (e.g., `summary`, `text`, `sections`, `links`, `fullurl`, `pageid`) are coroutines and must be `awaited`. Failing to do so will result in a coroutine object instead of the expected data.
fix Prepend `await` to calls for data-fetching attributes, e.g., `await page.summary` instead of `page.summary`.
gotcha Providing an inappropriate or missing `User-Agent` string can lead to your requests being blocked by Wikimedia servers. It's essential to follow Wikimedia's User-Agent policy, including providing a project name and contact information.
fix Use a descriptive `user_agent` like `'MyProjectName/1.0 (https://myproject.com; contact@example.com)'` or `'MyProjectName (contact@example.com)'`.
gotcha Excessive or rapid requests to the Wikipedia API, especially without proper delays, can lead to your IP being temporarily or permanently blocked by Wikimedia's servers due to rate limiting.
fix Implement rate limiting or introduce delays between requests, particularly for automated scraping or high-volume data retrieval. Consider using `time.sleep()` between calls if performing many requests.
breaking `wikipediaapi` versions 0.8.0 and newer utilize Python 3.10+ type hint syntax (e.g., `X | Y` for union types). Running these versions on Python environments older than 3.10 will result in a `TypeError: unsupported operand type(s) for |: 'type' and 'type'` during import.
fix Upgrade your Python environment to 3.10 or newer. Alternatively, if you must use an older Python version (e.g., 3.9), downgrade `wikipediaapi` to a version prior to 0.8.0.
python os / libc status wheel install import disk mem side effects
3.10 alpine (musl) wheel - 0.29s 24.2M 8.9M clean
3.10 alpine (musl) - - 0.34s 24.2M 8.8M -
3.10 slim (glibc) wheel 2.2s 0.21s 25M 8.9M clean
3.10 slim (glibc) - - 0.23s 25M 8.8M -
3.11 alpine (musl) wheel - 0.44s 26.7M 10.1M clean
3.11 alpine (musl) - - 0.53s 26.7M 10.1M -
3.11 slim (glibc) wheel 2.4s 0.40s 27M 10.1M clean
3.11 slim (glibc) - - 0.41s 27M 10.1M -
3.12 alpine (musl) wheel - 0.38s 18.4M 9.7M clean
3.12 alpine (musl) - - 0.44s 18.3M 9.7M -
3.12 slim (glibc) wheel 2.1s 0.45s 19M 9.7M clean
3.12 slim (glibc) - - 0.47s 19M 9.7M -
3.13 alpine (musl) wheel - 0.39s 17.7M 10.1M clean
3.13 alpine (musl) - - 0.55s 17.6M 10.1M -
3.13 slim (glibc) wheel 2.2s 0.37s 18M 10.1M clean
3.13 slim (glibc) - - 0.41s 18M 10.1M -
3.9 alpine (musl) sdist - - 22.8M - broken
3.9 alpine (musl) - - - - - -
3.9 slim (glibc) sdist 3.1s - 23M - broken
3.9 slim (glibc) - - - - - -

Demonstrates how to initialize the synchronous Wikipedia client, retrieve a page by title, check its existence, and access its summary, URL, and sections. A descriptive user_agent is mandatory.

import wikipediaapi
import os

# It's crucial to provide a User-Agent that identifies your project.
# Replace 'MyProjectName (contact@example.com)' with your actual project name and contact info.
# See https://meta.wikimedia.org/wiki/User-Agent_policy
wiki_wiki = wikipediaapi.Wikipedia(
    user_agent=os.environ.get('WIKIPEDIA_USER_AGENT', 'MyPythonApp/1.0 (contact@example.com)'),
    language='en'
)

page_py = wiki_wiki.page('Python (programming language)')

if page_py.exists():
    print(f"Page title: {page_py.title}")
    print(f"Summary: {page_py.summary[0:200]}...")
    print(f"Full URL: {page_py.fullurl}")
    # Example of accessing sections
    for s in page_py.sections:
        print(f"Section: {s.title}")
        break # Just print the first section title for brevity
else:
    print(f"Page 'Python (programming language)' does not exist.")