{"id":542,"library":"nltk","title":"Natural Language Toolkit (NLTK)","description":"NLTK (Natural Language Toolkit) is a leading open-source Python library for Natural Language Processing (NLP). It provides easy-to-use interfaces to over 50 corpora and lexical resources, along with a suite of text processing libraries for classification, tokenization, stemming, tagging, parsing, and semantic reasoning. Currently at version 3.9.4, NLTK generally follows a release cadence of a few minor versions per year, with more significant updates addressing security and Python compatibility as needed.","status":"active","version":"3.9.4","language":"python","source_language":"en","source_url":"https://github.com/nltk/nltk","tags":["NLP","Natural Language Processing","text analysis","tokenization","stemming","tagging","corpora"],"install":[{"cmd":"pip install nltk","lang":"bash","label":"Install NLTK"}],"dependencies":[],"imports":[{"note":"Most common and provides access to core functionalities and submodules like nltk.word_tokenize, nltk.pos_tag.","symbol":"nltk","correct":"import nltk"},{"note":"Direct import for specific tokenization functions.","symbol":"word_tokenize","correct":"from nltk.tokenize import word_tokenize"},{"note":"Direct import for a specific stemmer.","symbol":"PorterStemmer","correct":"from nltk.stem import PorterStemmer"}],"quickstart":{"code":"import nltk\nfrom nltk.tokenize import word_tokenize\nfrom nltk.tag import pos_tag\n\n# Download necessary NLTK data (run once)\ntry:\n    nltk.data.find('tokenizers/punkt')\nexcept nltk.downloader.DownloadError:\n    nltk.download('punkt')\ntry:\n    nltk.data.find('taggers/averaged_perceptron_tagger')\nexcept nltk.downloader.DownloadError:\n    nltk.download('averaged_perceptron_tagger')\n\ntext = \"NLTK is a powerful library for natural language processing.\"\n\n# Tokenization\ntokens = word_tokenize(text)\nprint(f\"Tokens: {tokens}\")\n\n# Part-of-Speech Tagging\ntagged_tokens = pos_tag(tokens)\nprint(f\"POS Tagged: {tagged_tokens}\")","lang":"python","description":"This quickstart demonstrates basic text tokenization and Part-of-Speech (POS) tagging using NLTK. It includes checks to download the 'punkt' tokenizer and 'averaged_perceptron_tagger' if they are not already present, which are common requirements for many NLTK operations. This ensures the example is runnable out-of-the-box."},"warnings":[{"fix":"Upgrade NLTK to version 3.9 or higher. Ensure your application is updated to use the new `_tab` packages or re-download corpora with `nltk.download()` after upgrading. Specifically, NLTK 3.9.3 fixed CVE-2025-14009 related to secure ZIP extraction.","message":"NLTK 3.9 introduced a breaking change by replacing pickled models (e.g., for `punkt`, chunkers, taggers) with new pickle-free `_tab` packages to fix security vulnerability CVE-2024-39705. Older versions using pickled models may be insecure or incompatible with newer NLTK versions.","severity":"breaking","affected_versions":"<3.9"},{"fix":"Before using a specific NLTK module that relies on external data, ensure the necessary data is downloaded. For production, explicitly download only the required packages using `nltk.download('package_name')` once during setup, or use `nltk.data.path.append('/path/to/nltk_data')` to point to pre-downloaded data. For example, `nltk.download('punkt')` for the Punkt tokenizer.","message":"Many NLTK functionalities (e.g., tokenizers, taggers, corpora) require downloading specific datasets. Failing to download them will result in `Resource Not Found` errors. Running `nltk.download('all')` can be resource-intensive and unsuitable for production environments.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Update exception handling in your code to catch `nltk.downloader.NLTKDownloadError` or `nltk.downloader.NLTKDownloaderException` instead of `nltk.downloader.DownloadError`.","message":"The `nltk.downloader.DownloadError` exception class was deprecated and removed in NLTK versions 3.8.1 and higher. Code attempting to catch `nltk.downloader.DownloadError` will raise an `AttributeError`. The replacement is `nltk.downloader.NLTKDownloadError` or its base class `nltk.downloader.NLTKDownloaderException`.","severity":"breaking","affected_versions":">=3.8.1"}],"env_vars":null,"last_verified":"2026-05-12T14:50:57.394Z","next_check":"2026-06-26T00:00:00.000Z","problems":[{"fix":"Install NLTK using pip: `pip install nltk` or `pip3 install nltk`.","cause":"The NLTK library is not installed in the Python environment you are currently using, or there's an issue with your Python PATH.","error":"ModuleNotFoundError: No module named 'nltk'"},{"fix":"Open a Python interpreter and run `import nltk; nltk.download('punkt')` to download the specific 'punkt' tokenizer. For other resources, replace 'punkt' with the name of the missing resource (e.g., 'stopwords', 'wordnet', 'averaged_perceptron_tagger'), or run `nltk.download('all')` to download all popular NLTK data collections.","cause":"NLTK requires additional data packages (like 'punkt' for tokenization, 'stopwords' for stop word lists, 'wordnet' for lexical resources, etc.) that are not included in the initial library installation and must be downloaded separately.","error":"LookupError: Resource 'punkt' not found. Please use the NLTK Downloader to obtain the resource:"},{"fix":"Bypass SSL verification for the NLTK download. In your Python script or interpreter, add the following before `nltk.download()`: `import ssl; try: _create_unverified_https_context = ssl._create_unverified_https_context except AttributeError: pass else: ssl._create_default_https_context = _create_unverified_https_context; nltk.download('popular')` (or the specific resource you need).","cause":"The NLTK data downloader is encountering an SSL certificate verification issue, often due to corporate network proxies, firewalls, or an outdated Python installation's certificate store.","error":"Error loading [resource name]: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed:"},{"fix":"Rename your Python script if it's named `nltk.py` (or any other name that conflicts with an NLTK module). If that's not the case, ensure NLTK is properly installed and updated by running `pip install --upgrade nltk`.","cause":"This usually happens if you've inadvertently named one of your Python files 'nltk.py', which causes Python to import your local file instead of the actual NLTK library, or if you're using a very old or corrupted NLTK installation.","error":"AttributeError: module 'nltk' has no attribute 'download'"}],"ecosystem":"pypi","meta_description":null,"install_score":100,"install_tag":"verified","quickstart_score":0,"quickstart_tag":"stale","pypi_latest":null,"install_checks":{"last_tested":"2026-05-12","tag":"verified","tag_description":"installs cleanly on critical runtimes, fast import, recently tested","results":[{"runtime":"python:3.10-alpine","python_version":"3.10","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.99,"mem_mb":17.7,"disk_size":"35.4M"},{"runtime":"python:3.10-slim","python_version":"3.10","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.66,"mem_mb":17.7,"disk_size":"36M"},{"runtime":"python:3.11-alpine","python_version":"3.11","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":1.63,"mem_mb":20.4,"disk_size":"40.5M"},{"runtime":"python:3.11-slim","python_version":"3.11","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":1.35,"mem_mb":20.4,"disk_size":"41M"},{"runtime":"python:3.12-alpine","python_version":"3.12","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":1.2,"mem_mb":19.5,"disk_size":"31.5M"},{"runtime":"python:3.12-slim","python_version":"3.12","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":1.29,"mem_mb":19.5,"disk_size":"32M"},{"runtime":"python:3.13-alpine","python_version":"3.13","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.83,"mem_mb":19.9,"disk_size":"31.0M"},{"runtime":"python:3.13-slim","python_version":"3.13","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":1.32,"mem_mb":19.9,"disk_size":"32M"},{"runtime":"python:3.9-alpine","python_version":"3.9","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.8,"mem_mb":17.3,"disk_size":"34.5M"},{"runtime":"python:3.9-slim","python_version":"3.9","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.69,"mem_mb":17.3,"disk_size":"36M"}]},"quickstart_checks":{"last_tested":"2026-04-23","tag":"stale","tag_description":"widespread failures or data too old to trust","results":[{"runtime":"python:3.10-alpine","exit_code":1},{"runtime":"python:3.10-slim","exit_code":1},{"runtime":"python:3.11-alpine","exit_code":1},{"runtime":"python:3.11-slim","exit_code":1},{"runtime":"python:3.12-alpine","exit_code":1},{"runtime":"python:3.12-slim","exit_code":1},{"runtime":"python:3.13-alpine","exit_code":1},{"runtime":"python:3.13-slim","exit_code":1},{"runtime":"python:3.9-alpine","exit_code":1},{"runtime":"python:3.9-slim","exit_code":1}]}}