LinkChecker
LinkChecker is a free, GPL-licensed website validator that checks links in web documents or full websites. It supports recursive and multithreaded checking, various output formats (text, HTML, SQL, CSV, XML), proxy support, and honors the robots.txt exclusion protocol. The current version is 10.6.0 and it is actively maintained with frequent releases, often multiple times a year.
Common errors
-
linkchecker: command not found
cause The `linkchecker` executable is not in your system's PATH, or the installation via pip/pipx was not properly completed/sourced.fixEnsure your shell's PATH includes the directory where pip/pipx installs executables. If using `pipx`, verify `pipx ensurepath` has been run. You can also try running `python -m linkchecker --help`. -
Error: problems with a configuration file or cookie file detected on startup
cause Since v10.4.0, LinkChecker will explicitly exit if it encounters issues parsing the configuration file (`~/.linkchecker/linkcheckerrc`) or a specified cookie file. Older versions might have raised an exception instead.fixReview your configuration file and any cookie files for syntax errors or malformed content. Ensure they are valid and readable. -
TypeError: 'NoneType' object is not iterable (or similar FTP related errors)
cause Prior to version 10.3.0, the FTP checker module had bugs, including raising `TypeError` and ignoring `maxfilesizedownload`.fixUpgrade LinkChecker to version 10.3.0 or later to resolve known issues with FTP link checking. -
linkchecker: error: argument -p/--password: ignored
cause In versions prior to 10.3.0, the `-p`/`--password` command-line option for specifying passwords was ignored.fixUpgrade to LinkChecker 10.3.0 or later. If using configuration files for credentials, verify the format as login entries may have changed (v10.0.0).
Warnings
- breaking LinkChecker transitioned from Python 2 to Python 3 with version 10.0.0. Older versions (pre-10.0.0) are incompatible with Python 3 environments.
- breaking The minimum required Python version has progressively increased. LinkChecker 10.4.0 and later requires Python 3.9+. Earlier versions had lower requirements (e.g., 3.7 for v10.2.0, 3.8 for v10.3.0).
- deprecated Support for checking NNTP and Telnet links was removed in version 10.3.0.
- gotcha Prior to version 10.6.0, LinkChecker did not verify the SSL/TLS certificate of the HTTPS connection when an HTTP URL redirected to HTTPS. This could lead to security vulnerabilities.
- gotcha LinkChecker often reports HTTP 301 (Moved Permanently) redirects as warnings by default, which can result in verbose output.
- gotcha When checking slow or under-development sites, LinkChecker may report many timeouts.
Install
-
pip install linkchecker -
pipx install linkchecker
Imports
- linkchecker
import linkchecker
This is primarily a command-line tool. Direct programmatic import of the main 'linkchecker' utility is not the standard usage pattern. Instead, execute it as a shell command.
Quickstart
linkchecker https://www.example.com