Datefinder
Datefinder is a Python library designed to extract datetime objects from natural language text. It supports various date and time formats, including relative and absolute expressions. The current version is 1.0.0, and it follows an active release cadence, with a recent major update shifting its default parsing engine.
Warnings
- breaking In version 1.0.0, the default parsing engine for `find_dates` changed from 'legacy' to a new 'v2' engine. This may alter results or introduce breaking changes if your application relied on specific behaviors of the 'legacy' engine.
- gotcha The `find_dates` function returns an iterable of `DateMatch` objects, not raw `datetime` objects directly. Each `DateMatch` object contains the extracted `datetime` object via its `.datetime` attribute, and the original matched string via `.substring`.
- breaking Support for Python 2 was dropped in version 0.7.0. Older applications running on Python 2 will need to upgrade their Python version to use recent datefinder releases.
- gotcha By default, `datefinder` returns naive `datetime` objects (without timezone information) unless explicit timezone details are present in the input text. If timezone awareness is critical, you must handle it separately.
- gotcha The `strict=True` parameter significantly restricts the types of dates and times `datefinder` will parse, often leading to fewer matches. This can be unexpected if not fully understood.
Install
-
pip install datefinder
Imports
- find_dates
from datefinder import find_dates
- find_dates_legacy
from datefinder import find_dates_legacy
- extract
from datefinder import extract
Quickstart
from datefinder import find_dates
text = "I have a meeting on October 25th, 2024 at 3 PM and another one next Tuesday."
print("Dates found:")
for match in find_dates(text):
print(f" Match: '{match.substring}' -> Datetime: {match.datetime}")
# To use the legacy engine (pre-1.0.0 behavior)
print("\nDates found (legacy engine):")
for match in find_dates(text, engine='legacy'):
print(f" Match: '{match.substring}' -> Datetime: {match.datetime}")