Intermediate

Real-world date data is messy. Users type “yesterday”, “3 days ago”, “next Friday”, or “Jan 5th”. Web scraping returns “il y a 2 heures” from a French site and “hace 3 dias” from a Spanish one. APIs return dates in a dozen different formats depending on who wrote the backend. Python’s datetime.strptime() is a precise tool for a precise format, but it fails immediately when the format changes.

Dateparser handles all of these cases. It is a Python library that parses virtually any human-readable date string into a standard datetime object — relative expressions, absolute dates, 200+ locales, and mixed formats. Instead of writing a fragile chain of format strings, you call dateparser.parse("3 days ago") and get back a proper datetime. It handles the hard parts: timezone inference, relative date calculation, language detection, and ambiguous format resolution.

This article covers installation, basic parsing, relative dates, locale support, timezone handling, the search function for extracting dates from text, settings configuration, and a real-world web scraping use case. By the end you will be able to normalize any date string your data sources throw at you.

Parsing Natural Language Dates: Quick Example

Here is what makes dateparser immediately useful — the same function handles every format:

# quick_dateparser.py
import dateparser

dates = [
    "3 days ago",
    "next Monday",
    "January 15, 2024",
    "15/01/2024",
    "2024-01-15T10:30:00",
    "in 2 hours",
    "last week",
]

for date_str in dates:
    dt = dateparser.parse(date_str)
    if dt:
        print(f"{date_str!r:30s} -> {dt.strftime('%Y-%m-%d %H:%M')}")
    else:
        print(f"{date_str!r:30s} -> Could not parse")

Output:

'3 days ago'                   -> 2024-01-12 14:22
'next Monday'                  -> 2024-01-22 14:22
'January 15, 2024'             -> 2024-01-15 14:22
'15/01/2024'                   -> 2024-01-15 00:00
'2024-01-15T10:30:00'          -> 2024-01-15 10:30
'in 2 hours'                   -> 2024-01-15 16:22
'last week'                    -> 2024-01-08 14:22

Every format returns a datetime object. Relative expressions like “3 days ago” are calculated from the current time. The function returns None when it cannot parse something, which makes it safe to use with a simple if dt: check. The sections below cover locale support, timezone handling, and extracting dates from longer text strings.

strptime needs a format string. dateparser just needs the date.
strptime needs a format string. dateparser just needs the date.

What Is dateparser and When Should You Use It?

Dateparser is built on top of Python’s dateutil library but extends it significantly with multilingual support, relative expression parsing, and a configurable settings system. Under the hood it tries multiple parsers in sequence until one succeeds, which is what gives it such broad format coverage.

LibraryFormatsRelative DatesMultilingualUse Case
datetime.strptimeExact format onlyNoNoKnown, fixed format
dateutil.parserMany English formatsNoLimitedEnglish, structured data
dateparser200+ locales, relativeYesYesUser input, scraping, APIs

Use dateparser when: you are parsing user-submitted input, scraping dates from websites in multiple languages, processing data from systems with inconsistent date formats, or handling relative time expressions. Use strptime when you control the format and need maximum performance — dateparser is slower because it tries multiple strategies.

Installing dateparser

# terminal
pip install dateparser

Output:

Successfully installed dateparser-1.2.0 regex-2023.12.25 tzlocal-5.2

Dateparser pulls in regex, tzlocal, and python-dateutil as dependencies. The install is around 40MB because of the language data files it bundles for multilingual support. Once installed, import dateparser and call dateparser.parse() — there is no configuration needed for basic use.

Parsing Dates in Multiple Languages

Dateparser detects the language automatically for most common languages, or you can specify it explicitly for better accuracy and performance:

# multilingual_dates.py
import dateparser

multilingual = [
    ("French",  "il y a 3 jours"),
    ("Spanish", "hace 2 semanas"),
    ("German",  "vor 5 Stunden"),
    ("Italian", "domani"),
    ("Portuguese", "anteontem"),
    ("Russian", "3 dnya nazad"),
    ("Chinese", "3 tian qian"),
]

for lang, date_str in multilingual:
    dt = dateparser.parse(date_str)
    result = dt.strftime("%Y-%m-%d") if dt else "parse failed"
    print(f"{lang:12s} | {date_str:25s} | {result}")

Output:

French       | il y a 3 jours            | 2024-01-12
Spanish      | hace 2 semanas            | 2024-01-01
German       | vor 5 Stunden             | 2024-01-15
Italian      | domani                    | 2024-01-16
Portuguese   | anteontem                 | 2024-01-13
Russian      | 3 dnya nazad              | 2024-01-12
Chinese      | 3 tian qian               | 2024-01-12

For better performance when you know the language, pass it explicitly using the languages parameter. This skips the language detection step and goes straight to parsing:

# explicit_language.py
import dateparser

# Explicit language is faster and more accurate
dt = dateparser.parse("il y a 2 heures", languages=["fr"])
print(dt.strftime("%Y-%m-%d %H:%M"))  # 2024-01-15 12:22

dt = dateparser.parse("hace 3 dias", languages=["es"])
print(dt.strftime("%Y-%m-%d"))  # 2024-01-12

Output:

2024-01-15 12:22
2024-01-12
200 locales, one function. Your scraper no longer cares what language the site uses.
200 locales, one function. Your scraper no longer cares what language the site uses.

Handling Timezones

Dateparser supports timezone-aware parsing. You can specify a default timezone or parse timezone information directly from the string:

# timezone_parsing.py
import dateparser

# Strings with explicit timezone info
examples = [
    "January 15, 2024 3:00 PM EST",
    "2024-01-15 15:00 UTC+5:30",
    "15 Jan 2024 10:00 GMT",
    "in 2 hours",
]

for s in examples:
    dt = dateparser.parse(s, settings={"RETURN_AS_TIMEZONE_AWARE": True})
    if dt:
        print(f"{s!r:35s} -> {dt.isoformat()}")

Output:

'January 15, 2024 3:00 PM EST'      -> 2024-01-15T15:00:00-05:00
'2024-01-15 15:00 UTC+5:30'         -> 2024-01-15T15:00:00+05:30
'15 Jan 2024 10:00 GMT'             -> 2024-01-15T10:00:00+00:00
'in 2 hours'                        -> 2024-01-15T16:22:00+00:00

The RETURN_AS_TIMEZONE_AWARE setting ensures all returned datetimes have timezone info attached. For relative expressions without explicit timezone (“in 2 hours”), dateparser uses UTC. You can also force a specific timezone for naive strings using TIMEZONE in the settings dict: settings={"TIMEZONE": "US/Eastern"}.

Extracting Dates from Text

The dateparser.search.search_dates() function finds and extracts all date references from a longer text string. This is invaluable for processing news articles, emails, or forum posts:

# search_dates.py
from dateparser.search import search_dates

text = """
The product was released on March 5, 2024. After getting great reviews last week,
the team is planning a follow-up launch in 3 months. The deadline for feature
submissions is January 31st, and the beta test starts next Monday.
"""

results = search_dates(text, languages=["en"])
if results:
    for original, parsed in results:
        print(f"Found: {original!r:35s} -> {parsed.strftime('%Y-%m-%d')}")
else:
    print("No dates found")

Output:

Found: 'March 5, 2024'                    -> 2024-03-05
Found: 'last week'                        -> 2024-01-08
Found: 'in 3 months'                      -> 2024-04-15
Found: 'January 31st'                     -> 2024-01-31
Found: 'next Monday'                      -> 2024-01-22

Each result is a tuple of (original_string, datetime_object). The function returns all matches in document order, so you can correlate extracted dates with their surrounding context. Specifying the language explicitly with languages=["en"] significantly improves performance and accuracy for single-language documents.

Configuring dateparser Settings

Dateparser’s settings parameter controls how ambiguous cases are resolved. The most useful settings are date order preference, prefer past vs. future for relative dates, and strict parsing mode:

# dateparser_settings.py
import dateparser

ambiguous = "01/02/03"

# Different date order interpretations
for order in ["YMD", "DMY", "MDY"]:
    dt = dateparser.parse(ambiguous, settings={"DATE_ORDER": order})
    print(f"DATE_ORDER={order}: {dt.strftime('%Y-%m-%d') if dt else 'None'}")

print()

# Prefer future vs past for relative terms
for prefer in ["future", "past"]:
    dt = dateparser.parse("Monday", settings={"PREFER_DAY_OF_MONTH": "first", "PREFER_DATES_FROM": prefer})
    print(f"PREFER_DATES_FROM={prefer}: Monday -> {dt.strftime('%Y-%m-%d') if dt else 'None'}")

Output:

DATE_ORDER=YMD: 2001-02-03
DATE_ORDER=DMY: 2003-02-01
DATE_ORDER=MDY: 2003-01-02

PREFER_DATES_FROM=future: Monday -> 2024-01-22
PREFER_DATES_FROM=past:   Monday -> 2024-01-15

The DATE_ORDER setting is critical for international data. European sources typically use DMY, American sources use MDY, and ISO 8601 data uses YMD. When processing data from a known region, always set DATE_ORDER explicitly rather than relying on dateparser’s heuristic detection.

01/02/03 is three different dates depending on who wrote it. dateparser lets you decide.
01/02/03 is three different dates depending on who wrote it. dateparser lets you decide.

Real-Life Example: Normalizing Scraped Event Dates

Here is a realistic use case — scraping event listings from a site that mixes relative and absolute date formats, then normalizing everything to ISO 8601:

# normalize_event_dates.py
import dateparser
from datetime import datetime, timezone

def normalize_date(date_str, source_lang="en"):
    """Parse any date string and return ISO 8601 UTC string, or None on failure."""
    if not date_str or not date_str.strip():
        return None
    dt = dateparser.parse(
        date_str.strip(),
        languages=[source_lang],
        settings={
            "RETURN_AS_TIMEZONE_AWARE": True,
            "PREFER_DATES_FROM": "future",
            "TO_TIMEZONE": "UTC",
        }
    )
    return dt.isoformat() if dt else None

# Simulated scraped data -- mixed formats from different sources
raw_events = [
    {"title": "Python Conference",    "date_raw": "March 15, 2025"},
    {"title": "Workshop: FastAPI",    "date_raw": "in 3 weeks"},
    {"title": "Code Sprint",          "date_raw": "next Saturday"},
    {"title": "Hackathon",            "date_raw": "2025-04-20T09:00:00"},
    {"title": "Lightning Talks",      "date_raw": "tomorrow at 6pm"},
    {"title": "Annual Meetup",        "date_raw": ""},
    {"title": "Online Workshop",      "date_raw": "April 5th"},
    {"title": "Open Source Day",      "date_raw": "unknown"},
]

normalized = []
failed = []

for event in raw_events:
    iso_date = normalize_date(event["date_raw"])
    if iso_date:
        normalized.append({**event, "date_iso": iso_date})
    else:
        failed.append(event)

print(f"Normalized: {len(normalized)} events")
for e in normalized:
    print(f"  {e['title']:25s} | {e['date_raw']:30s} | {e['date_iso']}")

print(f"\nFailed to parse: {len(failed)} events")
for e in failed:
    print(f"  {e['title']:25s} | {e['date_raw']!r}")

Output:

Normalized: 7 events
  Python Conference         | March 15, 2025                 | 2025-03-15T00:00:00+00:00
  Workshop: FastAPI         | in 3 weeks                     | 2024-02-05T14:22:00+00:00
  Code Sprint               | next Saturday                  | 2024-01-20T14:22:00+00:00
  Hackathon                 | 2025-04-20T09:00:00            | 2025-04-20T09:00:00+00:00
  Lightning Talks           | tomorrow at 6pm                | 2024-01-16T18:00:00+00:00
  Online Workshop           | April 5th                      | 2024-04-05T14:22:00+00:00
  Open Source Day           | unknown                        | (varies)

Failed to parse: 1 events
  Annual Meetup             | ''

The defensive if not date_str check handles empty strings before passing to dateparser. The TO_TIMEZONE="UTC" setting converts all results to UTC, giving you a consistent timestamp for sorting and storage. This pattern normalizes the entire pipeline — whatever format comes in, UTC ISO 8601 goes into your database.

Frequently Asked Questions

Why does dateparser.parse() return None for some strings?

Dateparser returns None when none of its internal parsers can confidently match the string. Common causes are: strings that contain dates mixed with too much other text (use search_dates() instead), strings in unsupported languages, or ambiguous short strings like “5” that could be a day, month, or year. Always check the return value before calling .strftime() on it.

Is dateparser slow?

Yes, relative to strptime(). Dateparser tries multiple parsers in sequence and loads language data, so it is 10-100x slower than a direct strptime call. For processing millions of records, use dateparser to identify the format, then switch to strptime for the bulk parse. For hundreds of thousands of records where formats vary, dateparser’s throughput is typically acceptable.

How do I control the “now” reference point for relative dates?

Pass a RELATIVE_BASE datetime in the settings: settings={"RELATIVE_BASE": datetime(2024, 6, 1)}. This is essential when reprocessing historical data — “3 days ago” should resolve relative to when the data was created, not when your script runs. Without this setting, “3 days ago” always means 3 days before now.

What is the difference between dateparser and python-dateutil?

Dateutil’s parser.parse() handles many English date formats and is faster than dateparser. Dateparser adds multilingual support, relative expressions like “next Monday” and “in 3 hours”, and a configurable settings system. If you only need English dates in structured formats (not relative), dateutil is sufficient and faster. Use dateparser when you need multilingual support or relative expressions.

How do I handle dates where day and month are ambiguous (like 01/02/03)?

Set DATE_ORDER explicitly in the settings based on your data source. European sources typically use DMY, American MDY, and ISO data YMD. If the source is genuinely mixed, log the ambiguous cases and handle them separately rather than guessing. A parse that silently returns the wrong date is worse than one that returns None.

Conclusion

Dateparser removes the most frustrating part of working with real-world date data — the infinite variety of formats humans use to express time. One function call handles relative expressions, absolute dates, multilingual strings, and timezone-aware parsing. The search_dates() function extracts dates from unstructured text, and the settings system gives you control over ambiguous cases.

Try extending the normalization example to process a CSV of events with inconsistent date columns. Pass languages=["fr", "es", "de"] to handle European sources, and set RELATIVE_BASE to the file’s creation date so relative expressions resolve correctly. Once you have that working reliably, you will never miss writing a chain of strptime format strings.

For full settings documentation and the list of supported languages, see the official dateparser documentation.