Intermediate

Your inbox is a goldmine of structured data — receipts, order confirmations, alerts, reports — and most developers are leaving it completely untouched. If you have ever wanted to automatically archive invoices from a particular sender, extract tracking numbers from shipping emails, or build a lightweight email-to-ticket pipeline, Python’s built-in imaplib module is what you need. No third-party packages, no complicated setup — just the IMAP protocol wrapped in a clean Python interface.

The imaplib module ships with Python’s standard library, so there is nothing to install. You will need an email account with IMAP access enabled and, for Gmail, an App Password (not your regular password). That is the only setup step. Everything else is Python.

This tutorial walks you through connecting to an IMAP server, authenticating securely, listing and selecting mailboxes, searching for messages by sender or date, fetching full email bodies, and parsing headers with the companion email module. By the end, you will have a working email reader that you can adapt for any automation task.

Reading Your Inbox: Quick Example

Here is the shortest path to reading your 5 most recent unread emails. This script connects, searches, fetches, and prints subject lines — all in about 20 lines:

# quick_email_reader.py
import imaplib
import email

IMAP_HOST = 'imap.gmail.com'
USERNAME  = 'you@gmail.com'
PASSWORD  = 'your-app-password'   # Gmail App Password, not account password

with imaplib.IMAP4_SSL(IMAP_HOST) as mail:
    mail.login(USERNAME, PASSWORD)
    mail.select('INBOX')

    _, uids = mail.search(None, 'UNSEEN')
    recent = uids[0].split()[-5:]          # last 5 unread UIDs

    for uid in recent:
        _, data = mail.fetch(uid, '(RFC822)')
        msg = email.message_from_bytes(data[0][1])
        print(f"From:    {msg['From']}")
        print(f"Subject: {msg['Subject']}")
        print(f"Date:    {msg['Date']}")
        print('---')

Output:

From:    "GitHub" <noreply@github.com>
Subject: [python-project] Pull request merged
Date:    Wed, 13 May 2026 09:14:32 +0000
---
From:    orders@amazon.com
Subject: Your order has shipped
Date:    Wed, 13 May 2026 08:03:11 +0000
---

The IMAP4_SSL context manager handles the secure connection and automatic logout. mail.search(None, 'UNSEEN') returns a list of message UIDs matching the IMAP search criterion. The email module then parses the raw RFC 822 message bytes into a navigable object with named header access. The sections below break down each part of this pattern so you can customise it for your use case.

What Is IMAP and Why Use imaplib?

IMAP (Internet Message Access Protocol) is the standard protocol for reading email from a server. Unlike POP3, which downloads and deletes messages, IMAP keeps mail on the server and lets you query it without permanently removing it — making it safe to run scripts against your real inbox. Almost every email provider supports IMAP: Gmail, Outlook, Yahoo Mail, Fastmail, and most self-hosted mail servers.

ApproachBest ForSetup RequiredLibrary
imaplib (IMAP)Reading, searching, archiving emailIMAP enabled + App PasswordBuilt-in
smtplib (SMTP)Sending email onlySMTP credentialsBuilt-in
Gmail APIFull Gmail integration, OAuthGoogle Cloud projectgoogle-api-python-client
Microsoft GraphOutlook/365 integrationAzure app registrationmsal

imaplib wins when you want something that works with any provider, has zero dependencies, and gives you full control over search queries. The Gmail API is better if you need push notifications, label management, or you are already in the Google ecosystem and want OAuth instead of passwords.

Connecting and Authenticating

Before you can read any email, you need a live, authenticated IMAP connection. The two main connection classes are IMAP4 (plain-text, port 143) and IMAP4_SSL (TLS-encrypted, port 993). Always use IMAP4_SSL in production — never send credentials over an unencrypted connection.

# connect_imap.py
import imaplib

# Gmail IMAP settings
IMAP_HOST = 'imap.gmail.com'
IMAP_PORT = 993

mail = imaplib.IMAP4_SSL(IMAP_HOST, IMAP_PORT)

# Returns ('OK', [b'...capability string...'])
status, caps = mail.capability()
print(f"Status: {status}")
print(f"Server supports: {caps[0].decode()[:80]}...")

# Authenticate -- use App Password for Gmail (Settings > Security > App Passwords)
mail.login('you@gmail.com', 'your-app-password')
print("Logged in successfully")

mail.logout()

Output:

Status: OK
Server supports: CAPABILITY IMAP4rev1 UNSELECT IDLE NAMESPACE QUOTA ID XLIST CHILDREN...
Logged in successfully

The capability() call tells you what extensions the server supports. Most of the time you do not need to inspect it, but it is useful for debugging connection problems. Every imaplib method returns a (status, data) tuple — status is 'OK' on success or 'NO'/'BAD' on failure. Always check it in production scripts before proceeding.

Listing and Selecting Mailboxes

An IMAP account contains multiple mailboxes — INBOX, Sent, Drafts, Trash, and any custom folders or Gmail labels you have created. Before you can read messages, you must select a mailbox. Use list() to discover what is available:

# list_mailboxes.py
import imaplib

with imaplib.IMAP4_SSL('imap.gmail.com') as mail:
    mail.login('you@gmail.com', 'your-app-password')

    _, mailboxes = mail.list()
    for mb in mailboxes:
        print(mb.decode())

Output:

(\HasNoChildren) "/" "INBOX"
(\HasNoChildren) "/" "[Gmail]/Sent Mail"
(\HasNoChildren) "/" "[Gmail]/Drafts"
(\HasNoChildren) "/" "[Gmail]/Spam"
(\HasNoChildren) "/" "[Gmail]/Starred"
(\HasNoChildren) "/" "[Gmail]/All Mail"

Each line shows the mailbox flags, the hierarchy delimiter (/), and the mailbox name. Gmail prefixes its system folders with [Gmail]/. Once you know the name, select it with mail.select() before running any search() or fetch() calls. select() returns the number of messages in that mailbox:

# select_mailbox.py
import imaplib

with imaplib.IMAP4_SSL('imap.gmail.com') as mail:
    mail.login('you@gmail.com', 'your-app-password')

    status, data = mail.select('INBOX')
    count = int(data[0])
    print(f"INBOX has {count} messages")

    status, data = mail.select('[Gmail]/Sent Mail')
    count = int(data[0])
    print(f"Sent Mail has {count} messages")

Output:

INBOX has 1482 messages
Sent Mail has 247 messages

Searching for Messages

IMAP search criteria are powerful and composable. You can combine conditions like FROM, SUBJECT, SINCE, BEFORE, UNSEEN, and SEEN to build precise queries without downloading thousands of messages first.

# search_emails.py
import imaplib

with imaplib.IMAP4_SSL('imap.gmail.com') as mail:
    mail.login('you@gmail.com', 'your-app-password')
    mail.select('INBOX')

    # Search by sender
    _, uids = mail.search(None, 'FROM', '"github.com"')
    print(f"GitHub messages: {len(uids[0].split())}")

    # Search by subject keyword
    _, uids = mail.search(None, 'SUBJECT', '"invoice"')
    print(f"Invoice messages: {len(uids[0].split())}")

    # Combine: unread emails from a specific sender since a date
    _, uids = mail.search(None, '(UNSEEN FROM "noreply@github.com" SINCE "01-May-2026")')
    print(f"Unread GitHub since May 1: {len(uids[0].split())}")

    # All messages this month
    _, uids = mail.search(None, 'SINCE', '01-May-2026')
    print(f"Messages since May 1: {len(uids[0].split())}")

Output:

GitHub messages: 143
Invoice messages: 22
Unread GitHub since May 1: 8
Messages since May 1: 94

The first argument to search() is the character set (use None for US-ASCII, which covers most searches). The search criteria string follows IMAP search syntax — criteria are AND-ed together by default, and you wrap a group in parentheses to nest them. Dates must be in the DD-Mon-YYYY format like '13-May-2026'.

Fetching and Parsing Email Content

Once you have a list of UIDs from search(), use fetch() to download the actual message. The fetch specification controls what parts you retrieve — RFC822 fetches the full raw message, while RFC822.HEADER fetches only headers (much faster when you only need subject and sender).

# fetch_emails.py
import imaplib
import email
from email.header import decode_header

def decode_str(value):
    """Decode encoded email header values (handles UTF-8, base64, etc.)."""
    if value is None:
        return ''
    parts = decode_header(value)
    result = []
    for part, charset in parts:
        if isinstance(part, bytes):
            result.append(part.decode(charset or 'utf-8', errors='replace'))
        else:
            result.append(part)
    return ''.join(result)

with imaplib.IMAP4_SSL('imap.gmail.com') as mail:
    mail.login('you@gmail.com', 'your-app-password')
    mail.select('INBOX')

    _, uids = mail.search(None, 'UNSEEN')
    uid_list = uids[0].split()

    for uid in uid_list[:3]:          # process first 3 unread
        _, data = mail.fetch(uid, '(RFC822)')
        raw_email = data[0][1]
        msg = email.message_from_bytes(raw_email)

        subject = decode_str(msg['Subject'])
        sender  = decode_str(msg['From'])
        date    = msg['Date']

        print(f"From:    {sender}")
        print(f"Subject: {subject}")
        print(f"Date:    {date}")

        # Extract plain text body
        if msg.is_multipart():
            for part in msg.walk():
                if part.get_content_type() == 'text/plain':
                    body = part.get_payload(decode=True).decode('utf-8', errors='replace')
                    print(f"Body:    {body[:120].strip()}...")
                    break
        else:
            body = msg.get_payload(decode=True).decode('utf-8', errors='replace')
            print(f"Body:    {body[:120].strip()}...")
        print('---')

Output:

From:    GitHub <noreply@github.com>
Subject: [my-repo] New issue opened by user123
Date:    Wed, 13 May 2026 10:21:05 +0000
Body:    A new issue has been opened in your repository. View it at https://github.com/...
---
From:    "AWS Billing" <aws-billing@amazon.com>
Subject: Your AWS bill for April 2026 is now available
Date:    Wed, 13 May 2026 08:00:44 +0000
Body:    Your AWS monthly bill for April 2026 is now available. Total charges: $14.72...
---

The decode_header() call from email.header handles encoded subject lines like =?UTF-8?B?SGVsbG8gV29ybGQ=?= that appear when senders use non-ASCII characters. The is_multipart() check is essential — HTML emails bundle a plain text part and an HTML part together; msg.walk() lets you pick the one you want.

Marking, Moving, and Deleting Messages

Reading email is only half the job. Production scripts often need to mark messages as read, move them to a folder, or delete them after processing. IMAP handles all of this through flags and the copy() + store() pair:

# manage_emails.py
import imaplib

with imaplib.IMAP4_SSL('imap.gmail.com') as mail:
    mail.login('you@gmail.com', 'your-app-password')
    mail.select('INBOX')

    # Find messages from a specific sender
    _, uids = mail.search(None, 'FROM', '"invoices@acme.com"')
    uid_list = uids[0].split()
    print(f"Found {len(uid_list)} invoice emails")

    for uid in uid_list:
        # Mark as read (add the \Seen flag)
        mail.store(uid, '+FLAGS', '\\Seen')

        # Copy to an archive folder
        mail.copy(uid, 'Archive')

        # Flag the original for deletion
        mail.store(uid, '+FLAGS', '\\Deleted')

    # Permanently remove all messages flagged for deletion
    mail.expunge()
    print(f"Archived and expunged {len(uid_list)} messages")

Output:

Found 7 invoice emails
Archived and expunged 7 messages

The store() method modifies IMAP flags. '+FLAGS' adds a flag, '-FLAGS' removes it. The standard flags are \\Seen, \\Deleted, \\Answered, \\Flagged, and \\Draft. Marking a message \\Deleted does not remove it immediately — it only disappears when you call expunge(), which gives you a safety window to undo mistakes.

Real-Life Example: Invoice Extractor

Let us build a practical tool that scans your inbox for invoice emails, extracts key details from each one, and saves them to a CSV file for accounting purposes. This is a pattern you can adapt for any structured-data extraction task.

# invoice_extractor.py
import imaplib
import email
from email.header import decode_header
import csv
import re
from datetime import datetime

IMAP_HOST = 'imap.gmail.com'
USERNAME  = 'you@gmail.com'
PASSWORD  = 'your-app-password'

def decode_str(value):
    if value is None:
        return ''
    parts = decode_header(value)
    result = []
    for part, charset in parts:
        if isinstance(part, bytes):
            result.append(part.decode(charset or 'utf-8', errors='replace'))
        else:
            result.append(part)
    return ''.join(result)

def get_body(msg):
    if msg.is_multipart():
        for part in msg.walk():
            if part.get_content_type() == 'text/plain':
                return part.get_payload(decode=True).decode('utf-8', errors='replace')
    return msg.get_payload(decode=True).decode('utf-8', errors='replace')

def extract_amount(body):
    # Look for patterns like "$14.72" or "Total: 14.72"
    match = re.search(r'\$[\d,]+\.\d{2}', body)
    return match.group(0) if match else 'Unknown'

invoices = []

with imaplib.IMAP4_SSL(IMAP_HOST) as mail:
    mail.login(USERNAME, PASSWORD)
    mail.select('INBOX')

    # Search for likely invoice emails from the last 30 days
    _, uids = mail.search(None, '(SUBJECT "invoice" SINCE "01-Apr-2026")')
    uid_list = uids[0].split()
    print(f"Found {len(uid_list)} potential invoice emails...")

    for uid in uid_list:
        _, data = mail.fetch(uid, '(RFC822)')
        msg = email.message_from_bytes(data[0][1])

        subject = decode_str(msg['Subject'])
        sender  = decode_str(msg['From'])
        date    = msg['Date']
        body    = get_body(msg)
        amount  = extract_amount(body)

        invoices.append({
            'date':    date,
            'from':    sender,
            'subject': subject,
            'amount':  amount,
        })
        # Mark as read once processed
        mail.store(uid, '+FLAGS', '\\Seen')

with open('invoices.csv', 'w', newline='') as f:
    writer = csv.DictWriter(f, fieldnames=['date','from','subject','amount'])
    writer.writeheader()
    writer.writerows(invoices)

print(f"Saved {len(invoices)} invoices to invoices.csv")

Output:

Found 12 potential invoice emails...
Saved 12 invoices to invoices.csv

This script ties together everything from the tutorial: SSL connection, search by subject and date, full message fetch, multipart body extraction, regex parsing, and flag management. You can extend it by adding PDF attachment extraction with the email module’s get_payload(decode=True) on application/pdf parts, or by pushing rows directly to a Google Sheet instead of a CSV file.

Frequently Asked Questions

Why can’t I log in with my Gmail password?

Google requires App Passwords for IMAP access when 2-Step Verification is enabled (and it is required for most accounts). Go to your Google Account > Security > 2-Step Verification > App Passwords, create a password for “Mail” on “Other device”, and use that 16-character string instead of your regular password. Never store credentials in source code — use environment variables or a secrets manager.

How do I search for emails within a date range?

Combine SINCE and BEFORE in your search string. Dates must be in the DD-Mon-YYYY format: mail.search(None, '(SINCE "01-May-2026" BEFORE "13-May-2026")'). Note that SINCE is inclusive but BEFORE is exclusive, so the example above returns May 1 through May 12.

How do I save email attachments?

Walk the message parts and check part.get_content_disposition() == 'attachment'. Then call part.get_payload(decode=True) to get the raw bytes, and write them to a file using the filename from part.get_filename(). Always sanitise the filename with os.path.basename() before writing to disk to prevent directory traversal attacks.

Can imaplib watch for new emails in real time?

The imaplib module does not directly support IMAP IDLE (push notifications). For real-time monitoring, use the imapclient third-party library, which wraps imaplib and adds IDLE support. Alternatively, poll every 60 seconds using a while True loop with time.sleep(60) — less efficient but works with any server.

How do I handle connection drops mid-batch?

Wrap your main loop in a try/except that catches imaplib.IMAP4.abort and imaplib.IMAP4.error. On abort, re-connect and re-select the mailbox, then resume from the last processed UID. Store processed UIDs in a set so you do not reprocess messages on reconnect. Long-running scripts should also send a mail.noop() every few minutes to keep the connection alive.

Conclusion

Python’s imaplib module gives you direct, scriptable access to any IMAP email account without needing any third-party packages. In this tutorial, you connected over TLS with IMAP4_SSL, listed and selected mailboxes, searched with criteria like UNSEEN, FROM, SUBJECT, and SINCE, fetched full messages and parsed headers with the email module’s decode_header(), and managed messages with store() and expunge().

The invoice extractor project shows how these primitives combine into a real automation tool. Extend it further by adding PDF attachment parsing, pushing data to a database, or integrating with Slack using smtplib‘s sibling for notifications. The IMAP protocol is remarkably capable once you learn the query syntax.

For deeper reading, the official Python documentation covers every method and flag in detail: imaplib — IMAP4 protocol client. Pair it with the email package documentation for full parsing power.