Intermediate

You have a hot-reload development server, a log aggregation pipeline, or a file-sync tool — and you need to react the instant a file changes on disk. Polling with os.listdir() every few seconds works but wastes CPU and introduces latency. Python’s watchdog library solves this by hooking into the operating system’s native filesystem event APIs: inotify on Linux, FSEvents on macOS, ReadDirectoryChangesW on Windows. You get real-time, zero-overhead change notifications delivered directly to your Python code.

Watchdog is a cross-platform filesystem monitoring library. You define an event handler — a class that responds to specific events like file created, modified, deleted, or moved — and attach it to an observer that watches a directory. The observer runs in a background thread, calling your handler the moment the OS reports a change. No polling, no missed events, no busy-waiting.

In this article you will learn how to install watchdog, handle the four core filesystem event types, filter events by file pattern, watch directories recursively, build a debounced handler for editors that trigger multiple events on save, and build a real-world file-sync tool. By the end you will be able to react to filesystem changes in real time in any Python application.

Watchdog Quick Example

Here is the minimal watchdog setup — watch a directory and print every event:

# watchdog_quick.py
from watchdog.observers import Observer
from watchdog.events import FileSystemEventHandler
import time

class PrintHandler(FileSystemEventHandler):
    def on_any_event(self, event):
        if not event.is_directory:
            print(f"Event: {event.event_type} | Path: {event.src_path}")

observer = Observer()
observer.schedule(PrintHandler(), path=".", recursive=False)
observer.start()
print("Watching current directory. Press Ctrl+C to stop.")

try:
    while True:
        time.sleep(1)
except KeyboardInterrupt:
    observer.stop()
observer.join()

Output (when you create/edit a file in the watched directory):

Watching current directory. Press Ctrl+C to stop.
Event: created | Path: ./notes.txt
Event: modified | Path: ./notes.txt
Event: modified | Path: ./notes.txt

The on_any_event method catches every filesystem event. The observer.schedule() call connects the handler to a directory path. With recursive=False, only the top-level directory is watched — set it to True to watch all subdirectories as well. Keep reading to see how to handle specific event types and filter by file pattern.

What Is Watchdog and How Does It Work?

Watchdog is a Python wrapper around three operating system APIs for filesystem events: inotify (Linux), FSEvents (macOS), and ReadDirectoryChangesW (Windows). Each API allows user-space programs to subscribe to filesystem events without polling. Watchdog abstracts these platform differences into a single Python interface.

Event TypeWhen It FiresHandler Method
createdNew file or directory createdon_created(event)
modifiedFile content or metadata changedon_modified(event)
deletedFile or directory removedon_deleted(event)
movedFile or directory renamed/movedon_moved(event)
Watchdog tutorial 1
inotify says a file changed. Watchdog says which one, what happened, and when.

Installing Watchdog

# terminal
pip install watchdog

Verify the install:

# verify_watchdog.py
import watchdog
print(watchdog.__version__)

Output:

4.0.1

Handling Specific Event Types

Subclass FileSystemEventHandler and override the specific methods you need. Each method receives an event object with src_path, is_directory, and (for move events) dest_path:

# watchdog_events.py
from watchdog.observers import Observer
from watchdog.events import FileSystemEventHandler
from datetime import datetime
import time

class DetailedHandler(FileSystemEventHandler):
    def on_created(self, event):
        if not event.is_directory:
            print(f"[{datetime.now():%H:%M:%S}] CREATED: {event.src_path}")

    def on_modified(self, event):
        if not event.is_directory:
            print(f"[{datetime.now():%H:%M:%S}] MODIFIED: {event.src_path}")

    def on_deleted(self, event):
        label = "DIR" if event.is_directory else "FILE"
        print(f"[{datetime.now():%H:%M:%S}] DELETED ({label}): {event.src_path}")

    def on_moved(self, event):
        print(f"[{datetime.now():%H:%M:%S}] MOVED: {event.src_path} -> {event.dest_path}")

observer = Observer()
observer.schedule(DetailedHandler(), path="./watched_dir", recursive=True)
observer.start()
print("Watching ./watched_dir recursively...")

try:
    while True:
        time.sleep(1)
except KeyboardInterrupt:
    observer.stop()
observer.join()

Output:

Watching ./watched_dir recursively...
[09:30:15] CREATED: ./watched_dir/report.csv
[09:30:18] MODIFIED: ./watched_dir/report.csv
[09:30:22] MOVED: ./watched_dir/report.csv -> ./watched_dir/archive/report_2026.csv
[09:31:00] DELETED (FILE): ./watched_dir/old_log.txt

Always check event.is_directory before acting on events — many editors create temporary directories during file saves, and handling those incorrectly causes spurious actions in your pipeline.

Filtering by File Pattern

Use PatternMatchingEventHandler to restrict events to specific file types. This is more efficient than checking file extensions in your handler code:

# watchdog_patterns.py
from watchdog.observers import Observer
from watchdog.events import PatternMatchingEventHandler
import time

class PythonFileHandler(PatternMatchingEventHandler):
    def __init__(self):
        super().__init__(
            patterns=["*.py", "*.pyi"],
            ignore_patterns=["*.pyc", "*__pycache__*"],
            ignore_directories=True,
            case_sensitive=False,
        )

    def on_modified(self, event):
        print(f"Python file changed: {event.src_path}")
        # Trigger linting, testing, hot-reload, etc.

    def on_created(self, event):
        print(f"New Python file: {event.src_path}")

observer = Observer()
observer.schedule(PythonFileHandler(), path="./src", recursive=True)
observer.start()
print("Watching ./src for Python file changes...")

try:
    while True:
        time.sleep(1)
except KeyboardInterrupt:
    observer.stop()
observer.join()

The patterns list uses shell-style wildcards. The ignore_patterns list filters out events you never want to see — always ignore *.pyc and __pycache__ when watching Python source. The case_sensitive=False parameter matters on Windows where filenames are case-insensitive by default.

Watchdog tutorial 2
patterns=[“*.py”], ignore_patterns=[“*.pyc”]. Signal separated from noise.

Debouncing: Handling Multiple Events on Save

Most text editors trigger 2-4 events per save (write temp file, rename, modify original). If your handler triggers a long operation like running tests, you want to wait until the burst of events settles before acting. This is called debouncing:

# watchdog_debounce.py
from watchdog.observers import Observer
from watchdog.events import PatternMatchingEventHandler
import threading
import time

class DebouncedHandler(PatternMatchingEventHandler):
    def __init__(self, debounce_seconds=0.5):
        super().__init__(patterns=["*.py"], ignore_directories=True)
        self.debounce_seconds = debounce_seconds
        self._timer = None
        self._last_path = None

    def on_modified(self, event):
        self._last_path = event.src_path
        if self._timer is not None:
            self._timer.cancel()
        self._timer = threading.Timer(self.debounce_seconds, self._run_action)
        self._timer.start()

    def on_created(self, event):
        self.on_modified(event)

    def _run_action(self):
        print(f"Running action for: {self._last_path}")
        # Run your linter, test suite, or hot-reload here

observer = Observer()
observer.schedule(DebouncedHandler(debounce_seconds=0.5), path="./src", recursive=True)
observer.start()
print("Watching ./src with 500ms debounce...")

try:
    while True:
        time.sleep(1)
except KeyboardInterrupt:
    observer.stop()
observer.join()

The debounce pattern uses a threading.Timer: every new event cancels the previous timer and starts a fresh one. The action only fires once the timer expires without being cancelled — meaning no events occurred for at least 0.5 seconds. This is the standard approach used by development servers and live-reloaders.

Real-Life Example: Automatic Log File Processor

Watchdog tutorial 3
Log files appear. They get processed. Nobody had to write a cron job.

Here is a complete log file processor that watches an input directory, processes new log files as they arrive, and moves them to an archive folder:

# log_processor.py
from watchdog.observers import Observer
from watchdog.events import PatternMatchingEventHandler
from pathlib import Path
from datetime import datetime
import time
import shutil

INPUT_DIR = Path("./logs/incoming")
PROCESSED_DIR = Path("./logs/processed")
ERROR_DIR = Path("./logs/errors")

# Ensure directories exist
for d in [INPUT_DIR, PROCESSED_DIR, ERROR_DIR]:
    d.mkdir(parents=True, exist_ok=True)

def parse_log_line(line):
    parts = line.strip().split(" ", 3)
    if len(parts) >= 4:
        return {"date": parts[0], "time": parts[1], "level": parts[2], "message": parts[3]}
    return None

def process_log_file(filepath):
    path = Path(filepath)
    errors = []
    warnings = []
    total = 0

    try:
        with open(path, "r", encoding="utf-8") as f:
            for line in f:
                total += 1
                parsed = parse_log_line(line)
                if parsed:
                    if parsed["level"] == "ERROR":
                        errors.append(parsed["message"])
                    elif parsed["level"] == "WARNING":
                        warnings.append(parsed["message"])

        print(f"Processed {path.name}: {total} lines, {len(errors)} errors, {len(warnings)} warnings")

        if errors:
            print(f"  Errors found:")
            for err in errors[:3]:
                print(f"    - {err}")

        # Move to processed dir with timestamp
        ts = datetime.now().strftime("%Y%m%d_%H%M%S")
        dest = PROCESSED_DIR / f"{ts}_{path.name}"
        shutil.move(str(path), str(dest))
        print(f"  Moved to: {dest}")
        return True

    except Exception as e:
        print(f"Failed to process {path.name}: {e}")
        error_dest = ERROR_DIR / path.name
        shutil.move(str(path), str(error_dest))
        return False


class LogFileHandler(PatternMatchingEventHandler):
    def __init__(self):
        super().__init__(
            patterns=["*.log", "*.txt"],
            ignore_directories=True
        )

    def on_created(self, event):
        # Wait briefly for the file write to complete
        time.sleep(0.2)
        print(f"\nNew log file detected: {event.src_path}")
        process_log_file(event.src_path)


observer = Observer()
observer.schedule(LogFileHandler(), path=str(INPUT_DIR), recursive=False)
observer.start()
print(f"Log processor watching: {INPUT_DIR}")
print(f"Processed files go to: {PROCESSED_DIR}\n")

try:
    while True:
        time.sleep(1)
except KeyboardInterrupt:
    observer.stop()
observer.join()
print("Log processor stopped.")

Output (when a log file is dropped into the input directory):

Log processor watching: ./logs/incoming
Processed files go to: ./logs/processed

New log file detected: ./logs/incoming/app_2026_05_03.log
Processed app_2026_05_03.log: 1247 lines, 3 errors, 12 warnings
  Errors found:
    - Database connection timeout after 30s
    - Failed to parse response from payment API
    - Disk usage exceeded 90% threshold
  Moved to: ./logs/processed/20260503_091534_app_2026_05_03.log

The time.sleep(0.2) in on_created is a practical necessity — the OS fires the created event as soon as the file descriptor is opened, before the content is fully written. The brief pause gives the writer time to finish. For large files, use a loop that polls file size until it stops growing, or use the on_modified event with debouncing instead.

Watchdog tutorial 4
on_any_event fired 47 times. The debouncer fired once.

Frequently Asked Questions

How many files can watchdog monitor without performance issues?

Watchdog relies on the OS kernel’s inotify/FSEvents API, which is very efficient. On Linux, the default inotify limit is 8,192 watches per user. For large directory trees with thousands of subdirectories, you may need to increase this: echo 524288 | sudo tee /proc/sys/fs/inotify/max_user_watches. On macOS and Windows, FSEvents and ReadDirectoryChangesW have different (generally higher) limits but the same principle applies — monitoring tens of thousands of files is fine; millions requires tuning.

Does watchdog work on network drives or NFS mounts?

Not reliably. inotify and FSEvents only work on local filesystems — they do not receive events for changes made on remote systems. For network drives, fall back to polling: use watchdog.observers.polling.PollingObserver instead of Observer. The API is identical but it polls the filesystem at regular intervals. You lose the real-time benefit but gain cross-filesystem compatibility.

How do I watch multiple directories with the same observer?

Call observer.schedule() multiple times: observer.schedule(handler, path="/dir1") and observer.schedule(handler, path="/dir2"). You can use the same handler instance for both or different handler instances. The observer runs all scheduled watches in a single background thread, so avoid long-running operations in event handlers — offload heavy work to a thread pool or queue instead.

By default, no. Pass observer.schedule(handler, path, recursive=True) does not follow symlinks automatically. If your directory contains symlinks to other directories and you want events from those paths, you need to watch them separately by resolving the symlink targets with Path.resolve() and scheduling them individually.

How do I safely read a file that is still being written?

The safest pattern: listen for on_closed events (available in watchdog 2.x+) instead of on_modified. The on_closed event fires after the file writer closes the file descriptor, guaranteeing the write is complete. If you must use on_modified, poll the file size in a loop with a short sleep until it stops changing, then read it.

Conclusion

Watchdog turns filesystem events into Python callbacks with a clean, cross-platform API. The core pattern — create a handler subclass, schedule it with an observer, start the observer in a background thread — works the same on Linux, macOS, and Windows without any changes. Use PatternMatchingEventHandler to filter by file type, use debouncing for editor compatibility, and use the PollingObserver for network drives.

The log file processor example shows a complete real-world use case: detect new files, process them immediately, move them to an archive. Extend it with a work queue (using Python’s queue.Queue) to handle concurrent file arrivals safely, or add a Flask endpoint to report processing statistics.

For installation options, the full event API, and platform-specific notes, see the Watchdog documentation.