Intermediate
You have a hot-reload development server, a log aggregation pipeline, or a file-sync tool — and you need to react the instant a file changes on disk. Polling with os.listdir() every few seconds works but wastes CPU and introduces latency. Python’s watchdog library solves this by hooking into the operating system’s native filesystem event APIs: inotify on Linux, FSEvents on macOS, ReadDirectoryChangesW on Windows. You get real-time, zero-overhead change notifications delivered directly to your Python code.
Watchdog is a cross-platform filesystem monitoring library. You define an event handler — a class that responds to specific events like file created, modified, deleted, or moved — and attach it to an observer that watches a directory. The observer runs in a background thread, calling your handler the moment the OS reports a change. No polling, no missed events, no busy-waiting.
In this article you will learn how to install watchdog, handle the four core filesystem event types, filter events by file pattern, watch directories recursively, build a debounced handler for editors that trigger multiple events on save, and build a real-world file-sync tool. By the end you will be able to react to filesystem changes in real time in any Python application.
Watchdog Quick Example
Here is the minimal watchdog setup — watch a directory and print every event:
# watchdog_quick.py
from watchdog.observers import Observer
from watchdog.events import FileSystemEventHandler
import time
class PrintHandler(FileSystemEventHandler):
def on_any_event(self, event):
if not event.is_directory:
print(f"Event: {event.event_type} | Path: {event.src_path}")
observer = Observer()
observer.schedule(PrintHandler(), path=".", recursive=False)
observer.start()
print("Watching current directory. Press Ctrl+C to stop.")
try:
while True:
time.sleep(1)
except KeyboardInterrupt:
observer.stop()
observer.join()
Output (when you create/edit a file in the watched directory):
Watching current directory. Press Ctrl+C to stop.
Event: created | Path: ./notes.txt
Event: modified | Path: ./notes.txt
Event: modified | Path: ./notes.txt
The on_any_event method catches every filesystem event. The observer.schedule() call connects the handler to a directory path. With recursive=False, only the top-level directory is watched — set it to True to watch all subdirectories as well. Keep reading to see how to handle specific event types and filter by file pattern.
What Is Watchdog and How Does It Work?
Watchdog is a Python wrapper around three operating system APIs for filesystem events: inotify (Linux), FSEvents (macOS), and ReadDirectoryChangesW (Windows). Each API allows user-space programs to subscribe to filesystem events without polling. Watchdog abstracts these platform differences into a single Python interface.
| Event Type | When It Fires | Handler Method |
|---|---|---|
| created | New file or directory created | on_created(event) |
| modified | File content or metadata changed | on_modified(event) |
| deleted | File or directory removed | on_deleted(event) |
| moved | File or directory renamed/moved | on_moved(event) |
Installing Watchdog
# terminal
pip install watchdog
Verify the install:
# verify_watchdog.py
import watchdog
print(watchdog.__version__)
Output:
4.0.1
Handling Specific Event Types
Subclass FileSystemEventHandler and override the specific methods you need. Each method receives an event object with src_path, is_directory, and (for move events) dest_path:
# watchdog_events.py
from watchdog.observers import Observer
from watchdog.events import FileSystemEventHandler
from datetime import datetime
import time
class DetailedHandler(FileSystemEventHandler):
def on_created(self, event):
if not event.is_directory:
print(f"[{datetime.now():%H:%M:%S}] CREATED: {event.src_path}")
def on_modified(self, event):
if not event.is_directory:
print(f"[{datetime.now():%H:%M:%S}] MODIFIED: {event.src_path}")
def on_deleted(self, event):
label = "DIR" if event.is_directory else "FILE"
print(f"[{datetime.now():%H:%M:%S}] DELETED ({label}): {event.src_path}")
def on_moved(self, event):
print(f"[{datetime.now():%H:%M:%S}] MOVED: {event.src_path} -> {event.dest_path}")
observer = Observer()
observer.schedule(DetailedHandler(), path="./watched_dir", recursive=True)
observer.start()
print("Watching ./watched_dir recursively...")
try:
while True:
time.sleep(1)
except KeyboardInterrupt:
observer.stop()
observer.join()
Output:
Watching ./watched_dir recursively...
[09:30:15] CREATED: ./watched_dir/report.csv
[09:30:18] MODIFIED: ./watched_dir/report.csv
[09:30:22] MOVED: ./watched_dir/report.csv -> ./watched_dir/archive/report_2026.csv
[09:31:00] DELETED (FILE): ./watched_dir/old_log.txt
Always check event.is_directory before acting on events — many editors create temporary directories during file saves, and handling those incorrectly causes spurious actions in your pipeline.
Filtering by File Pattern
Use PatternMatchingEventHandler to restrict events to specific file types. This is more efficient than checking file extensions in your handler code:
# watchdog_patterns.py
from watchdog.observers import Observer
from watchdog.events import PatternMatchingEventHandler
import time
class PythonFileHandler(PatternMatchingEventHandler):
def __init__(self):
super().__init__(
patterns=["*.py", "*.pyi"],
ignore_patterns=["*.pyc", "*__pycache__*"],
ignore_directories=True,
case_sensitive=False,
)
def on_modified(self, event):
print(f"Python file changed: {event.src_path}")
# Trigger linting, testing, hot-reload, etc.
def on_created(self, event):
print(f"New Python file: {event.src_path}")
observer = Observer()
observer.schedule(PythonFileHandler(), path="./src", recursive=True)
observer.start()
print("Watching ./src for Python file changes...")
try:
while True:
time.sleep(1)
except KeyboardInterrupt:
observer.stop()
observer.join()
The patterns list uses shell-style wildcards. The ignore_patterns list filters out events you never want to see — always ignore *.pyc and __pycache__ when watching Python source. The case_sensitive=False parameter matters on Windows where filenames are case-insensitive by default.
Debouncing: Handling Multiple Events on Save
Most text editors trigger 2-4 events per save (write temp file, rename, modify original). If your handler triggers a long operation like running tests, you want to wait until the burst of events settles before acting. This is called debouncing:
# watchdog_debounce.py
from watchdog.observers import Observer
from watchdog.events import PatternMatchingEventHandler
import threading
import time
class DebouncedHandler(PatternMatchingEventHandler):
def __init__(self, debounce_seconds=0.5):
super().__init__(patterns=["*.py"], ignore_directories=True)
self.debounce_seconds = debounce_seconds
self._timer = None
self._last_path = None
def on_modified(self, event):
self._last_path = event.src_path
if self._timer is not None:
self._timer.cancel()
self._timer = threading.Timer(self.debounce_seconds, self._run_action)
self._timer.start()
def on_created(self, event):
self.on_modified(event)
def _run_action(self):
print(f"Running action for: {self._last_path}")
# Run your linter, test suite, or hot-reload here
observer = Observer()
observer.schedule(DebouncedHandler(debounce_seconds=0.5), path="./src", recursive=True)
observer.start()
print("Watching ./src with 500ms debounce...")
try:
while True:
time.sleep(1)
except KeyboardInterrupt:
observer.stop()
observer.join()
The debounce pattern uses a threading.Timer: every new event cancels the previous timer and starts a fresh one. The action only fires once the timer expires without being cancelled — meaning no events occurred for at least 0.5 seconds. This is the standard approach used by development servers and live-reloaders.
Real-Life Example: Automatic Log File Processor
Here is a complete log file processor that watches an input directory, processes new log files as they arrive, and moves them to an archive folder:
# log_processor.py
from watchdog.observers import Observer
from watchdog.events import PatternMatchingEventHandler
from pathlib import Path
from datetime import datetime
import time
import shutil
INPUT_DIR = Path("./logs/incoming")
PROCESSED_DIR = Path("./logs/processed")
ERROR_DIR = Path("./logs/errors")
# Ensure directories exist
for d in [INPUT_DIR, PROCESSED_DIR, ERROR_DIR]:
d.mkdir(parents=True, exist_ok=True)
def parse_log_line(line):
parts = line.strip().split(" ", 3)
if len(parts) >= 4:
return {"date": parts[0], "time": parts[1], "level": parts[2], "message": parts[3]}
return None
def process_log_file(filepath):
path = Path(filepath)
errors = []
warnings = []
total = 0
try:
with open(path, "r", encoding="utf-8") as f:
for line in f:
total += 1
parsed = parse_log_line(line)
if parsed:
if parsed["level"] == "ERROR":
errors.append(parsed["message"])
elif parsed["level"] == "WARNING":
warnings.append(parsed["message"])
print(f"Processed {path.name}: {total} lines, {len(errors)} errors, {len(warnings)} warnings")
if errors:
print(f" Errors found:")
for err in errors[:3]:
print(f" - {err}")
# Move to processed dir with timestamp
ts = datetime.now().strftime("%Y%m%d_%H%M%S")
dest = PROCESSED_DIR / f"{ts}_{path.name}"
shutil.move(str(path), str(dest))
print(f" Moved to: {dest}")
return True
except Exception as e:
print(f"Failed to process {path.name}: {e}")
error_dest = ERROR_DIR / path.name
shutil.move(str(path), str(error_dest))
return False
class LogFileHandler(PatternMatchingEventHandler):
def __init__(self):
super().__init__(
patterns=["*.log", "*.txt"],
ignore_directories=True
)
def on_created(self, event):
# Wait briefly for the file write to complete
time.sleep(0.2)
print(f"\nNew log file detected: {event.src_path}")
process_log_file(event.src_path)
observer = Observer()
observer.schedule(LogFileHandler(), path=str(INPUT_DIR), recursive=False)
observer.start()
print(f"Log processor watching: {INPUT_DIR}")
print(f"Processed files go to: {PROCESSED_DIR}\n")
try:
while True:
time.sleep(1)
except KeyboardInterrupt:
observer.stop()
observer.join()
print("Log processor stopped.")
Output (when a log file is dropped into the input directory):
Log processor watching: ./logs/incoming
Processed files go to: ./logs/processed
New log file detected: ./logs/incoming/app_2026_05_03.log
Processed app_2026_05_03.log: 1247 lines, 3 errors, 12 warnings
Errors found:
- Database connection timeout after 30s
- Failed to parse response from payment API
- Disk usage exceeded 90% threshold
Moved to: ./logs/processed/20260503_091534_app_2026_05_03.log
The time.sleep(0.2) in on_created is a practical necessity — the OS fires the created event as soon as the file descriptor is opened, before the content is fully written. The brief pause gives the writer time to finish. For large files, use a loop that polls file size until it stops growing, or use the on_modified event with debouncing instead.
Frequently Asked Questions
How many files can watchdog monitor without performance issues?
Watchdog relies on the OS kernel’s inotify/FSEvents API, which is very efficient. On Linux, the default inotify limit is 8,192 watches per user. For large directory trees with thousands of subdirectories, you may need to increase this: echo 524288 | sudo tee /proc/sys/fs/inotify/max_user_watches. On macOS and Windows, FSEvents and ReadDirectoryChangesW have different (generally higher) limits but the same principle applies — monitoring tens of thousands of files is fine; millions requires tuning.
Does watchdog work on network drives or NFS mounts?
Not reliably. inotify and FSEvents only work on local filesystems — they do not receive events for changes made on remote systems. For network drives, fall back to polling: use watchdog.observers.polling.PollingObserver instead of Observer. The API is identical but it polls the filesystem at regular intervals. You lose the real-time benefit but gain cross-filesystem compatibility.
How do I watch multiple directories with the same observer?
Call observer.schedule() multiple times: observer.schedule(handler, path="/dir1") and observer.schedule(handler, path="/dir2"). You can use the same handler instance for both or different handler instances. The observer runs all scheduled watches in a single background thread, so avoid long-running operations in event handlers — offload heavy work to a thread pool or queue instead.
Does watchdog follow symlinks?
By default, no. Pass observer.schedule(handler, path, recursive=True) does not follow symlinks automatically. If your directory contains symlinks to other directories and you want events from those paths, you need to watch them separately by resolving the symlink targets with Path.resolve() and scheduling them individually.
How do I safely read a file that is still being written?
The safest pattern: listen for on_closed events (available in watchdog 2.x+) instead of on_modified. The on_closed event fires after the file writer closes the file descriptor, guaranteeing the write is complete. If you must use on_modified, poll the file size in a loop with a short sleep until it stops changing, then read it.
Conclusion
Watchdog turns filesystem events into Python callbacks with a clean, cross-platform API. The core pattern — create a handler subclass, schedule it with an observer, start the observer in a background thread — works the same on Linux, macOS, and Windows without any changes. Use PatternMatchingEventHandler to filter by file type, use debouncing for editor compatibility, and use the PollingObserver for network drives.
The log file processor example shows a complete real-world use case: detect new files, process them immediately, move them to an archive. Extend it with a work queue (using Python’s queue.Queue) to handle concurrent file arrivals safely, or add a Flask endpoint to report processing statistics.
For installation options, the full event API, and platform-specific notes, see the Watchdog documentation.