Intermediate

You have a Python script that works perfectly on your Mac, but the moment a teammate runs it on Windows, the file paths break. Maybe you have been joining paths with string concatenation and forward slashes, or juggling os.path.join(), os.path.exists(), and os.path.basename() calls scattered across your codebase. There is a better way.

Python’s pathlib module, available since Python 3.4 and now the recommended approach for file system operations, gives you an object-oriented interface for working with paths. Instead of treating paths as raw strings, you get Path objects with methods for reading, writing, searching, and navigating directories — all cross-platform by default.

In this article, we will start with a quick example to get you productive in 30 seconds, then cover what pathlib is and why it replaced os.path. From there, we will walk through creating and navigating paths, reading and writing files, searching with glob patterns, and working with file metadata. We will finish with a real-life project that ties everything together. By the end, you will never go back to string-based path manipulation.

Python Pathlib Quick Example

# quick_pathlib.py
from pathlib import Path

# Create a path and explore it
project = Path.cwd() / "my_project" / "data"
project.mkdir(parents=True, exist_ok=True)

# Write a file
config = project / "settings.txt"
config.write_text("debug=True\nport=8080")

# Read it back
print(config.read_text())
print(f"File size: {config.stat().st_size} bytes")
print(f"Parent folder: {config.parent.name}")

Output:

debug=True
port=8080
File size: 21 bytes
Parent folder: data

Notice how the / operator joins path components — no os.path.join() needed. The Path object handles platform-specific separators automatically. Methods like write_text(), read_text(), and stat() replace several lines of boilerplate file I/O code. Let us dig deeper into each of these features below.

What Is Pathlib and Why Use It?

The pathlib module provides classes representing filesystem paths with semantics appropriate for different operating systems. The main class you will use is Path, which automatically gives you a PosixPath on Linux/Mac or a WindowsPath on Windows.

Think of it like this: os.path treats paths as strings and gives you standalone functions to manipulate them. pathlib treats paths as objects that know how to manipulate themselves. It is the difference between calling len(my_string) (a function on data) and calling my_path.exists() (a method on an object that understands what it is).

Taskos.path (old way)pathlib (modern way)
Join pathsos.path.join(a, b)Path(a) / b
Check existenceos.path.exists(p)p.exists()
Get filenameos.path.basename(p)p.name
Get extensionos.path.splitext(p)[1]p.suffix
Read fileopen(p).read()p.read_text()
List directoryos.listdir(p)p.iterdir()
Find filesglob.glob(pattern)p.glob(pattern)

Since Python 3.6, Path objects work everywhere a string path is accepted: open(), json.load(), csv.reader(), shutil.copy(), and most third-party libraries. There is almost no reason to convert a Path to a string anymore.

os.path.join vs pathlib comparison
os.path.join() called 47 times in one file. There had to be a better way.

Creating and Navigating Paths

There are several ways to create Path objects depending on your starting point. Let us look at the most common patterns.

# creating_paths.py
from pathlib import Path

# From a string
config_path = Path("/etc/hosts")
print(f"From string: {config_path}")

# Current working directory
cwd = Path.cwd()
print(f"Current dir: {cwd}")

# Home directory
home = Path.home()
print(f"Home dir: {home}")

# Join with the / operator
data_file = home / "Documents" / "data.csv"
print(f"Joined path: {data_file}")

# Path components
print(f"Name: {data_file.name}")
print(f"Stem: {data_file.stem}")
print(f"Suffix: {data_file.suffix}")
print(f"Parent: {data_file.parent}")
print(f"Parts: {data_file.parts}")

Output:

From string: /etc/hosts
Current dir: /home/user/projects
Home dir: /home/user
Joined path: /home/user/Documents/data.csv
Name: data.csv
Stem: data
Suffix: .csv
Parent: /home/user/Documents
Parts: ('/', 'home', 'user', 'Documents', 'data.csv')

The / operator is the star here — it replaces os.path.join() with something that reads like an actual file path. The .name, .stem, .suffix, and .parent properties give you instant access to path components without parsing strings.

Resolving and Converting Paths

# resolving_paths.py
from pathlib import Path

# Resolve relative paths to absolute
relative = Path("../data/config.json")
absolute = relative.resolve()
print(f"Resolved: {absolute}")

# Convert to string when needed
path_str = str(Path.home() / "file.txt")
print(f"As string: {path_str}")
print(f"Type: {type(path_str)}")

# Change extension
readme = Path("docs/guide.md")
html_version = readme.with_suffix(".html")
print(f"Changed suffix: {html_version}")

# Change filename
new_name = readme.with_name("tutorial.md")
print(f"Changed name: {new_name}")

Output:

Resolved: /home/user/projects/data/config.json
As string: /home/user/file.txt
Type: <class 'str'>
Changed suffix: docs/guide.html
Changed name: docs/tutorial.md

The resolve() method is essential for turning relative paths into absolute ones — it also resolves symlinks. Use with_suffix() and with_name() to create path variations without string manipulation.

Reading and Writing Files

Path objects have built-in methods for common file I/O operations. These methods handle opening and closing files automatically, so you do not need context managers for simple read/write operations.

# file_io.py
from pathlib import Path

# Setup
demo_dir = Path.cwd() / "pathlib_demo"
demo_dir.mkdir(exist_ok=True)

# Write text
notes = demo_dir / "notes.txt"
notes.write_text("Line 1: Python is great\nLine 2: Pathlib is better")
print(f"Wrote {notes.stat().st_size} bytes")

# Read text
content = notes.read_text()
print(f"Content:\n{content}")

# Write bytes (useful for binary data)
binary_file = demo_dir / "data.bin"
binary_file.write_bytes(b"\x89PNG\r\n\x1a\n")
print(f"\nBinary file size: {binary_file.stat().st_size} bytes")

# Append text (use open() for append mode)
with notes.open("a") as f:
    f.write("\nLine 3: Appended with open()")

print(f"\nAfter append:\n{notes.read_text()}")

Output:

Wrote 49 bytes
Content:
Line 1: Python is great
Line 2: Pathlib is better

Binary file size: 8 bytes

After append:
Line 1: Python is great
Line 2: Pathlib is better
Line 3: Appended with open()

For simple read/write operations, read_text() and write_text() are all you need — they open the file, perform the operation, and close it in one call. For more complex operations like appending or reading line by line, use path.open() which works exactly like the built-in open() function.

File I/O with pathlib write_text and read_text
write_text() and read_text() — file I/O without the ceremony.

Directory Operations

Creating, listing, and removing directories are everyday tasks. pathlib makes each of these a single method call.

# directory_ops.py
from pathlib import Path

base = Path.cwd() / "project_skeleton"

# Create nested directories
(base / "src" / "utils").mkdir(parents=True, exist_ok=True)
(base / "tests").mkdir(parents=True, exist_ok=True)
(base / "docs").mkdir(parents=True, exist_ok=True)

# Create some files
(base / "src" / "__init__.py").write_text("")
(base / "src" / "main.py").write_text("print('hello')")
(base / "src" / "utils" / "helpers.py").write_text("# helpers")
(base / "tests" / "test_main.py").write_text("# tests")
(base / "README.md").write_text("# My Project")

# List immediate children
print("Top-level contents:")
for item in sorted(base.iterdir()):
    icon = "D" if item.is_dir() else "F"
    print(f"  [{icon}] {item.name}")

# Check properties
print(f"\nIs directory: {base.is_dir()}")
print(f"Is file: {(base / 'README.md').is_file()}")
print(f"Exists: {(base / 'missing.txt').exists()}")

Output:

Top-level contents:
  [F] README.md
  [D] docs
  [D] src
  [D] tests

Is directory: True
Is file: True
Exists: False

The mkdir(parents=True, exist_ok=True) combination is your best friend — it creates the full directory tree and does not raise an error if it already exists. The iterdir() method returns a generator of all items in a directory, which you can filter with is_dir() and is_file().

Searching with Glob Patterns

Finding files by pattern is one of pathlib‘s strongest features. The glob() method searches a directory, while rglob() searches recursively through all subdirectories.

# glob_search.py
from pathlib import Path

base = Path.cwd() / "project_skeleton"

# Find all Python files in src/
print("Python files in src/:")
for py_file in sorted((base / "src").glob("*.py")):
    print(f"  {py_file.name}")

# Recursive search - all .py files anywhere
print("\nAll Python files (recursive):")
for py_file in sorted(base.rglob("*.py")):
    print(f"  {py_file.relative_to(base)}")

# Find all markdown files
print("\nMarkdown files:")
for md_file in base.rglob("*.md"):
    print(f"  {md_file.name} ({md_file.stat().st_size} bytes)")

# Multiple patterns
print("\nPython and Markdown files:")
for pattern in ["*.py", "*.md"]:
    for f in base.rglob(pattern):
        print(f"  {f.relative_to(base)}")

Output:

Python files in src/:
  __init__.py
  main.py

All Python files (recursive):
  src/__init__.py
  src/main.py
  src/utils/helpers.py
  tests/test_main.py

Markdown files:
  README.md (12 bytes)

Python and Markdown files:
  README.md
  src/__init__.py
  src/main.py
  src/utils/helpers.py
  tests/test_main.py

The rglob() method is particularly powerful — rglob("*.py") is equivalent to glob("**/*.py") but shorter. Use relative_to() to display paths relative to a base directory, which is much cleaner than showing full absolute paths.

Searching files with pathlib rglob
rglob(‘*.py’) — find every Python file, no matter how deep it hides.

Working with File Metadata

Every file carries metadata — size, creation time, modification time, and permissions. pathlib gives you access to all of this through the stat() method.

# file_metadata.py
from pathlib import Path
from datetime import datetime

target = Path.cwd() / "project_skeleton" / "src" / "main.py"

# Get file stats
stats = target.stat()
print(f"File: {target.name}")
print(f"Size: {stats.st_size} bytes")
print(f"Modified: {datetime.fromtimestamp(stats.st_mtime).strftime('%Y-%m-%d %H:%M:%S')}")
print(f"Created: {datetime.fromtimestamp(stats.st_ctime).strftime('%Y-%m-%d %H:%M:%S')}")

# Check file type
print(f"\nIs file: {target.is_file()}")
print(f"Is symlink: {target.is_symlink()}")
print(f"Suffix: {target.suffix}")

# Get directory size (sum of all files)
project = Path.cwd() / "project_skeleton"
total_size = sum(f.stat().st_size for f in project.rglob("*") if f.is_file())
file_count = sum(1 for f in project.rglob("*") if f.is_file())
print(f"\nProject: {file_count} files, {total_size} bytes total")

Output:

File: main.py
Size: 14 bytes
Modified: 2026-04-12 09:15:30
Created: 2026-04-12 09:15:30

Is file: True
Is symlink: False
Suffix: .py

Project: 5 files, 45 bytes total

The stat() method returns the same information as os.stat() but you call it directly on the path object. Combining rglob() with stat() in a generator expression is a common pattern for calculating directory sizes or finding the largest files.

Real-Life Example: Project File Organizer

Building a file organizer with pathlib
Forty downloads, zero organization. Time for pathlib to sort this out.

Let us build a practical file organizer that sorts files in a directory by their extension into categorized subfolders. This is something you might use to clean up a messy Downloads folder.

# file_organizer.py
from pathlib import Path
from collections import defaultdict

CATEGORIES = {
    "Images": [".jpg", ".jpeg", ".png", ".gif", ".svg", ".webp"],
    "Documents": [".pdf", ".doc", ".docx", ".txt", ".md", ".csv", ".xlsx"],
    "Code": [".py", ".js", ".html", ".css", ".json", ".yaml", ".yml"],
    "Archives": [".zip", ".tar", ".gz", ".rar", ".7z"],
    "Media": [".mp3", ".mp4", ".wav", ".avi", ".mkv"],
}

def get_category(suffix):
    for category, extensions in CATEGORIES.items():
        if suffix.lower() in extensions:
            return category
    return "Other"

def organize_directory(source_dir):
    source = Path(source_dir)
    if not source.is_dir():
        print(f"Error: {source} is not a valid directory")
        return

    moved = defaultdict(list)

    for item in source.iterdir():
        if item.is_file() and not item.name.startswith("."):
            category = get_category(item.suffix)
            target_dir = source / category
            target_dir.mkdir(exist_ok=True)

            target_file = target_dir / item.name
            if target_file.exists():
                stem = item.stem
                suffix = item.suffix
                counter = 1
                while target_file.exists():
                    target_file = target_dir / f"{stem}_{counter}{suffix}"
                    counter += 1

            item.rename(target_file)
            moved[category].append(item.name)

    print("Organization complete!")
    for category, files in sorted(moved.items()):
        print(f"  {category}: {len(files)} files")
        for f in files[:3]:
            print(f"    - {f}")
        if len(files) > 3:
            print(f"    ... and {len(files) - 3} more")

# Demo with test files
demo = Path.cwd() / "messy_folder"
demo.mkdir(exist_ok=True)
for name in ["photo.jpg", "report.pdf", "script.py", "data.csv", "notes.txt", "archive.zip", "song.mp3"]:
    (demo / name).write_text("sample content")

organize_directory(demo)

Output:

Organization complete!
  Archives: 1 files
    - archive.zip
  Code: 1 files
    - script.py
  Documents: 3 files
    - report.pdf
    - data.csv
    - notes.txt
  Images: 1 files
    - photo.jpg
  Media: 1 files
    - song.mp3

This organizer uses nearly every pathlib feature we covered: iterdir() to list files, .suffix to check extensions, mkdir() to create category folders, .exists() to handle duplicates, and rename() to move files. You could extend this by adding logging, a dry-run mode, or custom category rules loaded from a JSON config file.

Organized files with pathlib
Downloads folder: before pathlib, a landfill. After pathlib, a library.

Frequently Asked Questions

Can I use pathlib with older Python versions?

pathlib was added in Python 3.4 and has been improved in every release since. Python 3.6 added os.fspath() support so Path objects work with open() and other built-in functions. If you are on Python 3.6 or later (which you should be — 3.5 reached end of life in 2020), pathlib works everywhere.

Should I completely replace os.path with pathlib?

For new code, yes. pathlib is the recommended approach per Python’s official documentation. The only exception is if you are working in a codebase that heavily uses os.path and consistency matters more than modernization. You can mix both — Path objects accept strings, and str(path) converts back.

How do I handle permissions with pathlib?

Use path.chmod(mode) to set permissions and path.stat().st_mode to read them. For example, script.chmod(0o755) makes a file executable. The stat module helps decode permission bits: import stat; print(stat.filemode(path.stat().st_mode)) gives human-readable output like -rwxr-xr-x.

What happens if I use pathlib on Windows with forward slashes?

pathlib handles this automatically. When you write Path("src") / "utils" / "helpers.py", it produces src\utils\helpers.py on Windows and src/utils/helpers.py on Linux/Mac. You never need to worry about separator characters — that is the whole point of using Path objects instead of strings.

How do I delete files and directories with pathlib?

Use path.unlink() to delete a file and path.rmdir() to remove an empty directory. For non-empty directories, combine with shutil.rmtree(): import shutil; shutil.rmtree(path). Since Python 3.8, unlink() accepts a missing_ok=True parameter so it does not raise an error if the file is already gone.

Conclusion

We covered the full range of pathlib operations: creating and joining paths with the / operator, reading and writing files with read_text() and write_text(), navigating directories with iterdir(), searching with glob() and rglob(), and inspecting file metadata with stat(). The real-life file organizer showed how these pieces come together in a practical tool.

Try extending the file organizer with features like recursive organization, file size filtering, or a configuration file that defines custom categories. The pathlib module has even more methods we did not cover — check the official pathlib documentation for the complete reference.