Intermediate

Python’s built-in sorting handles strings lexicographically — which means “file10.txt” sorts before “file2.txt” because “1” comes before “2” in ASCII. This is fine for computers, but deeply frustrating for anyone who has to look at a sorted file listing. Natural sort order — the order humans expect when they see numbers inside strings — is what natsort provides. It is the difference between a file manager that makes sense and one that scrambles your numbered files.

The natsort library is a pure-Python solution that detects numeric substrings inside strings and compares them as numbers, not as character sequences. It handles filenames, version strings, IP addresses, biological sequences, and locale-aware sorting with a clean, minimal API. No need to write custom sort key functions for every new case.

In this article, you will learn how natsort works, how to use natsorted() for common cases, sort paths and version numbers, use natural sort keys with sort() and sorted(), handle case-insensitive and locale-aware sorting, and build a practical file organizer that uses natural sort to group and display files. Install natsort with pip install natsort.

Natural Sorting: Quick Example

# natsort_quick.py
from natsort import natsorted

files = ['file10.txt', 'file2.txt', 'file1.txt', 'file20.txt', 'file3.txt']

# Default Python sort -- lexicographic (wrong order for humans)
print("Default sort:", sorted(files))

# Natural sort -- numbers compared as integers
print("Natural sort:", natsorted(files))

Output:

Default sort: ['file1.txt', 'file10.txt', 'file2.txt', 'file20.txt', 'file3.txt']
Natural sort: ['file1.txt', 'file2.txt', 'file3.txt', 'file10.txt', 'file20.txt']

That one import and one function call is all it takes to fix the ordering. natsorted() returns a new list, just like sorted(). The rest of this article covers the full range of what natsort can do for you beyond basic filename sorting.

What Is Natural Sort Order?

Natural sort order is a collation algorithm that treats numeric substrings within strings as actual numbers when comparing. Instead of comparing character by character (‘2’ vs ‘1’ vs ‘0’), it identifies contiguous digit sequences, converts them to integers, and compares those integers. This produces an ordering that matches human intuition.

InputLexicographic (default)Natural sort (natsort)
file1, file2, file10file1, file10, file2file1, file2, file10
v1.2, v1.10, v1.9v1.10, v1.2, v1.9v1.2, v1.9, v1.10
Chapter 1, Chapter 2, Chapter 10Chapter 1, Chapter 10, Chapter 2Chapter 1, Chapter 2, Chapter 10
img_001.jpg, img_010.jpg, img_100.jpgCorrect (leading zeros)Correct (numeric value)

The key insight: lexicographic sorting is correct for most string data (names, words, identifiers), but breaks down whenever numeric sequences appear within strings. natsort detects these sequences and applies numeric comparison only where it is appropriate, leaving the rest of the string to standard character comparison.

Loop Larry confused by file ordering
file10 before file2. Computers are right. Computers are also wrong.

Using natsorted() for Common Cases

The main entry point is natsorted(), which accepts any iterable and optional keyword arguments to control sorting behavior:

# natsort_basics.py
from natsort import natsorted, ns

# Version strings
versions = ['1.10.2', '1.9.0', '2.0.1', '1.2.3', '1.10.0']
print("Versions:", natsorted(versions))

# Mixed strings with numbers and text
items = ['item_5', 'item_12', 'item_3', 'item_10', 'item_1']
print("Items:", natsorted(items))

# Case-insensitive natural sort
mixed_case = ['File10.txt', 'file2.TXT', 'FILE1.txt', 'File20.txt']
print("Case-insensitive:", natsorted(mixed_case, alg=ns.IGNORECASE))

# Reverse natural sort
print("Reversed:", natsorted(versions, reverse=True))

# Sort with a key function (e.g., sort tuples by second element)
data = [('b', 'v10'), ('a', 'v2'), ('c', 'v1'), ('d', 'v20')]
print("By version:", natsorted(data, key=lambda x: x[1]))

Output:

Versions: ['1.2.3', '1.9.0', '1.10.0', '1.10.2', '2.0.1']
Items: ['item_1', 'item_3', 'item_5', 'item_10', 'item_12']
Case-insensitive: ['FILE1.txt', 'file2.TXT', 'File10.txt', 'File20.txt']
Reversed: ['2.0.1', '1.10.2', '1.10.0', '1.9.0', '1.2.3']
By version: [('c', 'v1'), ('a', 'v2'), ('b', 'v10'), ('d', 'v20')]

Sorting File Paths

File path sorting has an extra wrinkle: you usually want to sort directory components separately from filenames, and you want numbers within both components to sort naturally. natsort’s PATH5�code> algorithm handles this correctly:

# natsort_paths.py
from natsort import natsorted, ns
from pathlib import Path

# Simulate a file listing with numbered paths
paths = [
    '/data/project10/output2.csv',
    '/data/project2/output10.csv',
    '/data/project1/output1.csv',
    '/data/project2/output2.csv',
    '/data/project10/output1.csv',
]

# PATH algorithm: sorts directory components and filenames separately
sorted_paths = natsorted(paths, alg=ns.PATH)
for p in sorted_paths:
    print(p)

Output:

/data/project1/output1.csv
/data/project2/output2.csv
/data/project2/output10.csv
/data/project10/output1.csv
/data/project10/output2.csv

Without ns.PATH, the slash characters in paths can interfere with the numeric detection. The PATH algorithm splits on path separators first, then applies natural sort within each component -- giving you the result a file manager would show.

Cache Katie sorting files in natural order
O(n log n) plus one import. The price of sanity.

Using natsort_keygen() with sort()

If you need to sort an existing list in place (rather than getting a new sorted list), use natsort_keygen() to get a key function you can pass to list.sort():

# natsort_key.py
from natsort import natsort_keygen, ns

# Sort a list in place
logs = ['error_10.log', 'error_2.log', 'error_1.log', 'error_100.log']
key = natsort_keygen()
logs.sort(key=key)
print("Sorted logs:", logs)

# Use as a sort key in more complex operations
students = [
    {'name': 'Alice', 'grade': 'Grade 10'},
    {'name': 'Bob', 'grade': 'Grade 2'},
    {'name': 'Carol', 'grade': 'Grade 1'},
    {'name': 'Dave', 'grade': 'Grade 20'},
]
grade_key = natsort_keygen(key=lambda s: s['grade'])
students.sort(key=grade_key)
for s in students:
    print(f"  {s['grade']}: {s['name']}")

Output:

Sorted logs: ['error_1.log', 'error_2.log', 'error_10.log', 'error_100.log']
  Grade 1: Carol
  Grade 2: Bob
  Grade 10: Alice
  Grade 20: Dave

Real-Life Example: File Report Generator

Sudo Sam organizing perfectly sorted files
A sorted file listing: the last thing you will take for granted again.

This script scans a directory, groups files by type, sorts them naturally within each group, and produces a clean report with file sizes:

# file_report.py
from pathlib import Path
from natsort import natsorted, ns
from collections import defaultdict

def format_size(bytes_count: int) -> str:
    """Format bytes as human-readable size."""
    for unit in ['B', 'KB', 'MB', 'GB']:
        if bytes_count < 1024:
            return f"{bytes_count:.1f} {unit}"
        bytes_count /= 1024
    return f"{bytes_count:.1f} TB"

def generate_file_report(folder: str) -> None:
    base = Path(folder)
    if not base.exists():
        print(f"Folder not found: {folder}")
        return

    # Group files by extension
    groups = defaultdict(list)
    for path in base.rglob('*'):
        if path.is_file():
            ext = path.suffix.lower() or '.no_ext'
            groups[ext].append(path)

    # Sort extension groups naturally, then files within each group
    sorted_exts = natsorted(groups.keys())

    total_files = 0
    total_size = 0

    for ext in sorted_exts:
        files = natsorted(groups[ext], key=lambda p: p.name, alg=ns.PATH)
        ext_size = sum(f.stat().st_size for f in files)
        print(f"\n{ext} ({len(files)} files, {format_size(ext_size)})")
        print("-" * 40)
        for f in files[:5]:  # Show first 5 to keep output manageable
            size = f.stat().st_size
            rel = f.relative_to(base)
            print(f"  {str(rel):<35} {format_size(size):>8}")
        if len(files) > 5:
            print(f"  ... and {len(files) - 5} more")
        total_files += len(files)
        total_size += ext_size

    print(f"\nTotal: {total_files} files, {format_size(total_size)}")

generate_file_report('./my_project')

Output:

.csv (3 files, 48.2 KB)
----------------------------------------
  data/output1.csv                  12.1 KB
  data/output2.csv                  18.4 KB
  data/output10.csv                 17.7 KB

.py (5 files, 22.8 KB)
----------------------------------------
  src/module1.py                     4.2 KB
  src/module2.py                     5.1 KB
  src/module10.py                    6.8 KB
  src/module11.py                    3.9 KB
  src/utils.py                       2.8 KB

Total: 8 files, 71.0 KB

Frequently Asked Questions

Does natsort handle floating-point numbers in strings?

Yes -- use the ns.FLOAT algorithm flag: natsorted(items, alg=ns.FLOAT). This treats sequences like 3.14 and 2.71 as floating-point numbers rather than two separate integer sequences. You can combine flags: ns.FLOAT | ns.IGNORECASE. For version strings specifically, ns.VERSION (or ns.V) is a dedicated shortcut that handles dotted version numbers correctly.

How do I sort with locale-aware character ordering?

Use natsorted(items, alg=ns.LOCALE) after calling locale.setlocale(locale.LC_ALL, '') to set the system locale. This sorts accented characters (like e, a, o in European languages) according to the locale's collation rules rather than Unicode code point order. This is important for sorting names in languages where a and a are considered equal or adjacent in the alphabet.

Can I use natsort with pandas DataFrames?

Yes. Use natsort_keygen() as the key parameter in pandas sort_values(): df.sort_values('column', key=lambda s: s.map(natsort_keygen())). For sorting DataFrame row labels (index), use df.reindex(natsorted(df.index)). The natsort documentation has a dedicated pandas section with more examples for multi-column natural sorting.

Does natsort handle negative numbers?

By default, natsort does not treat a leading minus sign as part of a number -- "file-10.txt" sorts as the string "file-" followed by natural sort on "10.txt". To enable signed number handling, use ns.SIGNED: natsorted(items, alg=ns.SIGNED). This is useful for data with negative values in filenames or log entries, but can produce unexpected results if hyphens are used as separators in non-numeric contexts.

Is natsort fast enough for large lists?

natsort is implemented in pure Python with optional C acceleration via the fastnumbers package. Install it with pip install fastnumbers to get 3-5x speedup on numeric-heavy data. For most real-world use cases (thousands of filenames, hundreds of version strings), natsort's performance is more than adequate. If you are sorting millions of strings, profile first -- the Python overhead may matter more than the sorting algorithm itself.

Conclusion

You now know how to use natsort to make string sorting behave the way humans expect. We covered natsorted() for common cases, the PATH5�code> algorithm for file system paths, natsort_keygen() for in-place sorting and complex key functions, case-insensitive and locale-aware sorting with ns flags, and a practical file report generator that groups and naturally sorts files by extension.

The key takeaway is simple: any time you are sorting strings that contain numbers -- filenames, version strings, log entries, chapter titles, IP addresses -- reach for natsort instead of writing a custom key function. It handles the edge cases (floats, negative numbers, leading zeros, locale) that custom one-liners miss.

See the natsort documentation for the full list of algorithm flags, pandas integration examples, and performance tips.