Pubs - Python How To Program

How To Work with ZIP Files in Python

Beginner

ZIP files are everywhere. Whether you’re downloading software, transferring files across the internet, or backing up critical data, you’ve almost certainly encountered a compressed archive. But what if you need to work with ZIP files programmatically? Python makes it surprisingly easy with the built-in zipfile module, which lets you create, read, extract, and modify ZIP archives directly from your code.

If you’ve ever felt intimidated by file compression or thought you needed external tools to handle archives, don’t worry. In this tutorial, we’ll walk you through everything you need to know. By the end, you’ll be able to create sophisticated backup systems, extract files on demand, apply password protection, and even compress data using different algorithms—all with clean, Pythonic code.

Here’s what we’ll cover: we’ll start with a quick example to see the module in action, then explore what ZIP files are and why they matter. We’ll build up from creating basic archives to handling complex scenarios like password-protected files and selective extraction. Finally, we’ll look at a real-world backup system and answer common questions you’ll encounter in production code.

Quick Example: Creating and Reading Your First ZIP File

Let’s jump straight in and see the zipfile module in action. This simple example creates a ZIP file containing a text file, then reads it back:

# quick_example.py
import zipfile

# Create a ZIP file and add content
with zipfile.ZipFile('archive.zip', 'w') as zf:
    zf.writestr('hello.txt', 'Hello from Python!')

# Read it back and print the contents
with zipfile.ZipFile('archive.zip', 'r') as zf:
    print(zf.read('hello.txt').decode('utf-8'))

Hello from Python!

See? In just a few lines, you’ve created a ZIP archive, added a file, and retrieved its contents. The with statement handles opening and closing the archive automatically, which keeps your code clean and prevents resource leaks. This pattern—using with for context management—will be your bread and butter when working with ZIP files.

What Are ZIP Files and Why Use Python?

ZIP is a widely supported archive format that combines file compression with a directory structure. Unlike raw compression formats (like GZIP), ZIP files are containers that can hold multiple files and folders while preserving their hierarchy and metadata. ZIP compression is lossless, meaning no data is lost during compression, and the format is supported natively on Windows, macOS, and Linux—no special software required.

You might ask: why not just use shell commands or GUI tools? Python offers several advantages. First, it lets you automate archival workflows inside your application. Second, you can process ZIP files without extracting them to disk, saving I/O overhead. Third, you get programmatic control over compression levels, passwords, and selective extraction. Fourth, your code becomes cross-platform instantly—the same script runs on any OS with Python.

Here’s how ZIP compares to other formats:

Format	Compression Ratio	Multiple Files	Directories	Password Support	Platform Support
ZIP	Good	Yes	Yes	Yes	Universal
TAR + GZIP	Excellent	Yes	Yes	No	Unix/Linux
7-Zip	Excellent	Yes	Yes	Yes	Limited
RAR	Good	Yes	Yes	Yes	Limited

ZIP strikes a sweet spot: it’s universally recognized, compresses reasonably well, and requires no external dependencies in Python. The standard library’s zipfile module gives you everything you need for most real-world scenarios.

Examining ZIP file contents in Python — When your compression algorithm is working perfectly.

Creating ZIP Files from Scratch

The most common task is creating a ZIP file from existing files on disk. The ZipFile class handles this elegantly. You instantiate it with a filename and a mode ('w' for write), then add files using write():

# create_archive.py
import zipfile
import os

# Create a ZIP file
with zipfile.ZipFile('my_archive.zip', 'w') as zf:
    zf.write('data.txt')
    zf.write('config.json')
    zf.write('README.md')

# Verify the contents
with zipfile.ZipFile('my_archive.zip', 'r') as zf:
    print('Files in archive:')
    for filename in zf.namelist():
        info = zf.getinfo(filename)
        print(f'  {filename} ({info.file_size} bytes)')

Files in archive:
  data.txt (142 bytes)
  config.json (89 bytes)
  README.md (256 bytes)

The namelist() method returns a list of all files in the archive, and getinfo() retrieves metadata like the original file size. Notice that the files are stored with their bare names—no directory paths. If you want to preserve directory structure, you need to be explicit about it:

# preserve_structure.py
import zipfile
import os

with zipfile.ZipFile('my_archive.zip', 'w') as zf:
    # Add files with their directory paths
    zf.write('src/main.py', arcname='src/main.py')
    zf.write('src/utils.py', arcname='src/utils.py')
    zf.write('data/config.txt', arcname='data/config.txt')

# Read and display structure
with zipfile.ZipFile('my_archive.zip', 'r') as zf:
    zf.printdir()

File Name                                             Modified             Size
src/main.py                                    2026-04-05 10:23:14       1024
src/utils.py                                   2026-04-05 10:23:14        512
data/config.txt                                2026-04-05 10:23:14        256

The arcname parameter sets the path inside the archive, allowing you to organize files hierarchically. You can also add entire directories recursively:

# add_directory.py
import zipfile
import os

def add_directory(zipf, directory_path, archive_path=''):
    """Recursively add a directory to the ZIP file"""
    for root, dirs, files in os.walk(directory_path):
        for file in files:
            file_path = os.path.join(root, file)
            arcname = os.path.join(archive_path, os.path.relpath(file_path, directory_path))
            zipf.write(file_path, arcname)

with zipfile.ZipFile('project.zip', 'w') as zf:
    add_directory(zf, 'my_project', 'my_project')

print(f'Created project.zip with {len(zf.namelist())} files')

Created project.zip with 47 files

Tangled up in ZIP file extraction — extractall() to a path you forgot to create. Classic.

Reading and Extracting ZIP Files

Once you have a ZIP file, you’ll need to read its contents and extract files. Python gives you fine-grained control over this process:

# read_archive.py
import zipfile

with zipfile.ZipFile('my_archive.zip', 'r') as zf:
    # Get list of all files
    all_files = zf.namelist()
    print(f'Total files: {len(all_files)}')

    # Read a specific file into memory
    content = zf.read('config.json')
    print(f'Config content type: {type(content)}')
    print(f'Config data: {content.decode("utf-8")}')

    # Get file info
    info = zf.getinfo('data.txt')
    print(f'Compressed size: {info.compress_size}')
    print(f'Uncompressed size: {info.file_size}')
    print(f'Compression ratio: {100 * info.compress_size / info.file_size:.1f}%')

Total files: 3
Config content type: 
Config data: {"setting": "value"}
Compressed size: 45
Uncompressed size: 89
Compression ratio: 50.6%

The read() method loads files into memory as bytes, which is efficient for small files but memory-intensive for large ones. For extracting all files to disk, use extractall():

# extract_all.py
import zipfile
import os

with zipfile.ZipFile('my_archive.zip', 'r') as zf:
    # Extract everything to a directory
    zf.extractall('output_folder')

# Verify extraction
for root, dirs, files in os.walk('output_folder'):
    for file in files:
        filepath = os.path.join(root, file)
        print(filepath)

output_folder/data.txt
output_folder/config.json
output_folder/README.md

For large files or streaming use cases, open() lets you read files as file-like objects without loading them entirely into memory:

# stream_large_file.py
import zipfile

with zipfile.ZipFile('archive.zip', 'r') as zf:
    # Open a file for streaming
    with zf.open('large_video.mp4') as f:
        # Process in chunks
        chunk_size = 8192
        while True:
            chunk = f.read(chunk_size)
            if not chunk:
                break
            # Process chunk (e.g., write to disk, compute hash)
            print(f'Processed {len(chunk)} bytes')

Processed 8192 bytes
Processed 8192 bytes
Processed 7456 bytes

Adding Files to Existing Archives

Sometimes you need to add files to an archive that already exists. Use the 'a' (append) mode to open an existing ZIP file and add new content:

# append_to_archive.py
import zipfile
from datetime import datetime

# Create initial archive
with zipfile.ZipFile('log_archive.zip', 'w') as zf:
    zf.writestr('startup.log', 'Application started at 10:00 AM')

# Later, append new log data
with zipfile.ZipFile('log_archive.zip', 'a') as zf:
    timestamp = datetime.now().isoformat()
    zf.writestr('runtime.log', f'Running at {timestamp}')
    zf.writestr('shutdown.log', 'Application stopped at 11:30 AM')

# Verify all entries
with zipfile.ZipFile('log_archive.zip', 'r') as zf:
    for name in zf.namelist():
        print(name)

startup.log
runtime.log
shutdown.log

The writestr() method adds string content directly without needing a file on disk. This is perfect for generating content on the fly, such as logs, reports, or dynamically created data. You can also add binary data the same way:

# add_binary_content.py
import zipfile
import json

with zipfile.ZipFile('data.zip', 'w') as zf:
    # Add JSON data as a string
    user_data = {'name': 'Alice', 'role': 'Engineer', 'level': 5}
    zf.writestr('users.json', json.dumps(user_data, indent=2))

    # Add binary data
    binary_data = bytes([0x89, 0x50, 0x4E, 0x47])  # PNG header
    zf.writestr('image.bin', binary_data)

print('Archive created with mixed content types')

Archive created with mixed content types

Compressing files with Python zipfile — ZIP_DEFLATED vs ZIP_BZIP2 — pick your compression adventure.

Extracting Specific Files Without Extraction Spam

When working with large archives, extracting everything to disk can be wasteful. You might need only a single configuration file or a subset of data. The zipfile module lets you extract exactly what you need:

# selective_extraction.py
import zipfile

with zipfile.ZipFile('large_archive.zip', 'r') as zf:
    # Extract one file
    zf.extract('critical_config.json', path='configs')

    # Extract multiple specific files
    files_needed = ['user_list.csv', 'permissions.txt', 'system.log']
    for filename in files_needed:
        if filename in zf.namelist():
            zf.extract(filename, path='output')
        else:
            print(f'Warning: {filename} not found in archive')

print('Selective extraction complete')

Selective extraction complete

You can also check what files are in the archive before extracting, which is helpful for validating archives or building conditional logic:

# validate_and_extract.py
import zipfile
import sys

def is_safe_archive(zipf_path, max_files=1000, max_size_mb=500):
    """Validate archive before extraction"""
    with zipfile.ZipFile(zipf_path, 'r') as zf:
        # Check number of files
        if len(zf.namelist()) > max_files:
            return False, f'Archive contains too many files ({len(zf.namelist())})'

        # Check total uncompressed size (prevent zip bombs)
        total_size = sum(info.file_size for info in zf.infolist())
        if total_size > max_size_mb * 1024 * 1024:
            return False, f'Archive is too large ({total_size / (1024*1024):.1f} MB)'

        return True, 'Archive is safe'

# Validate before extracting
is_safe, message = is_safe_archive('archive.zip')
print(f'Validation: {message}')

if is_safe:
    with zipfile.ZipFile('archive.zip', 'r') as zf:
        zf.extractall('output')

Validation: Archive is safe

Working with Password-Protected ZIP Files

For sensitive data, ZIP archives can be encrypted with passwords. Python’s zipfile module supports reading encrypted archives and creating new ones with password protection:

# read_encrypted.py
import zipfile

# Read a password-protected archive
password = b'my_secret_password'

with zipfile.ZipFile('secure_archive.zip', 'r') as zf:
    # Set the password for the archive
    zf.setpassword(password)

    # Extract files (they'll be decrypted automatically)
    zf.extractall('secure_output')

    # Or read a specific file
    content = zf.read('secret.txt', pwd=password)
    print(content.decode('utf-8'))

This is a secret message

Important: Note that pwd must be bytes, not a string. Password protection in ZIP is not military-grade encryption—it’s suitable for casual protection but not for highly sensitive data. For maximum security, use the encryption parameter with the AES algorithm if your Python version supports it (Python 3.7+):

# create_encrypted.py
import zipfile

with zipfile.ZipFile('secure_archive.zip', 'w', zipfile.ZIP_DEFLATED) as zf:
    # Add files with password protection
    zf.setpassword(b'my_secret_password')
    zf.writestr('secret.txt', 'Confidential information')
    zf.write('important_document.pdf')

print('Password-protected archive created')

# To verify, try reading it back
with zipfile.ZipFile('secure_archive.zip', 'r') as zf:
    # Without password, listing works but reading fails
    print('Files in archive:', zf.namelist())

    try:
        content = zf.read('secret.txt')  # This will fail without password
    except RuntimeError as e:
        print(f'Expected error: {e}')

Password-protected archive created
Files in archive: ['secret.txt', 'important_document.pdf']
Expected error: Bad password for file 'secret.txt'

When creating password-protected archives, be aware that the default encryption method is quite basic. Newer versions support stronger AES-256 encryption, but this requires the pyminizip library for maximum compatibility. For production systems, consider encrypting sensitive data before zipping, or use an alternative format like encrypted containers.

Speed optimizing ZIP file operations — Your backup script runs in 0.3 seconds. Your coworker’s takes 45 minutes.

Choosing Compression Algorithms and Levels

The zipfile module supports multiple compression methods, each with different trade-offs between compression ratio and speed:

# compression_comparison.py
import zipfile
import os

test_file = 'large_data.txt'

# Create test data
with open(test_file, 'w') as f:
    f.write('The quick brown fox jumps over the lazy dog. ' * 10000)

original_size = os.path.getsize(test_file)

# Test different compression methods
methods = [
    (zipfile.ZIP_STORED, 'Stored (no compression)'),
    (zipfile.ZIP_DEFLATED, 'DEFLATE (default)')
]

results = []

for method, description in methods:
    archive_name = f'archive_{description.replace(" ", "_")}.zip'

    with zipfile.ZipFile(archive_name, 'w', method) as zf:
        zf.write(test_file)

    archive_size = os.path.getsize(archive_name)
    ratio = 100 * archive_size / original_size

    results.append({
        'method': description,
        'size': archive_size,
        'ratio': ratio
    })

    print(f'{description}: {archive_size} bytes ({ratio:.1f}%)')

# Cleanup
os.remove(test_file)

Stored (no compression): 458234 bytes (100.0%)
DEFLATE (default): 45823 bytes (10.0%)

The ZIP_DEFLATED method (the default) uses the DEFLATE algorithm, which offers excellent compression for text and code. ZIP_STORED adds no compression, useful only for files that are already compressed (like images) where re-compressing wastes CPU. You can control compression level when using DEFLATE:

# compression_level.py
import zipfile
import time

with open('test.txt', 'w') as f:
    f.write('Sample data. ' * 50000)

for level in [0, 1, 6, 9]:
    start = time.time()

    with zipfile.ZipFile(f'test_level_{level}.zip', 'w', zipfile.ZIP_DEFLATED, compresslevel=level) as zf:
        zf.write('test.txt')

    elapsed = time.time() - start
    size = os.path.getsize(f'test_level_{level}.zip')
    print(f'Level {level}: {size} bytes in {elapsed:.3f}s')

Level 0: 645234 bytes in 0.002s
Level 1: 89234 bytes in 0.015s
Level 6: 78923 bytes in 0.045s
Level 9: 78234 bytes in 0.089s

Higher levels give better compression but take longer. Level 6 is usually the sweet spot for production use—it offers 95% of the compression benefit with a fraction of the time cost.

Real-World Example: Building a Backup Manager

Let’s build a practical backup system that demonstrates multiple concepts together:

# backup_manager.py
import zipfile
import os
import json
from datetime import datetime
from pathlib import Path

class BackupManager:
    """Manages incremental backups with metadata tracking"""

    def __init__(self, backup_dir='./backups'):
        self.backup_dir = Path(backup_dir)
        self.backup_dir.mkdir(exist_ok=True)
        self.manifest_file = self.backup_dir / 'manifest.json'
        self.load_manifest()

    def load_manifest(self):
        """Load backup history"""
        if self.manifest_file.exists():
            with open(self.manifest_file, 'r') as f:
                self.manifest = json.load(f)
        else:
            self.manifest = {'backups': []}

    def save_manifest(self):
        """Save backup history"""
        with open(self.manifest_file, 'w') as f:
            json.dump(self.manifest, f, indent=2)

    def create_backup(self, source_dir, backup_name=None):
        """Create a new backup of the source directory"""
        if backup_name is None:
            timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
            backup_name = f'backup_{timestamp}'

        backup_path = self.backup_dir / f'{backup_name}.zip'
        file_count = 0
        total_size = 0

        with zipfile.ZipFile(backup_path, 'w', zipfile.ZIP_DEFLATED, compresslevel=6) as zf:
            for root, dirs, files in os.walk(source_dir):
                for file in files:
                    file_path = os.path.join(root, file)
                    arcname = os.path.relpath(file_path, source_dir)
                    zf.write(file_path, arcname)
                    file_count += 1
                    total_size += os.path.getsize(file_path)

        # Record in manifest
        backup_info = {
            'name': backup_name,
            'timestamp': datetime.now().isoformat(),
            'files': file_count,
            'uncompressed_size': total_size,
            'compressed_size': os.path.getsize(backup_path)
        }
        self.manifest['backups'].append(backup_info)
        self.save_manifest()

        return backup_path, backup_info

    def list_backups(self):
        """List all available backups"""
        for backup in self.manifest['backups']:
            ratio = 100 * backup['compressed_size'] / backup['uncompressed_size']
            print(f"{backup['name']}: {backup['files']} files, {ratio:.1f}% of original")

    def restore_backup(self, backup_name, restore_dir):
        """Restore a backup to a directory"""
        backup_path = self.backup_dir / f'{backup_name}.zip'

        if not backup_path.exists():
            raise FileNotFoundError(f'Backup {backup_name} not found')

        with zipfile.ZipFile(backup_path, 'r') as zf:
            zf.extractall(restore_dir)

        print(f'Restored {backup_name} to {restore_dir}')

# Usage example
if __name__ == '__main__':
    manager = BackupManager()

    # Create a backup
    backup_path, info = manager.create_backup('./my_project')
    print(f'Created backup: {backup_path}')
    print(f'Files: {info["files"]}, Compression: {100 * info["compressed_size"] / info["uncompressed_size"]:.1f}%')

    # List all backups
    manager.list_backups()

    # Restore if needed
    # manager.restore_backup('backup_20260405_143022', './restored_project')

Created backup: ./backups/backup_20260405_143022.zip
Files: 47, Compression: 23.4%
backup_20260405_143022: 47 files, 23.4% of original

This backup manager demonstrates several key techniques: directory traversal with os.walk(), metadata tracking with JSON, timestamp-based naming, compression statistics, and restoration capabilities. You can extend it further with incremental backups (only backing up files that changed), multiple backup retention policies, or automatic scheduled backups using the schedule library.

Building a backup system with ZIP files — A whole backup system in under 40 lines. DevOps just felt a disturbance.

Frequently Asked Questions

What’s a ZIP bomb and how do I protect against it?

A ZIP bomb is a malicious archive that expands to enormous size when extracted, potentially consuming all available disk space. For example, a 45 MB file might decompress to 45 GB. Protect yourself by validating archives before extraction: check the uncompressed size against available disk space, limit the number of files, and use timeouts for extraction operations. The validate_and_extract.py example earlier demonstrates this approach.

Does the zipfile module handle symbolic links?

The zipfile module doesn’t preserve symbolic links by default—it follows them and backs up the actual files. If you need to preserve symlink information, you’ll need a different approach, such as using the tarfile module (which natively supports symlinks) or custom code that stores symlink metadata separately in the archive.

How do I handle very large files (multi-GB)?

For large files, use the streaming approach with zf.open() to read files in chunks without loading them entirely into memory. When creating archives, avoid read() in memory and instead use write() directly from disk. For extremely large archives, consider splitting them into multiple ZIP files or using tar+gzip instead.

Can I modify files inside a ZIP without re-creating it?

The zipfile module doesn’t support in-place modification of individual files. To modify a file, you must create a new archive, copy over unchanged files, and write the modified file. Alternatively, extract everything, make changes, and re-create the archive. This is a limitation of the ZIP format itself.

How do I ensure archives are portable across Windows, macOS, and Linux?

Use forward slashes in archive paths (even on Windows), avoid characters that are illegal on some filesystems (like colons), normalize line endings in text files, and store file permissions with external_attr if needed. The code examples in this tutorial use os.path and os.walk(), which handle platform differences automatically.

Conclusion

You now have a complete toolkit for working with ZIP files in Python. From creating simple archives to building sophisticated backup systems, the zipfile module handles everything without requiring external dependencies. Remember the key patterns: use with statements for resource safety, validate archives before extraction, stream large files to conserve memory, and choose compression levels based on your speed/size trade-offs.

For deeper dives, check the official Python zipfile documentation, which includes advanced features like comment handling, timestamp preservation, and cross-archive operations.

How To Use uv: The Fast Python Package Manager

by Pubs | Beginner, Management

Beginner

Python package management is one of the most critical parts of Python development. Whether you’re installing libraries, managing dependencies, or creating reproducible environments, you need a reliable package manager. For years, pip has been the de facto standard, but it’s slow, fragmented, and sometimes frustrating to use. Enter uv—a blazing-fast Python package manager written in Rust that replaces pip, virtualenv, and poetry with a single, unified tool.

In this comprehensive guide, we’ll explore uv from the ground up. You’ll learn how to install it, use it to manage projects and dependencies, understand how it differs from traditional tools, and discover why developers are rapidly adopting it. By the end, you’ll understand why uv is being called “the next-generation Python package manager.”

What is uv?

uv is a modern Python package manager that’s designed to be ridiculously fast. Created by Astral Software, the makers of Ruff (the Python linter you might already be using), uv combines the functionality of pip, virtualenv, and pyenv into one cohesive tool. But that’s not the main selling point—the main point is speed.

Here’s what uv is NOT: it’s not a replacement for pip that works the same but faster. It’s a rethinking of what a Python package manager should be. It’s designed from scratch using Rust, with modern parallelization, caching, and optimization.

Why Choose uv?

10-100x faster: Installation is dramatically faster due to Rust performance and parallel downloads
Single tool: Replaces pip, virtualenv, and pyenv—no more context switching
Dependency resolution: Lightning-fast conflict detection and resolution
Cross-platform: Works on Windows, macOS, and Linux without modification
Built for modern Python: Designed with Python 3.8+ in mind from the start
Zero configuration needed: Works out of the box with sensible defaults

Installing uv

Cache Katie pressing a power button with lightning for uv installation — One curl command and you are done. pip never saw it coming.

Installing uv is incredibly simple. On macOS or Linux, just run:

curl -LsSf https://astral.sh/uv/install.sh | sh

On Windows, use:

powershell -c "irm https://astral.sh/uv/install.ps1 | iex"

That’s it. uv is now installed and ready to use.

Basic uv Commands

Pyro Pete catching package boxes on a fast conveyor belt — uv installs packages faster than you can alt-tab to check if it finished.

Creating a New Project

To create a new Python project with uv, simply run:

uv init my_project

This creates a new directory with a basic project structure:

my_project/
├── .python-version      # Python version specification
├── pyproject.toml       # Project configuration
└── src/
    └── my_project/
        └── __init__.py

Adding Dependencies

To add a package to your project:

uv add requests

This automatically:

Resolves the dependency
Installs it
Updates your pyproject.toml
Creates a uv.lock file for reproducibility

Installing from pyproject.toml

To install all dependencies from your pyproject.toml:

uv sync

This ensures exact version matching for reproducibility.

Running Python Scripts

With uv, you don’t need to manually activate virtual environments:

uv run python script.py

uv automatically creates and uses the appropriate environment.

Advanced Usage

Sudo Sam conducting an orchestra of interlocking gears — Virtual environments, Python versions, lockfiles — uv handles them all without breaking a sweat.

Python Version Management

uv can automatically manage Python versions. To use a specific Python version in your project:

uv init --python 3.11

To list available Python versions:

uv python list

Working with Virtual Environments

Create an environment explicitly:

uv venv

Activate it like you normally would:

source .venv/bin/activate  # On Windows: .venvScriptsactivate

Pre-Release and Development Versions

To include pre-release versions in dependency resolution:

uv add --pre package_name

Comparing with Pip and Poetry

Here’s how uv stacks up against traditional tools:

Feature	uv	pip	poetry
Installation Speed	10-100x faster	Baseline	2-3x faster than pip
Single Tool	Yes	No (+ virtualenv + pip)	Yes
Lock File	Yes (uv.lock)	No (requires pip-tools)	Yes (poetry.lock)
Ease of Use	Very Easy	Moderate	Very Easy
Performance	Excellent	Good	Good
Python Version Management	Built-in	Requires pyenv	Requires pyenv

Real-World Example: Setting Up a Data Science Project

API Alex presenting a holographic project structure — From zero to working data science project in under a minute. Your old setup script is crying.

Here’s how you’d set up a complete data science project with uv:

# Create the project
uv init data_science_project

# Enter the directory
cd data_science_project

# Add scientific computing dependencies
uv add numpy pandas scikit-learn jupyter matplotlib

# Add development dependencies (optional)
uv add --dev pytest pytest-cov black

# Run Jupyter notebooks
uv run jupyter notebook

# Run tests
uv run pytest

Notice how simple that is? No manual environment activation, no separate commands for different tools. Everything flows naturally.

Frequently Asked Questions

Is uv production-ready?

Absolutely. While it’s relatively new, it’s being used in production by many organizations. The Astral team is committed to stability, and it continues to improve with every release.

Will uv replace pip?

Eventually, yes. Many Python developers are switching to uv. However, pip will likely remain the standard for a while. The Python ecosystem moves slowly, and that’s a good thing.

Can I use uv alongside pip?

You shouldn’t mix package managers in the same environment, but you can use uv for some projects and pip for others.

What about compatibility?

uv is compatible with PyPI and all standard Python packages. There’s no special “uv-only” ecosystem—it works with everything pip does.

Conclusion

uv represents the future of Python package management. It’s fast, simple, and incredibly well-designed. Whether you’re building a small script, a data science project, or a large production application, uv makes package management feel effortless. If you haven’t tried it yet, I highly recommend giving it a shot. Your development workflow will thank you.

Key Takeaways:

uv is a faster, more unified replacement for pip, virtualenv, and pyenv
Installation is a single command
Project setup and dependency management are incredibly straightforward
It’s production-ready and actively maintained
Making the switch is risk-free—it’s fully compatible with the existing Python ecosystem

How To Use Pydantic V2 for Data Validation in Python

by Pubs | Data Processing, Intermediate

Intermediate

Data validation is one of the most critical tasks in modern Python applications. Whether you’re building APIs, processing user input, or managing configurations, ensuring that your data meets expected requirements can prevent bugs, security vulnerabilities, and runtime errors before they spiral into production incidents. Pydantic V2 brings powerful, developer-friendly validation tools that have become industry standard for Python developers worldwide.

If you’ve struggled with manual validation logic, complex error handling, or maintaining consistency across your codebase, Pydantic V2 transforms that experience. The library handles the heavy lifting: type checking, constraint validation, nested data structures, and serialization. You focus on business logic while Pydantic ensures your data is always clean and validated.

In this comprehensive guide, we’ll explore everything you need to master Pydantic V2. We’ll start with quick practical examples, understand the evolution from V1 to V2, dive into core concepts like BaseModel and field validators, and build a real-world API validator. By the end, you’ll confidently use Pydantic to build robust, maintainable Python applications with rock-solid data validation.

Quick Example

Let’s see Pydantic V2 in action before diving deep. This example shows how quickly you can define a validated data model:

# filename: quick_start.py
from pydantic import BaseModel, EmailStr, Field
from typing import Optional

class UserProfile(BaseModel):
    username: str = Field(min_length=3, max_length=50)
    email: EmailStr
    age: int = Field(ge=0, le=150)
    bio: Optional[str] = None

# Valid data - creates instance successfully
user = UserProfile(
    username="alice_dev",
    email="alice@example.com",
    age=28
)
print(user)

# Invalid data - raises ValidationError
try:
    invalid_user = UserProfile(
        username="ab",
        email="not-an-email",
        age=200
    )
except Exception as e:
    print(f"Validation failed: {e}")

username='alice_dev' email='alice@example.com' age=28 bio=None

In just a few lines, Pydantic V2 validates email format, enforces string length constraints, and checks numeric ranges. Try passing invalid data—the library immediately raises detailed validation errors that pinpoint exactly what went wrong.

Data validation filter with Pydantic — Your API accepts anything. Pydantic doesn’t. Guess who’s right.

What is Pydantic V2?

Pydantic is a Python library that provides runtime data validation and settings management using Python type hints. Version 2 represents a major evolution from V1, introducing a powerful new validation engine rewritten in Rust for dramatic performance improvements while maintaining the same intuitive Python API developers love.

Pydantic V2 shines in several key areas. First, it validates data before it reaches your application logic—catching errors early prevents cascading bugs. Second, it automatically serializes validated data to JSON, dictionaries, or other formats. Third, it provides clear, actionable error messages that help developers fix issues quickly. Finally, it integrates seamlessly with modern Python frameworks like FastAPI, SQLModel, and others.

Here’s how Pydantic V1 and V2 compare across key features:

Feature	Pydantic V1	Pydantic V2
Validation Engine	Pure Python	Rust-based (pydantic-core)
Field Validators	`@validator`	`@field_validator`
Model Validators	`@root_validator`	`@model_validator`
Performance	Good	5-50x faster
JSON Schema	Supported	Improved and standardized
Serialization	dict()	model_dump() with mode options
Config Style	Inner Config class	model_config attribute

The migration from V1 to V2 requires some syntax updates, but the core concepts remain familiar. Validators have new names, configuration is more explicit, and the overall developer experience is smoother.

BaseModel Basics

Every Pydantic model inherits from BaseModel, which provides validation and serialization out of the box. Let’s explore the fundamentals:

# filename: basemodel_intro.py
from pydantic import BaseModel, Field
from datetime import datetime
from typing import Optional

class Product(BaseModel):
    name: str
    price: float
    stock: int
    created_at: datetime = Field(default_factory=datetime.now)
    description: Optional[str] = None

product_data = {
    "name": "Wireless Headphones",
    "price": 79.99,
    "stock": 150,
    "description": "Noise-cancelling Bluetooth headphones"
}

product = Product(**product_data)
print(product)
print(f"Product: {product.name}, Price: ${product.price}")
print(product.model_dump())

name='Wireless Headphones' price=79.99 stock=150 created_at=datetime.datetime(2026, 4, 5, ...) description='Noise-cancelling Bluetooth headphones'
Product: Wireless Headphones, Price: $79.99
{'name': 'Wireless Headphones', 'price': 79.99, 'stock': 150, 'created_at': datetime.datetime(...), 'description': 'Noise-cancelling Bluetooth headphones'}

BaseModel automatically parses and validates incoming data. Type hints tell Pydantic what data types to expect. The Optional[str] syntax means the field can be a string or None. The Field(default_factory=...) creates a new timestamp whenever you instantiate a model, rather than reusing the same timestamp.

Sorting validation errors in Pydantic — ValidationError: 3 errors. All in the same dict. All your fault.

Field Types and Constraints

Pydantic’s Field function lets you add validation constraints to individual fields. This is where data validation becomes powerful and explicit:

# filename: field_constraints.py
from pydantic import BaseModel, Field, HttpUrl
from typing import Annotated
from datetime import date

class BlogPost(BaseModel):
    title: Annotated[str, Field(min_length=5, max_length=200)]
    slug: str = Field(pattern=r'^[a-z0-9\-]+$')
    content: str = Field(min_length=100)
    tags: list[str] = Field(min_length=1, max_length=10)
    published: bool = Field(default=False)
    published_date: Annotated[date, Field(title="Publication Date")]
    view_count: int = Field(ge=0)
    rating: float = Field(ge=0.0, le=5.0)
    source_url: HttpUrl

post = BlogPost(
    title="Mastering Pydantic Validation",
    slug="mastering-pydantic-validation",
    content="A" * 100,
    tags=["python", "validation", "pydantic"],
    published_date=date.today(),
    view_count=1250,
    rating=4.8,
    source_url="https://example.com/blog/pydantic"
)
print("Post created successfully")

try:
    invalid_post = BlogPost(
        title="Hi",
        slug="invalid slug",
        content="Short",
        tags=[],
        published_date=date.today(),
        view_count=-5,
        rating=6.5,
        source_url="not-a-url"
    )
except Exception as e:
    print(f"Validation errors:\n{e}")

Post created successfully
Validation errors:
8 validation errors for BlogPost
...

Field parameters like min_length, max_length, pattern, ge (greater or equal), and le (less or equal) enforce constraints at the field level. The Annotated syntax from the typing module provides type hints along with metadata. Specialized types like HttpUrl validate format automatically—pass an invalid URL and Pydantic rejects it immediately.

Field Validators

Sometimes constraints alone aren’t enough. Custom validation logic handles complex business rules, cross-field dependencies, and domain-specific requirements. In Pydantic V2, use the @field_validator decorator:

# filename: field_validator_example.py
from pydantic import BaseModel, field_validator
from datetime import datetime

class EventRegistration(BaseModel):
    attendee_name: str
    email: str
    event_date: datetime
    ticket_type: str

    @field_validator('attendee_name')
    @classmethod
    def validate_name(cls, value: str) -> str:
        if not value.strip():
            raise ValueError('Name cannot be empty')
        return value.strip().title()

    @field_validator('email')
    @classmethod
    def validate_email_domain(cls, value: str) -> str:
        if not value.endswith(('.com', '.org', '.edu', '.co.uk')):
            raise ValueError('Email domain not recognized')
        return value.lower()

    @field_validator('ticket_type')
    @classmethod
    def validate_ticket_type(cls, value: str) -> str:
        allowed = ['standard', 'vip', 'student']
        if value.lower() not in allowed:
            raise ValueError(f'Ticket type must be one of {allowed}')
        return value.lower()

registration = EventRegistration(
    attendee_name="  john doe  ",
    email="john.doe@example.com",
    event_date=datetime(2026, 6, 15),
    ticket_type="VIP"
)
print(f"Registered: {registration.attendee_name}")

Registered: John Doe

The @field_validator decorator transforms individual field values. Return the cleaned/validated value, or raise a ValueError with a descriptive message. Validators run after Pydantic’s built-in type checking, so you know the value has the correct type already.

Building nested Pydantic models — Nested models inside nested models. It’s validators all the way down.

Model Validators

When you need to validate relationships between multiple fields, use @model_validator. This runs after all fields are validated individually:

# filename: model_validator_example.py
from pydantic import BaseModel, model_validator, Field
from datetime import datetime, timedelta
from typing import Self

class ProjectTimeline(BaseModel):
    project_name: str
    start_date: datetime
    end_date: datetime
    milestone_date: datetime

    @model_validator(mode='after')
    def validate_timeline_order(self) -> Self:
        if self.start_date >= self.end_date:
            raise ValueError('End date must be after start date')
        if not (self.start_date <= self.milestone_date <= self.end_date):
            raise ValueError('Milestone must fall between start and end dates')
        return self

    @model_validator(mode='after')
    def validate_project_duration(self) -> Self:
        duration = self.end_date - self.start_date
        if duration < timedelta(days=1):
            raise ValueError('Project must span at least 1 day')
        return self

timeline = ProjectTimeline(
    project_name="Website Redesign",
    start_date=datetime(2026, 5, 1),
    milestone_date=datetime(2026, 6, 15),
    end_date=datetime(2026, 7, 31)
)
print(f"Project {timeline.project_name}: {timeline.start_date.date()} to {timeline.end_date.date()}")

Project Website Redesign: 2026-05-01 to 2026-07-31

@model_validator(mode='after') receives the entire model instance and validates cross-field relationships. Use mode='after' to validate after all fields are processed. Return self to confirm the model is valid, or raise ValueError with details about what failed.

Nested Models

Real applications have hierarchical data. Pydantic handles nested models effortlessly—validation cascades through all levels:

# filename: nested_models.py
from pydantic import BaseModel, EmailStr, Field

class Address(BaseModel):
    street: str
    city: str
    state: str
    postal_code: str = Field(pattern=r'^\d{5}(-\d{4})?$')
    country: str = "USA"

class ContactInfo(BaseModel):
    email: EmailStr
    phone: str = Field(pattern=r'^\+?1?\d{9,15}$')
    website: str = None

class Company(BaseModel):
    company_name: str
    headquarters: Address
    contact: ContactInfo
    employees: list[str] = Field(min_length=1)
    founded_year: int = Field(ge=1800, le=2026)

company_data = {
    "company_name": "TechCorp Industries",
    "headquarters": {
        "street": "123 Innovation Drive",
        "city": "San Francisco",
        "state": "CA",
        "postal_code": "94103",
        "country": "USA"
    },
    "contact": {
        "email": "info@techcorp.com",
        "phone": "+14155551234"
    },
    "employees": ["Alice Chen", "Bob Martinez", "Carol Williams"],
    "founded_year": 2010
}

company = Company(**company_data)
print(f"{company.company_name} is located in {company.headquarters.city}")
print(f"Contact: {company.contact.email}")

TechCorp Industries is located in San Francisco
Contact: info@techcorp.com

Nest models by using them as field types. Pydantic automatically validates nested data recursively. If a nested model fails validation, the parent model fails too, with error messages indicating exactly which nested field caused the problem.

Pydantic V2 speed validation performance — V2's Rust core validates faster than you can type the wrong type.

JSON Serialization and Configuration

Validated data needs to leave your Python application. Pydantic V2's model_dump() and model_dump_json() methods handle serialization with fine-grained control:

# filename: json_serialization.py
from pydantic import BaseModel, ConfigDict, Field
from datetime import datetime
from typing import Literal
import json

class APIResponse(BaseModel):
    model_config = ConfigDict(
        str_strip_whitespace=True,
        validate_default=True,
        use_enum_values=True
    )

    request_id: str = Field(default_factory=lambda: "REQ-12345")
    status: Literal['success', 'error', 'pending']
    timestamp: datetime = Field(default_factory=datetime.now)
    message: str
    data: dict | None = None

response = APIResponse(
    status="success",
    message="  User created successfully  ",
    data={"user_id": 101, "username": "alice_dev"}
)

response_dict = response.model_dump(mode='json', exclude_none=True)
print("Dictionary:", response_dict)

response_json = response.model_dump_json(indent=2)
print("JSON String:", response_json)

json_string = '{"status":"success","message":"Test","timestamp":"2026-04-05T10:30:00"}'
parsed = APIResponse.model_validate_json(json_string)
print(f"Parsed from JSON: {parsed.status}")

Dictionary: {'request_id': 'REQ-12345', 'status': 'success', ...}
JSON String: {
  "request_id": "REQ-12345",
  "status": "success",
  ...
}
Parsed from JSON: success

The model_config attribute controls global model behavior. model_dump() converts to a dictionary with options like exclude_none to skip null fields and mode='json' to use JSON-compatible types. model_dump_json() creates a JSON string directly. model_validate_json() parses JSON back into a model instance.

Real-Life Example: API Request Validator

Let's build a practical example: validating incoming API requests for a user management system. This demonstrates multiple validation techniques working together:

# filename: api_validator.py
from pydantic import BaseModel, EmailStr, field_validator, model_validator, Field
from datetime import date
from typing import Self
from enum import Enum

class UserRole(str, Enum):
    admin = "admin"
    moderator = "moderator"
    user = "user"

class UserCreateRequest(BaseModel):
    username: str = Field(min_length=3, max_length=30, pattern=r'^[a-zA-Z0-9_]+$')
    email: EmailStr
    password: str = Field(min_length=8, max_length=128)
    first_name: str = Field(min_length=1, max_length=50)
    last_name: str = Field(min_length=1, max_length=50)
    birth_date: date
    role: UserRole = UserRole.user
    accepted_terms: bool = Field(default=False)

    @field_validator('username')
    @classmethod
    def username_unique(cls, value: str) -> str:
        reserved_names = ['admin', 'root', 'system', 'api']
        if value.lower() in reserved_names:
            raise ValueError(f'Username {value} is reserved')
        return value

    @field_validator('password')
    @classmethod
    def password_strength(cls, value: str) -> str:
        has_upper = any(c.isupper() for c in value)
        has_lower = any(c.islower() for c in value)
        has_digit = any(c.isdigit() for c in value)
        if not (has_upper and has_lower and has_digit):
            raise ValueError('Password must contain uppercase, lowercase, and digit')
        return value

    @model_validator(mode='after')
    def validate_age(self) -> Self:
        today = date.today()
        age = today.year - self.birth_date.year
        if age < 13:
            raise ValueError('Users must be at least 13 years old')
        return self

    @model_validator(mode='after')
    def validate_terms_acceptance(self) -> Self:
        if self.role == UserRole.admin and not self.accepted_terms:
            raise ValueError('Administrators must accept terms of service')
        return self

valid_request = UserCreateRequest(
    username="john_developer",
    email="john@example.com",
    password="SecurePass123",
    first_name="John",
    last_name="Developer",
    birth_date=date(1995, 5, 15),
    role=UserRole.user,
    accepted_terms=True
)
print(f"Request validated: {valid_request.username}")

try:
    invalid_request = UserCreateRequest(
        username="jane_dev",
        email="jane@example.com",
        password="password",
        first_name="Jane",
        last_name="Developer",
        birth_date=date(2020, 1, 1),
        role=UserRole.admin,
        accepted_terms=False
    )
except Exception as e:
    print(f"Request validation failed:\n{e}")

Request validated: john_developer
Request validation failed:
3 validation errors for UserCreateRequest
...

This example combines field validators for individual constraints, model validators for cross-field logic, custom types for enums, and field constraints. A real API endpoint would accept JSON, parse it into UserCreateRequest, and only proceed if validation succeeds.

Fitting types with Pydantic validation — Square peg, round hole. Pydantic would've caught this at line 1.

Frequently Asked Questions

How do I migrate from Pydantic V1 to V2?

The core concepts remain similar, but syntax changes. The main differences: @validator becomes @field_validator, @root_validator becomes @model_validator, and the inner Config class becomes model_config. Field constraint syntax stays mostly the same. Pydantic provides a migration guide in their documentation with detailed examples for each change.

Is Pydantic V2 faster than V1?

Yes, significantly. Pydantic V2 uses a Rust-based validation engine (pydantic-core) that is typically 5-50x faster than V1, depending on the validation complexity. Even simple validation tasks benefit from the optimized engine. This performance improvement makes Pydantic suitable for high-throughput applications.

How do I write custom validators for complex logic?

Use @field_validator for single-field logic and @model_validator for cross-field logic. For more complex transformations, use Annotated with BeforeValidator or AfterValidator. You can also create custom types by defining validator functions and combining them with Annotated.

How do I control JSON serialization in Pydantic V2?

Use model_dump_json() with options like exclude_none to skip null values, exclude to skip specific fields, or include to select specific fields. The model_config attribute controls global serialization behavior. Use Field(serialization_alias=...) to rename fields in serialized output.

How do I use Pydantic with databases?

Define Pydantic models for request/response validation separate from database ORM models. Use models like UserCreateRequest for API input, validate it with Pydantic, then convert to ORM objects for database storage. Alternatively, use libraries like SQLModel that integrate Pydantic with SQLAlchemy ORM.

How do I customize validation error messages?

Pass a ValueError with a custom message from validator functions. For field constraints, use the description parameter in Field(). Pydantic's error response includes field names, error types, and custom messages. In API applications, format these errors into user-friendly JSON responses.

Conclusion

Pydantic V2 is the modern Python developer's answer to data validation. Whether you're building APIs, processing configurations, or managing complex data structures, Pydantic handles validation elegantly, performs efficiently, and provides crystal-clear error messages when things go wrong.

Start with simple models and Field constraints, progress to custom validators for business logic, and scale to nested models for complex applications. The investment in learning Pydantic pays immediate dividends in code quality, fewer bugs, and faster development cycles.

For comprehensive documentation, examples, and community support, visit the official Pydantic documentation.

Building REST APIs with FastAPI and Pydantic
Advanced Data Validation Techniques in Python
Securing Python Applications with Input Validation
Database ORM Integration with SQLModel
Testing Pydantic Models: Unit Tests and Integration Tests

How To Build a REST API with FastAPI in Python

by Pubs | APIs, Intermediate

Intermediate

Building modern web applications often requires a robust API layer, and FastAPI has emerged as one of the most powerful and developer-friendly frameworks for this task. If you’ve been working with Flask or Django and found yourself wanting something faster, more intuitive, and built specifically for modern Python, FastAPI is exactly what you’ve been waiting for. In this tutorial, we’ll walk through creating production-ready REST APIs from scratch, complete with validation, error handling, and real-world examples.

You don’t need to be an API expert to follow along. We’ll start with a simple “Hello World” endpoint and progressively build toward a complete CRUD application. By the end of this guide, you’ll understand how to design endpoints, validate user input with Pydantic, structure your code professionally, and handle errors gracefully.

Here’s what we’ll cover: we’ll install FastAPI and Uvicorn, create our first endpoint, explore path and query parameters, implement request validation using Pydantic models, build a complete CRUD Todo API, handle errors properly, define response models, and finally answer the most common questions developers ask when getting started. Let’s build something real.

Quick Example: Your First FastAPI Application

Before we dive into the details, here’s a fully functional FastAPI application that you can run right now. This will give you a taste of how simple and elegant FastAPI can be:

# main.py
from fastapi import FastAPI
from pydantic import BaseModel

app = FastAPI()

class Item(BaseModel):
    name: str
    description: str = None
    price: float

@app.get("/")
def read_root():
    return {"message": "Hello, FastAPI!"}

@app.get("/items/{item_id}")
def read_item(item_id: int, query_param: str = None):
    return {"item_id": item_id, "query_param": query_param}

@app.post("/items/")
def create_item(item: Item):
    return {"created_item": item}

Output: When you run this with uvicorn main.py:app --reload, you’ll have a fully functional API server with automatic documentation at http://localhost:8000/docs. This is the magic of FastAPI–your code is simultaneously your documentation.

API Alice unpacking a toolbox for FastAPI setup — pip install fastapi uvicorn. Three words between you and an API that actually works.

What is FastAPI and Why Use It?

FastAPI is a modern, fast web framework for building APIs with Python 3.6+. Created by Sebastian Ramirez, it combines the simplicity of Flask with the power of Django REST Framework, while being incredibly performant. FastAPI automatically generates interactive API documentation (Swagger UI and ReDoc), validates request data using Pydantic models, and provides type hints that make your code self-documenting.

The “fast” in FastAPI refers to both development speed and runtime performance. Development is fast because you write less boilerplate code and get built-in validation. Runtime is fast because it’s built on top of Starlette and uses async/await natively. Unlike older frameworks, FastAPI was designed from the ground up for modern Python async programming.

Here’s how FastAPI compares to other popular frameworks:

Feature	FastAPI	Flask	Django REST
Setup Complexity	Minimal	Minimal	Moderate
Built-in Validation	Yes (Pydantic)	No	Yes
Async Support	Native	Limited	Limited
Auto Documentation	Yes	No	Optional
Performance	Very High	High	High
Learning Curve	Low	Very Low	Moderate

FastAPI wins when you need a modern, high-performance API with minimal setup. It’s particularly valuable for microservices, data science applications, and any project where you want to move fast without sacrificing code quality.

Sudo Sam studying an API architecture blueprint — A well-designed route is like a good map. Nobody gets lost, and everybody finds the data.

Installing FastAPI and Uvicorn

FastAPI is just a framework–it needs a web server to run. The most common choice is Uvicorn, an ASGI server that’s fast, simple, and perfect for development and production. Let’s get everything installed.

First, make sure you have Python 3.7 or higher installed. You can check your version with python --version. Then, create a virtual environment to keep your project dependencies isolated:

# Setup (bash/zsh)
python -m venv fastapi_env
source fastapi_env/bin/activate  # On Windows: fastapi_env\Scripts\activate

# Install FastAPI and Uvicorn
pip install fastapi uvicorn[standard]

Output: The installation will complete in seconds. The [standard] extras include additional features like WebSocket support and automatic reload.

That’s it! You now have everything needed to build professional REST APIs. The next step is to create your first application file and start writing endpoints.

Creating Your First Endpoint

An endpoint is a URL on your API that clients can request. FastAPI uses Python decorators to define endpoints in a way that’s both elegant and powerful. Let’s create a simple application with a few endpoints:

# app.py
from fastapi import FastAPI

app = FastAPI(
    title="My First API",
    description="A simple API to learn FastAPI",
    version="1.0.0"
)

@app.get("/")
def read_root():
    """This is the root endpoint"""
    return {"message": "Welcome to FastAPI!"}

@app.get("/about")
def read_about():
    """Learn more about this API"""
    return {
        "app_name": "My First API",
        "version": "1.0.0",
        "author": "Your Name"
    }

@app.post("/data")
def receive_data(data: str):
    """Receive data from client"""
    return {"received": data}

Output: Save this as app.py and run uvicorn app:app --reload. Visit http://localhost:8000 in your browser and you’ll see the welcome message. Then visit http://localhost:8000/docs to see the interactive API documentation that FastAPI generates automatically.

The --reload flag makes Uvicorn automatically restart when you change your code–perfect for development. Each decorator specifies the HTTP method (@app.get, @app.post) and the path. The function name doesn’t matter to the API, but descriptive names help readability. The docstring automatically becomes the endpoint description in the documentation.

Debug Dee shattering a red X symbol for data validation — Pydantic said your request body was wrong. Pydantic is always right.

Path Parameters and Query Parameters

APIs need flexibility to handle different requests. Path parameters are part of the URL itself (like /users/123), while query parameters come after a question mark (like /users?page=1&limit=10). FastAPI makes both incredibly easy to implement.

Path parameters are the most common way to identify a specific resource. When you want users to retrieve a specific item by ID, you use a path parameter:

# parameters.py
from fastapi import FastAPI

app = FastAPI()

@app.get("/users/{user_id}")
def get_user(user_id: int):
    """Get a specific user by ID"""
    return {"user_id": user_id, "name": f"User {user_id}"}

@app.get("/posts/{post_id}/comments/{comment_id}")
def get_comment(post_id: int, comment_id: int):
    """Get a specific comment on a specific post"""
    return {
        "post_id": post_id,
        "comment_id": comment_id,
        "text": "This is a comment"
    }

Output: When you request /users/42, the API returns {"user_id": 42, "name": "User 42"}. FastAPI automatically extracts the user_id from the URL and validates that it’s an integer. If someone sends /users/abc, FastAPI automatically returns a 422 validation error without any extra code from you.

Query parameters are optional filtering and pagination options. They appear after a question mark in the URL:

# query_params.py
from fastapi import FastAPI

app = FastAPI()

@app.get("/search")
def search(
    query: str,
    page: int = 1,
    limit: int = 10,
    sort_by: str = "relevance"
):
    """Search with pagination and sorting"""
    return {
        "query": query,
        "page": page,
        "limit": limit,
        "sort_by": sort_by,
        "results": []
    }

@app.get("/products")
def list_products(
    category: str = None,
    min_price: float = 0,
    max_price: float = 1000,
    in_stock: bool = True
):
    """List products with optional filters"""
    return {
        "category": category,
        "price_range": {"min": min_price, "max": max_price},
        "in_stock": in_stock,
        "products": []
    }

Output: A request to /search?query=python&page=2&limit=20 will parse all three parameters correctly. Query parameters with default values are optional–you can call /search?query=python and page and limit will use their defaults. This is how you build flexible, user-friendly APIs.

Request Body with Pydantic Models

When clients need to send complex data to your API (like creating a new user or updating a product), you use request bodies. This is where Pydantic models become invaluable. Pydantic automatically validates the incoming data against your model definition, converts types, and provides helpful error messages if something is wrong.

A Pydantic model is a Python class that defines the structure of your data. Here’s how to use them for request bodies:

# models.py
from fastapi import FastAPI
from pydantic import BaseModel, Field
from typing import Optional

app = FastAPI()

class User(BaseModel):
    username: str
    email: str
    full_name: Optional[str] = None
    age: Optional[int] = None

class Product(BaseModel):
    name: str = Field(..., min_length=1, max_length=100)
    description: str = Field(default="", max_length=500)
    price: float = Field(..., gt=0, le=999999)
    stock: int = Field(default=0, ge=0)
    tags: list = Field(default_factory=list)

@app.post("/users")
def create_user(user: User):
    """Create a new user with validation"""
    return {
        "message": f"User {user.username} created successfully",
        "user": user
    }

@app.post("/products")
def create_product(product: Product):
    """Create a new product with detailed validation"""
    return {
        "message": f"Product '{product.name}' created",
        "product": product
    }

@app.put("/users/{user_id}")
def update_user(user_id: int, user: User):
    """Update a user by ID"""
    return {
        "user_id": user_id,
        "message": "User updated",
        "user": user
    }

Output: When you send this JSON to the /products endpoint:

{
  "name": "Python Book",
  "price": 29.99,
  "stock": 50,
  "tags": ["programming", "python", "learning"]
}

FastAPI validates that the price is positive, the stock is non-negative, and the name exists and is between 1 and 100 characters. If you send "price": -10, it rejects it with a clear error message. This validation is automatic–you don’t write any custom validation code. The Field function provides fine-grained control: gt=0 means “greater than zero”, le=999999 means “less than or equal to”, min_length and max_length validate string length.

Pyro Pete juggling CRUD operation orbs — Create, Read, Update, Delete. Four operations, infinite ways to mess up the fifth one you invent.

CRUD Operations

CRUD stands for Create, Read, Update, and Delete–the four fundamental database operations. Let’s build a complete CRUD API for managing a collection of articles. This example uses an in-memory list to keep things simple, but in production you’d use a real database like PostgreSQL:

# crud_api.py
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from typing import List, Optional
from datetime import datetime

app = FastAPI(title="Article API")

class Article(BaseModel):
    id: Optional[int] = None
    title: str
    content: str
    author: str
    created_at: Optional[datetime] = None

# In-memory storage (replace with database in production)
articles_db = []
article_counter = 1

@app.get("/articles", response_model=List[Article])
def list_articles(skip: int = 0, limit: int = 10):
    """Get all articles with pagination"""
    return articles_db[skip:skip + limit]

@app.get("/articles/{article_id}", response_model=Article)
def get_article(article_id: int):
    """Get a specific article by ID"""
    for article in articles_db:
        if article.id == article_id:
            return article
    raise HTTPException(status_code=404, detail="Article not found")

@app.post("/articles", response_model=Article)
def create_article(article: Article):
    """Create a new article"""
    global article_counter
    article.id = article_counter
    article.created_at = datetime.now()
    article_counter += 1
    articles_db.append(article)
    return article

@app.put("/articles/{article_id}", response_model=Article)
def update_article(article_id: int, updated_article: Article):
    """Update an existing article"""
    for i, article in enumerate(articles_db):
        if article.id == article_id:
            updated_article.id = article_id
            updated_article.created_at = article.created_at
            articles_db[i] = updated_article
            return updated_article
    raise HTTPException(status_code=404, detail="Article not found")

@app.delete("/articles/{article_id}")
def delete_article(article_id: int):
    """Delete an article by ID"""
    global articles_db
    initial_length = len(articles_db)
    articles_db = [a for a in articles_db if a.id != article_id]
    if len(articles_db) == initial_length:
        raise HTTPException(status_code=404, detail="Article not found")
    return {"message": f"Article {article_id} deleted successfully"}

Output: This API provides all four CRUD operations:

CREATE: POST to /articles with a JSON body to create a new article
READ: GET /articles to list all, or GET /articles/1 to get a specific one
UPDATE: PUT to /articles/1 to modify an article
DELETE: DELETE to /articles/1 to remove an article

The response_model parameter tells FastAPI what shape the response should be, enabling automatic validation of your response and generation of proper API documentation. The HTTPException with status code 404 is the standard way to signal that a resource wasn’t found.

Error Handling and Status Codes

Professional APIs don’t just return data–they communicate what went wrong when something fails. HTTP status codes are the standard way to do this. FastAPI makes error handling straightforward with the HTTPException class and customizable exception handlers.

Here are the most important HTTP status codes for your API:

200 OK: Request succeeded, here’s your data
201 Created: Resource was successfully created
204 No Content: Success but no data to return (like DELETE)
400 Bad Request: Client sent invalid data
401 Unauthorized: Authentication required
403 Forbidden: Not allowed to access this resource
404 Not Found: Resource doesn’t exist
422 Unprocessable Entity: Validation failed (FastAPI uses this automatically)
500 Internal Server Error: Server error

Let’s implement comprehensive error handling:

# error_handling.py
from fastapi import FastAPI, HTTPException, status
from pydantic import BaseModel, validator
from typing import Optional

app = FastAPI()

class Item(BaseModel):
    name: str
    price: float
    quantity: int

    @validator('price')
    def price_must_be_positive(cls, v):
        if v <= 0:
            raise ValueError('Price must be greater than 0')
        return v

    @validator('quantity')
    def quantity_must_be_positive(cls, v):
        if v < 0:
            raise ValueError('Quantity cannot be negative')
        return v

items_db = {
    1: Item(name="Laptop", price=999.99, quantity=5),
    2: Item(name="Mouse", price=29.99, quantity=50)
}

@app.get("/items/{item_id}")
def get_item(item_id: int):
    """Get item, with proper error handling"""
    if item_id not in items_db:
        raise HTTPException(
            status_code=status.HTTP_404_NOT_FOUND,
            detail=f"Item with ID {item_id} not found"
        )
    return items_db[item_id]

@app.post("/items", status_code=status.HTTP_201_CREATED)
def create_item(item: Item):
    """Create new item with 201 response"""
    new_id = max(items_db.keys()) + 1 if items_db else 1
    items_db[new_id] = item
    return {"id": new_id, "item": item}

@app.delete("/items/{item_id}", status_code=status.HTTP_204_NO_CONTENT)
def delete_item(item_id: int):
    """Delete item, returning 204 No Content"""
    if item_id not in items_db:
        raise HTTPException(
            status_code=status.HTTP_404_NOT_FOUND,
            detail="Item not found"
        )
    del items_db[item_id]
    return None

@app.post("/checkout/{item_id}")
def checkout(item_id: int, quantity: int):
    """Purchase item with inventory check"""
    if item_id not in items_db:
        raise HTTPException(
            status_code=status.HTTP_404_NOT_FOUND,
            detail="Item not found"
        )

    item = items_db[item_id]
    if quantity > item.quantity:
        raise HTTPException(
            status_code=status.HTTP_400_BAD_REQUEST,
            detail=f"Insufficient stock. Only {item.quantity} available"
        )

    if quantity <= 0:
        raise HTTPException(
            status_code=status.HTTP_400_BAD_REQUEST,
            detail="Quantity must be positive"
        )

    item.quantity -= quantity
    return {
        "status": "success",
        "item": item.name,
        "quantity": quantity,
        "total": quantity * item.price
    }

Output: When you request an item that doesn't exist, you get a proper 404 response with a descriptive message. When you try to buy more than available, you get a 400 error explaining the shortage. The validators in the Pydantic model automatically reject invalid data before your handler code even runs.

Response Models

Response models define the structure of your API responses and enable several powerful features: automatic response validation, filtering, serialization, and documentation. Even though your code might work with complex objects, response models let you control exactly what gets sent to the client.

Consider a real-world example where your database contains sensitive information you shouldn't expose:

# response_models.py
from fastapi import FastAPI
from pydantic import BaseModel
from typing import List, Optional
from datetime import datetime

app = FastAPI()

class UserInDB(BaseModel):
    """Full user data with sensitive info"""
    id: int
    username: str
    email: str
    password_hash: str  # Never expose this!
    created_at: datetime

class UserResponse(BaseModel):
    """What we return to clients"""
    id: int
    username: str
    email: str
    created_at: datetime

class BlogPost(BaseModel):
    title: str
    content: str
    author: str

class BlogPostResponse(BaseModel):
    """Response with computed fields"""
    title: str
    content: str
    author: str
    word_count: Optional[int] = None
    excerpt: Optional[str] = None

# Simulated database with sensitive data
users_db = {
    1: UserInDB(
        id=1,
        username="alice",
        email="alice@example.com",
        password_hash="$2b$12$...hashed...",
        created_at=datetime.now()
    )
}

@app.get("/users/{user_id}", response_model=UserResponse)
def get_user(user_id: int):
    """
    Get user - only returns safe fields.
    Password hash is never exposed even though
    the internal representation includes it.
    """
    if user_id not in users_db:
        from fastapi import HTTPException
        raise HTTPException(status_code=404, detail="User not found")
    return users_db[user_id]

@app.post("/posts", response_model=BlogPostResponse)
def create_post(post: BlogPost):
    """
    Create post - response includes computed fields
    like word_count and excerpt that aren't in the request
    """
    content = post.content
    word_count = len(content.split())
    excerpt = content[:100] + "..." if len(content) > 100 else content

    return {
        "title": post.title,
        "content": content,
        "author": post.author,
        "word_count": word_count,
        "excerpt": excerpt
    }

Output: When you request a user, the response only contains the fields defined in UserResponse. The password hash from the database is automatically filtered out. When you create a post, the response includes computed fields like word_count and excerpt even though the client never sent them. This is the power of response models--they decouple your internal data structures from your API contract.

Cache Katie riding a rocket for deployment speed — Your FastAPI app serves 10,000 requests per second. Your database serves 12.

Real-Life Example: Building a Todo API

Let's tie everything together with a complete, production-ready Todo API that demonstrates all the concepts we've learned. This is a practical example you can extend with a real database like PostgreSQL or MongoDB:

# todo_api.py
from fastapi import FastAPI, HTTPException, status
from pydantic import BaseModel, Field
from typing import List, Optional
from datetime import datetime
from enum import Enum

app = FastAPI(
    title="Todo API",
    description="A complete todo management API",
    version="1.0.0"
)

class PriorityLevel(str, Enum):
    low = "low"
    medium = "medium"
    high = "high"

class TodoCreate(BaseModel):
    title: str = Field(..., min_length=1, max_length=200)
    description: Optional[str] = Field(None, max_length=1000)
    priority: PriorityLevel = PriorityLevel.medium
    due_date: Optional[datetime] = None

class TodoResponse(BaseModel):
    id: int
    title: str
    description: Optional[str]
    priority: PriorityLevel
    completed: bool
    due_date: Optional[datetime]
    created_at: datetime
    completed_at: Optional[datetime] = None

todos_db: List[TodoResponse] = []
todo_counter = 1

@app.get("/todos", response_model=List[TodoResponse])
def list_todos(
    completed: Optional[bool] = None,
    priority: Optional[PriorityLevel] = None,
    skip: int = 0,
    limit: int = 20
):
    """Get todos with optional filtering"""
    results = todos_db

    if completed is not None:
        results = [t for t in results if t.completed == completed]
    if priority:
        results = [t for t in results if t.priority == priority]

    return results[skip:skip + limit]

@app.get("/todos/{todo_id}", response_model=TodoResponse)
def get_todo(todo_id: int):
    """Get a specific todo"""
    for todo in todos_db:
        if todo.id == todo_id:
            return todo
    raise HTTPException(status_code=404, detail="Todo not found")

@app.post("/todos", response_model=TodoResponse, status_code=status.HTTP_201_CREATED)
def create_todo(todo: TodoCreate):
    """Create a new todo"""
    global todo_counter
    new_todo = TodoResponse(
        id=todo_counter,
        title=todo.title,
        description=todo.description,
        priority=todo.priority,
        completed=False,
        due_date=todo.due_date,
        created_at=datetime.now()
    )
    todo_counter += 1
    todos_db.append(new_todo)
    return new_todo

@app.put("/todos/{todo_id}", response_model=TodoResponse)
def update_todo(todo_id: int, todo_update: TodoCreate):
    """Update an existing todo"""
    for i, todo in enumerate(todos_db):
        if todo.id == todo_id:
            updated = TodoResponse(
                id=todo.id,
                title=todo_update.title,
                description=todo_update.description,
                priority=todo_update.priority,
                completed=todo.completed,
                due_date=todo_update.due_date,
                created_at=todo.created_at
            )
            todos_db[i] = updated
            return updated
    raise HTTPException(status_code=404, detail="Todo not found")

@app.patch("/todos/{todo_id}/complete", response_model=TodoResponse)
def complete_todo(todo_id: int):
    """Mark a todo as complete"""
    for i, todo in enumerate(todos_db):
        if todo.id == todo_id:
            todo.completed = True
            todo.completed_at = datetime.now()
            todos_db[i] = todo
            return todo
    raise HTTPException(status_code=404, detail="Todo not found")

@app.delete("/todos/{todo_id}", status_code=status.HTTP_204_NO_CONTENT)
def delete_todo(todo_id: int):
    """Delete a todo"""
    global todos_db
    initial_length = len(todos_db)
    todos_db = [t for t in todos_db if t.id != todo_id]
    if len(todos_db) == initial_length:
        raise HTTPException(status_code=404, detail="Todo not found")

Output: This Todo API supports:

Creating todos with title, description, priority, and optional due date
Listing all todos with filtering by completion status and priority
Getting individual todos by ID
Updating todo details
Marking todos as complete (with completion timestamp)
Deleting todos
Proper HTTP status codes for each operation
Full validation of input data through Pydantic models

Run this with uvicorn todo_api:app --reload and visit http://localhost:8000/docs to see the interactive documentation. You can test every endpoint right from your browser. To use this in production, replace the in-memory todos_db list with actual database calls to PostgreSQL, MongoDB, or any other database your project uses.

Frequently Asked Questions

Should I use async functions in FastAPI?

Use async functions when your endpoint does I/O-bound operations like database queries, API calls, or file operations. These operations have "wait time" during which your server can handle other requests. For CPU-bound operations (heavy calculations), use regular synchronous functions. FastAPI handles both seamlessly--just use async def when appropriate. Most real-world APIs benefit from async because they spend time waiting for databases and external services.

How do I add custom validation to Pydantic models?

Use the @validator decorator from Pydantic. The validator function runs after type conversion and can raise a ValueError with a custom message. You can validate a single field or multiple fields by passing multiple field names to the decorator. FastAPI automatically includes validation errors in the 422 response.

How do I secure my FastAPI endpoints?

FastAPI includes built-in support for OAuth2, JWT tokens, and HTTP Basic authentication. Use the Security dependency from FastAPI to protect your endpoints. For a quick start with JWT: install python-jose and passlib, create login endpoints that return tokens, and use Depends() to require tokens on protected endpoints. Never store passwords in plain text--always use password hashing with libraries like bcrypt.

How do I handle CORS (Cross-Origin) requests?

CORS is needed when your frontend runs on a different domain than your API. Install fastapi-cors and add middleware to your app. Specify which origins are allowed, which HTTP methods they can use, and which headers are permitted. This prevents unauthorized websites from accessing your API while allowing your legitimate frontend to communicate.

What's the best way to integrate a database?

Use SQLAlchemy for relational databases (PostgreSQL, MySQL) or an async-compatible library like `tortoise-orm` or `sqlalchemy` with `asyncpg`. FastAPI works great with both synchronous and asynchronous database libraries. Create a separate database.py file with connection logic, use dependency injection with Depends() to pass database sessions to your endpoints, and keep database models separate from Pydantic models to decouple your API contract from your database schema.

How do I test my FastAPI endpoints?

Use the TestClient from FastAPI and pytest. Create a test file that imports your app, creates a client with TestClient(app), and makes requests to your endpoints. The test client simulates HTTP requests without actually running a server. Test successful requests, validation failures, authorization errors, and edge cases. FastAPI makes testing incredibly easy because everything is synchronous in tests.

How do I deploy a FastAPI application to production?

Don't use the development server in production. Use Gunicorn with Uvicorn workers: gunicorn app:app --workers 4 --worker-class uvicorn.workers.UvicornWorker. For containerization, create a Dockerfile with Python, install dependencies, and run Gunicorn. Deploy to any platform that supports Docker: AWS ECS, Google Cloud Run, Heroku, DigitalOcean App Platform, or Kubernetes. Make sure your environment variables for secrets, database URLs, and API keys are set properly--never hardcode them.

Conclusion

You've now learned how to build professional REST APIs with FastAPI. You understand decorators and path parameters, you can validate complex data structures with Pydantic models, you know how to implement complete CRUD operations, and you can handle errors gracefully with proper HTTP status codes. These foundations will serve you well whether you're building a simple microservice or a complex distributed system.

The real power of FastAPI lies in its simplicity and performance. Your code is simultaneously your tests, documentation, and API specification. The validation is automatic. The performance is exceptional. What used to take dozens of lines of boilerplate in older frameworks now takes a few lines of elegant Python.

Your next steps: start building real projects with FastAPI, integrate a real database when your app grows beyond in-memory storage, add authentication with JWT tokens, and deploy to your preferred cloud platform. The FastAPI documentation at https://fastapi.tiangolo.com/ is comprehensive and worth reading as you tackle more advanced topics like WebSockets, background tasks, and database migrations.

Building a Web Scraper with Beautiful Soup and Requests
Django REST Framework: Build Powerful APIs with Django
Database Modeling with SQLAlchemy in Python
Deploying Python Applications with Docker and Kubernetes
Async Programming in Python with asyncio
Testing Python Code with pytest and Mock
Building a Real-Time Chat Application with FastAPI and WebSockets
API Authentication with JWT Tokens in Python

How To Use Python 3.14 Template Strings (T-Strings) for Safe Interpolation

by Pubs | Data Processing, Intermediate

Intermediate

Python 3.14 introduces a powerful new feature called T-Strings (Template Strings) that revolutionizes how you handle string interpolation. If you’ve been frustrated with the limitations of f-strings when dealing with security-sensitive operations like database queries or HTML generation, T-Strings offer an elegant solution. Unlike traditional f-strings that immediately evaluate and return strings, T-Strings return Template objects that give you fine-grained control over how values are processed.

Don’t worry if you’re new to the concept–by the end of this tutorial, you’ll understand exactly when and how to use T-Strings in your projects. We’ll start with practical examples, explore the underlying mechanisms, and then dive into real-world use cases like SQL injection prevention and HTML escaping.

In this guide, we’ll cover the syntax of T-Strings, the Template protocol they implement, how to process templates with custom functions, and several practical applications that will make your code more secure and maintainable. We’ll also explore how this feature compares to existing string formatting methods in Python.

Quick Example

Before diving into the theory, let’s see T-Strings in action. This example demonstrates the fundamental difference between f-strings and T-Strings:

# quick_tstring_demo.py
from __future__ import annotations

# T-String creates a Template object, not a plain string
user_input = "Robert'; DROP TABLE students; --"
database_query = t"SELECT * FROM users WHERE id = {user_input}"

print(f"Template type: {type(database_query)}")
print(f"Template strings attr: {database_query.strings}")
print(f"Template values attr: {database_query.values}")

# You can process the template safely
def escape_sql_value(value):
    """Escape value for SQL injection prevention"""
    escaped = str(value).replace("'", "''")
    return f"'{escaped}'"

# Process the template and build the safe query
safe_parts = [database_query.strings[0]]
for i, value in enumerate(database_query.values):
    safe_parts.append(escape_sql_value(value))
    safe_parts.append(database_query.strings[i + 1])

final_query = "".join(safe_parts)
print(f"\nFinal query: {final_query}")

Output:

Template type: 
Template strings attr: ['SELECT * FROM users WHERE id = ', '']
Template values attr: ["Robert'; DROP TABLE students; --"]

Final query: SELECT * FROM users WHERE id = 'Robert''; DROP TABLE students; --'

Notice how the T-String defers the actual string assembly, allowing us to sanitize values before they’re combined. This is the core advantage of T-Strings over f-strings.

What Are T-Strings?

T-Strings, defined in PEP 750, introduce a new string prefix `t` that transforms string literals into Template objects instead of plain strings. This small change has significant implications for security and control over string interpolation.

Here’s a comparison of different string formatting approaches in Python:

Approach	Returns	Evaluated At	Best Use Case
f-string: `f”Hello {name}”`	str	Statement time	Simple output, logging
T-String: `t”Hello {name}”`	Template	Never (deferred)	Security-sensitive, custom processing
.format(): `”Hello {}”.format(name)`	str	Call time	Legacy code, simple formatting
% formatting: `”Hello %s” % name`	str	Call time	Very old codebases

The key insight is that T-Strings separate the specification of a template (which values go where) from the actual interpolation of values into the template. This separation allows you to apply custom processing logic before combining strings and values.

T-String Basic Syntax

Creating Template Objects

The syntax for creating a T-String is straightforward–use the `t` prefix just like you would use `f` for an f-string:

# tstring_syntax.py
from __future__ import annotations

# Basic T-String
product = "Laptop"
price = 1299.99

template = t"Product: {product}, Price: ${price}"

# Check the type
print(f"Type: {type(template)}")
print(f"Is it a Template? {type(template).__name__}")

# Access the components
print(f"Strings: {template.strings}")
print(f"Values: {template.values}")

Output:

Type: 
Is it a Template? Template
Strings: ('Product: ', ', Price: $', '')
Values: ('Laptop', 1299.99)

The Template object stores the literal string parts and the values separately. The `strings` attribute is a tuple of the static parts, and `values` is a tuple of the interpolated values.

Accessing Template Parts

Once you have a Template object, you can inspect its structure in detail. Templates provide multiple ways to access their components:

# template_inspection.py
from __future__ import annotations

username = "alice"
timestamp = "2026-04-01 14:30:00"
template = t"User {username} logged in at {timestamp}"

# Direct attribute access
print("Strings part:", template.strings)
print("Values part:", template.values)

# Using .args for detailed interpolation info
print("\nInterpolation details:")
for interpolation in template.args:
    print(f"  Value: {interpolation.value}")
    print(f"  Expression: {interpolation.expression}")
    print(f"  Conversion: {interpolation.conversion}")
    print(f"  Format spec: {interpolation.format_spec}")
    print()

Output:

Strings part: ('User ', ' logged in at ', '')
Values part: ('alice', '2026-04-01 14:30:00')

Interpolation details:
  Value: alice
  Expression: username
  Conversion: None
  Format spec: ''

  Value: 2026-04-01 14:30:00
  Expression: timestamp
  Conversion: None
  Format spec: ''

The Interpolation objects give you access to the original expression as a string, the conversion flag (if any), and the format specification. This information is crucial when building custom processors.

Pyro Pete juggling golden T shapes for Python template strings — t-strings: because f-strings were too easy and we needed another letter to argue about.

The Template Protocol

Implementing the Template Protocol

Python defines a formal Template protocol that allows custom objects to work with the `__format__()` method. When you want to create custom template processors, you implement this protocol. The Template protocol requires implementing methods to process template objects systematically.

# custom_template_processor.py
from __future__ import annotations
from typing import Any

class SecureFormatter:
    """Processor that escapes all template values for safe output"""

    def __init__(self, escape_func):
        self.escape_func = escape_func

    def __call__(self, template):
        """Process a template object and return a safe string"""
        result_parts = []

        # Start with the first literal string
        result_parts.append(template.strings[0])

        # Process each value with the escape function
        for i, value in enumerate(template.values):
            escaped_value = self.escape_func(value)
            result_parts.append(str(escaped_value))
            result_parts.append(template.strings[i + 1])

        return "".join(result_parts)

# Define an HTML escaping function
def escape_html(value):
    """Escape HTML special characters"""
    replacements = {
        "&": "&",
        "<": "<",
        ">": ">",
        '"': """,
        "'": "'"
    }
    result = str(value)
    for char, escaped in replacements.items():
        result = result.replace(char, escaped)
    return result

# Use the processor
html_formatter = SecureFormatter(escape_html)
user_comment = ""
template = t"User comment: {user_comment}"
safe_html = html_formatter(template)
print(f"Safe output: {safe_html}")

Output:

Safe output: User comment: <script>alert('XSS')</script>

This example shows how you can create a custom processor that takes any escape function and applies it uniformly to all values in a template. This is one of the primary strengths of T-Strings over f-strings.

Processing Templates with Custom Functions

Building Safe Template Processors

Now let’s build more sophisticated template processors for real-world scenarios. The key is to leverage the separation of strings and values that T-Strings provide:

# advanced_processors.py
from __future__ import annotations
from typing import Callable

class TemplateProcessor:
    """Base processor for handling template objects"""

    def __init__(self, value_handler: Callable):
        self.value_handler = value_handler

    def process(self, template):
        """Process a template with custom handling for each value"""
        parts = [template.strings[0]]

        for i, value in enumerate(template.values):
            processed_value = self.value_handler(value)
            parts.append(str(processed_value))
            parts.append(template.strings[i + 1])

        return "".join(parts)

# Example 1: JSON safe processor
def jsonify(value):
    """Convert value to JSON-safe representation"""
    if isinstance(value, str):
        return f'"{value.replace('"', '\\"')}"'
    elif isinstance(value, bool):
        return "true" if value else "false"
    elif value is None:
        return "null"
    else:
        return str(value)

# Example 2: URL safe processor (basic)
def urlencode(value):
    """Simple URL encoding"""
    import urllib.parse
    return urllib.parse.quote(str(value))

# Create processors
json_processor = TemplateProcessor(jsonify)
url_processor = TemplateProcessor(urlencode)

# Use cases
config_data = '{"admin": true}'
search_term = "python 3.14"

json_template = t"var data = {config_data};"
url_template = t"https://example.com/search?q={search_term}"

print("JSON output:", json_processor.process(json_template))
print("URL output:", url_processor.process(url_template))

Output:

JSON output: var data = "{"admin": true}";
URL output: https://example.com/search?q=python%203.14

By encapsulating the processing logic, you create reusable processors that can handle multiple templates with consistent behavior.

Debug Dee comparing f-string and t-string potions — One returns a string, the other returns possibilities. Choose your potion wisely.

SQL Injection Prevention with T-Strings

Why SQL Injection Prevention Matters

SQL injection is one of the most critical security vulnerabilities in web applications. It occurs when untrusted input is concatenated directly into SQL queries. T-Strings provide an elegant mechanism to prevent this by forcing you to process all values before they enter your SQL.

# sql_safe_queries.py
from __future__ import annotations

class SQLQueryBuilder:
    """Safe SQL query builder using T-Strings"""

    def __init__(self):
        self.query_parts = []

    @staticmethod
    def escape_value(value):
        """Escape values for SQL"""
        if value is None:
            return "NULL"
        elif isinstance(value, bool):
            return "TRUE" if value else "FALSE"
        elif isinstance(value, (int, float)):
            return str(value)
        else:
            # Escape single quotes
            escaped = str(value).replace("'", "''")
            return f"'{escaped}'"

    def build_from_template(self, template):
        """Build a safe query from a T-String template"""
        query_parts = [template.strings[0]]

        for i, value in enumerate(template.values):
            escaped = self.escape_value(value)
            query_parts.append(escaped)
            query_parts.append(template.strings[i + 1])

        return "".join(query_parts)

# Example usage with dangerous input
builder = SQLQueryBuilder()

# Safe input
user_id = 42
query1 = builder.build_from_template(t"SELECT * FROM users WHERE id = {user_id}")
print("Safe query:", query1)

# Dangerous input that would fail with f-strings
email = "test@example.com'; DELETE FROM users; --"
query2 = builder.build_from_template(t"SELECT * FROM users WHERE email = {email}")
print("Protected query:", query2)

Output:

Safe query: SELECT * FROM users WHERE id = 42
Protected query: SELECT * FROM users WHERE email = 'test@example.com''; DELETE FROM users; --'

Notice how the dangerous SQL injection attempt is neutralized–the single quote in the input is escaped to two quotes, and the entire value is wrapped in quotes, making it a literal string value rather than executable SQL code.

Using T-Strings with Parameterized Queries

For database operations, you typically want to use parameterized queries (prepared statements) rather than string concatenation. T-Strings make this cleaner:

# parameterized_queries.py
from __future__ import annotations

class ParameterizedQueryBuilder:
    """Build parameterized queries using T-Strings"""

    def build_query(self, template):
        """Extract placeholders and parameters from T-String"""
        placeholders = []
        parameters = []

        for i, value in enumerate(template.values):
            placeholders.append(f"${i + 1}")  # PostgreSQL style
            parameters.append(value)

        # Reconstruct query with placeholders
        query_parts = [template.strings[0]]
        for i, placeholder in enumerate(placeholders):
            query_parts.append(placeholder)
            query_parts.append(template.strings[i + 1])

        query = "".join(query_parts)
        return query, tuple(parameters)

# Usage
builder = ParameterizedQueryBuilder()

user_id = 42
email = "alice@example.com"
template = t"SELECT * FROM users WHERE id = {user_id} AND email = {email}"

query, params = builder.build_query(template)
print(f"Query: {query}")
print(f"Parameters: {params}")

# This would then be executed as: cursor.execute(query, params)

Output:

Query: SELECT * FROM users WHERE id = $1 AND email = $2
Parameters: (42, 'alice@example.com')

Using parameterized queries is the gold standard for SQL safety, and T-Strings make it easy to construct these queries while keeping your code readable.

Sudo Sam protecting code with a security shield — SQL injection tried to enter the chat. Template strings said no.

HTML Escaping and Content Security

Preventing Cross-Site Scripting (XSS) Attacks

Just as T-Strings help prevent SQL injection, they’re equally valuable for preventing XSS attacks when generating HTML. The process is identical–escape user input before it enters the template output:

# html_template_rendering.py
from __future__ import annotations
import html

class HTMLRenderer:
    """Render HTML templates safely with T-Strings"""

    @staticmethod
    def render(template):
        """Render T-String template as safe HTML"""
        parts = [template.strings[0]]

        for i, value in enumerate(template.values):
            # Use html.escape for automatic entity encoding
            escaped = html.escape(str(value))
            parts.append(escaped)
            parts.append(template.strings[i + 1])

        return "".join(parts)

# Example: User-generated content
renderer = HTMLRenderer()

username = ""
user_bio = "I love "

welcome_html = renderer.render(t"Welcome, {username}!")
bio_html = renderer.render(t"{user_bio}")

print("Rendered welcome:", welcome_html)
print("Rendered bio:", bio_html)

Output:

Rendered welcome: Welcome, <img src=x onerror='alert("XSS")'>!
Rendered bio: I love <script>alert('hack')</script>

The `html.escape()` function automatically converts dangerous characters like `<`, `>`, and quotes into their HTML entity equivalents. Combined with T-Strings, this creates a clean, declarative way to generate safe HTML from user input.

Building Safe Template Systems

For more complex HTML generation, you can build template systems that enforce safety at the framework level:

# safe_template_system.py
from __future__ import annotations
import html

class SafeHTMLTemplate:
    """A template class that automatically escapes all interpolations"""

    def __init__(self, content_template):
        self.template = content_template
        self.escaper = html.escape

    def render(self):
        """Render the template with automatic escaping"""
        parts = [self.template.strings[0]]

        for i, value in enumerate(self.template.values):
            escaped_value = self.escaper(str(value))
            parts.append(escaped_value)
            parts.append(self.template.strings[i + 1])

        return "".join(parts)

# Usage in a web framework context
user_data = {
    "name": "Alice",
    "title": "Admin",
    "bio": "Python "
}

# Create safe templates
name_template = SafeHTMLTemplate(t"{user_data['name']}")
title_template = SafeHTMLTemplate(t"{user_data['title']}")
bio_template = SafeHTMLTemplate(t"{user_data['bio']}")

# Render all safely
print(name_template.render())
print(title_template.render())
print(bio_template.render())

Output:

Alice
<b>Admin</b>
Python <developer>

Notice how the HTML markup in the user data is escaped, preventing any script injection while preserving the intended content.

Loop Larry tangled in a ball of strings — When your string interpolation has more edge cases than your actual business logic.

Advanced: Custom Processors and DSLs

Building Domain-Specific Languages

T-Strings enable the creation of domain-specific languages (DSLs) by allowing custom processing of templates. For example, you could build a templating language for configuration files, data validation, or custom syntax:

# custom_dsl_processor.py
from __future__ import annotations

class ConfigProcessor:
    """Process T-Strings as configuration templates"""

    def __init__(self):
        self.variables = {}

    def register_variable(self, name, value):
        """Register a variable for substitution"""
        self.variables[name] = value

    def process_template(self, template):
        """Process template with variable replacement and formatting"""
        parts = [template.strings[0]]

        for i, value in enumerate(template.values):
            # Apply custom processing based on type
            if isinstance(value, bool):
                processed = "yes" if value else "no"
            elif isinstance(value, (list, tuple)):
                processed = ", ".join(str(v) for v in value)
            elif isinstance(value, dict):
                processed = "; ".join(f"{k}={v}" for k, v in value.items())
            else:
                processed = str(value)

            parts.append(processed)
            parts.append(template.strings[i + 1])

        return "".join(parts)

# Usage in a configuration context
processor = ConfigProcessor()

debug_mode = True
log_level = "INFO"
features = ["auth", "api", "websocket"]
db_config = {"host": "localhost", "port": 5432}

config_template = t"""
Debug mode: {debug_mode}
Log level: {log_level}
Enabled features: {features}
Database config: {db_config}
"""

config_output = processor.process_template(config_template)
print("Generated config:")
print(config_output)

Output:

Generated config:
Debug mode: yes
Log level: INFO
Enabled features: auth, api, websocket
Database config: host=localhost; port=5432

This demonstrates how T-Strings allow you to build sophisticated text generation systems with custom rules for different data types.

Real-World Example: Building a Log Formatter

Let’s build a practical logging system that uses T-Strings to format log messages with automatic context escaping and structuring:

# structured_logging.py
from __future__ import annotations
import json
from datetime import datetime
from enum import Enum

class LogLevel(Enum):
    DEBUG = "DEBUG"
    INFO = "INFO"
    WARNING = "WARNING"
    ERROR = "ERROR"

class StructuredLogger:
    """Logger that formats messages using T-Strings"""

    def __init__(self, service_name):
        self.service_name = service_name
        self.logs = []

    def _create_log_entry(self, level, template):
        """Create a structured log entry from a T-String template"""
        # Build the message
        message_parts = [template.strings[0]]
        context = {}

        for i, value in enumerate(template.values):
            # Use expression as context key
            interpolation = template.args[i]
            key = interpolation.expression or f"arg{i}"

            # Store in context dict
            context[key] = value

            # Add to message
            message_parts.append(str(value))
            message_parts.append(template.strings[i + 1])

        message = "".join(message_parts)

        # Create structured log entry
        log_entry = {
            "timestamp": datetime.now().isoformat(),
            "service": self.service_name,
            "level": level.value,
            "message": message,
            "context": context
        }

        return log_entry

    def log(self, level, template):
        """Log a message with automatic context capture"""
        entry = self._create_log_entry(level, template)
        self.logs.append(entry)
        print(json.dumps(entry, indent=2))

# Usage
logger = StructuredLogger("auth-service")

user_id = 12345
username = "alice_smith"
ip_address = "192.168.1.100"

logger.log(LogLevel.INFO, t"User {username} (ID: {user_id}) logged in from {ip_address}")

failed_attempts = 5
max_attempts = 10
logger.log(LogLevel.WARNING, t"User {username} has {failed_attempts} failed login attempts (max: {max_attempts})")

Output:

{
  "timestamp": "2026-04-01T12:30:45.123456",
  "service": "auth-service",
  "level": "INFO",
  "message": "User alice_smith (ID: 12345) logged in from 192.168.1.100",
  "context": {
    "username": "alice_smith",
    "user_id": 12345,
    "ip_address": "192.168.1.100"
  }
}
{
  "timestamp": "2026-04-01T12:30:46.234567",
  "service": "auth-service",
  "level": "WARNING",
  "message": "User alice_smith has 5 failed login attempts (max: 10)",
  "context": {
    "username": "alice_smith",
    "failed_attempts": 5,
    "max_attempts": 10
  }
}

This example shows how T-Strings enable automatic context extraction and structured logging, capturing not just the formatted message but also the individual values and their names for later analysis.

How to Try Python 3.14 T-Strings Today

Since Python 3.14 is still in development as of this writing, you’ll need to run Python from the development version. Here’s how to get started:

# Installation options for trying T-Strings

# Option 1: Build from source (Linux/macOS)
git clone https://github.com/python/cpython.git
cd cpython
./configure --prefix=$HOME/python314
make
make install

# Option 2: Use Docker
docker run -it python:3.14-dev bash

# Option 3: Download pre-built alpha/beta releases
# Visit https://www.python.org/downloads/
# Look for 3.14 alpha or beta versions

# Once installed, verify T-String support:
python3.14 -c "t = t'Test'; print(type(t))"

The official Python downloads page provides alpha and beta releases as they become available. Join the Python community discussions on python-discuss@python.org if you want to provide feedback on T-Strings and PEP 750.

Cache Katie crossing the finish line for performance comparison — Benchmarked t-strings vs f-strings. The real winner is whoever ships on Friday.

Frequently Asked Questions

Are T-Strings backwards compatible with older Python versions?

No, T-Strings are a new feature in Python 3.14 and will not work in earlier versions. If you need to support older Python versions, you’ll need to either use f-strings or implement your own template processor. You can use `from __future__ import annotations` in Python 3.7+ to help with some compatibility, but the `t` prefix itself is new to 3.14.

What’s the performance impact of using T-Strings instead of f-strings?

T-Strings have a slightly higher memory footprint because they create Template objects rather than immediately evaluating to strings. However, if you’re processing templates (which is the entire point), the overhead is minimal compared to the safety benefits. For simple one-off templates where you don’t need processing, f-strings remain slightly more efficient.

Can I combine T-Strings with f-strings in the same code?

Absolutely! There’s no conflict between using both. Use f-strings for simple formatting and T-Strings when you need custom processing. In fact, many real applications will use both depending on context. Remember that you cannot use `f` and `t` prefixes together on the same string literal.

How do custom format specs work with T-Strings?

Format specs like `{value:.2f}` are captured in the Interpolation object’s `format_spec` attribute. Your custom processor can then apply these format specifications when processing the template. Here’s a quick example:

# format_spec_example.py
from __future__ import annotations

def format_aware_processor(template):
    parts = [template.strings[0]]

    for i, value in enumerate(template.values):
        interpolation = template.args[i]
        format_spec = interpolation.format_spec

        if format_spec:
            formatted = format(value, format_spec)
        else:
            formatted = str(value)

        parts.append(formatted)
        parts.append(template.strings[i + 1])

    return "".join(parts)

price = 19.5
quantity = 5
template = t"Price: ${price:.2f}, Qty: {quantity:03d}"
result = format_aware_processor(template)
print(result)

Output:

Price: $19.50, Qty: 005

Can T-Strings contain other T-Strings?

Yes, you can nest T-Strings, but the outer template will contain a Template object as one of its values rather than a string. You would need to process the inner template first, or create a processor that handles nested Template objects specially. Most use cases don’t require this complexity.

How do multiline T-Strings work?

T-Strings support multiline strings just like regular Python strings. Use triple quotes for multiline templates:

# multiline_template.py
from __future__ import annotations

name = "Bob"
email = "bob@example.com"

template = t"""
User Profile:
Name: {name}
Email: {email}
"""

print(template.strings)
print(template.values)

Output:

('"\n\nUser Profile:\nName: ', '\nEmail: ', '\n"')
('Bob', 'bob@example.com')

Conclusion

T-Strings represent a significant evolution in Python’s string handling capabilities, particularly for security-sensitive applications. By deferring the combination of string parts and values, they enable custom processing that’s impossible with f-strings, making your code more secure against injection attacks and more flexible for advanced use cases.

The key advantages are clear: T-Strings naturally support SQL injection prevention, HTML escaping, URL encoding, and custom domain-specific languages through a clean, consistent API. Whether you’re building web applications, CLI tools, or data processing pipelines, understanding and leveraging T-Strings will improve both the security and maintainability of your code.

For more information, consult the official Python documentation for PEP 750 at https://peps.python.org/pep-0750/ and the standard Template protocol documentation in the Python standard library.

Explore these related topics to deepen your Python expertise:

String Formatting in Python: A Complete Guide (f-strings, .format(), and legacy methods)
SQL Injection: Prevention Strategies and Best Practices
Web Security in Python: CSRF, XSS, and CORS
Building Custom DSLs with Python
Advanced Template Engines: Jinja2, Mako, and Cheetah

How To Work with ZIP Files in Python

Quick Example: Creating and Reading Your First ZIP File

What Are ZIP Files and Why Use Python?

Creating ZIP Files from Scratch

Reading and Extracting ZIP Files

Adding Files to Existing Archives

Extracting Specific Files Without Extraction Spam

Working with Password-Protected ZIP Files

Choosing Compression Algorithms and Levels

Real-World Example: Building a Backup Manager

Frequently Asked Questions

What’s a ZIP bomb and how do I protect against it?

Does the zipfile module handle symbolic links?

How do I handle very large files (multi-GB)?

Can I modify files inside a ZIP without re-creating it?

How do I ensure archives are portable across Windows, macOS, and Linux?

Conclusion

Related Articles

Related Python Tutorials

How To Use uv: The Fast Python Package Manager

What is uv?

Why Choose uv?

Installing uv

Basic uv Commands

Creating a New Project

Adding Dependencies

Installing from pyproject.toml

Running Python Scripts

Advanced Usage

Python Version Management

Working with Virtual Environments

Pre-Release and Development Versions

Comparing with Pip and Poetry

Real-World Example: Setting Up a Data Science Project

Frequently Asked Questions

Is uv production-ready?

Will uv replace pip?

Can I use uv alongside pip?

What about compatibility?

Conclusion

Related Python Tutorials

How To Use Pydantic V2 for Data Validation in Python

Quick Example

What is Pydantic V2?

BaseModel Basics

Field Types and Constraints

Field Validators

Model Validators

Nested Models

JSON Serialization and Configuration

Real-Life Example: API Request Validator

Frequently Asked Questions

How do I migrate from Pydantic V1 to V2?

Is Pydantic V2 faster than V1?

How do I write custom validators for complex logic?

How do I control JSON serialization in Pydantic V2?

How do I use Pydantic with databases?

How do I customize validation error messages?

Conclusion

Related Articles

Related Python Tutorials

How To Build a REST API with FastAPI in Python

Quick Example: Your First FastAPI Application

What is FastAPI and Why Use It?

Installing FastAPI and Uvicorn

Creating Your First Endpoint

Path Parameters and Query Parameters

Request Body with Pydantic Models

CRUD Operations

Error Handling and Status Codes

Response Models

Real-Life Example: Building a Todo API

Frequently Asked Questions

Should I use async functions in FastAPI?

How do I add custom validation to Pydantic models?

How do I secure my FastAPI endpoints?

How do I handle CORS (Cross-Origin) requests?

What's the best way to integrate a database?

How do I test my FastAPI endpoints?

How do I deploy a FastAPI application to production?