Intermediate

Every Python application that receives data from the outside world — API requests, configuration files, CSV imports, form submissions — faces the same problem: how do you guarantee that the data matches what your code expects? A missing field crashes your function. A string where you expected an integer causes silent bugs. An email address without an “@” slips into your database. Manual validation with if-else chains works for one or two fields, but it does not scale.

Pydantic V2 solves this by letting you define data shapes as Python classes with type hints, then validating and converting incoming data automatically. Released in mid-2023, V2 is a complete rewrite of Pydantic with a Rust-powered core that runs 5-50x faster than V1. It is the validation engine behind FastAPI, and it works just as well standalone in any Python project. Install it with pip install pydantic.

This tutorial covers everything you need to use Pydantic V2 effectively: defining models with type annotations, using built-in validators and constraints, writing custom validation logic, working with nested models, serializing data to dictionaries and JSON, and handling validation errors gracefully. By the end, you will be able to validate any data structure your application encounters.

Pydantic Validation in 30 Seconds

Here is the smallest useful Pydantic model. It validates a user’s data and converts types automatically.

# quick_example.py
from pydantic import BaseModel

class User(BaseModel):
    name: str
    age: int
    email: str

# Valid data -- works perfectly
user = User(name="Alice", age=30, email="alice@example.com")
print(user)
print(user.model_dump())

# Type coercion -- "25" becomes int 25
user2 = User(name="Bob", age="25", email="bob@example.com")
print(f"Bob's age: {user2.age} (type: {type(user2.age).__name__})")

Output:

name='Alice' age=30 email='alice@example.com'
{'name': 'Alice', 'age': 30, 'email': 'alice@example.com'}
Bob's age: 25 (type: int)

Notice that Pydantic automatically converted the string "25" to the integer 25 because the age field is typed as int. This type coercion is one of Pydantic’s most practical features — it handles the messy reality of data that comes in as strings from JSON, forms, or environment variables.

What Is Pydantic and Why Use It?

Pydantic is a data validation library that uses Python type hints to define data structures and validate them at runtime. When you create a Pydantic model instance, it checks every field against its declared type, applies any constraints you have defined, and raises a detailed error if anything is wrong.

ApproachLines of CodeType CoercionError MessagesNested Validation
Manual if-elseManyManualYou write themYou build it
dataclassesFewNoneBasic TypeErrorNone
Pydantic V2FewAutomaticDetailed, structuredBuilt-in
marshmallowModerateConfigurableDetailedBuilt-in

The key advantage of Pydantic over alternatives like dataclasses or attrs is that it validates at runtime. A dataclass with age: int happily accepts age="hello" — it only declares the type hint without enforcing it. Pydantic actually checks and converts the value, raising a ValidationError if conversion fails.

Creating Pydantic models
Twelve lines of __init__ boilerplate, or one BaseModel. Pick wisely.

Built-in Field Types and Constraints

Pydantic supports all standard Python types plus specialized types for common validation patterns. The Field function adds constraints like minimum length, numeric ranges, and regex patterns.

# field_types.py
from pydantic import BaseModel, Field, EmailStr
from typing import Optional
from datetime import datetime

class Product(BaseModel):
    name: str = Field(..., min_length=1, max_length=100)
    price: float = Field(..., gt=0, description="Price in dollars")
    quantity: int = Field(default=0, ge=0)
    sku: str = Field(..., pattern=r"^[A-Z]{2}-\d{4}$")
    description: Optional[str] = None
    created_at: datetime = Field(default_factory=datetime.now)

# Valid product
product = Product(name="Widget", price=19.99, sku="AB-1234")
print(product.model_dump())

# Invalid -- price is negative
try:
    bad = Product(name="Widget", price=-5, sku="AB-1234")
except Exception as e:
    print(f"Error: {e}")

Output:

{'name': 'Widget', 'price': 19.99, 'quantity': 0, 'sku': 'AB-1234', 'description': None, 'created_at': '2026-04-07T...'}
Error: 1 validation error for Product
price
  Input should be greater than 0 [type=greater_than, input_value=-5, input_type=int]

The Field function is where you add constraints beyond basic type checking. The ... (Ellipsis) means the field is required. The gt=0 constraint rejects zero and negative numbers. The pattern constraint validates the SKU format with a regex. All of these constraints are checked automatically when you create the model instance.

Common Pydantic Types

Pydantic provides specialized types that go beyond basic Python types. These handle common validation patterns that would otherwise require custom code.

TypeWhat It ValidatesExample
EmailStrValid email format"user@example.com"
HttpUrlValid HTTP/HTTPS URL"https://example.com"
IPvAnyAddressValid IPv4 or IPv6 address"192.168.1.1"
SecretStrString hidden in repr/logs"s3cr3t" (shows "**********")
PositiveIntInteger greater than 042
FutureDatetimeDatetime in the future"2027-01-01T00:00:00"
constrConstrained string (length, pattern)constr(min_length=3)

To use EmailStr, install the optional dependency: pip install pydantic[email]. The other types are available in the core package.

Custom Validators

When built-in constraints are not enough, Pydantic V2 provides the @field_validator and @model_validator decorators for custom validation logic.

Field Validators

# custom_validators.py
from pydantic import BaseModel, field_validator

class Registration(BaseModel):
    username: str
    password: str
    confirm_password: str

    @field_validator("username")
    @classmethod
    def username_must_be_alphanumeric(cls, v: str) -> str:
        if not v.isalnum():
            raise ValueError("Username must contain only letters and numbers")
        if len(v) < 3:
            raise ValueError("Username must be at least 3 characters")
        return v.lower()  # Normalize to lowercase

    @field_validator("password")
    @classmethod
    def password_strength(cls, v: str) -> str:
        if len(v) < 8:
            raise ValueError("Password must be at least 8 characters")
        if not any(c.isupper() for c in v):
            raise ValueError("Password must contain an uppercase letter")
        if not any(c.isdigit() for c in v):
            raise ValueError("Password must contain a digit")
        return v

# Valid registration
reg = Registration(username="Alice42", password="Secure1Pass", confirm_password="Secure1Pass")
print(f"Username: {reg.username}")

# Invalid username
try:
    Registration(username="a!", password="Secure1Pass", confirm_password="Secure1Pass")
except Exception as e:
    print(f"Error: {e}")

Output:

Username: alice42
Error: 1 validation error for Registration
username
  Value error, Username must contain only letters and numbers [type=value_error, ...]

Field validators receive the raw value and can either return a transformed value (like v.lower()) or raise a ValueError with a descriptive message. The @classmethod decorator is required in V2.

Pydantic field validators
@field_validator stamps your data with approval — or stamps it into the ground.

Model Validators

Model validators check relationships between multiple fields. Use them when validation depends on more than one field at a time.

# model_validator.py
from pydantic import BaseModel, model_validator

class DateRange(BaseModel):
    start_date: str
    end_date: str

    @model_validator(mode="after")
    def check_dates(self):
        if self.start_date >= self.end_date:
            raise ValueError("end_date must be after start_date")
        return self

class Registration(BaseModel):
    password: str
    confirm_password: str

    @model_validator(mode="after")
    def passwords_match(self):
        if self.password != self.confirm_password:
            raise ValueError("Passwords do not match")
        return self

# Valid
dates = DateRange(start_date="2026-01-01", end_date="2026-12-31")
print(f"Range: {dates.start_date} to {dates.end_date}")

# Invalid -- passwords don't match
try:
    Registration(password="Secret1Pass", confirm_password="Different1Pass")
except Exception as e:
    print(f"Error: {e}")

Output:

Range: 2026-01-01 to 2026-12-31
Error: 1 validation error for Registration
  Value error, Passwords do not match [type=value_error, ...]

The mode="after" parameter means the validator runs after individual field validation is complete, so you can safely access all fields. Use mode="before" when you need to transform the raw input data before field-level validation runs.

Nested Models

Real-world data is rarely flat. Pydantic handles nested structures by composing models inside other models. Validation cascades through every level automatically.

# nested_models.py
from pydantic import BaseModel, EmailStr
from typing import Optional

class Address(BaseModel):
    street: str
    city: str
    state: str
    zip_code: str

class ContactInfo(BaseModel):
    email: str
    phone: Optional[str] = None
    address: Address

class Employee(BaseModel):
    name: str
    title: str
    department: str
    contact: ContactInfo

# Create from nested dictionaries
data = {
    "name": "Alice Johnson",
    "title": "Senior Developer",
    "department": "Engineering",
    "contact": {
        "email": "alice@company.com",
        "phone": "555-0123",
        "address": {
            "street": "123 Main St",
            "city": "Melbourne",
            "state": "VIC",
            "zip_code": "3000"
        }
    }
}

employee = Employee(**data)
print(f"Name: {employee.name}")
print(f"City: {employee.contact.address.city}")
print(f"Email: {employee.contact.email}")

Output:

Name: Alice Johnson
City: Melbourne
Email: alice@company.com

Pydantic validates every level of the nested structure. If the zip code is missing from the address, you get an error pointing to the exact path: contact -> address -> zip_code. This nested validation is especially valuable when parsing complex JSON from APIs or configuration files.

Serialization: Models to Dictionaries and JSON

Pydantic models are not just for validation -- they also handle serialization. The model_dump() and model_dump_json() methods convert models back to dictionaries and JSON strings with fine-grained control.

# serialization.py
from pydantic import BaseModel, Field
from typing import Optional
from datetime import datetime

class Article(BaseModel):
    title: str
    content: str
    author: str
    published: bool = False
    created_at: datetime = Field(default_factory=datetime.now)
    internal_notes: Optional[str] = None

article = Article(
    title="Pydantic V2 Guide",
    content="Learn data validation...",
    author="Alice",
    internal_notes="Draft needs review"
)

# Full dictionary
print("Full:", article.model_dump())

# Exclude internal fields
public = article.model_dump(exclude={"internal_notes", "created_at"})
print("Public:", public)

# Only include specific fields
summary = article.model_dump(include={"title", "author", "published"})
print("Summary:", summary)

# Skip fields with None values
clean = article.model_dump(exclude_none=True)
print("Clean:", clean)

# JSON string
json_str = article.model_dump_json(indent=2)
print("JSON:", json_str[:80], "...")

Output:

Full: {'title': 'Pydantic V2 Guide', 'content': 'Learn data validation...', 'author': 'Alice', 'published': False, 'created_at': datetime(...), 'internal_notes': 'Draft needs review'}
Public: {'title': 'Pydantic V2 Guide', 'content': 'Learn data validation...', 'author': 'Alice', 'published': False}
Summary: {'title': 'Pydantic V2 Guide', 'author': 'Alice', 'published': False}
Clean: {'title': 'Pydantic V2 Guide', 'content': 'Learn data validation...', 'author': 'Alice', 'published': False, 'created_at': datetime(...), 'internal_notes': 'Draft needs review'}
JSON: {
  "title": "Pydantic V2 Guide",
  "content": "Learn data validation..." ...

The exclude and include parameters give you control over which fields appear in the output. This is essential for APIs where internal fields like notes or timestamps should not be sent to clients.

Handling validation errors
Square peg, round hole. Pydantic will tell you exactly which corner doesn't fit.

Handling Validation Errors

When validation fails, Pydantic raises a ValidationError with detailed information about every problem. You can catch this exception and extract structured error data for API responses or logging.

# error_handling.py
from pydantic import BaseModel, Field, ValidationError

class Order(BaseModel):
    product: str = Field(..., min_length=1)
    quantity: int = Field(..., gt=0)
    price: float = Field(..., gt=0)
    email: str

# Multiple validation errors at once
try:
    Order(product="", quantity=-5, price="free", email="not-an-email")
except ValidationError as e:
    print(f"Error count: {e.error_count()}")
    print()
    for error in e.errors():
        print(f"Field: {error['loc']}")
        print(f"Message: {error['msg']}")
        print(f"Type: {error['type']}")
        print()

Output:

Error count: 3

Field: ('product',)
Message: String should have at least 1 character
Type: string_too_short

Field: ('quantity',)
Message: Input should be greater than 0
Type: greater_than

Field: ('price',)
Message: Input should be a valid number, unable to parse string as a number
Type: float_parsing

Pydantic collects all validation errors rather than stopping at the first one. Each error includes the field path (loc), a human-readable message (msg), and a machine-readable error type (type). In a FastAPI application, these errors are automatically converted to 422 responses with this same structured format.

Model Configuration

Pydantic V2 uses model_config to customize model behavior. This replaces the inner class Config from V1.

# model_config.py
from pydantic import BaseModel, ConfigDict

class StrictUser(BaseModel):
    model_config = ConfigDict(
        str_strip_whitespace=True,    # Strip leading/trailing whitespace
        str_min_length=1,             # No empty strings allowed
        frozen=True,                  # Immutable after creation
        extra="forbid",               # No extra fields allowed
    )

    name: str
    email: str

# Whitespace gets stripped automatically
user = StrictUser(name="  Alice  ", email="alice@example.com")
print(f"Name: '{user.name}'")

# Extra fields are rejected
try:
    StrictUser(name="Bob", email="bob@example.com", role="admin")
except Exception as e:
    print(f"Extra field error: {e}")

# Immutable -- cannot change after creation
try:
    user.name = "Charlie"
except Exception as e:
    print(f"Frozen error: {e}")

Output:

Name: 'Alice'
Extra field error: 1 validation error for StrictUser
role
  Extra inputs are not permitted [type=extra_forbidden, ...]
Frozen error: 1 validation error for StrictUser
name
  Instance is frozen [type=frozen_instance, ...]

The extra="forbid" setting is especially important for security -- it prevents attackers from injecting unexpected fields into your data models. The frozen=True setting makes models behave like named tuples, which is useful for configuration objects that should not be modified after creation.

Real-Life Example: Application Configuration Manager

Pydantic V2 performance
V2 rewrote the core in Rust. Your validation just got a turbocharger.

Here is a practical example: a type-safe application configuration system that reads from environment variables, validates every setting at startup, and provides clean access throughout your application.

# config_manager.py
import os
from pydantic import BaseModel, Field, field_validator, model_validator
from pydantic import ConfigDict
from typing import Optional

class DatabaseConfig(BaseModel):
    host: str = "localhost"
    port: int = Field(default=5432, ge=1, le=65535)
    name: str
    user: str
    password: str
    pool_size: int = Field(default=10, ge=1, le=100)

    @property
    def connection_url(self) -> str:
        return f"postgresql://{self.user}:{self.password}@{self.host}:{self.port}/{self.name}"

class CacheConfig(BaseModel):
    enabled: bool = True
    ttl_seconds: int = Field(default=300, ge=0)
    max_size: int = Field(default=1000, ge=1)

class LoggingConfig(BaseModel):
    level: str = "INFO"
    format: str = "json"

    @field_validator("level")
    @classmethod
    def validate_level(cls, v: str) -> str:
        allowed = {"DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL"}
        v_upper = v.upper()
        if v_upper not in allowed:
            raise ValueError(f"Log level must be one of {allowed}")
        return v_upper

    @field_validator("format")
    @classmethod
    def validate_format(cls, v: str) -> str:
        if v not in ("json", "text"):
            raise ValueError("Format must be 'json' or 'text'")
        return v

class AppConfig(BaseModel):
    model_config = ConfigDict(frozen=True)

    app_name: str = "MyApp"
    debug: bool = False
    api_version: str = "v1"
    database: DatabaseConfig
    cache: CacheConfig = CacheConfig()
    logging: LoggingConfig = LoggingConfig()

    @model_validator(mode="after")
    def warn_debug_in_production(self):
        if self.debug and self.logging.level not in ("DEBUG", "INFO"):
            raise ValueError("Debug mode requires log level DEBUG or INFO")
        return self

# Create config from a settings dictionary
settings = {
    "app_name": "BookStore API",
    "debug": True,
    "database": {
        "host": "db.example.com",
        "port": 5432,
        "name": "bookstore",
        "user": "app_user",
        "password": "secure_password_here",
    },
    "cache": {"ttl_seconds": 600, "max_size": 5000},
    "logging": {"level": "debug", "format": "json"},
}

config = AppConfig(**settings)
print(f"App: {config.app_name}")
print(f"DB URL: {config.database.connection_url}")
print(f"Cache TTL: {config.cache.ttl_seconds}s")
print(f"Log Level: {config.logging.level}")
print(f"Debug: {config.debug}")

Output:

App: BookStore API
DB URL: postgresql://app_user:secure_password_here@db.example.com:5432/bookstore
Cache TTL: 600s
Log Level: DEBUG
Debug: True

This configuration model validates every setting at application startup. If the database port is out of range, the log level is invalid, or debug mode conflicts with the log level, you get a clear error immediately instead of a mysterious failure at runtime. The frozen=True config prevents accidental modification after initialization.

Frequently Asked Questions

What changed between Pydantic V1 and V2?

The biggest changes are: .dict() became .model_dump(), .json() became .model_dump_json(), inner class Config became model_config = ConfigDict(...), validators use @field_validator instead of @validator, and the core validation engine was rewritten in Rust for major performance improvements. The migration guide at docs.pydantic.dev/latest/migration covers every change.

How much faster is V2 than V1?

Pydantic V2 is 5-50x faster than V1 depending on the operation. Simple model creation is about 5x faster, while complex nested validation can see 50x improvements. The speed comes from the Rust core (pydantic-core) that handles parsing and validation natively.

Should I use Pydantic or dataclasses?

Use Pydantic when you need runtime validation (API inputs, config files, external data). Use dataclasses when you need simple data containers for internal application state where the data is already trusted. Pydantic also has a @pydantic.dataclasses.dataclass decorator that adds validation to standard dataclass syntax.

How does Pydantic integrate with FastAPI?

FastAPI uses Pydantic models for request body validation, query parameter validation, and response serialization. When you define a FastAPI endpoint parameter as a Pydantic model, FastAPI automatically validates incoming JSON against it and returns 422 errors with Pydantic's structured error format.

Can I use Pydantic with SQLAlchemy or Django ORM?

Yes. Use model_config = ConfigDict(from_attributes=True) (formerly orm_mode) to create Pydantic models from ORM objects. This lets you validate and serialize database records through Pydantic models: UserSchema.model_validate(db_user) converts an ORM object into a validated Pydantic model.

Conclusion

Pydantic V2 is the most practical data validation library in the Python ecosystem. Its combination of type hint-driven validation, automatic type coercion, structured error reporting, and Rust-powered performance makes it the right choice for any project that handles external data. The configuration manager example shows how a well-designed model catches errors at startup instead of letting them surface as runtime crashes.

Start by replacing your manual validation code with Pydantic models, then explore advanced features like computed fields, generic models, and custom types. The official documentation at docs.pydantic.dev is comprehensive and well-organized.