Intermediate
Every Python application that receives data from the outside world — API requests, configuration files, CSV imports, form submissions — faces the same problem: how do you guarantee that the data matches what your code expects? A missing field crashes your function. A string where you expected an integer causes silent bugs. An email address without an “@” slips into your database. Manual validation with if-else chains works for one or two fields, but it does not scale.
Pydantic V2 solves this by letting you define data shapes as Python classes with type hints, then validating and converting incoming data automatically. Released in mid-2023, V2 is a complete rewrite of Pydantic with a Rust-powered core that runs 5-50x faster than V1. It is the validation engine behind FastAPI, and it works just as well standalone in any Python project. Install it with pip install pydantic.
This tutorial covers everything you need to use Pydantic V2 effectively: defining models with type annotations, using built-in validators and constraints, writing custom validation logic, working with nested models, serializing data to dictionaries and JSON, and handling validation errors gracefully. By the end, you will be able to validate any data structure your application encounters.
Pydantic Validation in 30 Seconds
Here is the smallest useful Pydantic model. It validates a user’s data and converts types automatically.
# quick_example.py
from pydantic import BaseModel
class User(BaseModel):
name: str
age: int
email: str
# Valid data -- works perfectly
user = User(name="Alice", age=30, email="alice@example.com")
print(user)
print(user.model_dump())
# Type coercion -- "25" becomes int 25
user2 = User(name="Bob", age="25", email="bob@example.com")
print(f"Bob's age: {user2.age} (type: {type(user2.age).__name__})")
Output:
name='Alice' age=30 email='alice@example.com'
{'name': 'Alice', 'age': 30, 'email': 'alice@example.com'}
Bob's age: 25 (type: int)
Notice that Pydantic automatically converted the string "25" to the integer 25 because the age field is typed as int. This type coercion is one of Pydantic’s most practical features — it handles the messy reality of data that comes in as strings from JSON, forms, or environment variables.
What Is Pydantic and Why Use It?
Pydantic is a data validation library that uses Python type hints to define data structures and validate them at runtime. When you create a Pydantic model instance, it checks every field against its declared type, applies any constraints you have defined, and raises a detailed error if anything is wrong.
| Approach | Lines of Code | Type Coercion | Error Messages | Nested Validation |
|---|---|---|---|---|
| Manual if-else | Many | Manual | You write them | You build it |
| dataclasses | Few | None | Basic TypeError | None |
| Pydantic V2 | Few | Automatic | Detailed, structured | Built-in |
| marshmallow | Moderate | Configurable | Detailed | Built-in |
The key advantage of Pydantic over alternatives like dataclasses or attrs is that it validates at runtime. A dataclass with age: int happily accepts age="hello" — it only declares the type hint without enforcing it. Pydantic actually checks and converts the value, raising a ValidationError if conversion fails.
Built-in Field Types and Constraints
Pydantic supports all standard Python types plus specialized types for common validation patterns. The Field function adds constraints like minimum length, numeric ranges, and regex patterns.
# field_types.py
from pydantic import BaseModel, Field, EmailStr
from typing import Optional
from datetime import datetime
class Product(BaseModel):
name: str = Field(..., min_length=1, max_length=100)
price: float = Field(..., gt=0, description="Price in dollars")
quantity: int = Field(default=0, ge=0)
sku: str = Field(..., pattern=r"^[A-Z]{2}-\d{4}$")
description: Optional[str] = None
created_at: datetime = Field(default_factory=datetime.now)
# Valid product
product = Product(name="Widget", price=19.99, sku="AB-1234")
print(product.model_dump())
# Invalid -- price is negative
try:
bad = Product(name="Widget", price=-5, sku="AB-1234")
except Exception as e:
print(f"Error: {e}")
Output:
{'name': 'Widget', 'price': 19.99, 'quantity': 0, 'sku': 'AB-1234', 'description': None, 'created_at': '2026-04-07T...'}
Error: 1 validation error for Product
price
Input should be greater than 0 [type=greater_than, input_value=-5, input_type=int]
The Field function is where you add constraints beyond basic type checking. The ... (Ellipsis) means the field is required. The gt=0 constraint rejects zero and negative numbers. The pattern constraint validates the SKU format with a regex. All of these constraints are checked automatically when you create the model instance.
Common Pydantic Types
Pydantic provides specialized types that go beyond basic Python types. These handle common validation patterns that would otherwise require custom code.
| Type | What It Validates | Example |
|---|---|---|
EmailStr | Valid email format | "user@example.com" |
HttpUrl | Valid HTTP/HTTPS URL | "https://example.com" |
IPvAnyAddress | Valid IPv4 or IPv6 address | "192.168.1.1" |
SecretStr | String hidden in repr/logs | "s3cr3t" (shows "**********") |
PositiveInt | Integer greater than 0 | 42 |
FutureDatetime | Datetime in the future | "2027-01-01T00:00:00" |
constr | Constrained string (length, pattern) | constr(min_length=3) |
To use EmailStr, install the optional dependency: pip install pydantic[email]. The other types are available in the core package.
Custom Validators
When built-in constraints are not enough, Pydantic V2 provides the @field_validator and @model_validator decorators for custom validation logic.
Field Validators
# custom_validators.py
from pydantic import BaseModel, field_validator
class Registration(BaseModel):
username: str
password: str
confirm_password: str
@field_validator("username")
@classmethod
def username_must_be_alphanumeric(cls, v: str) -> str:
if not v.isalnum():
raise ValueError("Username must contain only letters and numbers")
if len(v) < 3:
raise ValueError("Username must be at least 3 characters")
return v.lower() # Normalize to lowercase
@field_validator("password")
@classmethod
def password_strength(cls, v: str) -> str:
if len(v) < 8:
raise ValueError("Password must be at least 8 characters")
if not any(c.isupper() for c in v):
raise ValueError("Password must contain an uppercase letter")
if not any(c.isdigit() for c in v):
raise ValueError("Password must contain a digit")
return v
# Valid registration
reg = Registration(username="Alice42", password="Secure1Pass", confirm_password="Secure1Pass")
print(f"Username: {reg.username}")
# Invalid username
try:
Registration(username="a!", password="Secure1Pass", confirm_password="Secure1Pass")
except Exception as e:
print(f"Error: {e}")
Output:
Username: alice42
Error: 1 validation error for Registration
username
Value error, Username must contain only letters and numbers [type=value_error, ...]
Field validators receive the raw value and can either return a transformed value (like v.lower()) or raise a ValueError with a descriptive message. The @classmethod decorator is required in V2.
Model Validators
Model validators check relationships between multiple fields. Use them when validation depends on more than one field at a time.
# model_validator.py
from pydantic import BaseModel, model_validator
class DateRange(BaseModel):
start_date: str
end_date: str
@model_validator(mode="after")
def check_dates(self):
if self.start_date >= self.end_date:
raise ValueError("end_date must be after start_date")
return self
class Registration(BaseModel):
password: str
confirm_password: str
@model_validator(mode="after")
def passwords_match(self):
if self.password != self.confirm_password:
raise ValueError("Passwords do not match")
return self
# Valid
dates = DateRange(start_date="2026-01-01", end_date="2026-12-31")
print(f"Range: {dates.start_date} to {dates.end_date}")
# Invalid -- passwords don't match
try:
Registration(password="Secret1Pass", confirm_password="Different1Pass")
except Exception as e:
print(f"Error: {e}")
Output:
Range: 2026-01-01 to 2026-12-31
Error: 1 validation error for Registration
Value error, Passwords do not match [type=value_error, ...]
The mode="after" parameter means the validator runs after individual field validation is complete, so you can safely access all fields. Use mode="before" when you need to transform the raw input data before field-level validation runs.
Nested Models
Real-world data is rarely flat. Pydantic handles nested structures by composing models inside other models. Validation cascades through every level automatically.
# nested_models.py
from pydantic import BaseModel, EmailStr
from typing import Optional
class Address(BaseModel):
street: str
city: str
state: str
zip_code: str
class ContactInfo(BaseModel):
email: str
phone: Optional[str] = None
address: Address
class Employee(BaseModel):
name: str
title: str
department: str
contact: ContactInfo
# Create from nested dictionaries
data = {
"name": "Alice Johnson",
"title": "Senior Developer",
"department": "Engineering",
"contact": {
"email": "alice@company.com",
"phone": "555-0123",
"address": {
"street": "123 Main St",
"city": "Melbourne",
"state": "VIC",
"zip_code": "3000"
}
}
}
employee = Employee(**data)
print(f"Name: {employee.name}")
print(f"City: {employee.contact.address.city}")
print(f"Email: {employee.contact.email}")
Output:
Name: Alice Johnson
City: Melbourne
Email: alice@company.com
Pydantic validates every level of the nested structure. If the zip code is missing from the address, you get an error pointing to the exact path: contact -> address -> zip_code. This nested validation is especially valuable when parsing complex JSON from APIs or configuration files.
Serialization: Models to Dictionaries and JSON
Pydantic models are not just for validation -- they also handle serialization. The model_dump() and model_dump_json() methods convert models back to dictionaries and JSON strings with fine-grained control.
# serialization.py
from pydantic import BaseModel, Field
from typing import Optional
from datetime import datetime
class Article(BaseModel):
title: str
content: str
author: str
published: bool = False
created_at: datetime = Field(default_factory=datetime.now)
internal_notes: Optional[str] = None
article = Article(
title="Pydantic V2 Guide",
content="Learn data validation...",
author="Alice",
internal_notes="Draft needs review"
)
# Full dictionary
print("Full:", article.model_dump())
# Exclude internal fields
public = article.model_dump(exclude={"internal_notes", "created_at"})
print("Public:", public)
# Only include specific fields
summary = article.model_dump(include={"title", "author", "published"})
print("Summary:", summary)
# Skip fields with None values
clean = article.model_dump(exclude_none=True)
print("Clean:", clean)
# JSON string
json_str = article.model_dump_json(indent=2)
print("JSON:", json_str[:80], "...")
Output:
Full: {'title': 'Pydantic V2 Guide', 'content': 'Learn data validation...', 'author': 'Alice', 'published': False, 'created_at': datetime(...), 'internal_notes': 'Draft needs review'}
Public: {'title': 'Pydantic V2 Guide', 'content': 'Learn data validation...', 'author': 'Alice', 'published': False}
Summary: {'title': 'Pydantic V2 Guide', 'author': 'Alice', 'published': False}
Clean: {'title': 'Pydantic V2 Guide', 'content': 'Learn data validation...', 'author': 'Alice', 'published': False, 'created_at': datetime(...), 'internal_notes': 'Draft needs review'}
JSON: {
"title": "Pydantic V2 Guide",
"content": "Learn data validation..." ...
The exclude and include parameters give you control over which fields appear in the output. This is essential for APIs where internal fields like notes or timestamps should not be sent to clients.
Handling Validation Errors
When validation fails, Pydantic raises a ValidationError with detailed information about every problem. You can catch this exception and extract structured error data for API responses or logging.
# error_handling.py
from pydantic import BaseModel, Field, ValidationError
class Order(BaseModel):
product: str = Field(..., min_length=1)
quantity: int = Field(..., gt=0)
price: float = Field(..., gt=0)
email: str
# Multiple validation errors at once
try:
Order(product="", quantity=-5, price="free", email="not-an-email")
except ValidationError as e:
print(f"Error count: {e.error_count()}")
print()
for error in e.errors():
print(f"Field: {error['loc']}")
print(f"Message: {error['msg']}")
print(f"Type: {error['type']}")
print()
Output:
Error count: 3
Field: ('product',)
Message: String should have at least 1 character
Type: string_too_short
Field: ('quantity',)
Message: Input should be greater than 0
Type: greater_than
Field: ('price',)
Message: Input should be a valid number, unable to parse string as a number
Type: float_parsing
Pydantic collects all validation errors rather than stopping at the first one. Each error includes the field path (loc), a human-readable message (msg), and a machine-readable error type (type). In a FastAPI application, these errors are automatically converted to 422 responses with this same structured format.
Model Configuration
Pydantic V2 uses model_config to customize model behavior. This replaces the inner class Config from V1.
# model_config.py
from pydantic import BaseModel, ConfigDict
class StrictUser(BaseModel):
model_config = ConfigDict(
str_strip_whitespace=True, # Strip leading/trailing whitespace
str_min_length=1, # No empty strings allowed
frozen=True, # Immutable after creation
extra="forbid", # No extra fields allowed
)
name: str
email: str
# Whitespace gets stripped automatically
user = StrictUser(name=" Alice ", email="alice@example.com")
print(f"Name: '{user.name}'")
# Extra fields are rejected
try:
StrictUser(name="Bob", email="bob@example.com", role="admin")
except Exception as e:
print(f"Extra field error: {e}")
# Immutable -- cannot change after creation
try:
user.name = "Charlie"
except Exception as e:
print(f"Frozen error: {e}")
Output:
Name: 'Alice'
Extra field error: 1 validation error for StrictUser
role
Extra inputs are not permitted [type=extra_forbidden, ...]
Frozen error: 1 validation error for StrictUser
name
Instance is frozen [type=frozen_instance, ...]
The extra="forbid" setting is especially important for security -- it prevents attackers from injecting unexpected fields into your data models. The frozen=True setting makes models behave like named tuples, which is useful for configuration objects that should not be modified after creation.
Real-Life Example: Application Configuration Manager
Here is a practical example: a type-safe application configuration system that reads from environment variables, validates every setting at startup, and provides clean access throughout your application.
# config_manager.py
import os
from pydantic import BaseModel, Field, field_validator, model_validator
from pydantic import ConfigDict
from typing import Optional
class DatabaseConfig(BaseModel):
host: str = "localhost"
port: int = Field(default=5432, ge=1, le=65535)
name: str
user: str
password: str
pool_size: int = Field(default=10, ge=1, le=100)
@property
def connection_url(self) -> str:
return f"postgresql://{self.user}:{self.password}@{self.host}:{self.port}/{self.name}"
class CacheConfig(BaseModel):
enabled: bool = True
ttl_seconds: int = Field(default=300, ge=0)
max_size: int = Field(default=1000, ge=1)
class LoggingConfig(BaseModel):
level: str = "INFO"
format: str = "json"
@field_validator("level")
@classmethod
def validate_level(cls, v: str) -> str:
allowed = {"DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL"}
v_upper = v.upper()
if v_upper not in allowed:
raise ValueError(f"Log level must be one of {allowed}")
return v_upper
@field_validator("format")
@classmethod
def validate_format(cls, v: str) -> str:
if v not in ("json", "text"):
raise ValueError("Format must be 'json' or 'text'")
return v
class AppConfig(BaseModel):
model_config = ConfigDict(frozen=True)
app_name: str = "MyApp"
debug: bool = False
api_version: str = "v1"
database: DatabaseConfig
cache: CacheConfig = CacheConfig()
logging: LoggingConfig = LoggingConfig()
@model_validator(mode="after")
def warn_debug_in_production(self):
if self.debug and self.logging.level not in ("DEBUG", "INFO"):
raise ValueError("Debug mode requires log level DEBUG or INFO")
return self
# Create config from a settings dictionary
settings = {
"app_name": "BookStore API",
"debug": True,
"database": {
"host": "db.example.com",
"port": 5432,
"name": "bookstore",
"user": "app_user",
"password": "secure_password_here",
},
"cache": {"ttl_seconds": 600, "max_size": 5000},
"logging": {"level": "debug", "format": "json"},
}
config = AppConfig(**settings)
print(f"App: {config.app_name}")
print(f"DB URL: {config.database.connection_url}")
print(f"Cache TTL: {config.cache.ttl_seconds}s")
print(f"Log Level: {config.logging.level}")
print(f"Debug: {config.debug}")
Output:
App: BookStore API
DB URL: postgresql://app_user:secure_password_here@db.example.com:5432/bookstore
Cache TTL: 600s
Log Level: DEBUG
Debug: True
This configuration model validates every setting at application startup. If the database port is out of range, the log level is invalid, or debug mode conflicts with the log level, you get a clear error immediately instead of a mysterious failure at runtime. The frozen=True config prevents accidental modification after initialization.
Frequently Asked Questions
What changed between Pydantic V1 and V2?
The biggest changes are: .dict() became .model_dump(), .json() became .model_dump_json(), inner class Config became model_config = ConfigDict(...), validators use @field_validator instead of @validator, and the core validation engine was rewritten in Rust for major performance improvements. The migration guide at docs.pydantic.dev/latest/migration covers every change.
How much faster is V2 than V1?
Pydantic V2 is 5-50x faster than V1 depending on the operation. Simple model creation is about 5x faster, while complex nested validation can see 50x improvements. The speed comes from the Rust core (pydantic-core) that handles parsing and validation natively.
Should I use Pydantic or dataclasses?
Use Pydantic when you need runtime validation (API inputs, config files, external data). Use dataclasses when you need simple data containers for internal application state where the data is already trusted. Pydantic also has a @pydantic.dataclasses.dataclass decorator that adds validation to standard dataclass syntax.
How does Pydantic integrate with FastAPI?
FastAPI uses Pydantic models for request body validation, query parameter validation, and response serialization. When you define a FastAPI endpoint parameter as a Pydantic model, FastAPI automatically validates incoming JSON against it and returns 422 errors with Pydantic's structured error format.
Can I use Pydantic with SQLAlchemy or Django ORM?
Yes. Use model_config = ConfigDict(from_attributes=True) (formerly orm_mode) to create Pydantic models from ORM objects. This lets you validate and serialize database records through Pydantic models: UserSchema.model_validate(db_user) converts an ORM object into a validated Pydantic model.
Conclusion
Pydantic V2 is the most practical data validation library in the Python ecosystem. Its combination of type hint-driven validation, automatic type coercion, structured error reporting, and Rust-powered performance makes it the right choice for any project that handles external data. The configuration manager example shows how a well-designed model catches errors at startup instead of letting them surface as runtime crashes.
Start by replacing your manual validation code with Pydantic models, then explore advanced features like computed fields, generic models, and custom types. The official documentation at docs.pydantic.dev is comprehensive and well-organized.