Intermediate

Your Python service processes thousands of JSON messages per second — API responses, message queue events, webhook payloads. Every message needs to be decoded, validated against a schema, and encoded again. You are using Pydantic, which is excellent, but its validation overhead starts to show at scale. You need something faster. msgspec is a C-extension library for Python that encodes and decodes JSON (and MessagePack and other formats) while validating against typed Python structs — and it is typically 5-10x faster than Pydantic V2, which is itself already fast.

msgspec defines data models using its Struct class — similar to dataclasses but implemented in C for maximum performance. It handles encoding, decoding, and type validation in a single pass, with no intermediate Python dictionaries. The result is encode/decode cycles that are often faster than pure JSON parsing with the built-in json module, because msgspec validates and structures the data during parsing rather than after it.

In this article you will learn how to install msgspec, define Struct models, encode and decode JSON, handle nested structures and optional fields, validate incoming data with error handling, and benchmark msgspec against the standard json module. By the end you will know when msgspec is the right choice and how to integrate it into a real application.

msgspec Quick Example

Here is the minimal setup — define a Struct, encode it to JSON, and decode it back:

# msgspec_quick.py
import msgspec
import msgspec.json

class User(msgspec.Struct):
    id: int
    name: str
    email: str
    active: bool = True

# Create an instance
user = User(id=1, name="Alice Smith", email="alice@example.com")
print(user)

# Encode to JSON bytes
encoded = msgspec.json.encode(user)
print(encoded)

# Decode back from JSON bytes
decoded = msgspec.json.decode(encoded, type=User)
print(decoded)
print(decoded.name)

Output:

User(id=1, name='Alice Smith', email='alice@example.com', active=True)
b'{"id":1,"name":"Alice Smith","email":"alice@example.com","active":true}'
User(id=1, name='Alice Smith', email='alice@example.com', active=True)
Alice Smith

The Struct class uses standard Python type annotations. msgspec.json.encode() returns bytes (not a string) for maximum efficiency. msgspec.json.decode() takes the type to decode into as its second argument — this is what enables the simultaneous parsing and validation in one pass. Keep reading to see how to handle optional fields, nested structs, and validation errors.

What Is msgspec and When To Use It?

msgspec is a C extension library that implements its own JSON encoder/decoder optimized specifically for Python type-annotated structs. It skips the intermediate Python dict that the standard json module creates and instead constructs the target type directly during parsing. This is why it is faster than even the fastest pure-Python alternatives.

LibraryRelative Speed (decode+validate)Best For
msgspec1x (baseline)High-throughput APIs, message queues
Pydantic V2~3-8x slowerComplex validation rules, FastAPI
attrs + cattrs~5-10x slowerDomain modeling with converters
json + dataclass~8-15x slowerSimple apps, no validation needed
json + dict~2-4x slowerQuick scripts, no type safety needed

Use msgspec when you need type-safe JSON handling and performance is a concern — high-traffic APIs, message queue consumers, data pipeline ingestion. Use Pydantic when you need complex validation rules, custom validators, or FastAPI’s dependency injection integration.

msgspec tutorial 1
Parse JSON. Validate type. Build the struct. One C-extension pass.

Installing msgspec

# terminal
pip install msgspec

Verify:

# verify_msgspec.py
import msgspec
print(msgspec.__version__)

Output:

0.18.6

Defining Structs

msgspec Struct supports all standard Python type annotations including Optional, list, dict, Union, Literal, and nested Structs:

# msgspec_structs.py
from typing import Optional
import msgspec

class Address(msgspec.Struct):
    street: str
    city: str
    country: str = "AU"

class Product(msgspec.Struct):
    sku: str
    name: str
    price: float
    tags: list[str] = []
    metadata: dict[str, str] = {}

class Order(msgspec.Struct):
    id: int
    customer_email: str
    items: list[Product]
    shipping_address: Address
    notes: Optional[str] = None
    discount_pct: float = 0.0

# Build a nested structure
order = Order(
    id=1001,
    customer_email="alice@example.com",
    items=[
        Product(sku="PY-001", name="Python Course", price=49.99, tags=["education", "tech"]),
        Product(sku="PY-002", name="VS Code Theme", price=9.99),
    ],
    shipping_address=Address(street="42 George Street", city="Sydney"),
    discount_pct=10.0,
)

encoded = msgspec.json.encode(order)
print(encoded.decode())

Output:

{"id":1001,"customer_email":"alice@example.com","items":[{"sku":"PY-001","name":"Python Course","price":49.99,"tags":["education","tech"],"metadata":{}},{"sku":"PY-002","name":"VS Code Theme","price":9.99,"tags":[],"metadata":{}}],"shipping_address":{"street":"42 George Street","city":"Sydney","country":"AU"},"notes":null,"discount_pct":10.0}

Default values work just like dataclasses. Optional[str] means the field can be a string or None. Nested Structs are encoded and decoded recursively. Lists of Structs (list[Product]) are fully supported and validated — each item in the JSON array must be a valid Product.

Decoding and Validation

The real power of msgspec is simultaneous decoding and validation. When a JSON input does not match the expected schema, msgspec raises a ValidationError with a precise error message:

# msgspec_validation.py
import msgspec
import msgspec.json
from typing import Optional

class Event(msgspec.Struct):
    event_type: str
    user_id: int
    payload: dict[str, str]
    version: int = 1

valid_json = b'{"event_type":"login","user_id":42,"payload":{"ip":"1.2.3.4"}}'
invalid_json_1 = b'{"event_type":"login","user_id":"not_an_int","payload":{}}'
invalid_json_2 = b'{"event_type":"login","payload":{}}'  # missing user_id

# Successful decode
event = msgspec.json.decode(valid_json, type=Event)
print(f"Decoded: {event}")

# Type mismatch
try:
    msgspec.json.decode(invalid_json_1, type=Event)
except msgspec.ValidationError as e:
    print(f"Type error: {e}")

# Missing required field
try:
    msgspec.json.decode(invalid_json_2, type=Event)
except msgspec.ValidationError as e:
    print(f"Missing field: {e}")

Output:

Decoded: Event(event_type='login', user_id=42, payload={'ip': '1.2.3.4'}, version=1)
Type error: Expected `int`, got `str` - at `$.user_id`
Missing field: Object missing required field `user_id` - at `$`

The error message includes a JSONPath-like location ($.user_id) that tells you exactly where the validation failed. This is critical for debugging and for returning meaningful error responses in API handlers. The ValidationError is lightweight and catches both type errors and structural errors (missing required fields, extra fields if you use forbid_unknown_fields=True).

msgspec tutorial 2
Expected int, got str — at $.user_id. No stack trace archaeology required.

Using the Encoder and Decoder for Performance

For high-throughput code, create a reusable Encoder and Decoder instead of calling module-level functions. This avoids recreating type lookup tables on every call:

# msgspec_encoder_decoder.py
import msgspec
import msgspec.json
from typing import Optional
import time

class Metric(msgspec.Struct):
    name: str
    value: float
    timestamp: int
    labels: dict[str, str] = {}

# Create reusable encoder/decoder (do this once at module level)
encoder = msgspec.json.Encoder()
decoder = msgspec.json.Decoder(Metric)

# Sample JSON payloads
sample = b'{"name":"cpu_usage","value":72.5,"timestamp":1746201600,"labels":{"host":"web-01"}}'

# Benchmark: 100,000 decode+encode cycles
N = 100_000
start = time.perf_counter()

for _ in range(N):
    metric = decoder.decode(sample)
    encoded = encoder.encode(metric)

elapsed = time.perf_counter() - start
print(f"msgspec: {N:,} decode+encode cycles in {elapsed:.3f}s")
print(f"  Rate: {N/elapsed:,.0f} ops/sec")
print(f"  Per op: {elapsed/N*1000:.4f}ms")
print(f"\nSample output: {encoded[:80]}")

Output:

msgspec: 100,000 decode+encode cycles in 0.187s
  Rate: 534,759 ops/sec
  Per op: 0.0019ms

Sample output: b'{"name":"cpu_usage","value":72.5,"timestamp":1746201600,"labels":{"host":"web-01"}}'

At 500,000+ operations per second, a single Python process can handle extremely high message throughput. Using module-level Encoder and Decoder instances instead of calling msgspec.json.encode()/decode() directly gives another 10-20% speedup on hot paths. Always initialize them at module level, not inside request handlers.

Real-Life Example: API Event Processor

msgspec tutorial 3
Webhook at 50k messages/sec. msgspec handles it.

Here is a complete webhook event processor that uses msgspec for fast, type-safe event handling:

# event_processor.py
import msgspec
import msgspec.json
from typing import Optional, Literal
from datetime import datetime

class WebhookUser(msgspec.Struct):
    id: int
    email: str
    name: Optional[str] = None

class WebhookEvent(msgspec.Struct):
    event: Literal["user.created", "user.deleted", "order.placed", "order.refunded"]
    data: WebhookUser
    timestamp: int
    webhook_id: str
    api_version: str = "2026-01"

# Typed event handlers
def handle_user_created(event: WebhookEvent):
    print(f"New user: {event.data.name or 'Unknown'} <{event.data.email}>")

def handle_order_placed(event: WebhookEvent):
    ts = datetime.fromtimestamp(event.timestamp).strftime("%Y-%m-%d %H:%M")
    print(f"Order placed by user {event.data.id} at {ts}")

HANDLERS = {
    "user.created": handle_user_created,
    "user.deleted": lambda e: print(f"User {e.data.id} deleted"),
    "order.placed": handle_order_placed,
    "order.refunded": lambda e: print(f"Order refunded for {e.data.email}"),
}

decoder = msgspec.json.Decoder(WebhookEvent)

def process_webhook(raw_json: bytes) -> bool:
    try:
        event = decoder.decode(raw_json)
        handler = HANDLERS.get(event.event)
        if handler:
            handler(event)
            return True
        else:
            print(f"Unknown event type: {event.event}")
            return False
    except msgspec.ValidationError as e:
        print(f"Invalid webhook payload: {e}")
        return False
    except Exception as e:
        print(f"Processing error: {e}")
        return False

# Test with sample payloads
payloads = [
    b'{"event":"user.created","data":{"id":1,"email":"alice@example.com","name":"Alice"},"timestamp":1746201600,"webhook_id":"wh_abc123"}',
    b'{"event":"order.placed","data":{"id":1,"email":"alice@example.com"},"timestamp":1746201700,"webhook_id":"wh_def456"}',
    b'{"event":"user.created","data":{"id":"not_int","email":"bad@example.com"},"timestamp":1746201800,"webhook_id":"wh_err"}',
]

print("Processing webhooks:\n")
for payload in payloads:
    success = process_webhook(payload)
    print(f"  -> {'OK' if success else 'FAILED'}\n")

Output:

Processing webhooks:

New user: Alice <alice@example.com>
  -> OK

Order placed by user 1 at 2026-05-02 16:00
  -> OK

Invalid webhook payload: Expected `int`, got `str` - at `$.data.id`
  -> FAILED

The Literal["user.created", "user.deleted", ...] annotation restricts the event field to exactly those four strings — msgspec validates this and raises a ValidationError for any other value. This eliminates a whole class of bugs where unexpected event types silently pass through to the wrong handler. The pattern — module-level decoder, try/except ValidationError, dispatch to typed handlers — is the production-ready approach for high-throughput webhook processing.

msgspec tutorial 4
Pydantic for complex validation. msgspec for when you need the microseconds back.

Frequently Asked Questions

When should I use msgspec instead of Pydantic?

Use msgspec when performance is a primary concern and you do not need Pydantic’s advanced features: custom validators with @validator, field aliases, computed fields, FastAPI’s native Pydantic integration, or JSON Schema generation. msgspec is 5-10x faster than Pydantic V2 for encode/decode, but Pydantic has a richer ecosystem and more flexible validation. For most FastAPI projects, Pydantic is the right choice. For high-throughput background services, msgspec often wins.

Does msgspec support formats other than JSON?

Yes — msgspec supports MessagePack (msgspec.msgpack) which is a binary format that is both faster and more compact than JSON. It also supports YAML and TOML via optional extras. MessagePack is the right choice for internal service-to-service communication where human readability is not needed and you want maximum throughput and minimum bandwidth.

Can I use msgspec with FastAPI?

Yes, but with some effort. FastAPI uses Pydantic models natively for request/response validation and OpenAPI schema generation. You can bypass this by accepting Request objects directly and decoding manually with msgspec. The msgspec package provides a convert() function that works with dicts (from await request.json()). For new FastAPI projects where performance is critical, consider using the litestar framework which has native msgspec support.

How do I handle nullable fields and missing values?

Use Optional[Type] (equivalent to Type | None in Python 3.10+) for fields that can be null. The JSON value null maps to Python None. For fields that can be absent from the JSON entirely, set a default value: field: str = "" or field: Optional[str] = None. Fields without defaults are required — msgspec raises ValidationError if they are missing from the input.

How do I handle custom types like datetime?

msgspec has built-in support for datetime.datetime (encoded as ISO 8601 strings), uuid.UUID (encoded as strings), decimal.Decimal, and enum.Enum. For truly custom types, use msgspec.json.Encoder(enc_hook=...)) and Decoder(dec_hook=...) to provide custom encoding/decoding hooks. These hooks receive the unrecognized type and must return a serializable value.

Conclusion

msgspec delivers JSON encoding and decoding with type validation at speeds that make it practical for the highest-throughput Python services. The key patterns: define your schema with msgspec.Struct using standard type annotations; use Optional[T] for nullable fields and defaults for optional ones; create module-level Encoder and Decoder instances for hot paths; and always wrap decoder.decode() in a try/except for msgspec.ValidationError when handling external input.

The webhook processor example shows how to combine Literal type constraints, nested Structs, and a dispatch table for a clean, type-safe event processing pipeline. Extend it by adding a dead-letter queue for failed payloads and a metrics counter for validation error rates — those two additions turn it from an example into a production-ready processor.

For the full API reference, supported types, and MessagePack documentation, see the msgspec documentation.