Advanced

You ask an LLM to extract product information from a customer email, and it returns a beautifully formatted paragraph — but your code needed a JSON object with specific fields and specific types. So you write a prompt that says “respond in JSON format”, and sometimes it does, and sometimes it wraps the JSON in markdown code fences, and sometimes it adds an explanation paragraph after the closing brace. Parsing raw LLM output reliably is one of the most frustrating parts of building production AI applications. Pydantic AI solves this problem at the framework level by making structured, type-validated output a first-class feature of every LLM interaction.

Pydantic AI is a Python agent framework built by the Pydantic team. It uses your existing Pydantic models to define exactly what the LLM must return, handles the retry loop when the output does not validate, and gives you a fully typed result object that your IDE can autocomplete. It supports OpenAI, Anthropic, Google Gemini, Ollama, and other providers through a unified interface. If you already know Pydantic for data validation, Pydantic AI will feel immediately familiar — because it is literally the same BaseModel you already use.

In this article we will cover how Pydantic AI works and why it beats hand-crafted JSON prompts, how to define result types with Pydantic models, how to create agents with dependencies and system prompts, how to add tools to agents, how to handle validation retries, and how to build a real extraction pipeline. By the end you will be able to replace fragile text-parsing code with robust, type-safe agent responses.

Pydantic AI Quick Example

Here is the core pattern — define a Pydantic model, create an agent that returns that model, and call it with a user prompt. The agent guarantees the response matches your schema:

# quick_pydantic_ai.py
from pydantic import BaseModel
from pydantic_ai import Agent

class MovieReview(BaseModel):
    title: str
    year: int
    rating: float  # 0.0 to 10.0
    genre: str
    one_line_summary: str

agent = Agent(
    "openai:gpt-4o-mini",
    result_type=MovieReview,
    system_prompt="You are a film critic. Extract structured data from movie descriptions.",
)

result = agent.run_sync(
    "The Matrix came out in 1999, it's a sci-fi action film about a simulated reality. Rating: 9.2/10."
)

movie = result.data
print(f"Title: {movie.title}")
print(f"Year: {movie.year}")
print(f"Rating: {movie.rating}")
print(f"Genre: {movie.genre}")
print(f"Summary: {movie.one_line_summary}")
print(f"Type: {type(movie)}")

Output:

Title: The Matrix
Year: 1999
Rating: 9.2
Genre: Sci-Fi Action
Summary: A mind-bending film about a simulated reality and humanity's fight for freedom.
Type: <class '__main__.MovieReview'>

result.data is a fully validated MovieReview instance — not a string, not a dict. Your IDE knows its type, you get autocomplete on movie.rating, and if the LLM had returned a non-numeric rating, Pydantic AI would have retried the call automatically. This is the fundamental shift from parsing text to receiving typed objects.

Debug Dee watching LLM text become typed Python objects
result_type=EmailResponse — the LLM guesses, Pydantic validates.

What Is Pydantic AI and Why Use It?

Pydantic AI is an agentic AI framework that wraps LLM calls in a type-safe layer. Every call to a Pydantic AI agent returns a validated Pydantic model instance — not raw text. If the LLM returns output that fails validation, the framework sends the validation error back to the LLM and asks it to fix the output, automatically, up to a configurable retry limit.

ApproachReliabilityType safetyRetry on failure
Raw prompt + str parsingLowNoneManual
JSON mode (OpenAI)MediumNone (just a dict)Manual
Instructor libraryHighYes (Pydantic)Automatic
Pydantic AIHighYes (Pydantic)Automatic + dependencies + tools

Install Pydantic AI with the extras for your LLM provider:

# terminal
pip install pydantic-ai           # base install
pip install pydantic-ai[openai]   # for OpenAI
pip install pydantic-ai[anthropic] # for Anthropic Claude

Defining Result Types

The result type is a standard Pydantic BaseModel with field descriptions that help the LLM understand what to put in each field. Good field descriptions are the single biggest factor in extraction accuracy:

# result_types.py
from pydantic import BaseModel, Field
from typing import List, Optional
from pydantic_ai import Agent

class JobPosting(BaseModel):
    job_title: str = Field(description="The exact job title as listed")
    company: str = Field(description="Company name")
    location: str = Field(description="City, State or 'Remote'")
    salary_min: Optional[int] = Field(None, description="Minimum salary in USD per year, or None if not stated")
    salary_max: Optional[int] = Field(None, description="Maximum salary in USD per year, or None if not stated")
    required_skills: List[str] = Field(description="List of required technical skills")
    experience_years: int = Field(description="Minimum years of experience required (0 if not specified)")
    is_remote: bool = Field(description="True if remote work is allowed")

agent = Agent(
    "openai:gpt-4o-mini",
    result_type=JobPosting,
    system_prompt="Extract structured job posting data from the provided text.",
)

posting_text = """
Senior Python Developer -- Remote OK
TechCorp Inc. | San Francisco, CA (Remote)
Salary: $140,000 - $180,000/year

We need 5+ years Python experience. Skills: FastAPI, PostgreSQL, Docker, AWS.
Must be comfortable with distributed systems and async Python.
"""

result = agent.run_sync(posting_text)
job = result.data

print(f"Title: {job.job_title}")
print(f"Company: {job.company}")
print(f"Remote: {job.is_remote}")
print(f"Salary: ${job.salary_min:,} - ${job.salary_max:,}")
print(f"Skills: {', '.join(job.required_skills)}")
print(f"Experience: {job.experience_years}+ years")

Output:

Title: Senior Python Developer
Company: TechCorp Inc.
Remote: True
Salary: $140,000 - $180,000
Skills: FastAPI, PostgreSQL, Docker, AWS, distributed systems, async Python
Experience: 5+ years

The Field(description=...) text is injected into the prompt automatically. When a field is Optional with a default of None, Pydantic AI tells the LLM it can leave that field empty if the information is not in the text — which is the correct behavior for extraction tasks where some fields may not be present.

Using Dependencies for Context

Pydantic AI has a dependency injection system that lets you pass runtime context (database connections, API clients, configuration) into your agent’s system prompt and tools without hardcoding it:

# dependencies.py
from dataclasses import dataclass
from pydantic import BaseModel
from pydantic_ai import Agent, RunContext

@dataclass
class UserContext:
    user_name: str
    preferred_language: str
    timezone: str

class Greeting(BaseModel):
    message: str
    is_formal: bool
    language: str

agent = Agent(
    "openai:gpt-4o-mini",
    result_type=Greeting,
    deps_type=UserContext,
    system_prompt="Generate an appropriate greeting based on the user context provided.",
)

@agent.system_prompt
def build_system_prompt(ctx: RunContext[UserContext]) -> str:
    return (
        f"The user's name is {ctx.deps.user_name}. "
        f"They prefer {ctx.deps.preferred_language} and are in timezone {ctx.deps.timezone}. "
        f"Generate a greeting appropriate for their context."
    )

# Pass different contexts for different users
ctx_alice = UserContext("Alice", "English", "UTC+10")
ctx_marco = UserContext("Marco", "Italian", "UTC+1")

result1 = agent.run_sync("Morning greeting please", deps=ctx_alice)
result2 = agent.run_sync("Morning greeting please", deps=ctx_marco)

print(result1.data.message)
print(result2.data.language)

Output:

Good morning, Alice! Hope you're starting your day well in Australia.
Italian

The @agent.system_prompt decorator registers a function that builds the system prompt dynamically from the dependencies at runtime. This pattern is much cleaner than string-formatting dependencies into a hardcoded system prompt — you get typed access to context via ctx.deps, and the system prompt updates automatically when you pass different dependencies.

Sudo Sam injecting typed dependency into AI agent
RunContext[Deps] — dependency injection for agents. No globals required.

Adding Tools to Pydantic AI Agents

Tools extend what your agent can do beyond just LLM calls — the agent can look up data, call APIs, or compute values during its reasoning process:

# tools_agent.py
from pydantic import BaseModel
from pydantic_ai import Agent, RunContext
import httpx

class WeatherSummary(BaseModel):
    city: str
    temperature_celsius: float
    condition: str
    recommendation: str

agent = Agent(
    "openai:gpt-4o-mini",
    result_type=WeatherSummary,
    system_prompt="You are a weather assistant. Use the get_weather tool to fetch real data, then provide a structured summary.",
)

@agent.tool_plain
def get_weather(city: str) -> dict:
    """
    Fetch current weather for a city from wttr.in (free, no API key required).
    
    Args:
        city: City name to get weather for
    """
    try:
        r = httpx.get(f"https://wttr.in/{city}?format=j1", timeout=8)
        r.raise_for_status()
        data = r.json()
        current = data["current_condition"][0]
        return {
            "temp_c": int(current["temp_C"]),
            "feels_like_c": int(current["FeelsLikeC"]),
            "description": current["weatherDesc"][0]["value"],
            "humidity": current["humidity"],
        }
    except Exception as e:
        return {"error": str(e)}

result = agent.run_sync("What's the weather like in Sydney right now?")
summary = result.data
print(f"City: {summary.city}")
print(f"Temp: {summary.temperature_celsius}C")
print(f"Condition: {summary.condition}")
print(f"Recommendation: {summary.recommendation}")

Output:

City: Sydney
Temp: 18.0C
Condition: Partly Cloudy
Recommendation: Light jacket recommended. Good conditions for outdoor activities.

The @agent.tool_plain decorator registers a function as a tool without dependency injection. Use @agent.tool (without _plain) if your tool needs access to the run context and dependencies. The LLM decides when to call the tool based on the function’s docstring — so write your docstrings carefully, describing exactly when the tool should be used.

Real-Life Example: Email Triage Pipeline

This complete pipeline extracts structured data from customer emails and classifies them for routing to the right team — the kind of task that previously required fragile regex and string parsing:

# email_triage.py
from pydantic import BaseModel, Field
from pydantic_ai import Agent
from typing import List, Literal

class EmailTriage(BaseModel):
    subject_clean: str = Field(description="Clean, concise subject line (max 60 chars)")
    category: Literal["billing", "technical", "sales", "complaint", "general"] = Field(
        description="Best category for routing this email"
    )
    priority: Literal["urgent", "high", "normal", "low"] = Field(
        description="Priority based on urgency indicators in the email"
    )
    sentiment: Literal["positive", "neutral", "negative", "very_negative"] = Field(
        description="Overall customer sentiment"
    )
    key_issues: List[str] = Field(
        description="List of specific issues or questions the customer raised (max 3 items)"
    )
    suggested_response_tone: str = Field(
        description="One sentence describing the appropriate tone for the response"
    )
    requires_escalation: bool = Field(
        description="True if this should go to a senior team member"
    )

triage_agent = Agent(
    "openai:gpt-4o-mini",
    result_type=EmailTriage,
    system_prompt=(
        "You are a customer service triage specialist. Analyze incoming customer emails "
        "and extract structured classification data to route them to the right team. "
        "Be conservative with 'urgent' and 'requires_escalation' -- only use them when clearly warranted."
    ),
)

sample_emails = [
    {
        "from": "angry.customer@example.com",
        "body": "I have been charged TWICE for my subscription this month!! "
                "I need this fixed immediately or I am cancelling. "
                "This is absolutely unacceptable. My account is #12345."
    },
    {
        "from": "new.lead@bigcorp.com",
        "body": "Hi, we're a 500-person company looking for enterprise pricing. "
                "Could someone from your sales team reach out to discuss volume discounts? "
                "No rush, just exploring options."
    },
]

for email in sample_emails:
    result = triage_agent.run_sync(
        f"From: {email['from']}\n\nEmail body:\n{email['body']}"
    )
    t = result.data
    print(f"\n--- {email['from']} ---")
    print(f"Category: {t.category.upper()} | Priority: {t.priority} | Sentiment: {t.sentiment}")
    print(f"Subject: {t.subject_clean}")
    print(f"Issues: {t.key_issues}")
    print(f"Escalate: {t.requires_escalation}")
    print(f"Tone: {t.suggested_response_tone}")

Output:

--- angry.customer@example.com ---
Category: BILLING | Priority: urgent | Sentiment: very_negative
Subject: Double Charge on Subscription - Immediate Resolution Needed
Issues: ['Double charge on subscription', 'Threat to cancel account', 'Account #12345 affected']
Escalate: True
Tone: Apologetic, empathetic, and action-oriented -- acknowledge the error immediately.

--- new.lead@bigcorp.com ---
Category: SALES | Priority: normal | Sentiment: positive
Subject: Enterprise Pricing Inquiry - 500-Person Company
Issues: ['Interested in enterprise pricing', 'Requesting volume discount discussion']
Escalate: False
Tone: Enthusiastic and professional -- welcome the opportunity and offer a prompt follow-up.

Every field is typed. You can store these EmailTriage objects directly in a database, pass them to routing logic without parsing, and iterate over t.key_issues as a proper Python list. Replace the sample emails with real ones from an IMAP connection or an email webhook, and this becomes a production-ready triage pipeline in under 50 lines.

Cache Katie reviewing structured email triage dashboard
Literal[‘urgent’,’normal’,’low’] — the model picks one. Not ‘URGENT’. Not ‘asap’. One.

Frequently Asked Questions

How does Pydantic AI compare to the Instructor library?

Instructor is a lightweight library focused specifically on getting structured output from LLMs using Pydantic models. Pydantic AI is a fuller agent framework — it adds dependency injection, tools, multi-turn conversation support, and streaming. For simple extraction tasks, either works well. For building agents that need to call tools and maintain state across multiple LLM turns, Pydantic AI is the better choice. Both use the same Pydantic model definition syntax, so switching between them is straightforward.

What happens when the LLM returns invalid JSON or wrong types?

Pydantic AI catches the validation error, formats it into a clear message (e.g., “field ‘rating’ must be a float between 0 and 10, got ‘excellent'”), and sends it back to the LLM as a new message in the same conversation. The LLM tries again. By default it retries up to 3 times before raising a ModelRetry exception. You can configure the retry limit with Agent(retries=5). In practice, well-defined Pydantic models with good Field(description=...) text almost never need more than one retry.

Does Pydantic AI support async?

Yes. Use await agent.run() instead of agent.run_sync(). All tools defined with async def are awaited automatically. The async API is recommended for production applications where you are running multiple agent calls concurrently — it integrates cleanly with FastAPI, asyncio event loops, and other async frameworks. For scripts and notebooks, run_sync() is more convenient.

Can I stream the output as it’s generated?

Yes, with some limitations. Use agent.run_stream() to get an async iterator that yields partial results as the LLM generates them. For structured output, streaming gives you the raw token stream until the full JSON is complete — you cannot get a partially-validated Pydantic model. However, you can stream text fields and use result.is_complete to check when validation is done. Streaming is most useful for long responses where you want to show progress to the user.

Does Pydantic AI work with local models (Ollama)?

Yes. Use Agent("ollama:llama3.2") or any model tag from your Ollama installation. Local models are less reliable at structured output than GPT-4o or Claude Sonnet, especially for complex schemas. Start with simpler schemas (3-5 fields, basic types) and increase complexity as you gain confidence in your model’s capabilities. The retry mechanism compensates for occasional failures, but if retries are frequent, consider adding more detail to your Field(description=...) text or simplifying the schema.

Conclusion

Pydantic AI eliminates the most error-prone part of LLM integration — parsing unstructured text output — by making validated, typed responses the default. We covered defining result schemas with Pydantic models and Field descriptions, using dependency injection for runtime context, registering tools the agent can call, handling retries automatically, and building a complete email triage pipeline. Every result your agent returns is a proper Python object with type hints your IDE understands.

The next step is to take one piece of existing code where you are parsing LLM output with string operations or json.loads() and rewrite it using a Pydantic AI agent. The reliability improvement is immediate. Explore the official Pydantic AI documentation for advanced topics like multi-agent orchestration, streaming validation, and custom validators on result models.