Intermediate

You ask an LLM to extract a user’s name, age, and email from a paragraph of text. Sometimes it returns clean JSON. Sometimes it returns JSON wrapped in markdown fences. Sometimes it returns a paragraph explaining why it extracted those fields. If you have ever built a pipeline that breaks because the model decided today was a good day to add “Sure! Here is the extracted data:” before the JSON, you already understand why instructor exists.

The instructor library patches the OpenAI client (and any OpenAI-compatible API) to force the model to return a fully validated Pydantic model — every time. When validation fails, it retries automatically. You define exactly what fields you need, with their types and constraints, and instructor handles the conversation with the model until the output matches your schema. You need Python 3.9+, an OpenAI API key (or compatible endpoint), and pip install instructor.

This article walks through everything you need to get structured LLM outputs in production: installing and patching the client, defining Pydantic schemas, extracting nested objects, handling lists, using validation hooks, working with non-OpenAI models via LiteLLM, and building a real extraction pipeline. By the end you will have a reusable pattern for reliable structured data from any LLM.

Structured LLM Output: Quick Example

The fastest way to see instructor in action is to extract a structured object from a single sentence. Install the library and try this:

# quick_instructor.py
import instructor
from openai import OpenAI
from pydantic import BaseModel

client = instructor.from_openai(OpenAI())

class Person(BaseModel):
    name: str
    age: int
    city: str

person = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=Person,
    messages=[{"role": "user", "content": "Alice is 32 years old and lives in Melbourne."}]
)

print(person.name)   # Alice
print(person.age)    # 32
print(person.city)   # Melbourne
print(type(person))  # <class '__main__.Person'>

Output:

Alice
32
Melbourne
<class '__main__.Person'>

The key line is instructor.from_openai(OpenAI()) — this patches the standard OpenAI client. After that, you pass response_model=Person to any chat.completions.create call, and instructor automatically: sends the Pydantic schema to the model as a tool definition, parses the model’s tool-call response, validates it against your schema, and retries if validation fails. The return value is a fully typed Pydantic object, not a string or dict.

That example covers the simplest case. The sections below show how to handle nested models, lists, validation rules, retry configuration, and real-world pipelines.

Instructor converts chaotic LLM output to clean schema
response_model= and the chaos becomes a schema.

What Is instructor and Why Use It?

When you call an LLM without constraints, it returns free-form text. Parsing that text into structured data is fragile — you write regex, JSON parsers, and fallback handlers that break every time the model changes its wording. instructor solves this by using OpenAI’s function/tool calling feature under the hood: it converts your Pydantic model into a JSON Schema tool definition, forces the model to call that tool, and validates the returned arguments against your schema.

The result is LLM output that behaves like a typed function return value instead of a string you have to parse. If the model returns a field with the wrong type (for example, age as a string “thirty-two” instead of an integer), instructor sends the validation error back to the model and asks it to try again — up to a configurable number of retries.

ApproachReliabilityType SafetyAuto-Retry
Parse raw LLM textFragileNoneManual
Parse JSON from promptModerateManualManual
OpenAI function callingGoodPartialNone
instructor + PydanticHighFullBuilt-in

The library supports multiple backends: instructor.from_openai, instructor.from_anthropic, instructor.from_gemini, and any OpenAI-compatible endpoint via base_url. This makes it the same interface regardless of which model you use.

Installation and Setup

Install instructor and the OpenAI SDK together. If you are using a different provider, you may also need their SDK:

# Terminal
pip install instructor openai pydantic

Set your API key as an environment variable so it never appears in your code:

# setup_env.py -- run once, or add to your shell profile
import os
# In practice, set this in your shell:
# export OPENAI_API_KEY="sk-..."
print("OPENAI_API_KEY set:", bool(os.environ.get("OPENAI_API_KEY")))

Output:

OPENAI_API_KEY set: True

Patch the client once at startup and reuse it for all calls. Creating a new patched client for every request is wasteful:

# client_setup.py
import instructor
from openai import OpenAI

# Patch once at startup
client = instructor.from_openai(OpenAI())  # reads OPENAI_API_KEY from env

# The client now has response_model support on all completion calls
print(type(client))  # <class 'instructor.client.Instructor'>

Output:

<class 'instructor.client.Instructor'>
Patching OpenAI client with Pydantic validator
One patch. Every completion call now speaks schema.

Defining Pydantic Schemas for Extraction

Your Pydantic model defines exactly what fields the LLM must return. Field descriptions improve accuracy significantly — the model uses them as instructions for what to put in each field. Use Field(description=...) to guide the extraction:

# schema_example.py
import instructor
from openai import OpenAI
from pydantic import BaseModel, Field
from typing import Optional

client = instructor.from_openai(OpenAI())

class JobPosting(BaseModel):
    title: str = Field(description="The exact job title as written in the posting")
    company: str = Field(description="Company name offering the position")
    location: str = Field(description="City and country, or 'Remote'")
    salary_min: Optional[int] = Field(None, description="Minimum annual salary in USD if mentioned")
    salary_max: Optional[int] = Field(None, description="Maximum annual salary in USD if mentioned")
    is_remote: bool = Field(description="True if the role allows remote work")

text = """
Senior Python Developer at DataFlow Inc. -- Remote (US timezones preferred).
Salary range: $140,000 - $175,000 per year. Must have 5+ years Python experience.
"""

job = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=JobPosting,
    messages=[{"role": "user", "content": f"Extract the job details from: {text}"}]
)

print(f"Title: {job.title}")
print(f"Company: {job.company}")
print(f"Location: {job.location}")
print(f"Salary: ${job.salary_min:,} - ${job.salary_max:,}")
print(f"Remote: {job.is_remote}")

Output:

Title: Senior Python Developer
Company: DataFlow Inc.
Location: Remote (US timezones preferred)
Salary: $140,000 - $175,000
Remote: True

The Optional[int] type tells instructor (and the model) that salary fields may be absent. When the source text does not mention a salary, these fields will be None instead of hallucinated values. Always use Optional for fields that may not appear in the input — without it, the model will invent plausible-sounding values rather than leaving the field empty.

Extracting Nested and List Objects

Real-world extraction often requires nested structures — for example, an invoice with multiple line items, or a resume with a list of work experiences. instructor handles nested Pydantic models and List types natively:

# nested_extraction.py
import instructor
from openai import OpenAI
from pydantic import BaseModel, Field
from typing import List

client = instructor.from_openai(OpenAI())

class LineItem(BaseModel):
    description: str
    quantity: int
    unit_price: float

class Invoice(BaseModel):
    vendor: str
    invoice_number: str
    items: List[LineItem]
    total: float

invoice_text = """
Invoice #INV-2024-0891 from CloudHost Solutions
- 3x Server instances @ $45.00 each
- 1x SSL Certificate @ $12.00
- 2x Domain registrations @ $15.00 each
Total: $222.00
"""

result = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=Invoice,
    messages=[{"role": "user", "content": f"Extract invoice data: {invoice_text}"}]
)

print(f"Vendor: {result.vendor}")
print(f"Invoice #: {result.invoice_number}")
for item in result.items:
    print(f"  {item.quantity}x {item.description} @ ${item.unit_price:.2f}")
print(f"Total: ${result.total:.2f}")

Output:

Vendor: CloudHost Solutions
Invoice #: INV-2024-0891
  3x Server instances @ $45.00
  1x SSL Certificate @ $12.00
  2x Domain registrations @ $15.00
Total: $222.00

Nested models work because instructor converts the entire schema — including nested classes — into a JSON Schema definition that the model understands. The model fills in every field of every nested object, and Pydantic validates the whole structure recursively. If the items list is missing or a line item has an invalid type, instructor retries the extraction with the validation error as feedback.

Instructor nested Pydantic models
Nested Pydantic models: recursion that actually works.

Adding Custom Validation Rules

Pydantic’s field_validator lets you add business logic on top of type checking. instructor automatically feeds validation errors back to the model, so the model gets a second (or third) chance to return values that satisfy your rules:

# custom_validation.py
import instructor
from openai import OpenAI
from pydantic import BaseModel, Field, field_validator
from typing import List

client = instructor.from_openai(OpenAI())

class ProductReview(BaseModel):
    product_name: str
    rating: int = Field(description="Rating from 1 to 5")
    pros: List[str] = Field(description="List of positive aspects, at least one")
    cons: List[str] = Field(description="List of negative aspects, can be empty")
    summary: str = Field(description="One-sentence summary under 150 characters")

    @field_validator("rating")
    @classmethod
    def rating_in_range(cls, v: int) -> int:
        if not 1 <= v <= 5:
            raise ValueError(f"Rating must be between 1 and 5, got {v}")
        return v

    @field_validator("pros")
    @classmethod
    def at_least_one_pro(cls, v: List[str]) -> List[str]:
        if not v:
            raise ValueError("Must include at least one positive aspect")
        return v

    @field_validator("summary")
    @classmethod
    def summary_length(cls, v: str) -> str:
        if len(v) > 150:
            raise ValueError(f"Summary too long: {len(v)} chars (max 150)")
        return v

text = """
The new Python IDE is pretty solid. Boot time is fast, autocomplete works well.
The memory usage is high and the plugin store is still sparse. Overall a decent
choice for Python development. I'd give it 4 out of 5.
"""

review = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=ProductReview,
    messages=[{"role": "user", "content": f"Extract review details: {text}"}]
)

print(f"Product: {review.product_name}")
print(f"Rating: {review.rating}/5")
print(f"Pros: {review.pros}")
print(f"Cons: {review.cons}")
print(f"Summary: {review.summary}")

Output:

Product: Python IDE
Rating: 4/5
Pros: ['Fast boot time', 'Good autocomplete']
Cons: ['High memory usage', 'Sparse plugin store']
Summary: A solid Python IDE with fast performance but limited plugins and high memory usage.

When a validator raises ValueError, instructor captures the error message and sends it back to the model in a follow-up message: “Validation failed: Rating must be between 1 and 5, got 6. Please fix and try again.” The model then self-corrects. By default, instructor retries up to 3 times before raising an exception. You can configure this with max_retries=N on the completion call.

Configuring Retries and Modes

instructor supports several extraction modes depending on what your model supports. The default mode uses OpenAI’s tool calling, but you can switch to JSON mode or other strategies:

# retry_config.py
import instructor
from instructor import Mode
from openai import OpenAI
from pydantic import BaseModel

# Default: tool calling (most reliable for OpenAI models)
client_tools = instructor.from_openai(OpenAI())

# JSON mode: model returns raw JSON instead of a tool call
client_json = instructor.from_openai(OpenAI(), mode=Mode.JSON)

# MD_JSON mode: model wraps JSON in markdown fences (useful for some fine-tunes)
client_md = instructor.from_openai(OpenAI(), mode=Mode.MD_JSON)

class City(BaseModel):
    name: str
    country: str
    population: int

# Control retries per-call
city = client_tools.chat.completions.create(
    model="gpt-4o-mini",
    response_model=City,
    max_retries=5,           # retry up to 5 times on validation failure
    messages=[{"role": "user", "content": "Tell me about Tokyo"}]
)

print(f"{city.name}, {city.country}: pop {city.population:,}")

Output:

Tokyo, Japan: pop 13,960,000

For most OpenAI models, the default tool-calling mode is most reliable. Use Mode.JSON for models that support JSON mode but not tool calling — for example, some fine-tuned models or older GPT versions. The max_retries parameter controls how many times instructor will re-prompt the model when validation fails. For production pipelines where data quality matters more than cost, set this to 3-5.

Instructor retry and self-correction loop
Three retries and a Pydantic error. That’s the whole self-correction system.

Using instructor with Non-OpenAI Models

If you are using Anthropic’s Claude, Google Gemini, or a local model via Ollama, instructor has provider-specific patches. For OpenAI-compatible endpoints (like local LLMs with an OpenAI-compatible API), you can pass a custom base_url:

# multi_provider.py
import instructor
from anthropic import Anthropic
from pydantic import BaseModel

# Anthropic Claude -- uses a different client class
anthropic_client = instructor.from_anthropic(Anthropic())

class Sentiment(BaseModel):
    label: str   # "positive", "negative", or "neutral"
    score: float # confidence from 0.0 to 1.0
    reason: str  # one-sentence explanation

result = anthropic_client.messages.create(
    model="claude-3-haiku-20240307",
    max_tokens=256,
    response_model=Sentiment,
    messages=[{
        "role": "user",
        "content": "This new Python library is fantastic, saves me hours every week!"
    }]
)

print(f"Sentiment: {result.label} ({result.score:.0%})")
print(f"Reason: {result.reason}")

Output:

Sentiment: positive (96%)
Reason: The user expresses strong enthusiasm and quantifies time savings, indicating genuine satisfaction.

For local models via Ollama (which provides an OpenAI-compatible API on localhost:11434), create the client with a custom base URL:

# ollama_instructor.py
import instructor
from openai import OpenAI
from pydantic import BaseModel

# Ollama runs an OpenAI-compatible server locally
ollama_client = instructor.from_openai(
    OpenAI(base_url="http://localhost:11434/v1", api_key="ollama"),
    mode=instructor.Mode.JSON  # use JSON mode for local models
)

class Summary(BaseModel):
    headline: str
    key_points: list[str]

# Works the same as OpenAI -- just a different backend
# summary = ollama_client.chat.completions.create(
#     model="llama3.2",
#     response_model=Summary,
#     messages=[{"role": "user", "content": "Summarize Python's async/await model"}]
# )
print("Local model client ready -- uncomment to use with Ollama running")

Output:

Local model client ready -- uncomment to use with Ollama running

Real-Life Example: Job Posting Extraction Pipeline

Here is a complete pipeline that reads job postings from a list of texts, extracts structured data, filters by criteria, and exports to CSV — the kind of task that comes up in recruiting tools, market research, and job aggregators:

Instructor batch structured extraction
Structured extraction at scale: parsing 50 job posts is just a for loop.
# job_extraction_pipeline.py
import instructor
import csv
from openai import OpenAI
from pydantic import BaseModel, Field
from typing import Optional, List

client = instructor.from_openai(OpenAI())

class JobPosting(BaseModel):
    title: str = Field(description="Job title exactly as written")
    company: str
    location: str = Field(description="City/country or 'Remote'")
    salary_min: Optional[int] = Field(None, description="Min annual salary USD")
    salary_max: Optional[int] = Field(None, description="Max annual salary USD")
    required_years: Optional[int] = Field(None, description="Years of experience required")
    technologies: List[str] = Field(description="List of technologies mentioned")
    is_remote: bool

# Sample job postings to process
JOB_TEXTS = [
    """Senior Python Engineer at Nexaflow -- Remote-first.
    $150k-$190k. 5+ years Python, FastAPI, PostgreSQL, AWS required.""",

    """Junior Data Scientist at BioMetrics Ltd (London, UK).
    GBP 45,000-55,000. 0-2 years exp, pandas, scikit-learn, matplotlib.""",

    """Staff ML Engineer at Quantra -- San Francisco CA.
    $220,000 - $280,000/yr. 8+ years, PyTorch, CUDA, distributed training.""",
]

def extract_jobs(texts: List[str]) -> List[JobPosting]:
    """Extract structured job data from raw posting texts."""
    jobs = []
    for i, text in enumerate(texts, 1):
        job = client.chat.completions.create(
            model="gpt-4o-mini",
            response_model=JobPosting,
            max_retries=3,
            messages=[{"role": "user", "content": f"Extract job details:\n\n{text}"}]
        )
        jobs.append(job)
        print(f"[{i}/{len(texts)}] Extracted: {job.title} at {job.company}")
    return jobs

def filter_remote(jobs: List[JobPosting]) -> List[JobPosting]:
    return [j for j in jobs if j.is_remote]

def export_csv(jobs: List[JobPosting], path: str) -> None:
    with open(path, "w", newline="") as f:
        writer = csv.writer(f)
        writer.writerow(["Title", "Company", "Location", "Salary Min", "Salary Max",
                         "Yrs Required", "Technologies", "Remote"])
        for j in jobs:
            writer.writerow([
                j.title, j.company, j.location,
                j.salary_min or "", j.salary_max or "",
                j.required_years or "",
                ", ".join(j.technologies),
                j.is_remote
            ])

if __name__ == "__main__":
    print("Extracting job postings...")
    jobs = extract_jobs(JOB_TEXTS)
    remote_jobs = filter_remote(jobs)
    print(f"\nTotal extracted: {len(jobs)}, Remote: {len(remote_jobs)}")
    export_csv(jobs, "jobs_extracted.csv")
    print("Saved to jobs_extracted.csv")

Output:

Extracting job postings...
[1/3] Extracted: Senior Python Engineer at Nexaflow
[2/3] Extracted: Junior Data Scientist at BioMetrics Ltd
[3/3] Extracted: Staff ML Engineer at Quantra

Total extracted: 3, Remote: 1
Saved to jobs_extracted.csv

This pipeline is easy to extend: add a database write step, connect it to a web scraper that feeds real job pages, or add more validation rules to the JobPosting model. The core pattern — extract once, validate automatically, retry on failure — stays the same regardless of the scale. You can process thousands of postings by replacing JOB_TEXTS with a generator that reads from a queue or database, keeping the extraction logic identical.

Frequently Asked Questions

Does instructor increase API costs because of retries?

Yes, each retry is an additional API call, so failed extractions cost more. In practice, with well-designed schemas and clear field descriptions, validation failures are rare — under 5% for most extraction tasks. The cost increase is usually worth the reliability gain. If cost is a concern, use max_retries=1 and handle exceptions in your code rather than retrying automatically.

Does instructor support streaming responses?

Yes. Use response_model=Iterable[YourModel] for streaming lists, or Partial[YourModel] for streaming partial updates to a single model. Streaming is useful for large extractions where you want to process results as they arrive rather than waiting for the full response. See the instructor documentation for the streaming API details.

What happens when the model cannot extract a field?

If the field is typed as Optional[X], the model will return None for missing information. If the field is required (non-Optional), the model will either hallucinate a value or fail validation, triggering a retry. For fields that may legitimately be absent in the source text, always use Optional with a None default. This is the most common mistake new users make.

Can I extract data from large documents?

Yes, but be aware of token limits. For documents larger than a few thousand words, split them into chunks and extract from each chunk separately. Use a List[YourModel] return type if a single document contains multiple items to extract (like a list of transactions in a bank statement). For very large documents, consider summarizing first with a regular completion call, then extracting from the summary.

How is this different from just prompting for JSON output?

Prompting for JSON works until it does not — the model adds markdown fences, writes a preamble sentence, or omits fields. instructor uses tool calling (not prompting) to enforce the schema, so the model cannot deviate from the structure. It also runs Pydantic validation on the result and retries if types or constraints are violated. The difference in reliability for production use is significant — JSON prompting is fine for experiments, but instructor is the right tool for pipelines where data quality matters.

Is my data sent to OpenAI when I use instructor?

instructor is a thin wrapper around the OpenAI SDK — your data goes to whatever API endpoint you configure, subject to that provider’s data policy. If you are processing sensitive data, use a self-hosted model via Ollama or another local inference server, and point instructor at your local endpoint with a custom base_url. The library itself does not send data anywhere — it only wraps the client you provide.

Conclusion

The instructor library solves one of the most persistent frustrations in LLM application development: getting the model to return data in the shape your code expects, every time. We covered patching the OpenAI client, defining Pydantic schemas with field descriptions, extracting nested and list objects, adding custom validation rules, configuring retries and modes, and using instructor with non-OpenAI providers. The job extraction pipeline demonstrated how these pieces combine into a production-ready pattern.

The next step is to extend the real-life example: add a web scraper to pull live job postings, or connect the extracted data to a database. With instructor handling the model-to-schema translation, you can focus entirely on the business logic of what to extract and what to do with it.

Full documentation and more examples are at python.useinstructor.com. The library’s GitHub has a large collection of real-world examples including classification, knowledge graph extraction, and citation-backed answers.