Summary

Beginner

Python Environment Variables: Quick Example

Environment variables keep sensitive data like API keys and database passwords out of your code. Python’s os.environ reads them, and python-dotenv loads them from a .env file.

#quick_example.py
import os
from dotenv import load_dotenv  # pip install python-dotenv

load_dotenv()  # reads .env file into environment variables

api_key = os.environ.get('API_KEY', 'not-set')  # get with a fallback
db_host = os.environ.get('DB_HOST', 'localhost')
print(f"API Key: {api_key[:8]}...")  # only show first 8 chars
print(f"DB Host: {db_host}")

Output:

API Key: sk-abc12...
DB Host: db.example.com

The load_dotenv() function reads key-value pairs from a .env file and makes them available through os.environ. Your secrets stay out of your codebase.

Want more? Below we cover creating .env files, keeping secrets out of Git, and a real-life database connection manager.

Why Environment Variables Matter for Python Developers

Hardcoding secrets into your source code is one of the most common security mistakes developers make. Push your code to GitHub with an API key embedded and bots will find it within minutes — that’s not an exaggeration. Environment variables solve this by keeping configuration separate from code. Different environments (dev, staging, production) can use different values without changing a single line of Python.

Using os.environ to Read Environment Variables in Python

Python’s built-in os module gives you direct access to environment variables through os.environ, which behaves like a dictionary.

#os_environ.py
import os

# Read an environment variable (raises KeyError if missing)
# home = os.environ['HOME']

# Safer: use .get() with a default value
home = os.environ.get('HOME', '/tmp')
user = os.environ.get('USER', 'unknown')
path = os.environ.get('PATH', '')

print(f"Home: {home}")
print(f"User: {user}")
print(f"PATH entries: {len(path.split(':'))}")

# Check if a variable exists
if 'API_KEY' in os.environ:
    print("API_KEY is set")
else:
    print("API_KEY is NOT set — using defaults")

Output:

Home: /home/user
User: user
PATH entries: 8
API_KEY is NOT set — using defaults

Always use .get() with a default value instead of direct dictionary access. If the variable doesn’t exist, os.environ['KEY'] throws a KeyError that will crash your script.

Creating a .env File for Your Python Project

A .env file is a simple text file with key-value pairs. Create it in your project root:

# .env
# Database settings
DB_HOST=db.example.com
DB_PORT=5432
DB_NAME=myapp
DB_USER=admin
DB_PASSWORD=supersecretpassword123

# API keys
API_KEY=sk-abc123def456ghi789
STRIPE_SECRET=sk_test_abcdefgh

# App settings
DEBUG=True
LOG_LEVEL=INFO

Note: Lines starting with # are comments. No quotes needed around values unless they contain spaces. No spaces around the = sign.

Installing and Using python-dotenv

pip install python-dotenv

Once installed, load_dotenv() reads your .env file and loads each variable into os.environ:

#using_dotenv.py
import os
from dotenv import load_dotenv

# Load .env file from the current directory (or specify a path)
load_dotenv()  # looks for .env in current dir and parent dirs

# Now all .env variables are available via os.environ
db_config = {
    'host': os.environ.get('DB_HOST'),
    'port': int(os.environ.get('DB_PORT', 5432)),
    'name': os.environ.get('DB_NAME'),
    'user': os.environ.get('DB_USER'),
    'password': os.environ.get('DB_PASSWORD'),
}

print(f"Connecting to {db_config['name']}@{db_config['host']}:{db_config['port']}")
print(f"Debug mode: {os.environ.get('DEBUG')}")

Output:

Connecting to myapp@db.example.com:5432
Debug mode: True

By default, load_dotenv() won’t overwrite existing environment variables. If you need to override them (for testing), pass override=True.

Keeping Secrets Out of Git With .gitignore

The whole point of using .env files is to keep secrets out of version control. Add .env to your .gitignore immediately:

# .gitignore
.env
.env.local
.env.production
*.env

Create a .env.example file that shows the required variables without actual values. Commit this to Git so other developers know what to set up:

# .env.example — copy to .env and fill in your values
DB_HOST=
DB_PORT=5432
DB_NAME=
DB_USER=
DB_PASSWORD=
API_KEY=
DEBUG=False

Validating Environment Variables at Startup

Don’t wait until your app crashes halfway through to discover a missing variable. Validate everything at startup.

#validate_env.py
import os
import sys
from dotenv import load_dotenv

load_dotenv()

REQUIRED_VARS = ['DB_HOST', 'DB_NAME', 'DB_USER', 'DB_PASSWORD', 'API_KEY']

missing = [var for var in REQUIRED_VARS if not os.environ.get(var)]

if missing:
    print(f"ERROR: Missing required environment variables: {', '.join(missing)}")
    print("Copy .env.example to .env and fill in the values")
    sys.exit(1)

print("All required environment variables are set")

Output (when variables are missing):

ERROR: Missing required environment variables: API_KEY
Copy .env.example to .env and fill in the values

Real-Life Example: A Database Connection Manager

Here’s a practical example that combines everything — loading config from .env, validating required variables, and creating a reusable database configuration class.

#db_manager.py
import os
import sys
from dotenv import load_dotenv
from dataclasses import dataclass

load_dotenv()

@dataclass
class DatabaseConfig:
    host: str
    port: int
    name: str
    user: str
    password: str
    ssl: bool = True

    @classmethod
    def from_env(cls):
        """Create config from environment variables"""
        required = ['DB_HOST', 'DB_NAME', 'DB_USER', 'DB_PASSWORD']
        missing = [v for v in required if not os.environ.get(v)]
        if missing:
            print(f"Missing DB config: {', '.join(missing)}")
            sys.exit(1)

        return cls(
            host=os.environ['DB_HOST'],
            port=int(os.environ.get('DB_PORT', 5432)),
            name=os.environ['DB_NAME'],
            user=os.environ['DB_USER'],
            password=os.environ['DB_PASSWORD'],
            ssl=os.environ.get('DB_SSL', 'true').lower() == 'true'
        )

    @property
    def connection_string(self):
        ssl_param = '?sslmode=require' if self.ssl else ''
        return f"postgresql://{self.user}:{self.password}@{self.host}:{self.port}/{self.name}{ssl_param}"

# Usage
config = DatabaseConfig.from_env()
print(f"Database: {config.name}")
print(f"Host: {config.host}:{config.port}")
print(f"SSL: {config.ssl}")
# In production, you'd pass config.connection_string to your ORM
print(f"Connection string ready (password hidden)")

Output:

Database: myapp
Host: db.example.com:5432
SSL: True
Connection string ready (password hidden)

This pattern gives you type-safe configuration, validation at startup, sensible defaults, and a clean connection string builder — all powered by a simple .env file.

Cache Katie switching between different environment configurations — load_dotenv() in dev, real env vars in prod. Same code, zero config changes.

Frequently Asked Questions

What is the difference between os.environ and os.getenv() in Python?

os.environ.get('KEY') and os.getenv('KEY') are functionally identical — both return None if the variable is missing. The only difference is os.environ['KEY'] (without .get) raises a KeyError, while os.getenv always returns the default.

Can I use .env files in production?

You can, but most production deployments set environment variables directly through the hosting platform (Heroku config vars, AWS Parameter Store, Docker environment). The .env file is primarily a development convenience.

Does python-dotenv work with Django and Flask?

Yes. Flask has built-in .env support with python-dotenv. For Django, call load_dotenv() at the top of your settings.py before referencing any os.environ calls.

Conclusion

Environment variables are the right way to manage configuration and secrets in Python. Use python-dotenv for local development, validate required variables at startup, never commit .env to Git, and provide a .env.example for your team. It takes five minutes to set up and saves you from a world of security headaches.

Reference

python-dotenv documentation: https://pypi.org/project/python-dotenv/

12-Factor App config: https://12factor.net/config

Further Reading: For more details, see the official Python os.environ documentation.

How To Use Python Requests To Call REST APIs With Authentication

by Pubs | APIs, Beginner

Beginner

Calling REST APIs in Python: Quick Example

Python’s requests library makes calling REST APIs dead simple. Install it with pip install requests and you can make HTTP calls in one line.

#quick_example.py
import requests  # pip install requests

# Make a GET request to a public API
response = requests.get('https://jsonplaceholder.typicode.com/posts/1')

data = response.json()  # parse the JSON response into a dict
print(data['title'])     # access specific fields
print(response.status_code)  # check the HTTP status code

Output:

sunt aut facere repellat provident occaecati excepturi optio reprehenderit
200

The response.json() method converts the API’s JSON response directly into a Python dictionary. The status code 200 means everything went smoothly.

Want to go deeper? Below we cover sending POST requests, authentication methods, and a real-life weather dashboard project.

Auth tokens in headers. Never in URLs. Never in code.

What is a REST API and Why Should You Care

A REST API is how two programs talk to each other over the internet. When you check the weather on your phone, your app is calling a weather API behind the scenes. When you log into a website using Google, that’s an API call too. As a Python developer, knowing how to call APIs opens up a world of data — weather, stock prices, social media, payment processing, you name it.

Python’s requests library is the gold standard for making HTTP calls. It wraps all the complexity of HTTP into a clean, readable interface.

Installing the Requests Library

The requests library doesn’t come with Python — you need to install it:

pip install requests

Or if you’re on Linux/Mac and need sudo:

sudo pip3 install requests

Making GET Requests With Query Parameters

GET requests are for fetching data. Most APIs accept query parameters to filter or customize the response. You can pass them as a dictionary using the params argument instead of manually building the URL string.

#get_with_params.py
import requests

# Pass query parameters as a dictionary — much cleaner than building the URL
params = {
    'userId': 1,
    'completed': 'false'
}

response = requests.get(
    'https://jsonplaceholder.typicode.com/todos',
    params=params  # requests builds the URL for you
)

todos = response.json()  # list of todo items
print(f"Found {len(todos)} incomplete todos")
print(f"First todo: {todos[0]['title']}")

Output:

Found 11 incomplete todos
First todo: delectus aut autem

The params dictionary gets converted into a query string like ?userId=1&completed=false and appended to the URL automatically. This is safer and cleaner than string concatenation.

POST Requests With JSON Body in Python

POST requests send data to an API — creating new records, submitting forms, or triggering actions. Use the json parameter to send a Python dictionary as a JSON body.

#post_request.py
import requests

# Data to send — requests will serialize this to JSON automatically
new_post = {
    'title': 'My API Post',
    'body': 'This was created with Python requests',
    'userId': 1
}

response = requests.post(
    'https://jsonplaceholder.typicode.com/posts',
    json=new_post  # automatically sets Content-Type: application/json
)

print(f"Status: {response.status_code}")  # 201 = created
print(f"New post ID: {response.json()['id']}")

Output:

Status: 201
New post ID: 101

Status code 201 means the resource was created successfully. The API returns the newly created object with its assigned ID.

Authentication Methods for Python API Calls

Most real-world APIs require authentication. Here are the three most common methods you’ll encounter.

API Key in Headers

#api_key_auth.py
import requests

headers = {
    'X-API-Key': 'your_api_key_here'  # some APIs use different header names
}

response = requests.get('https://api.example.com/data', headers=headers)
print(response.status_code)

Bearer Token Authentication

#bearer_token.py
import requests

token = 'your_access_token_here'
headers = {
    'Authorization': f'Bearer {token}'  # standard OAuth2 format
}

response = requests.get('https://api.example.com/user', headers=headers)
print(response.json())

Basic Authentication

#basic_auth.py
import requests

# requests has built-in support for Basic Auth
response = requests.get(
    'https://api.example.com/account',
    auth=('username', 'password')  # tuple of (user, pass)
)
print(response.status_code)

Note: Never hardcode API keys or tokens directly in your code. Use environment variables or a .env file instead. Check out our article on managing environment variables with dotenv for the proper approach.

Bearer vs Basic vs OAuth. Pick one, document it.

Handling API Errors and Status Codes in Python

APIs don’t always return what you expect. Network issues, invalid data, rate limits — things go wrong. Proper error handling separates production code from tutorial code.

#error_handling.py
import requests

def safe_api_call(url):
    try:
        response = requests.get(url, timeout=10)  # always set a timeout
        response.raise_for_status()  # raises exception for 4xx/5xx codes
        return response.json()
    except requests.exceptions.Timeout:
        print("Request timed out — the server took too long to respond")
    except requests.exceptions.HTTPError as e:
        print(f"HTTP error: {e.response.status_code} - {e.response.reason}")
    except requests.exceptions.ConnectionError:
        print("Connection failed — check your internet or the URL")
    except requests.exceptions.JSONDecodeError:
        print("Response wasn't valid JSON")
    return None

# Test with a valid URL
data = safe_api_call('https://jsonplaceholder.typicode.com/posts/1')
if data:
    print(f"Got: {data['title'][:40]}...")

# Test with a URL that returns 404
data = safe_api_call('https://jsonplaceholder.typicode.com/posts/99999')

Output:

Got: sunt aut facere repellat provident MDash...
HTTP error: 404 - Not Found

The raise_for_status() method is your best friend. It throws an exception for any 4xx or 5xx status code, so you don’t accidentally process error responses as valid data.

Working With Response Headers and Pagination

Many APIs return data in pages. You need to check the response headers or body for pagination info and loop through all pages to get the complete dataset.

#pagination.py
import requests

def get_all_posts(base_url):
    all_posts = []
    page = 1

    while True:
        response = requests.get(base_url, params={'_page': page, '_limit': 10})
        posts = response.json()

        if not posts:  # empty list means no more pages
            break

        all_posts.extend(posts)
        print(f"Page {page}: got {len(posts)} posts")
        page += 1

    return all_posts

posts = get_all_posts('https://jsonplaceholder.typicode.com/posts')
print(f"\nTotal posts collected: {len(posts)}")

Output:

Page 1: got 10 posts
Page 2: got 10 posts
...
Page 10: got 10 posts

Total posts collected: 100

Real-Life Example: Building a Weather Dashboard Script

Let’s put it all together with a practical script that fetches weather data from the Open-Meteo API (free, no API key needed) and displays a simple dashboard.

#weather_dashboard.py
import requests
from datetime import datetime

def get_weather(city_lat, city_lon, city_name):
    """Fetch current weather for a location using Open-Meteo API"""
    url = 'https://api.open-meteo.com/v1/forecast'
    params = {
        'latitude': city_lat,
        'longitude': city_lon,
        'current_weather': True,  # get current conditions
        'timezone': 'auto'        # detect timezone from coordinates
    }

    try:
        response = requests.get(url, params=params, timeout=10)
        response.raise_for_status()
        data = response.json()

        weather = data['current_weather']
        return {
            'city': city_name,
            'temp': weather['temperature'],
            'wind': weather['windspeed'],
            'time': weather['time']
        }
    except requests.exceptions.RequestException as e:
        print(f"Failed to get weather for {city_name}: {e}")
        return None

# Define cities with their coordinates
cities = [
    (-33.87, 151.21, 'Sydney'),
    (51.51, -0.13, 'London'),
    (40.71, -74.01, 'New York'),
    (35.68, 139.69, 'Tokyo'),
]

# Fetch and display weather for all cities
print("=" * 45)
print("  WEATHER DASHBOARD")
print("=" * 45)

for lat, lon, name in cities:
    w = get_weather(lat, lon, name)
    if w:
        print(f"  {w['city']:12s} | {w['temp']:5.1f} C | Wind: {w['wind']} km/h")

print("=" * 45)
print(f"  Updated: {datetime.now().strftime('%Y-%m-%d %H:%M')}")

Output:

=============================================
  WEATHER DASHBOARD
=============================================
  Sydney       |  22.3 C | Wind: 15.2 km/h
  London       |   8.1 C | Wind: 20.5 km/h
  New York     |  11.7 C | Wind: 12.8 km/h
  Tokyo        |  16.4 C | Wind: 8.3 km/h
=============================================
  Updated: 2026-03-13 09:15

This script demonstrates GET requests with query parameters, response parsing, error handling with timeouts, and looping through multiple API calls. You could easily extend it with a scheduler to run every hour or save results to a CSV for tracking trends over time.

Debug Dee examining a cracked response sphere for errors — response.raise_for_status() — one line between you and a silent 404 ruining everything.

Frequently Asked Questions

What is the difference between requests.get() and requests.post() in Python?

GET fetches data from a server without changing anything. POST sends data to create or update a resource. Use GET when you’re reading, POST when you’re writing. Some APIs also use PUT for updates and DELETE for removals.

How do I send form data instead of JSON with Python requests?

Use the data parameter instead of json: requests.post(url, data={'key': 'value'}). This sends the data as application/x-www-form-urlencoded, which is what HTML forms use.

Should I use requests or urllib for API calls in Python?

requests is almost always the better choice. While urllib is built-in, its API is verbose and harder to use. The requests library handles cookies, sessions, redirects, and encoding automatically.

How do I handle API rate limits with Python requests?

Check the response headers for rate limit info (usually X-RateLimit-Remaining and Retry-After). If you get a 429 status code, wait the specified time before retrying. For robust solutions, use exponential backoff with the tenacity library.

Conclusion

The requests library gives you everything you need to interact with REST APIs in Python — from simple GET calls to authenticated POST requests with error handling. The key patterns to remember are: always set a timeout, use raise_for_status() for error detection, and never hardcode credentials. With these fundamentals, you can integrate almost any web service into your Python projects.

Reference

Official requests documentation: https://docs.python-requests.org/

Python urllib documentation: https://docs.python.org/3/library/urllib.html

Why Developers Use No Code User Authentication for Python Sites

by Pubs | Input Output, Intermediate

Beginner

Introduction

Building user authentication from scratch sounds like a good idea until you’re three weeks in, wrestling with password hashing edge cases, session management bugs, and the nagging realization that you’ve probably missed half the security best practices. Authentication is deceptively complex — you need to handle password resets, token expiration, email verification, brute-force protection, and compliance with frameworks like GDPR and SOC 2. Most Python developers who’ve built auth systems manually can tell you: it’s a massive rabbit hole that distracts from your actual product.

The good news? You don’t have to build it yourself anymore. A growing number of Python developers are abandoning custom authentication in favor of no-code authentication services — third-party platforms that handle all the heavy lifting. These services let you add enterprise-grade authentication to your Python web applications in minutes, not months, without writing a single line of password validation logic or worrying about whether your security implementation is bulletproof.

In this guide, we’ll explore why no-code authentication has become the default choice for modern Python developers, how it works, and which services actually deliver on their promises. By the end, you’ll understand when to use these services and when (if ever) rolling your own auth actually makes sense.

What is No-Code Authentication?

No-code authentication refers to third-party platforms that provide complete user authentication and identity management without requiring you to build the infrastructure yourself. Instead of storing passwords in your database, validating credentials, managing sessions, and implementing security protocols, you delegate all of this to a specialized service. Your application communicates with the authentication service via APIs or SDKs, and the service handles the heavy lifting.

These platforms operate on a simple principle: authentication is so critical to security that it deserves specialized infrastructure. They invest heavily in compliance certifications, security audits, penetration testing, and infrastructure redundancy — things that are expensive and time-consuming for individual developers to maintain. By using a dedicated service, you inherit that mature security posture immediately.

The flow is straightforward. A user visits your Python application and attempts to log in. Instead of your app checking a password hash against your database, your app redirects the user to the authentication service’s login page. The service handles the login, issues tokens or sessions, and redirects the user back to your app. Your app then trusts those tokens to grant access to protected resources. From your Python code’s perspective, you’re just validating tokens and reading user claims — the hard parts are gone.

Auth is hard. Letting someone else handle it is easy.

Quick Example: Flask + Auth0

Here’s a minimal Flask application using Auth0 for authentication:

from flask import Flask, redirect, url_for, session, request
from authlib.integrations.flask_client import OAuth

app = Flask(__name__)
app.secret_key = 'your-secret-key'
oauth = OAuth(app)

auth0 = oauth.register(
    'auth0',
    client_id='your-auth0-client-id',
    client_secret='your-auth0-client-secret',
    api_base_url='https://your-domain.auth0.com',
    access_token_url='https://your-domain.auth0.com/oauth/token',
    authorize_url='https://your-domain.auth0.com/authorize',
    client_kwargs={'scope': 'openid profile email'}
)

@app.route('/login')
def login():
    return auth0.authorize_redirect(redirect_uri=url_for('callback', _external=True))

@app.route('/callback')
def callback():
    token = auth0.authorize_access_token()
    session['user'] = token
    return redirect(url_for('dashboard'))

@app.route('/dashboard')
def dashboard():
    user = session.get('user')
    return f"Hello, {user['userinfo']['email']}"

That’s it. The service handles password validation, token management, multi-factor authentication, and all the compliance headaches. Your app just needs to validate the token and read the user’s claims.

Rolling Your Own Auth vs No-Code Services

Let’s be direct about the tradeoffs. Building authentication yourself gives you complete control and customization options. You can design the exact user experience you want, integrate with proprietary identity systems, and avoid third-party dependencies. But control comes at a cost.

Factor	Roll Your Own Auth	No-Code Auth Service
Development Time	4-8 weeks minimum	30 minutes to 1 week
Security Compliance	Your responsibility, risky	SOC 2, GDPR, HIPAA certified
Password Storage	You manage hashing, salts	Provider handles securely
Token Management	Session handling, expiration logic	Automatic token lifecycle
Multi-Factor Auth	Build from scratch	Included out of the box
Social Login	Integrate each provider separately	Pre-built integrations
Breach Monitoring	Not typically implemented	Included, active alerts
Customization	Complete flexibility	Templated, some limits
Maintenance Burden	Ongoing patches, security updates	Managed by provider
Cost	Developer time (expensive)	$0-500/month depending on scale

For most Python developers and teams, the comparison is clear. The cost of building and maintaining auth incorrectly far exceeds the cost of a third-party service.

Don't roll your own crypto. Don't roll your own auth either. — Don’t roll your own crypto. Don’t roll your own auth either.

Top No-Code Authentication Services for Python

Auth0

Auth0 is the enterprise standard for no-code authentication. It provides comprehensive identity management, supports 30+ identity providers (Google, GitHub, Okta, Salesforce, etc.), and includes advanced features like passwordless authentication, risk-based access control, and detailed audit logs. For Python developers, Auth0 offers excellent SDK support via the authlib library and direct REST API access. The platform is SOC 2 certified and supports OAuth2, OpenID Connect, and SAML. Auth0’s pricing starts free for development and scales to $1,000+ per month for enterprise deployments.

Firebase Authentication

Firebase Auth is Google’s simplified authentication service, tightly integrated with the Firebase ecosystem. It’s lighter-weight than Auth0 and excels at rapid prototyping. Firebase supports email/password, phone authentication, and social login. For Python backends, you can verify Firebase tokens and manage users via their Admin SDK. The learning curve is shallow, and pricing is very reasonable — you pay for usage, typically under $100/month unless you’re at scale. Firebase Auth is ideal if you’re already invested in Google Cloud Platform or need quick, low-maintenance authentication.

Clerk

Clerk is a newer entrant focused on developer experience. It emphasizes pre-built authentication UI components and seamless session management. Clerk supports email, phone, OAuth (Google, GitHub, Apple), and passkeys. The platform includes organizational support out of the box, making it valuable for B2B applications. For Python backends, Clerk provides webhooks for user lifecycle events and middleware libraries for FastAPI and Flask. Clerk’s free tier is generous, and paid plans start around $99/month. It’s growing rapidly among startups and indie developers.

Supabase Auth

Supabase Auth is PostgreSQL-native and built on GoTrue (an open-source authentication service). If your Python application already uses Supabase for the database, adding auth is seamless — users are stored in a dedicated auth schema in your own database. Supabase supports email/password, OAuth, passwordless login, and magic links. For Python developers, Supabase provides the supabase-py SDK and REST API access. The major advantage is control — user data stays in your database, not a third-party silo. Pricing is based on usage and very affordable at scale.

Key Benefits for Python Developers

Security You Can’t Hack

Third-party auth services employ teams of security engineers, cryptographers, and compliance specialists. They undergo regular penetration testing, maintain bug bounty programs, and achieve certifications like SOC 2 and GDPR compliance. As an individual developer, achieving the same level of security would require thousands of hours and deep cryptographic expertise. When you use a no-code service, you’re inheriting a security posture that would cost your company hundreds of thousands of dollars to replicate.

Reclaim Weeks of Development Time

Authentication isn’t a differentiator for most applications. Your users don’t care if you built the login system yourself or outsourced it. What they care about is that it works reliably and securely. By using a no-code service, you redirect weeks of development effort toward features that actually move the needle — your product’s core value proposition. A typical auth implementation takes 4-8 weeks of developer time. A third-party service gets you to launch in hours.

Compliance Made Manageable

GDPR, HIPAA, SOC 2, CCPA — modern applications must meet increasingly complex compliance requirements. These standards demand careful handling of user data, audit trails, data retention policies, and security controls. Reputable auth services are already certified for these frameworks. Using them doesn’t eliminate your compliance responsibilities, but it dramatically simplifies them. You’re not starting from scratch trying to understand what GDPR requires of user authentication.

Automatic Scalability

Building auth at small scale is different from auth at large scale. At 1,000 users, a simple password database works fine. At 1 million users, you need distributed databases, caching layers, rate limiting, DDoS protection, and redundancy across regions. Third-party services handle this complexity invisibly. Your application scales from hobby project to enterprise system without changing how you call the auth API.

OAuth: a dance where four parties never trust each other.

When to Build Your Own Authentication

Despite the overwhelming advantages of no-code services, there are legitimate scenarios where building custom auth makes sense. Be honest with yourself: you probably don’t have one of these reasons.

Extreme Customization Needs: If your authentication flow requires unconventional user workflows (like a game with progression-based access gates or a specialized medical application with role-based biology), you might need custom logic. Even then, you can often layer custom logic on top of a third-party provider rather than replacing it entirely.

Regulatory Isolation Mandate: Some regulated industries require complete data sovereignty. A hospital system might be legally required to store patient authentication data exclusively within a private data center. In that case, running your own auth server (hardened and based on proven open-source code, not from scratch) is sometimes necessary.

Offline-First Application: If your Python application runs offline with intermittent connectivity (like a mobile app or field tool), a third-party auth service won’t help you validate users without internet. You’ll need to build local authentication with cached credentials. But even then, you can sync to third-party auth when connectivity returns.

Zero External Dependencies: Some organizations have architectural policies against third-party dependencies for security or liability reasons. If your company forbids external SaaS, you have no choice but to build your own. Understand that this decision extracts a real cost in engineering time and risk.

For everyone else? Use a third-party service and ship your product faster.

Real-Life Example: Django + Clerk

Let’s look at a more complete example using Django and Clerk, showing how to implement protected routes and user profile management:

import os
import requests
from functools import wraps
from django.shortcuts import redirect
from django.http import JsonResponse
from django.conf import settings

CLERK_API_KEY = os.getenv('CLERK_API_KEY')
CLERK_DOMAIN = os.getenv('CLERK_DOMAIN')

def require_clerk_auth(view_func):
    @wraps(view_func)
    def wrapped_view(request, *args, **kwargs):
        auth_header = request.headers.get('Authorization', '')
        if not auth_header.startswith('Bearer '):
            return JsonResponse({'error': 'Unauthorized'}, status=401)

        token = auth_header.split(' ')[1]
        headers = {
            'Authorization': f'Bearer {CLERK_API_KEY}',
            'Content-Type': 'application/json'
        }

        response = requests.get(
            f'{CLERK_DOMAIN}/api/v1/tokens/decode',
            params={'token': token},
            headers=headers
        )

        if response.status_code != 200:
            return JsonResponse({'error': 'Invalid token'}, status=401)

        request.clerk_user = response.json()
        return view_func(request, *args, **kwargs)

    return wrapped_view

# views.py
from django.http import JsonResponse
from django.views.decorators.http import require_http_methods

@require_http_methods(["GET"])
@require_clerk_auth
def get_profile(request):
    user_id = request.clerk_user.get('sub')
    return JsonResponse({
        'user_id': user_id,
        'email': request.clerk_user.get('email'),
        'created_at': request.clerk_user.get('iat')
    })

@require_http_methods(["POST"])
@require_clerk_auth
def update_profile(request):
    user_id = request.clerk_user.get('sub')
    data = request.POST

    # Update user in your database
    # (authenticate via Clerk token above)

    return JsonResponse({'status': 'updated'})

This example uses Clerk’s token validation endpoint to secure Django views. The decorator extracts the token from the Authorization header, validates it with Clerk’s API, and attaches the decoded user information to the request. Your view then has access to authenticated user data without ever touching passwords or sessions.

Frequently Asked Questions

How much does no-code authentication cost?

Most services offer free tiers for development and small projects. Auth0 starts free with limited features, Firebase Auth charges per identity verification (typically $0.01-$0.05 per auth event), and Clerk offers a generous free tier up to 5,000 monthly active users. For production applications, expect $20-500/month depending on user volume and features. This is almost always cheaper than the developer time required to build your own system.

Am I locked into a vendor?

Switching auth providers is possible but requires refactoring code. Your application code is tightly integrated with your chosen provider’s SDK and API. However, the integration layer is usually concentrated in middleware or decorators, so switching is more like rewriting an adapter than rewriting the entire system. Consider this when choosing a provider, but don’t let lock-in fears paralyze you — using the wrong auth approach (building it yourself) has far worse lock-in consequences.

Where does my user data live?

Most no-code providers (Auth0, Clerk, Firebase) store user data in their infrastructure. Supabase is unique in storing auth data in your own PostgreSQL database. If data residency is critical, Supabase is your answer. If you’re in an industry with strict data privacy requirements, check the provider’s data center locations and compliance certifications. Most enterprise services offer data residency options (e.g., EU-only data storage).

Can I customize the login UI?

All major providers support white-label login pages. Auth0 and Clerk allow embedding authentication directly in your application using their UI libraries. Firebase offers pre-built UI components or headless APIs if you want complete control over the interface. Supabase provides the supabase-auth-ui for quick setup or raw API access for custom interfaces. The level of customization varies by provider, but all offer more flexibility than building from scratch.

What if I have legacy users from a custom auth system?

Most providers support user imports. You can bulk-import existing user records (with hashed passwords if you trust your hash algorithm) into Auth0, Clerk, or Firebase. The import process typically takes a few steps and a bit of data transformation. During the transition, you might temporarily support both old and new auth systems, gradually migrating users. This is a known problem with known solutions.

Can I use no-code auth for offline-first apps?

No-code services require internet connectivity to authenticate users initially. For offline-first applications, you’ll need to implement local authentication with cached credentials. Some services like Supabase provide offline SDKs that sync when connectivity returns. If offline operation is essential, plan for a hybrid approach: use third-party auth for online users and implement local fallback logic for offline scenarios.

How to Use Environment Variables (.env) in Python — Keep your API keys and secrets secure
How to Make HTTP Requests with Python — Call auth provider APIs from your Python code

Conclusion

No-code authentication has fundamentally changed how Python developers should approach user login and identity management. The era of building custom auth systems is over for most applications. The services available today — Auth0, Firebase Auth, Clerk, and Supabase Auth — offer security, compliance, and features that rival or exceed what you could build in a reasonable timeframe.

The practical decision is simple: unless you have a specific, documented reason to build your own auth system, use a third-party provider. Spend your engineering time on your product’s core value proposition. Let experts handle the complex, security-critical job of authentication.

Start with the quick example in this guide, pick a provider that matches your architecture, and add authentication to your Python application in an afternoon. Your future self will thank you when you’re not debugging password reset tokens at 2 AM.

How To Generate Random Numbers In Python

by Pubs | Beginner, Data Processing, Skill Level

Generating random numbers in Python is a fairly straightforward activity which can be done in a few lines. There maybe many variations which you need to do ranging from decimal places, random numbers between a start and end number, and many more. We’ll go through many useful examples in this article.

The most basic way to generate random numbers in python is with the random library:

import random

num = random.random()

print( f"Random number between 0.0 and 1.0 ={num}\n")

Output as follows:

You’ll see that each time it is run it has a new random number.

Generating the same random number each time and why this matters

Sometimes, you may want to generate some random numbers, but then be able to generate the same random numbers each time. Now this may sound counter intuitive as the whole point of getting random numbers is so that, well, they are random. One scenario where you would like to regenerate the same random numbers is during testing. You may find some unusual behaviour and this is where you may want to replicate that behaviour for which you’l l need the same input. This is where you’d want to generate the same random number and you can do that in python using the seed function from the random library.

The idea behind the seed function is that you can think of it as a specific key which can be used to generate a series of random numbers which stems from a given key. Use a different seed and you’ll generate a different set of random numbers.

See the following example code which generates a random number between 1 and 0:

import random

random.seed(1)

for i in range(1,5):
    num = random.random()
    print( f"Random number between 0.0 and 1.0 ={num}\n")

Output as follows:

No matter how many times it is run, since the seed is the same each time, it generates the same numbers.

Python Random Number Between 1 and 10

Now that we know how to generate random numbers, how do you do it between two numbers? This is easily done in with either randint() for whole numbers or with uniform() for decimal numbers.

import random

num_int = random.randint(1,10)
print( f"Random whole number between 1 and 10 ={num_int}\n")

num_uni = random.uniform(1,10)
print( f"Random decimal number between 1 and 10 ={num_uni}\n")

Python Generate Random Numbers From A Range

Suppose you needed to generate random numbers from a range of data whether that be numbers, names or even a pack of cards. This can be done through selecting the random element in an array by choosing the index randomly. For example, if you had an array of 5 items, then you can randomly chose and index from 0 to 4 (where 0 is the index of the first item).

There is another and shorter way in python which is to use the random.choice() function. If you pass it an array, it will then randomly return one of the elements.

Here’s an example to randomly select a name from a list with both using the index (to show you how it works), and the much most efficient random.choice() library function:

import random

###### Selecing numbers from a range
names_list = [ "Judy", "Harry", "Sarah", "Tom", "Gloria"]

rand_index = random.randint( 0, len(names_list)-1 )

print( f"Randomly selected person 1 is = { names_list[ rand_index] }\n")
print( f"Randomly selected person 2 is = { random.choice( names_list) }\n")

And the output is different each time:

Generate Random String Of Length n in Python

If you want to generate a specific length string (e.g. to generate a password), both the random and the string libraries can come in handy where you can use it to create an easy password generator as follows:

import random, string

###### Create a random password
def generate_password( pass_len=10):
    password = "" 

    for i in range(1,pass_len+1):
        password = password + random.choice( string.ascii_letters + string.punctuation )
    return password

print( f"Password generated = [{ generate_password(10) }] ")

This will output a new password each time between square brackets:

If there are specific characters you want to include or exclude, you can simply replace the string.punctuation with your own list/array of specific characters to be included

Random Choice Without Replacement In Python

Suppose you wanted to randomly select items from a list without repeating any items. For example, you have a list of students and you have to select them in a random order to go first in a specific activity. In many programming languages you may need to generate a random list and remember the previously selected items to prevent any repeated selections. In the random library, there is a function called random.sample() that will do all that for you:

import random

#### Select unique random elements
students = ["John", "Tom", "Paul", "Sarah", "July", "Rachel"]

random_order = random.sample( students, 6)
print(random_order)

This will generate a unique list without repeating any selections:

[mfe_send_fox body=”

Sign up to the email list and get articles straight to your inbox. Plus get our free python one liner list!

” list=”237850″ redirect=”https://pythonhowtoprogram.com/thank-you-for-subscribing/” check_last_name=”off” layout=”top_bottom” first_name_fullwidth=”off” email_fullwidth=”off” _builder_version=”4.17.4″ _module_preset=”default” body_font=”|700|||||||” body_line_height=”1em” result_message_font=”|700|||||||” body_ul_line_height=”0.1em” custom_button=”on” button_bg_color=”#0C71C3″ button_border_color=”#FFFFFF” button_border_radius=”20px” button_letter_spacing=”0px” button_font=”|800|||||||” button_use_icon=”off” button_custom_margin=”0px||||false|false” button_custom_padding=”1px|1px|1px|1px|false|false” text_orientation=”center” background_layout=”light” custom_padding=”20px|30px|20px|30px|false|false” hover_enabled=”0″ border_radii=”on|3px|3px|3px|3px” box_shadow_style_button=”preset2″ box_shadow_vertical_button=”2px” global_colors_info=”{}” sticky_enabled=”0″][/mfe_send_fox]

Generate Date Between Two Dates in Python

In order to generate a date between two dates, this can be done by converting the dates into days first. This can be combined with the random.randint() in addition to the days of the date differences then adding back to the start date:

import random, datetime  

#### Select a random date between two dates: 
d1 = datetime.date( 2013,  2, 26 )
d2 = datetime.date( 2015, 12, 15 )
diff = d2 - d1 
new_date_days = random.randint( 0, diff.days )

print( f"Random date is {   d1 + datetime.timedelta( days=new_date_days ) }")

The output would be as follows:

Generate Random Temporary Filename in Python

A common need is to generate a random filename often for temporary storage. This might be for a log file, a cache file or some other scenario and can be easily done with the similar string generation as above. First a letter should be determined and then the remaining letters can be added with also numbers as well.

import random, string

def generate_random_filename( filename_len=10):
    filename = "" 
    filename = filename + random.choice( string.ascii_lowercase  )

    for i in range(2, filename_len+1):
        filename = filename + random.choice( string.ascii_lowercase + string.digits )
    return filename

print( f"Random filename = [{ generate_random_filename( 10) }.txt]")

Output as follows:

There is in fact a specific python library though that does this which is even simpler:

import tempfile

filename = tempfile.NamedTemporaryFile( prefix="temp_" , suffix =".txt" )

print( f" Temporary filename is [{ filename.name }] ")

Output of the temporary filename generator is:

Conclusion

The random library has many uses from generating numbers to specific strings with a given length for password generation. Typically, these use cases sometimes have specialised libraries as there can be nuances (e.g for passwords, you may not want a repeating sequence which may be possible through random luck) which you can search for through pypi.org. However, many can be created with simple lines of code as demonstrated above. Send comments below or email me to ask further questions.

Subscribe

Not subscribed to our email list? Sign up now and get your next article in your inbox:

How To Use Python instructor for Structured LLM Outputs

by Pubs | May 19, 2026 | APIs, Intermediate

Intermediate

You ask an LLM to extract a user’s name, age, and email from a paragraph of text. Sometimes it returns clean JSON. Sometimes it returns JSON wrapped in markdown fences. Sometimes it returns a paragraph explaining why it extracted those fields. If you have ever built a pipeline that breaks because the model decided today was a good day to add “Sure! Here is the extracted data:” before the JSON, you already understand why instructor exists.

The instructor library patches the OpenAI client (and any OpenAI-compatible API) to force the model to return a fully validated Pydantic model — every time. When validation fails, it retries automatically. You define exactly what fields you need, with their types and constraints, and instructor handles the conversation with the model until the output matches your schema. You need Python 3.9+, an OpenAI API key (or compatible endpoint), and pip install instructor.

This article walks through everything you need to get structured LLM outputs in production: installing and patching the client, defining Pydantic schemas, extracting nested objects, handling lists, using validation hooks, working with non-OpenAI models via LiteLLM, and building a real extraction pipeline. By the end you will have a reusable pattern for reliable structured data from any LLM.

Structured LLM Output: Quick Example

The fastest way to see instructor in action is to extract a structured object from a single sentence. Install the library and try this:

# quick_instructor.py
import instructor
from openai import OpenAI
from pydantic import BaseModel

client = instructor.from_openai(OpenAI())

class Person(BaseModel):
    name: str
    age: int
    city: str

person = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=Person,
    messages=[{"role": "user", "content": "Alice is 32 years old and lives in Melbourne."}]
)

print(person.name)   # Alice
print(person.age)    # 32
print(person.city)   # Melbourne
print(type(person))  # <class '__main__.Person'>

Output:

Alice
32
Melbourne
<class '__main__.Person'>

The key line is instructor.from_openai(OpenAI()) — this patches the standard OpenAI client. After that, you pass response_model=Person to any chat.completions.create call, and instructor automatically: sends the Pydantic schema to the model as a tool definition, parses the model’s tool-call response, validates it against your schema, and retries if validation fails. The return value is a fully typed Pydantic object, not a string or dict.

That example covers the simplest case. The sections below show how to handle nested models, lists, validation rules, retry configuration, and real-world pipelines.

Instructor converts chaotic LLM output to clean schema — response_model= and the chaos becomes a schema.

What Is instructor and Why Use It?

When you call an LLM without constraints, it returns free-form text. Parsing that text into structured data is fragile — you write regex, JSON parsers, and fallback handlers that break every time the model changes its wording. instructor solves this by using OpenAI’s function/tool calling feature under the hood: it converts your Pydantic model into a JSON Schema tool definition, forces the model to call that tool, and validates the returned arguments against your schema.

The result is LLM output that behaves like a typed function return value instead of a string you have to parse. If the model returns a field with the wrong type (for example, age as a string “thirty-two” instead of an integer), instructor sends the validation error back to the model and asks it to try again — up to a configurable number of retries.

Approach	Reliability	Type Safety	Auto-Retry
Parse raw LLM text	Fragile	None	Manual
Parse JSON from prompt	Moderate	Manual	Manual
OpenAI function calling	Good	Partial	None
instructor + Pydantic	High	Full	Built-in

The library supports multiple backends: instructor.from_openai, instructor.from_anthropic, instructor.from_gemini, and any OpenAI-compatible endpoint via base_url. This makes it the same interface regardless of which model you use.

Installation and Setup

Install instructor and the OpenAI SDK together. If you are using a different provider, you may also need their SDK:

# Terminal
pip install instructor openai pydantic

Set your API key as an environment variable so it never appears in your code:

# setup_env.py -- run once, or add to your shell profile
import os
# In practice, set this in your shell:
# export OPENAI_API_KEY="sk-..."
print("OPENAI_API_KEY set:", bool(os.environ.get("OPENAI_API_KEY")))

Output:

OPENAI_API_KEY set: True

Patch the client once at startup and reuse it for all calls. Creating a new patched client for every request is wasteful:

# client_setup.py
import instructor
from openai import OpenAI

# Patch once at startup
client = instructor.from_openai(OpenAI())  # reads OPENAI_API_KEY from env

# The client now has response_model support on all completion calls
print(type(client))  # <class 'instructor.client.Instructor'>

Output:

<class 'instructor.client.Instructor'>

Patching OpenAI client with Pydantic validator — One patch. Every completion call now speaks schema.

Defining Pydantic Schemas for Extraction

Your Pydantic model defines exactly what fields the LLM must return. Field descriptions improve accuracy significantly — the model uses them as instructions for what to put in each field. Use Field(description=...) to guide the extraction:

# schema_example.py
import instructor
from openai import OpenAI
from pydantic import BaseModel, Field
from typing import Optional

client = instructor.from_openai(OpenAI())

class JobPosting(BaseModel):
    title: str = Field(description="The exact job title as written in the posting")
    company: str = Field(description="Company name offering the position")
    location: str = Field(description="City and country, or 'Remote'")
    salary_min: Optional[int] = Field(None, description="Minimum annual salary in USD if mentioned")
    salary_max: Optional[int] = Field(None, description="Maximum annual salary in USD if mentioned")
    is_remote: bool = Field(description="True if the role allows remote work")

text = """
Senior Python Developer at DataFlow Inc. -- Remote (US timezones preferred).
Salary range: $140,000 - $175,000 per year. Must have 5+ years Python experience.
"""

job = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=JobPosting,
    messages=[{"role": "user", "content": f"Extract the job details from: {text}"}]
)

print(f"Title: {job.title}")
print(f"Company: {job.company}")
print(f"Location: {job.location}")
print(f"Salary: ${job.salary_min:,} - ${job.salary_max:,}")
print(f"Remote: {job.is_remote}")

Output:

Title: Senior Python Developer
Company: DataFlow Inc.
Location: Remote (US timezones preferred)
Salary: $140,000 - $175,000
Remote: True

The Optional[int] type tells instructor (and the model) that salary fields may be absent. When the source text does not mention a salary, these fields will be None instead of hallucinated values. Always use Optional for fields that may not appear in the input — without it, the model will invent plausible-sounding values rather than leaving the field empty.

Extracting Nested and List Objects

Real-world extraction often requires nested structures — for example, an invoice with multiple line items, or a resume with a list of work experiences. instructor handles nested Pydantic models and List types natively:

# nested_extraction.py
import instructor
from openai import OpenAI
from pydantic import BaseModel, Field
from typing import List

client = instructor.from_openai(OpenAI())

class LineItem(BaseModel):
    description: str
    quantity: int
    unit_price: float

class Invoice(BaseModel):
    vendor: str
    invoice_number: str
    items: List[LineItem]
    total: float

invoice_text = """
Invoice #INV-2024-0891 from CloudHost Solutions
- 3x Server instances @ $45.00 each
- 1x SSL Certificate @ $12.00
- 2x Domain registrations @ $15.00 each
Total: $222.00
"""

result = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=Invoice,
    messages=[{"role": "user", "content": f"Extract invoice data: {invoice_text}"}]
)

print(f"Vendor: {result.vendor}")
print(f"Invoice #: {result.invoice_number}")
for item in result.items:
    print(f"  {item.quantity}x {item.description} @ ${item.unit_price:.2f}")
print(f"Total: ${result.total:.2f}")

Output:

Vendor: CloudHost Solutions
Invoice #: INV-2024-0891
  3x Server instances @ $45.00
  1x SSL Certificate @ $12.00
  2x Domain registrations @ $15.00
Total: $222.00

Nested models work because instructor converts the entire schema — including nested classes — into a JSON Schema definition that the model understands. The model fills in every field of every nested object, and Pydantic validates the whole structure recursively. If the items list is missing or a line item has an invalid type, instructor retries the extraction with the validation error as feedback.

Instructor nested Pydantic models — Nested Pydantic models: recursion that actually works.

Adding Custom Validation Rules

Pydantic’s field_validator lets you add business logic on top of type checking. instructor automatically feeds validation errors back to the model, so the model gets a second (or third) chance to return values that satisfy your rules:

# custom_validation.py
import instructor
from openai import OpenAI
from pydantic import BaseModel, Field, field_validator
from typing import List

client = instructor.from_openai(OpenAI())

class ProductReview(BaseModel):
    product_name: str
    rating: int = Field(description="Rating from 1 to 5")
    pros: List[str] = Field(description="List of positive aspects, at least one")
    cons: List[str] = Field(description="List of negative aspects, can be empty")
    summary: str = Field(description="One-sentence summary under 150 characters")

    @field_validator("rating")
    @classmethod
    def rating_in_range(cls, v: int) -> int:
        if not 1 <= v <= 5:
            raise ValueError(f"Rating must be between 1 and 5, got {v}")
        return v

    @field_validator("pros")
    @classmethod
    def at_least_one_pro(cls, v: List[str]) -> List[str]:
        if not v:
            raise ValueError("Must include at least one positive aspect")
        return v

    @field_validator("summary")
    @classmethod
    def summary_length(cls, v: str) -> str:
        if len(v) > 150:
            raise ValueError(f"Summary too long: {len(v)} chars (max 150)")
        return v

text = """
The new Python IDE is pretty solid. Boot time is fast, autocomplete works well.
The memory usage is high and the plugin store is still sparse. Overall a decent
choice for Python development. I'd give it 4 out of 5.
"""

review = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=ProductReview,
    messages=[{"role": "user", "content": f"Extract review details: {text}"}]
)

print(f"Product: {review.product_name}")
print(f"Rating: {review.rating}/5")
print(f"Pros: {review.pros}")
print(f"Cons: {review.cons}")
print(f"Summary: {review.summary}")

Output:

Product: Python IDE
Rating: 4/5
Pros: ['Fast boot time', 'Good autocomplete']
Cons: ['High memory usage', 'Sparse plugin store']
Summary: A solid Python IDE with fast performance but limited plugins and high memory usage.

When a validator raises ValueError, instructor captures the error message and sends it back to the model in a follow-up message: “Validation failed: Rating must be between 1 and 5, got 6. Please fix and try again.” The model then self-corrects. By default, instructor retries up to 3 times before raising an exception. You can configure this with max_retries=N on the completion call.

Configuring Retries and Modes

instructor supports several extraction modes depending on what your model supports. The default mode uses OpenAI’s tool calling, but you can switch to JSON mode or other strategies:

# retry_config.py
import instructor
from instructor import Mode
from openai import OpenAI
from pydantic import BaseModel

# Default: tool calling (most reliable for OpenAI models)
client_tools = instructor.from_openai(OpenAI())

# JSON mode: model returns raw JSON instead of a tool call
client_json = instructor.from_openai(OpenAI(), mode=Mode.JSON)

# MD_JSON mode: model wraps JSON in markdown fences (useful for some fine-tunes)
client_md = instructor.from_openai(OpenAI(), mode=Mode.MD_JSON)

class City(BaseModel):
    name: str
    country: str
    population: int

# Control retries per-call
city = client_tools.chat.completions.create(
    model="gpt-4o-mini",
    response_model=City,
    max_retries=5,           # retry up to 5 times on validation failure
    messages=[{"role": "user", "content": "Tell me about Tokyo"}]
)

print(f"{city.name}, {city.country}: pop {city.population:,}")

Output:

Tokyo, Japan: pop 13,960,000

For most OpenAI models, the default tool-calling mode is most reliable. Use Mode.JSON for models that support JSON mode but not tool calling — for example, some fine-tuned models or older GPT versions. The max_retries parameter controls how many times instructor will re-prompt the model when validation fails. For production pipelines where data quality matters more than cost, set this to 3-5.

Instructor retry and self-correction loop — Three retries and a Pydantic error. That’s the whole self-correction system.

Using instructor with Non-OpenAI Models

If you are using Anthropic’s Claude, Google Gemini, or a local model via Ollama, instructor has provider-specific patches. For OpenAI-compatible endpoints (like local LLMs with an OpenAI-compatible API), you can pass a custom base_url:

# multi_provider.py
import instructor
from anthropic import Anthropic
from pydantic import BaseModel

# Anthropic Claude -- uses a different client class
anthropic_client = instructor.from_anthropic(Anthropic())

class Sentiment(BaseModel):
    label: str   # "positive", "negative", or "neutral"
    score: float # confidence from 0.0 to 1.0
    reason: str  # one-sentence explanation

result = anthropic_client.messages.create(
    model="claude-3-haiku-20240307",
    max_tokens=256,
    response_model=Sentiment,
    messages=[{
        "role": "user",
        "content": "This new Python library is fantastic, saves me hours every week!"
    }]
)

print(f"Sentiment: {result.label} ({result.score:.0%})")
print(f"Reason: {result.reason}")

Output:

Sentiment: positive (96%)
Reason: The user expresses strong enthusiasm and quantifies time savings, indicating genuine satisfaction.

For local models via Ollama (which provides an OpenAI-compatible API on localhost:11434), create the client with a custom base URL:

# ollama_instructor.py
import instructor
from openai import OpenAI
from pydantic import BaseModel

# Ollama runs an OpenAI-compatible server locally
ollama_client = instructor.from_openai(
    OpenAI(base_url="http://localhost:11434/v1", api_key="ollama"),
    mode=instructor.Mode.JSON  # use JSON mode for local models
)

class Summary(BaseModel):
    headline: str
    key_points: list[str]

# Works the same as OpenAI -- just a different backend
# summary = ollama_client.chat.completions.create(
#     model="llama3.2",
#     response_model=Summary,
#     messages=[{"role": "user", "content": "Summarize Python's async/await model"}]
# )
print("Local model client ready -- uncomment to use with Ollama running")

Output:

Local model client ready -- uncomment to use with Ollama running

Real-Life Example: Job Posting Extraction Pipeline

Here is a complete pipeline that reads job postings from a list of texts, extracts structured data, filters by criteria, and exports to CSV — the kind of task that comes up in recruiting tools, market research, and job aggregators:

Instructor batch structured extraction — Structured extraction at scale: parsing 50 job posts is just a for loop.

# job_extraction_pipeline.py
import instructor
import csv
from openai import OpenAI
from pydantic import BaseModel, Field
from typing import Optional, List

client = instructor.from_openai(OpenAI())

class JobPosting(BaseModel):
    title: str = Field(description="Job title exactly as written")
    company: str
    location: str = Field(description="City/country or 'Remote'")
    salary_min: Optional[int] = Field(None, description="Min annual salary USD")
    salary_max: Optional[int] = Field(None, description="Max annual salary USD")
    required_years: Optional[int] = Field(None, description="Years of experience required")
    technologies: List[str] = Field(description="List of technologies mentioned")
    is_remote: bool

# Sample job postings to process
JOB_TEXTS = [
    """Senior Python Engineer at Nexaflow -- Remote-first.
    $150k-$190k. 5+ years Python, FastAPI, PostgreSQL, AWS required.""",

    """Junior Data Scientist at BioMetrics Ltd (London, UK).
    GBP 45,000-55,000. 0-2 years exp, pandas, scikit-learn, matplotlib.""",

    """Staff ML Engineer at Quantra -- San Francisco CA.
    $220,000 - $280,000/yr. 8+ years, PyTorch, CUDA, distributed training.""",
]

def extract_jobs(texts: List[str]) -> List[JobPosting]:
    """Extract structured job data from raw posting texts."""
    jobs = []
    for i, text in enumerate(texts, 1):
        job = client.chat.completions.create(
            model="gpt-4o-mini",
            response_model=JobPosting,
            max_retries=3,
            messages=[{"role": "user", "content": f"Extract job details:\n\n{text}"}]
        )
        jobs.append(job)
        print(f"[{i}/{len(texts)}] Extracted: {job.title} at {job.company}")
    return jobs

def filter_remote(jobs: List[JobPosting]) -> List[JobPosting]:
    return [j for j in jobs if j.is_remote]

def export_csv(jobs: List[JobPosting], path: str) -> None:
    with open(path, "w", newline="") as f:
        writer = csv.writer(f)
        writer.writerow(["Title", "Company", "Location", "Salary Min", "Salary Max",
                         "Yrs Required", "Technologies", "Remote"])
        for j in jobs:
            writer.writerow([
                j.title, j.company, j.location,
                j.salary_min or "", j.salary_max or "",
                j.required_years or "",
                ", ".join(j.technologies),
                j.is_remote
            ])

if __name__ == "__main__":
    print("Extracting job postings...")
    jobs = extract_jobs(JOB_TEXTS)
    remote_jobs = filter_remote(jobs)
    print(f"\nTotal extracted: {len(jobs)}, Remote: {len(remote_jobs)}")
    export_csv(jobs, "jobs_extracted.csv")
    print("Saved to jobs_extracted.csv")

Output:

Extracting job postings...
[1/3] Extracted: Senior Python Engineer at Nexaflow
[2/3] Extracted: Junior Data Scientist at BioMetrics Ltd
[3/3] Extracted: Staff ML Engineer at Quantra

Total extracted: 3, Remote: 1
Saved to jobs_extracted.csv

This pipeline is easy to extend: add a database write step, connect it to a web scraper that feeds real job pages, or add more validation rules to the JobPosting model. The core pattern — extract once, validate automatically, retry on failure — stays the same regardless of the scale. You can process thousands of postings by replacing JOB_TEXTS with a generator that reads from a queue or database, keeping the extraction logic identical.

Frequently Asked Questions

Does instructor increase API costs because of retries?

Yes, each retry is an additional API call, so failed extractions cost more. In practice, with well-designed schemas and clear field descriptions, validation failures are rare — under 5% for most extraction tasks. The cost increase is usually worth the reliability gain. If cost is a concern, use max_retries=1 and handle exceptions in your code rather than retrying automatically.

Does instructor support streaming responses?

Yes. Use response_model=Iterable[YourModel] for streaming lists, or Partial[YourModel] for streaming partial updates to a single model. Streaming is useful for large extractions where you want to process results as they arrive rather than waiting for the full response. See the instructor documentation for the streaming API details.

What happens when the model cannot extract a field?

If the field is typed as Optional[X], the model will return None for missing information. If the field is required (non-Optional), the model will either hallucinate a value or fail validation, triggering a retry. For fields that may legitimately be absent in the source text, always use Optional with a None default. This is the most common mistake new users make.

Can I extract data from large documents?

Yes, but be aware of token limits. For documents larger than a few thousand words, split them into chunks and extract from each chunk separately. Use a List[YourModel] return type if a single document contains multiple items to extract (like a list of transactions in a bank statement). For very large documents, consider summarizing first with a regular completion call, then extracting from the summary.

How is this different from just prompting for JSON output?

Prompting for JSON works until it does not — the model adds markdown fences, writes a preamble sentence, or omits fields. instructor uses tool calling (not prompting) to enforce the schema, so the model cannot deviate from the structure. It also runs Pydantic validation on the result and retries if types or constraints are violated. The difference in reliability for production use is significant — JSON prompting is fine for experiments, but instructor is the right tool for pipelines where data quality matters.

Is my data sent to OpenAI when I use instructor?

instructor is a thin wrapper around the OpenAI SDK — your data goes to whatever API endpoint you configure, subject to that provider’s data policy. If you are processing sensitive data, use a self-hosted model via Ollama or another local inference server, and point instructor at your local endpoint with a custom base_url. The library itself does not send data anywhere — it only wraps the client you provide.

Conclusion

The instructor library solves one of the most persistent frustrations in LLM application development: getting the model to return data in the shape your code expects, every time. We covered patching the OpenAI client, defining Pydantic schemas with field descriptions, extracting nested and list objects, adding custom validation rules, configuring retries and modes, and using instructor with non-OpenAI providers. The job extraction pipeline demonstrated how these pieces combine into a production-ready pattern.

The next step is to extend the real-life example: add a web scraper to pull live job postings, or connect the extracted data to a database. With instructor handling the model-to-schema translation, you can focus entirely on the business logic of what to extract and what to do with it.

Full documentation and more examples are at python.useinstructor.com. The library’s GitHub has a large collection of real-world examples including classification, knowledge graph extraction, and citation-backed answers.

Further Reading: For more details, see the Python random module documentation.

Frequently Asked Questions

How do I generate a random number in Python?

Use random.randint(a, b) for integers or random.random() for a float between 0 and 1. Example: import random; num = random.randint(1, 100).

What is the difference between random and secrets?

The random module is for simulations and games but NOT for security. The secrets module provides cryptographically secure randomness for passwords, tokens, and security-sensitive applications.

How do I generate a random list of numbers?

Use [random.randint(1, 100) for _ in range(10)] for random integers. For unique numbers, use random.sample(range(1, 101), 10). For float arrays, use numpy.random.rand(10).

How do I set a random seed?

Call random.seed(42) before generating numbers. The same seed always produces the same sequence, useful for testing and reproducible experiments.

Can I generate numbers following a specific distribution?

Yes. Use random.gauss() for normal, random.uniform() for uniform. NumPy offers numpy.random.normal(), poisson(), binomial(), and many more.

Free Python Hosting with Deta.sh to Get Your Code on Cloud

For some of your web apps you develop in python, you will want to run them on the cloud so that your script can run 24/7. For some of your smaller applications, you may want to find the right free python hosting service so you don’t have to worry about the per month charges. These web applications might be a website written in flask, or using another web framework, it might be other types of python apps that runs in the background and runs your automation. This is where you can consider some of the hosting services that have a free plan and are still very easy to setup.

To find the right hosting platforms that fits your needs, you want to consider a few things:

Ease of access to upload projects
What type of support they provide
What specifications that virtual server environment has to offer

One such new platform is called deta.sh. Deta is a free hosting service that can be used to provide web hosting for deploying python web applications or other types of python applications that run in the background.

The deta service, as of mid-2022, is still in the development stage and is expected to have a permanent free python hosting service so that online python applications can be setup and deployed quickly and easily. Deta is a relatively new service but is a service that is intended to compete with pythonanywhere, heroku, and similar services to run python on web servers. The service lets you host python script online without fuss directly from a command line, much like how you can check in code to github. Although it is new, it has the potential to be one of the best free python hosting there is in order to get your python online.

The platform provides you mini virtual environments (called ‘micros’) where you can host your python scripts. These can be separated into workspaces called ‘projects’ so that you can also more easily manage your environments. The way you can access/upload your code is with the command line through a password Access Token.

We will go through step by step how to run your python online. For this article, we will guide you on using deta to host a simple flask based web page so that you can have python as a webserver.

Signing up for Deta.sh

Deta.sh is effectively a cloud python hosting service which sits on top of AWS and allows you to deploy your python code into a virtual machine (called a deta micro), store files (called data drive) and also store data (called deta base). Unlike AWS or other hosting services, you can quickly host and run your script without going through the hassle of setting up server, security configurations etc.

The Deta.sh team offers the service for free in order to allow developers to monetize the solutions where deta.sh will be able to share some of that revenue. To date, there are no paid Deta.sh hosting plans for python hosting and no intention. So you can continue to run python code online forever.

To begin with, head over to the website https://deta.sh to first create an account.

Enter a unique username, password and email. The Email must be real in order to activate your account

Once you have submitted, go to your email and click on the verify link.

You will be taken to this “verification success” page. Here you can sign in, but also join the “Discord” channel. You can get any help very quickly from the community that’s there.

After you click on sign-in, enter the same username and password, and you will be taken to the default page where you will have the ability to “See My Key”

Click on the “See My Key” to see your secret password. You will only be able to see it once and will not be able to see it ever again.

This is what they project key will look like:

You need both the key and the project id.

Think of the key like a password and the “Project ID” as a password. When you want to access your deta.sh to upload programs, make changes, you will need to use your project key to access your space.

If you lose your project id/key, you will not be able to recover it. However, you can create a new one with Settings->Create Key option.

Create a new project key with Settings -> Create Key (this key you see on the screen has already been deleted!)

One thing I’d like to call out is the Project ID. This is the ID of this particular s[ace

If you have multiple programs which access deta.sh, it is best to have separate project keys. The reason is that if one of your keys are compromised, then you can simply just change that key and not have all your applications be affected.

Setting Up Your Remote Access For Deta.sh

We will first setup deta.sh in the command line interface so that you can communicate to your deta.sh space on the cloud.

You can do this with either one of:

Mac / Linux:
curl -fsSL https://get.deta.dev/cli.sh | sh

Windows:
iwr https://get.deta.dev/cli.ps1 -useb | iex

Once that’s done, what will happen is that there will be a hidden folder called $HOME/.deta that is created (specifically in the case of Mac / Linux). It’s in this directory that the deta command line application will be found.

You can type deta --help to check that the command line tool was installed correctly

Next, you will need to create an access token so that you can connect to your deta.sh account. For this you will need to create an access token. Go to your deta.sh home page (e.g. https://web.deta.sh/) and then go back to the main projects page.

Next, click on the Create Access token under settings

Once you create token, this will create an Access Token so that you don’t need to login each time.

Copy this Access Token and then, create a file called tokens in the $HOME/.deta/ directory. Steps for Mac/Linux are:

cd $HOME/.deta
nano tokens

You can then add the following json inside the tokens file:

{
"deta_access_token": "<your access token created above>"
}

Finally, you can install the python library that will be used to access the deta components with the deta library.

pip install deta

Have a Free Python Hosting Flask on Deta.sh

To create an environment to host your python code and have python web hosting, you need to create something called a “micro“. This is almost like a mini virtual server with 128mb of memory but will not be running all the time. They will wake up, execute your code, and then go back to sleep. Deta.sh is not designed for long running applications with heavy computations (use one of the public cloud providers for that!). Also, each micro has its own python online cloud private access.

To begin with, you can use the command deta new --python <micro name>. The <micro name> is the name to label the mini-virtual name.

The above command will create a directory called flask_test with a python script called main.py

The default code in the main.py is:

def app(event):
    return "Hello, world!"

At the same time, this code will be uploaded to deta.sh. If you go to the dashboard page https://web.deta.sh/ you will see a sub-menu under the Micro menu. You may need to refresh your browser if you had it open.

You will notice that there’s also a URL for this deta micro which is the end point where your application output can be accessed. Think of this simply as the console output.

If you encountered any errors, in the command line, you can type deta logs to get an output of any errors from the logs.

To make a more useful application, we can create a flask application to show a more functional webpage. In order to do this, you will need to dell deta.sh to install the flask library. You cannot use pip install unfortunately, but instead you need to use the requirements.txt instead.

First, add flask into a requirements.txt file in your local directory. So your file should simply look like this:

#requirements.txt
flask

Then in your main.py code file, you add the following, again this is in your local directory

from flask import Flask

app = Flask(__name__)

@app.route('/', methods=["GET"])
def hello_world():
    return "Hello Flask World"
      
# def app(event):
#     return "Hello, world!"

In order to now upload the changes to your micro, you will need to run the command deta deploy. This will upload the files requirements.txt and updates to main.py into your micro.

deta deploy

When executed, this should upload the code and install the libraries:

Managing Flask Forms On Free Python Hosting

Now that we have a simple static web page, we can create a more complex example where there’s a form that can be submitted. Using the weather API from openweathermap API, we can show the weather for a given location.

To get the weather data, we need to install two libraries pyowm and datetime. Hence, this will need to be added to requirements.txt.

#requirements.txt
flask
pyowm
datetime

Then for the code, the following can be updated in the main.py:

from flask import Flask, request, jsonify
import pyowm, datetime

app = Flask(__name__)

@app.route('/', methods=["GET"])
def get_location():
    return """<html>
                <body>
                    <form action="weather" method="POST">
                        <input name="location" type="text">
                        <input type="submit" value="submit">
                    </form>
                </body>
              </html>"""  

@app.route('/weather', methods=["POST", "GET"]) 
def get_weather():
    api_key = '<your open weather map API ley>' 
    owm = pyowm.OWM( api_key ).weather_manager()   

    weather_data = owm.weather_at_place('Bangalore').weather
    ref_time = datetime.datetime.fromtimestamp( weather_data.ref_time ).strftime('%Y-%m-%d %H:%M')

    weather_str =   f"<h1>Weather Report for: {request.form['location']}</h1>"
    weather_str +=  f"<ul>"
    weather_str +=  f"<li><b>Time:</b> {  ref_time } </li>" 
    weather_str +=  f"<li><b>Overview:</b> {weather_data.detailed_status} </li>" 
    weather_str +=  f"<li><b>Wind Speed:</b> {weather_data.wind()} </li>" 
    weather_str +=  f"<li><b>Humidity:</b> {weather_data.humidity} </li>" 
    weather_str +=  f"<li><b>Temperature:</b> {weather_data.temperature('fahrenheit')} </li>" 
    weather_str +=  f"<li><b>Rain:</b> {weather_data.rain} </li>" 
    weather_str +=  f"</ul>"
    return weather_str

# def app(event):
#     return "Hello, world!"

Then to upload the code into deta.sh, you can use the command deploy:

deta deloy

Once deployed, you can then go to the website – this is the endpoint that was automatically generated by deta.sh above.

The main webpage which calls the function `def get_location()`

Once submitted, then a call is made to OpenWeatherMap

When the form is submitted from the `/` url, then the function `def get_weather()` is called to process the form. The variable that was passed, can be access through `request.form['location']`.

The above code works by first providing a form through the function def get_location() which generates a very simple form through HTML:

<html>
  <body>
    <form action="weather" method="POST">
      <input name="location" type="text">
      <input type="submit" value="submit">
    </form>
  </body>
</html>

When the submit button is pressed, the form calls the /weather URL with the field location. Once called, then the python function def get_weather() is called upon which a call to OpenWeatherMap.org is made to get the weather data for the given location.

Conclusion

This is just a tip of the iceberg of what you can do with deta. You can also run scheduled jobs, run a NoSQL database, and have file storage as well. Contact us if you’d like us to cover these areas too.

How To Use Python instructor for Structured LLM Outputs

by Pubs | May 19, 2026 | APIs, Intermediate

Intermediate

You ask an LLM to extract a user’s name, age, and email from a paragraph of text. Sometimes it returns clean JSON. Sometimes it returns JSON wrapped in markdown fences. Sometimes it returns a paragraph explaining why it extracted those fields. If you have ever built a pipeline that breaks because the model decided today was a good day to add “Sure! Here is the extracted data:” before the JSON, you already understand why instructor exists.

The instructor library patches the OpenAI client (and any OpenAI-compatible API) to force the model to return a fully validated Pydantic model — every time. When validation fails, it retries automatically. You define exactly what fields you need, with their types and constraints, and instructor handles the conversation with the model until the output matches your schema. You need Python 3.9+, an OpenAI API key (or compatible endpoint), and pip install instructor.

This article walks through everything you need to get structured LLM outputs in production: installing and patching the client, defining Pydantic schemas, extracting nested objects, handling lists, using validation hooks, working with non-OpenAI models via LiteLLM, and building a real extraction pipeline. By the end you will have a reusable pattern for reliable structured data from any LLM.

Structured LLM Output: Quick Example

The fastest way to see instructor in action is to extract a structured object from a single sentence. Install the library and try this:

# quick_instructor.py
import instructor
from openai import OpenAI
from pydantic import BaseModel

client = instructor.from_openai(OpenAI())

class Person(BaseModel):
    name: str
    age: int
    city: str

person = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=Person,
    messages=[{"role": "user", "content": "Alice is 32 years old and lives in Melbourne."}]
)

print(person.name)   # Alice
print(person.age)    # 32
print(person.city)   # Melbourne
print(type(person))  # <class '__main__.Person'>

Output:

Alice
32
Melbourne
<class '__main__.Person'>

The key line is instructor.from_openai(OpenAI()) — this patches the standard OpenAI client. After that, you pass response_model=Person to any chat.completions.create call, and instructor automatically: sends the Pydantic schema to the model as a tool definition, parses the model’s tool-call response, validates it against your schema, and retries if validation fails. The return value is a fully typed Pydantic object, not a string or dict.

That example covers the simplest case. The sections below show how to handle nested models, lists, validation rules, retry configuration, and real-world pipelines.

What Is instructor and Why Use It?

When you call an LLM without constraints, it returns free-form text. Parsing that text into structured data is fragile — you write regex, JSON parsers, and fallback handlers that break every time the model changes its wording. instructor solves this by using OpenAI’s function/tool calling feature under the hood: it converts your Pydantic model into a JSON Schema tool definition, forces the model to call that tool, and validates the returned arguments against your schema.

The result is LLM output that behaves like a typed function return value instead of a string you have to parse. If the model returns a field with the wrong type (for example, age as a string “thirty-two” instead of an integer), instructor sends the validation error back to the model and asks it to try again — up to a configurable number of retries.

Approach	Reliability	Type Safety	Auto-Retry
Parse raw LLM text	Fragile	None	Manual
Parse JSON from prompt	Moderate	Manual	Manual
OpenAI function calling	Good	Partial	None
instructor + Pydantic	High	Full	Built-in

The library supports multiple backends: instructor.from_openai, instructor.from_anthropic, instructor.from_gemini, and any OpenAI-compatible endpoint via base_url. This makes it the same interface regardless of which model you use.

Installation and Setup

Install instructor and the OpenAI SDK together. If you are using a different provider, you may also need their SDK:

# Terminal
pip install instructor openai pydantic

Set your API key as an environment variable so it never appears in your code:

# setup_env.py -- run once, or add to your shell profile
import os
# In practice, set this in your shell:
# export OPENAI_API_KEY="sk-..."
print("OPENAI_API_KEY set:", bool(os.environ.get("OPENAI_API_KEY")))

Output:

OPENAI_API_KEY set: True

Patch the client once at startup and reuse it for all calls. Creating a new patched client for every request is wasteful:

# client_setup.py
import instructor
from openai import OpenAI

# Patch once at startup
client = instructor.from_openai(OpenAI())  # reads OPENAI_API_KEY from env

# The client now has response_model support on all completion calls
print(type(client))  # <class 'instructor.client.Instructor'>

Output:

<class 'instructor.client.Instructor'>

Defining Pydantic Schemas for Extraction

Your Pydantic model defines exactly what fields the LLM must return. Field descriptions improve accuracy significantly — the model uses them as instructions for what to put in each field. Use Field(description=...) to guide the extraction:

# schema_example.py
import instructor
from openai import OpenAI
from pydantic import BaseModel, Field
from typing import Optional

client = instructor.from_openai(OpenAI())

class JobPosting(BaseModel):
    title: str = Field(description="The exact job title as written in the posting")
    company: str = Field(description="Company name offering the position")
    location: str = Field(description="City and country, or 'Remote'")
    salary_min: Optional[int] = Field(None, description="Minimum annual salary in USD if mentioned")
    salary_max: Optional[int] = Field(None, description="Maximum annual salary in USD if mentioned")
    is_remote: bool = Field(description="True if the role allows remote work")

text = """
Senior Python Developer at DataFlow Inc. -- Remote (US timezones preferred).
Salary range: $140,000 - $175,000 per year. Must have 5+ years Python experience.
"""

job = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=JobPosting,
    messages=[{"role": "user", "content": f"Extract the job details from: {text}"}]
)

print(f"Title: {job.title}")
print(f"Company: {job.company}")
print(f"Location: {job.location}")
print(f"Salary: ${job.salary_min:,} - ${job.salary_max:,}")
print(f"Remote: {job.is_remote}")

Output:

Title: Senior Python Developer
Company: DataFlow Inc.
Location: Remote (US timezones preferred)
Salary: $140,000 - $175,000
Remote: True

The Optional[int] type tells instructor (and the model) that salary fields may be absent. When the source text does not mention a salary, these fields will be None instead of hallucinated values. Always use Optional for fields that may not appear in the input — without it, the model will invent plausible-sounding values rather than leaving the field empty.

Extracting Nested and List Objects

Real-world extraction often requires nested structures — for example, an invoice with multiple line items, or a resume with a list of work experiences. instructor handles nested Pydantic models and List types natively:

# nested_extraction.py
import instructor
from openai import OpenAI
from pydantic import BaseModel, Field
from typing import List

client = instructor.from_openai(OpenAI())

class LineItem(BaseModel):
    description: str
    quantity: int
    unit_price: float

class Invoice(BaseModel):
    vendor: str
    invoice_number: str
    items: List[LineItem]
    total: float

invoice_text = """
Invoice #INV-2024-0891 from CloudHost Solutions
- 3x Server instances @ $45.00 each
- 1x SSL Certificate @ $12.00
- 2x Domain registrations @ $15.00 each
Total: $222.00
"""

result = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=Invoice,
    messages=[{"role": "user", "content": f"Extract invoice data: {invoice_text}"}]
)

print(f"Vendor: {result.vendor}")
print(f"Invoice #: {result.invoice_number}")
for item in result.items:
    print(f"  {item.quantity}x {item.description} @ ${item.unit_price:.2f}")
print(f"Total: ${result.total:.2f}")

Output:

Vendor: CloudHost Solutions
Invoice #: INV-2024-0891
  3x Server instances @ $45.00
  1x SSL Certificate @ $12.00
  2x Domain registrations @ $15.00
Total: $222.00

Nested models work because instructor converts the entire schema — including nested classes — into a JSON Schema definition that the model understands. The model fills in every field of every nested object, and Pydantic validates the whole structure recursively. If the items list is missing or a line item has an invalid type, instructor retries the extraction with the validation error as feedback.

Adding Custom Validation Rules

Pydantic’s field_validator lets you add business logic on top of type checking. instructor automatically feeds validation errors back to the model, so the model gets a second (or third) chance to return values that satisfy your rules:

# custom_validation.py
import instructor
from openai import OpenAI
from pydantic import BaseModel, Field, field_validator
from typing import List

client = instructor.from_openai(OpenAI())

class ProductReview(BaseModel):
    product_name: str
    rating: int = Field(description="Rating from 1 to 5")
    pros: List[str] = Field(description="List of positive aspects, at least one")
    cons: List[str] = Field(description="List of negative aspects, can be empty")
    summary: str = Field(description="One-sentence summary under 150 characters")

    @field_validator("rating")
    @classmethod
    def rating_in_range(cls, v: int) -> int:
        if not 1 <= v <= 5:
            raise ValueError(f"Rating must be between 1 and 5, got {v}")
        return v

    @field_validator("pros")
    @classmethod
    def at_least_one_pro(cls, v: List[str]) -> List[str]:
        if not v:
            raise ValueError("Must include at least one positive aspect")
        return v

    @field_validator("summary")
    @classmethod
    def summary_length(cls, v: str) -> str:
        if len(v) > 150:
            raise ValueError(f"Summary too long: {len(v)} chars (max 150)")
        return v

text = """
The new Python IDE is pretty solid. Boot time is fast, autocomplete works well.
The memory usage is high and the plugin store is still sparse. Overall a decent
choice for Python development. I'd give it 4 out of 5.
"""

review = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=ProductReview,
    messages=[{"role": "user", "content": f"Extract review details: {text}"}]
)

print(f"Product: {review.product_name}")
print(f"Rating: {review.rating}/5")
print(f"Pros: {review.pros}")
print(f"Cons: {review.cons}")
print(f"Summary: {review.summary}")

Output:

Product: Python IDE
Rating: 4/5
Pros: ['Fast boot time', 'Good autocomplete']
Cons: ['High memory usage', 'Sparse plugin store']
Summary: A solid Python IDE with fast performance but limited plugins and high memory usage.

When a validator raises ValueError, instructor captures the error message and sends it back to the model in a follow-up message: “Validation failed: Rating must be between 1 and 5, got 6. Please fix and try again.” The model then self-corrects. By default, instructor retries up to 3 times before raising an exception. You can configure this with max_retries=N on the completion call.

Configuring Retries and Modes

instructor supports several extraction modes depending on what your model supports. The default mode uses OpenAI’s tool calling, but you can switch to JSON mode or other strategies:

# retry_config.py
import instructor
from instructor import Mode
from openai import OpenAI
from pydantic import BaseModel

# Default: tool calling (most reliable for OpenAI models)
client_tools = instructor.from_openai(OpenAI())

# JSON mode: model returns raw JSON instead of a tool call
client_json = instructor.from_openai(OpenAI(), mode=Mode.JSON)

# MD_JSON mode: model wraps JSON in markdown fences (useful for some fine-tunes)
client_md = instructor.from_openai(OpenAI(), mode=Mode.MD_JSON)

class City(BaseModel):
    name: str
    country: str
    population: int

# Control retries per-call
city = client_tools.chat.completions.create(
    model="gpt-4o-mini",
    response_model=City,
    max_retries=5,           # retry up to 5 times on validation failure
    messages=[{"role": "user", "content": "Tell me about Tokyo"}]
)

print(f"{city.name}, {city.country}: pop {city.population:,}")

Output:

Tokyo, Japan: pop 13,960,000

For most OpenAI models, the default tool-calling mode is most reliable. Use Mode.JSON for models that support JSON mode but not tool calling — for example, some fine-tuned models or older GPT versions. The max_retries parameter controls how many times instructor will re-prompt the model when validation fails. For production pipelines where data quality matters more than cost, set this to 3-5.

Using instructor with Non-OpenAI Models

If you are using Anthropic’s Claude, Google Gemini, or a local model via Ollama, instructor has provider-specific patches. For OpenAI-compatible endpoints (like local LLMs with an OpenAI-compatible API), you can pass a custom base_url:

# multi_provider.py
import instructor
from anthropic import Anthropic
from pydantic import BaseModel

# Anthropic Claude -- uses a different client class
anthropic_client = instructor.from_anthropic(Anthropic())

class Sentiment(BaseModel):
    label: str   # "positive", "negative", or "neutral"
    score: float # confidence from 0.0 to 1.0
    reason: str  # one-sentence explanation

result = anthropic_client.messages.create(
    model="claude-3-haiku-20240307",
    max_tokens=256,
    response_model=Sentiment,
    messages=[{
        "role": "user",
        "content": "This new Python library is fantastic, saves me hours every week!"
    }]
)

print(f"Sentiment: {result.label} ({result.score:.0%})")
print(f"Reason: {result.reason}")

Output:

Sentiment: positive (96%)
Reason: The user expresses strong enthusiasm and quantifies time savings, indicating genuine satisfaction.

For local models via Ollama (which provides an OpenAI-compatible API on localhost:11434), create the client with a custom base URL:

# ollama_instructor.py
import instructor
from openai import OpenAI
from pydantic import BaseModel

# Ollama runs an OpenAI-compatible server locally
ollama_client = instructor.from_openai(
    OpenAI(base_url="http://localhost:11434/v1", api_key="ollama"),
    mode=instructor.Mode.JSON  # use JSON mode for local models
)

class Summary(BaseModel):
    headline: str
    key_points: list[str]

# Works the same as OpenAI -- just a different backend
# summary = ollama_client.chat.completions.create(
#     model="llama3.2",
#     response_model=Summary,
#     messages=[{"role": "user", "content": "Summarize Python's async/await model"}]
# )
print("Local model client ready -- uncomment to use with Ollama running")

Output:

Local model client ready -- uncomment to use with Ollama running

Real-Life Example: Job Posting Extraction Pipeline

Here is a complete pipeline that reads job postings from a list of texts, extracts structured data, filters by criteria, and exports to CSV — the kind of task that comes up in recruiting tools, market research, and job aggregators:

# job_extraction_pipeline.py
import instructor
import csv
from openai import OpenAI
from pydantic import BaseModel, Field
from typing import Optional, List

client = instructor.from_openai(OpenAI())

class JobPosting(BaseModel):
    title: str = Field(description="Job title exactly as written")
    company: str
    location: str = Field(description="City/country or 'Remote'")
    salary_min: Optional[int] = Field(None, description="Min annual salary USD")
    salary_max: Optional[int] = Field(None, description="Max annual salary USD")
    required_years: Optional[int] = Field(None, description="Years of experience required")
    technologies: List[str] = Field(description="List of technologies mentioned")
    is_remote: bool

# Sample job postings to process
JOB_TEXTS = [
    """Senior Python Engineer at Nexaflow -- Remote-first.
    $150k-$190k. 5+ years Python, FastAPI, PostgreSQL, AWS required.""",

    """Junior Data Scientist at BioMetrics Ltd (London, UK).
    GBP 45,000-55,000. 0-2 years exp, pandas, scikit-learn, matplotlib.""",

    """Staff ML Engineer at Quantra -- San Francisco CA.
    $220,000 - $280,000/yr. 8+ years, PyTorch, CUDA, distributed training.""",
]

def extract_jobs(texts: List[str]) -> List[JobPosting]:
    """Extract structured job data from raw posting texts."""
    jobs = []
    for i, text in enumerate(texts, 1):
        job = client.chat.completions.create(
            model="gpt-4o-mini",
            response_model=JobPosting,
            max_retries=3,
            messages=[{"role": "user", "content": f"Extract job details:\n\n{text}"}]
        )
        jobs.append(job)
        print(f"[{i}/{len(texts)}] Extracted: {job.title} at {job.company}")
    return jobs

def filter_remote(jobs: List[JobPosting]) -> List[JobPosting]:
    return [j for j in jobs if j.is_remote]

def export_csv(jobs: List[JobPosting], path: str) -> None:
    with open(path, "w", newline="") as f:
        writer = csv.writer(f)
        writer.writerow(["Title", "Company", "Location", "Salary Min", "Salary Max",
                         "Yrs Required", "Technologies", "Remote"])
        for j in jobs:
            writer.writerow([
                j.title, j.company, j.location,
                j.salary_min or "", j.salary_max or "",
                j.required_years or "",
                ", ".join(j.technologies),
                j.is_remote
            ])

if __name__ == "__main__":
    print("Extracting job postings...")
    jobs = extract_jobs(JOB_TEXTS)
    remote_jobs = filter_remote(jobs)
    print(f"\nTotal extracted: {len(jobs)}, Remote: {len(remote_jobs)}")
    export_csv(jobs, "jobs_extracted.csv")
    print("Saved to jobs_extracted.csv")

Output:

Extracting job postings...
[1/3] Extracted: Senior Python Engineer at Nexaflow
[2/3] Extracted: Junior Data Scientist at BioMetrics Ltd
[3/3] Extracted: Staff ML Engineer at Quantra

Total extracted: 3, Remote: 1
Saved to jobs_extracted.csv

This pipeline is easy to extend: add a database write step, connect it to a web scraper that feeds real job pages, or add more validation rules to the JobPosting model. The core pattern — extract once, validate automatically, retry on failure — stays the same regardless of the scale. You can process thousands of postings by replacing JOB_TEXTS with a generator that reads from a queue or database, keeping the extraction logic identical.

Frequently Asked Questions

Does instructor increase API costs because of retries?

Yes, each retry is an additional API call, so failed extractions cost more. In practice, with well-designed schemas and clear field descriptions, validation failures are rare — under 5% for most extraction tasks. The cost increase is usually worth the reliability gain. If cost is a concern, use max_retries=1 and handle exceptions in your code rather than retrying automatically.

Does instructor support streaming responses?

Yes. Use response_model=Iterable[YourModel] for streaming lists, or Partial[YourModel] for streaming partial updates to a single model. Streaming is useful for large extractions where you want to process results as they arrive rather than waiting for the full response. See the instructor documentation for the streaming API details.

What happens when the model cannot extract a field?

If the field is typed as Optional[X], the model will return None for missing information. If the field is required (non-Optional), the model will either hallucinate a value or fail validation, triggering a retry. For fields that may legitimately be absent in the source text, always use Optional with a None default. This is the most common mistake new users make.

Can I extract data from large documents?

Yes, but be aware of token limits. For documents larger than a few thousand words, split them into chunks and extract from each chunk separately. Use a List[YourModel] return type if a single document contains multiple items to extract (like a list of transactions in a bank statement). For very large documents, consider summarizing first with a regular completion call, then extracting from the summary.

How is this different from just prompting for JSON output?

Prompting for JSON works until it does not — the model adds markdown fences, writes a preamble sentence, or omits fields. instructor uses tool calling (not prompting) to enforce the schema, so the model cannot deviate from the structure. It also runs Pydantic validation on the result and retries if types or constraints are violated. The difference in reliability for production use is significant — JSON prompting is fine for experiments, but instructor is the right tool for pipelines where data quality matters.

Is my data sent to OpenAI when I use instructor?

instructor is a thin wrapper around the OpenAI SDK — your data goes to whatever API endpoint you configure, subject to that provider’s data policy. If you are processing sensitive data, use a self-hosted model via Ollama or another local inference server, and point instructor at your local endpoint with a custom base_url. The library itself does not send data anywhere — it only wraps the client you provide.

Conclusion

The instructor library solves one of the most persistent frustrations in LLM application development: getting the model to return data in the shape your code expects, every time. We covered patching the OpenAI client, defining Pydantic schemas with field descriptions, extracting nested and list objects, adding custom validation rules, configuring retries and modes, and using instructor with non-OpenAI providers. The job extraction pipeline demonstrated how these pieces combine into a production-ready pattern.

The next step is to extend the real-life example: add a web scraper to pull live job postings, or connect the extracted data to a database. With instructor handling the model-to-schema translation, you can focus entirely on the business logic of what to extract and what to do with it.

Full documentation and more examples are at python.useinstructor.com. The library’s GitHub has a large collection of real-world examples including classification, knowledge graph extraction, and citation-backed answers.

Python Await Async Tutorial with Real Examples and Simple Explanations

Get Weather Forecasts And Show It On A Chart Using Python 3

Further Reading: For more details, see the Python virtual environments documentation.

Frequently Asked Questions

Is Deta still free for hosting Python apps?

Deta Space offers a free tier for personal use. The original Deta.sh Micros service has evolved. For free Python hosting alternatives, consider Railway, Render, PythonAnywhere, or Google Cloud Run’s free tier.

What are the best free Python hosting alternatives?

PythonAnywhere offers a free tier for web apps. Render provides free static sites and web services. Railway has a free trial. Google Cloud Run and AWS Lambda have generous free tiers for serverless deployments.

How do I deploy a Python Flask app for free?

Use Render (connect GitHub repo), PythonAnywhere (upload directly), or Railway (deploy from GitHub). Each provides different advantages for hobby and small-scale projects.

What should I consider when choosing Python hosting?

Consider free tier limits, sleep/cold-start behavior, database availability, custom domain support, deployment method, Python version support, and scaling options.

Can I host a Python bot or script for free?

Yes. PythonAnywhere allows always-on tasks. Google Cloud Functions and AWS Lambda handle event-driven scripts. For Discord/Telegram bots, Railway and Render offer free tiers suitable for small bots.

Python Await Async Tutorial with Real Examples and Simple Explanations

by Pubs | Advanced, Management, Web Scraping

Advanced

The python await and async is one of the more advanced features to help run your programs faster by making sure the CPU is spending as little time as possible waiting and instead as much time as possible working. If ever you see a capable chef, you’ll know what I mean. The chef is not just following a recipe step by step (i.e. working synchronously), the chef is boiling water to cook the pasta , measuring the amount of pasta, chopping tomatoes for the pasta sauce until the water boils etc (i.e. the chef is working asynchronously). The chef is minimizing the time they are waiting idle and always working on a task. That’s the same idea with async and await.

For this tutorial, we will focus on python 3.7 as it has some of the more modern features of await and async. We will call out some of the differences for python 3.4 – 3.6.

What is async await in Python?

The async await keywords help to define in your program which parts need to run sequentially, and which parts may take sometime but other parts of the program can execute while this step completes. A modern example of this is that if you’re downloading a web page it may take a few seconds, while the download is happening you can execute other parts of your program.

How does async await work in Python?

Sometimes the best way to explain something is to show how you would achieve the same thing without the feature.

Continuing with the restaurant theme, suppose you are running a hamburger stall (you’re the waiter and the chef) and it is almost instant to collect payment for a customer and serve the final hamburger, but the most time consuming task is to cooking the beef patty which takes 2 seconds (one could only wish!).

See the below diagram:

Figure 1: Sequentially serving customers at a hamburger stall

In the above diagram:

Step 1: you would first get the order and collect the money from Customer 1
Step 2: you would then put a beef patty on the cook top and then wait for 2 seconds for the beef patty to cook. At the same time, Customer 1 is also waiting for 2 seconds.
Step 3: when the beef patty is cooked, you can then plate this onto a hamburger bun
Step 4: pass the final hamburger to Customer 1
Step 5: You would then start to serve Customer 2 (who has already been waiting 2 seconds for you to serve Customer 1). You can then repeat steps 2-4

With the above approach, Customer 1 would have their burger in about 2 seconds, Customer 2 approx 4 seconds, and then Customer 3 approx 6 seconds.

The equivalent code would be as follows:

import time, datetime, timeit

customer_queue = [ "C1", "C2", "C3" ]

def get_next_customer():
    return customer_queue.pop(0)    #Get the first customer from list

def cook_hamburger(customer):
    start_customer_timer = timeit.default_timer()
    print( f"[{customer}]: Start cooking hamberger for customer")
    time.sleep(2)   # It takes 2 seconds to cook the hamburger
    end_customer_timer = timeit.default_timer()
    print( f"[{customer}]: Finish cooking hamberger for customer.  Total {end_customer_timer-start_customer_timer} seconds\n")

def run_shop():
    while customer_queue:
        curr_customer = get_next_customer()
        cook_hamburger(curr_customer)

def main():
    print('Hamburger Shop')
    start = timeit.default_timer()
    run_shop()
    stop = timeit.default_timer()
    print(f"** Total runtime: {stop-start} seconds ***")

if __name__ == '__main__':
    main()

The code above is fairly straightforward. We have a list of customers that are queuing in the list customer_queue which are being looped under the def run_shop(). For each customer (get_next_customer()), we call cook_hamburger() to cook the hamburger for 2 seconds and wait for it to complete.

Running this code you would get the following output:

As expected, the total runtime for 3 customers is 6 seconds since each customer is served sequentially.

Cooking Hamburgers Asynchronously and coding the event loop manually

Instead of serving the customer and cooking the hamburger for each customer, you can obviously do some of the tasks asynchronously, meaning you can start the task but you don’t have to sit and wait, you can do something else. See the following diagram where the chef/waiter is serving multiple customers and cooking at the same time. It’s not explicitly shown here, but the chef/waiter is constantly checking on the status of the next task and if a task doesn’t require his/her attention they’ll move on to the next task. This process of always looking for something to do is the equivalent of the “event loop”. The Event Loop is a programming construct where the logic is to always look for a task to execute and if there’s a task which will take some time it can release control to the next task in the loop.

Figure 2: *Example of how the event loop works in a real life example – the chef/waiter is always busy!*

In the above example, the following is happening:

Step 1: you would first get the order and collect the money from Customer 1
Step 2: you would then put a beef patty on the cook top and then let it cook, then immediately move on to the next customer while the patty is cooking.
Step 3: you would first get the order and collect the money from Customer 2. You would also check if the first beef patty has completed cooking yet.
Step 4: you would then put another beef patty on the cook top and then let it cook, then immediately move on to the next customer while the patty is cooking.
…
Step 5: When any of the beef patties are done, you would plate it
Step 6: Pass the plated hamburger to the respective customer. Note, in the above example we’ve assumed it to be Customer 1, but it could be any customer depending on which beef patty cooked fully first.
Step 7: When any of the beef patties are done, you would plate it, and server

This is the equivalent of the event loop. The chef/waiter is constantly checking if it needs to serve the customer or check on the hamburgers which are cooking. When there’s a hamburger is placed on the stove and we need to wait 2 seconds, the chef/waiter moves to the next task and does not wait for the 2 seconds to complete. When the hamburger is done, it is then served to the customer.

How can this be done programatically? Glad you asked:

import time ,datetime, timeit

customer_queue = [ "C1", "C2", "C3" ]
hamburger_queue = []

def get_next_customer():
    if customer_queue: return customer_queue.pop(0)    #Get the first customer from list
    return None 

def start_cooking_hamburger(customer):
    print( f"[{customer}]: Start cooking hamberger for customer")
    hamburger = { "customer":customer, "start_cooking_time": timeit.default_timer(), "cooked":False}
    hamburger_queue.append( hamburger )

def check_hamburger_status():
    curr_timer = timeit.default_timer()

    #Check if it's cooking, but release control
    for index, hamburger in enumerate(hamburger_queue):         
        elapsed_time = curr_timer-hamburger['start_cooking_time']
        if elapsed_time > 2: #2 second has passed for hamrburger to cook
            print( f"[{hamburger['customer']}]: Finish cooking hamberger for customer.  Total {elapsed_time} seconds\n")
            del hamburger_queue[ index].  #delete from list to mark as done

def run_shop():
    while customer_queue or hamburger_queue:        #Event loop
        curr_customer = get_next_customer()
        if curr_customer: start_cooking_hamburger(curr_customer)
        check_hamburger_status()

def main():
    print('Hamburger Shop')
    start = timeit.default_timer()
    run_shop()
    stop = timeit.default_timer()
    print(f"** Total runtime: {stop-start} seconds ***")

if __name__ == '__main__':
    main()

The output of the code is as follows:

*Output running asynchronously – notice the runtime of 2 seconds compared to the 6 seconds in the synchronsous method*.

So there’s a few things happening here:

There’s a new list called hamburger_queue[] which is keeping track of each hamburger that is being cooked
The event loop is the while customer_queue or hamburger_queue within the run_shop() function
We have a new function called start_cooking_hamburger() which helps to keep track of the task to cooking starting. Why is this needed? Well in the past we would simply wait for a given task. Now, since we are doing something else while we wait, we need to remember a few things to come back to the task
We also have a new function called check_hamburger_status() which checks the status of each hamburger being cooked (i.e. item in hamburger_queue[]), and if it is cooked (i.e. 2 seconds have passed), then it is considered complete

You may notice in the output that Customer 3 was in fact served before Customer 2. This is because that the execution order is not guarantee.

How To Use Python instructor for Structured LLM Outputs

by Pubs | May 19, 2026 | APIs, Intermediate

Intermediate

You ask an LLM to extract a user’s name, age, and email from a paragraph of text. Sometimes it returns clean JSON. Sometimes it returns JSON wrapped in markdown fences. Sometimes it returns a paragraph explaining why it extracted those fields. If you have ever built a pipeline that breaks because the model decided today was a good day to add “Sure! Here is the extracted data:” before the JSON, you already understand why instructor exists.

The instructor library patches the OpenAI client (and any OpenAI-compatible API) to force the model to return a fully validated Pydantic model — every time. When validation fails, it retries automatically. You define exactly what fields you need, with their types and constraints, and instructor handles the conversation with the model until the output matches your schema. You need Python 3.9+, an OpenAI API key (or compatible endpoint), and pip install instructor.

This article walks through everything you need to get structured LLM outputs in production: installing and patching the client, defining Pydantic schemas, extracting nested objects, handling lists, using validation hooks, working with non-OpenAI models via LiteLLM, and building a real extraction pipeline. By the end you will have a reusable pattern for reliable structured data from any LLM.

Structured LLM Output: Quick Example

The fastest way to see instructor in action is to extract a structured object from a single sentence. Install the library and try this:

# quick_instructor.py
import instructor
from openai import OpenAI
from pydantic import BaseModel

client = instructor.from_openai(OpenAI())

class Person(BaseModel):
    name: str
    age: int
    city: str

person = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=Person,
    messages=[{"role": "user", "content": "Alice is 32 years old and lives in Melbourne."}]
)

print(person.name)   # Alice
print(person.age)    # 32
print(person.city)   # Melbourne
print(type(person))  # <class '__main__.Person'>

Output:

Alice
32
Melbourne
<class '__main__.Person'>

The key line is instructor.from_openai(OpenAI()) — this patches the standard OpenAI client. After that, you pass response_model=Person to any chat.completions.create call, and instructor automatically: sends the Pydantic schema to the model as a tool definition, parses the model’s tool-call response, validates it against your schema, and retries if validation fails. The return value is a fully typed Pydantic object, not a string or dict.

That example covers the simplest case. The sections below show how to handle nested models, lists, validation rules, retry configuration, and real-world pipelines.

What Is instructor and Why Use It?

When you call an LLM without constraints, it returns free-form text. Parsing that text into structured data is fragile — you write regex, JSON parsers, and fallback handlers that break every time the model changes its wording. instructor solves this by using OpenAI’s function/tool calling feature under the hood: it converts your Pydantic model into a JSON Schema tool definition, forces the model to call that tool, and validates the returned arguments against your schema.

The result is LLM output that behaves like a typed function return value instead of a string you have to parse. If the model returns a field with the wrong type (for example, age as a string “thirty-two” instead of an integer), instructor sends the validation error back to the model and asks it to try again — up to a configurable number of retries.

Approach	Reliability	Type Safety	Auto-Retry
Parse raw LLM text	Fragile	None	Manual
Parse JSON from prompt	Moderate	Manual	Manual
OpenAI function calling	Good	Partial	None
instructor + Pydantic	High	Full	Built-in

The library supports multiple backends: instructor.from_openai, instructor.from_anthropic, instructor.from_gemini, and any OpenAI-compatible endpoint via base_url. This makes it the same interface regardless of which model you use.

Installation and Setup

Install instructor and the OpenAI SDK together. If you are using a different provider, you may also need their SDK:

# Terminal
pip install instructor openai pydantic

Set your API key as an environment variable so it never appears in your code:

# setup_env.py -- run once, or add to your shell profile
import os
# In practice, set this in your shell:
# export OPENAI_API_KEY="sk-..."
print("OPENAI_API_KEY set:", bool(os.environ.get("OPENAI_API_KEY")))

Output:

OPENAI_API_KEY set: True

Patch the client once at startup and reuse it for all calls. Creating a new patched client for every request is wasteful:

# client_setup.py
import instructor
from openai import OpenAI

# Patch once at startup
client = instructor.from_openai(OpenAI())  # reads OPENAI_API_KEY from env

# The client now has response_model support on all completion calls
print(type(client))  # <class 'instructor.client.Instructor'>

Output:

<class 'instructor.client.Instructor'>

Defining Pydantic Schemas for Extraction

Your Pydantic model defines exactly what fields the LLM must return. Field descriptions improve accuracy significantly — the model uses them as instructions for what to put in each field. Use Field(description=...) to guide the extraction:

# schema_example.py
import instructor
from openai import OpenAI
from pydantic import BaseModel, Field
from typing import Optional

client = instructor.from_openai(OpenAI())

class JobPosting(BaseModel):
    title: str = Field(description="The exact job title as written in the posting")
    company: str = Field(description="Company name offering the position")
    location: str = Field(description="City and country, or 'Remote'")
    salary_min: Optional[int] = Field(None, description="Minimum annual salary in USD if mentioned")
    salary_max: Optional[int] = Field(None, description="Maximum annual salary in USD if mentioned")
    is_remote: bool = Field(description="True if the role allows remote work")

text = """
Senior Python Developer at DataFlow Inc. -- Remote (US timezones preferred).
Salary range: $140,000 - $175,000 per year. Must have 5+ years Python experience.
"""

job = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=JobPosting,
    messages=[{"role": "user", "content": f"Extract the job details from: {text}"}]
)

print(f"Title: {job.title}")
print(f"Company: {job.company}")
print(f"Location: {job.location}")
print(f"Salary: ${job.salary_min:,} - ${job.salary_max:,}")
print(f"Remote: {job.is_remote}")

Output:

Title: Senior Python Developer
Company: DataFlow Inc.
Location: Remote (US timezones preferred)
Salary: $140,000 - $175,000
Remote: True

The Optional[int] type tells instructor (and the model) that salary fields may be absent. When the source text does not mention a salary, these fields will be None instead of hallucinated values. Always use Optional for fields that may not appear in the input — without it, the model will invent plausible-sounding values rather than leaving the field empty.

Extracting Nested and List Objects

Real-world extraction often requires nested structures — for example, an invoice with multiple line items, or a resume with a list of work experiences. instructor handles nested Pydantic models and List types natively:

# nested_extraction.py
import instructor
from openai import OpenAI
from pydantic import BaseModel, Field
from typing import List

client = instructor.from_openai(OpenAI())

class LineItem(BaseModel):
    description: str
    quantity: int
    unit_price: float

class Invoice(BaseModel):
    vendor: str
    invoice_number: str
    items: List[LineItem]
    total: float

invoice_text = """
Invoice #INV-2024-0891 from CloudHost Solutions
- 3x Server instances @ $45.00 each
- 1x SSL Certificate @ $12.00
- 2x Domain registrations @ $15.00 each
Total: $222.00
"""

result = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=Invoice,
    messages=[{"role": "user", "content": f"Extract invoice data: {invoice_text}"}]
)

print(f"Vendor: {result.vendor}")
print(f"Invoice #: {result.invoice_number}")
for item in result.items:
    print(f"  {item.quantity}x {item.description} @ ${item.unit_price:.2f}")
print(f"Total: ${result.total:.2f}")

Output:

Vendor: CloudHost Solutions
Invoice #: INV-2024-0891
  3x Server instances @ $45.00
  1x SSL Certificate @ $12.00
  2x Domain registrations @ $15.00
Total: $222.00

Nested models work because instructor converts the entire schema — including nested classes — into a JSON Schema definition that the model understands. The model fills in every field of every nested object, and Pydantic validates the whole structure recursively. If the items list is missing or a line item has an invalid type, instructor retries the extraction with the validation error as feedback.

Adding Custom Validation Rules

Pydantic’s field_validator lets you add business logic on top of type checking. instructor automatically feeds validation errors back to the model, so the model gets a second (or third) chance to return values that satisfy your rules:

# custom_validation.py
import instructor
from openai import OpenAI
from pydantic import BaseModel, Field, field_validator
from typing import List

client = instructor.from_openai(OpenAI())

class ProductReview(BaseModel):
    product_name: str
    rating: int = Field(description="Rating from 1 to 5")
    pros: List[str] = Field(description="List of positive aspects, at least one")
    cons: List[str] = Field(description="List of negative aspects, can be empty")
    summary: str = Field(description="One-sentence summary under 150 characters")

    @field_validator("rating")
    @classmethod
    def rating_in_range(cls, v: int) -> int:
        if not 1 <= v <= 5:
            raise ValueError(f"Rating must be between 1 and 5, got {v}")
        return v

    @field_validator("pros")
    @classmethod
    def at_least_one_pro(cls, v: List[str]) -> List[str]:
        if not v:
            raise ValueError("Must include at least one positive aspect")
        return v

    @field_validator("summary")
    @classmethod
    def summary_length(cls, v: str) -> str:
        if len(v) > 150:
            raise ValueError(f"Summary too long: {len(v)} chars (max 150)")
        return v

text = """
The new Python IDE is pretty solid. Boot time is fast, autocomplete works well.
The memory usage is high and the plugin store is still sparse. Overall a decent
choice for Python development. I'd give it 4 out of 5.
"""

review = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=ProductReview,
    messages=[{"role": "user", "content": f"Extract review details: {text}"}]
)

print(f"Product: {review.product_name}")
print(f"Rating: {review.rating}/5")
print(f"Pros: {review.pros}")
print(f"Cons: {review.cons}")
print(f"Summary: {review.summary}")

Output:

Product: Python IDE
Rating: 4/5
Pros: ['Fast boot time', 'Good autocomplete']
Cons: ['High memory usage', 'Sparse plugin store']
Summary: A solid Python IDE with fast performance but limited plugins and high memory usage.

When a validator raises ValueError, instructor captures the error message and sends it back to the model in a follow-up message: “Validation failed: Rating must be between 1 and 5, got 6. Please fix and try again.” The model then self-corrects. By default, instructor retries up to 3 times before raising an exception. You can configure this with max_retries=N on the completion call.

Configuring Retries and Modes

instructor supports several extraction modes depending on what your model supports. The default mode uses OpenAI’s tool calling, but you can switch to JSON mode or other strategies:

# retry_config.py
import instructor
from instructor import Mode
from openai import OpenAI
from pydantic import BaseModel

# Default: tool calling (most reliable for OpenAI models)
client_tools = instructor.from_openai(OpenAI())

# JSON mode: model returns raw JSON instead of a tool call
client_json = instructor.from_openai(OpenAI(), mode=Mode.JSON)

# MD_JSON mode: model wraps JSON in markdown fences (useful for some fine-tunes)
client_md = instructor.from_openai(OpenAI(), mode=Mode.MD_JSON)

class City(BaseModel):
    name: str
    country: str
    population: int

# Control retries per-call
city = client_tools.chat.completions.create(
    model="gpt-4o-mini",
    response_model=City,
    max_retries=5,           # retry up to 5 times on validation failure
    messages=[{"role": "user", "content": "Tell me about Tokyo"}]
)

print(f"{city.name}, {city.country}: pop {city.population:,}")

Output:

Tokyo, Japan: pop 13,960,000

For most OpenAI models, the default tool-calling mode is most reliable. Use Mode.JSON for models that support JSON mode but not tool calling — for example, some fine-tuned models or older GPT versions. The max_retries parameter controls how many times instructor will re-prompt the model when validation fails. For production pipelines where data quality matters more than cost, set this to 3-5.

Using instructor with Non-OpenAI Models

If you are using Anthropic’s Claude, Google Gemini, or a local model via Ollama, instructor has provider-specific patches. For OpenAI-compatible endpoints (like local LLMs with an OpenAI-compatible API), you can pass a custom base_url:

# multi_provider.py
import instructor
from anthropic import Anthropic
from pydantic import BaseModel

# Anthropic Claude -- uses a different client class
anthropic_client = instructor.from_anthropic(Anthropic())

class Sentiment(BaseModel):
    label: str   # "positive", "negative", or "neutral"
    score: float # confidence from 0.0 to 1.0
    reason: str  # one-sentence explanation

result = anthropic_client.messages.create(
    model="claude-3-haiku-20240307",
    max_tokens=256,
    response_model=Sentiment,
    messages=[{
        "role": "user",
        "content": "This new Python library is fantastic, saves me hours every week!"
    }]
)

print(f"Sentiment: {result.label} ({result.score:.0%})")
print(f"Reason: {result.reason}")

Output:

Sentiment: positive (96%)
Reason: The user expresses strong enthusiasm and quantifies time savings, indicating genuine satisfaction.

For local models via Ollama (which provides an OpenAI-compatible API on localhost:11434), create the client with a custom base URL:

# ollama_instructor.py
import instructor
from openai import OpenAI
from pydantic import BaseModel

# Ollama runs an OpenAI-compatible server locally
ollama_client = instructor.from_openai(
    OpenAI(base_url="http://localhost:11434/v1", api_key="ollama"),
    mode=instructor.Mode.JSON  # use JSON mode for local models
)

class Summary(BaseModel):
    headline: str
    key_points: list[str]

# Works the same as OpenAI -- just a different backend
# summary = ollama_client.chat.completions.create(
#     model="llama3.2",
#     response_model=Summary,
#     messages=[{"role": "user", "content": "Summarize Python's async/await model"}]
# )
print("Local model client ready -- uncomment to use with Ollama running")

Output:

Local model client ready -- uncomment to use with Ollama running

Real-Life Example: Job Posting Extraction Pipeline

Here is a complete pipeline that reads job postings from a list of texts, extracts structured data, filters by criteria, and exports to CSV — the kind of task that comes up in recruiting tools, market research, and job aggregators:

# job_extraction_pipeline.py
import instructor
import csv
from openai import OpenAI
from pydantic import BaseModel, Field
from typing import Optional, List

client = instructor.from_openai(OpenAI())

class JobPosting(BaseModel):
    title: str = Field(description="Job title exactly as written")
    company: str
    location: str = Field(description="City/country or 'Remote'")
    salary_min: Optional[int] = Field(None, description="Min annual salary USD")
    salary_max: Optional[int] = Field(None, description="Max annual salary USD")
    required_years: Optional[int] = Field(None, description="Years of experience required")
    technologies: List[str] = Field(description="List of technologies mentioned")
    is_remote: bool

# Sample job postings to process
JOB_TEXTS = [
    """Senior Python Engineer at Nexaflow -- Remote-first.
    $150k-$190k. 5+ years Python, FastAPI, PostgreSQL, AWS required.""",

    """Junior Data Scientist at BioMetrics Ltd (London, UK).
    GBP 45,000-55,000. 0-2 years exp, pandas, scikit-learn, matplotlib.""",

    """Staff ML Engineer at Quantra -- San Francisco CA.
    $220,000 - $280,000/yr. 8+ years, PyTorch, CUDA, distributed training.""",
]

def extract_jobs(texts: List[str]) -> List[JobPosting]:
    """Extract structured job data from raw posting texts."""
    jobs = []
    for i, text in enumerate(texts, 1):
        job = client.chat.completions.create(
            model="gpt-4o-mini",
            response_model=JobPosting,
            max_retries=3,
            messages=[{"role": "user", "content": f"Extract job details:\n\n{text}"}]
        )
        jobs.append(job)
        print(f"[{i}/{len(texts)}] Extracted: {job.title} at {job.company}")
    return jobs

def filter_remote(jobs: List[JobPosting]) -> List[JobPosting]:
    return [j for j in jobs if j.is_remote]

def export_csv(jobs: List[JobPosting], path: str) -> None:
    with open(path, "w", newline="") as f:
        writer = csv.writer(f)
        writer.writerow(["Title", "Company", "Location", "Salary Min", "Salary Max",
                         "Yrs Required", "Technologies", "Remote"])
        for j in jobs:
            writer.writerow([
                j.title, j.company, j.location,
                j.salary_min or "", j.salary_max or "",
                j.required_years or "",
                ", ".join(j.technologies),
                j.is_remote
            ])

if __name__ == "__main__":
    print("Extracting job postings...")
    jobs = extract_jobs(JOB_TEXTS)
    remote_jobs = filter_remote(jobs)
    print(f"\nTotal extracted: {len(jobs)}, Remote: {len(remote_jobs)}")
    export_csv(jobs, "jobs_extracted.csv")
    print("Saved to jobs_extracted.csv")

Output:

Extracting job postings...
[1/3] Extracted: Senior Python Engineer at Nexaflow
[2/3] Extracted: Junior Data Scientist at BioMetrics Ltd
[3/3] Extracted: Staff ML Engineer at Quantra

Total extracted: 3, Remote: 1
Saved to jobs_extracted.csv

This pipeline is easy to extend: add a database write step, connect it to a web scraper that feeds real job pages, or add more validation rules to the JobPosting model. The core pattern — extract once, validate automatically, retry on failure — stays the same regardless of the scale. You can process thousands of postings by replacing JOB_TEXTS with a generator that reads from a queue or database, keeping the extraction logic identical.

Frequently Asked Questions

Does instructor increase API costs because of retries?

Yes, each retry is an additional API call, so failed extractions cost more. In practice, with well-designed schemas and clear field descriptions, validation failures are rare — under 5% for most extraction tasks. The cost increase is usually worth the reliability gain. If cost is a concern, use max_retries=1 and handle exceptions in your code rather than retrying automatically.

Does instructor support streaming responses?

Yes. Use response_model=Iterable[YourModel] for streaming lists, or Partial[YourModel] for streaming partial updates to a single model. Streaming is useful for large extractions where you want to process results as they arrive rather than waiting for the full response. See the instructor documentation for the streaming API details.

What happens when the model cannot extract a field?

If the field is typed as Optional[X], the model will return None for missing information. If the field is required (non-Optional), the model will either hallucinate a value or fail validation, triggering a retry. For fields that may legitimately be absent in the source text, always use Optional with a None default. This is the most common mistake new users make.

Can I extract data from large documents?

Yes, but be aware of token limits. For documents larger than a few thousand words, split them into chunks and extract from each chunk separately. Use a List[YourModel] return type if a single document contains multiple items to extract (like a list of transactions in a bank statement). For very large documents, consider summarizing first with a regular completion call, then extracting from the summary.

How is this different from just prompting for JSON output?

Prompting for JSON works until it does not — the model adds markdown fences, writes a preamble sentence, or omits fields. instructor uses tool calling (not prompting) to enforce the schema, so the model cannot deviate from the structure. It also runs Pydantic validation on the result and retries if types or constraints are violated. The difference in reliability for production use is significant — JSON prompting is fine for experiments, but instructor is the right tool for pipelines where data quality matters.

Is my data sent to OpenAI when I use instructor?

instructor is a thin wrapper around the OpenAI SDK — your data goes to whatever API endpoint you configure, subject to that provider’s data policy. If you are processing sensitive data, use a self-hosted model via Ollama or another local inference server, and point instructor at your local endpoint with a custom base_url. The library itself does not send data anywhere — it only wraps the client you provide.

Conclusion

The instructor library solves one of the most persistent frustrations in LLM application development: getting the model to return data in the shape your code expects, every time. We covered patching the OpenAI client, defining Pydantic schemas with field descriptions, extracting nested and list objects, adding custom validation rules, configuring retries and modes, and using instructor with non-OpenAI providers. The job extraction pipeline demonstrated how these pieces combine into a production-ready pattern.

The next step is to extend the real-life example: add a web scraper to pull live job postings, or connect the extracted data to a database. With instructor handling the model-to-schema translation, you can focus entirely on the business logic of what to extract and what to do with it.

Full documentation and more examples are at python.useinstructor.com. The library’s GitHub has a large collection of real-world examples including classification, knowledge graph extraction, and citation-backed answers.

Async Await Code Example in Python

In the previous section we created an asynchronous version manually. Here’s the same outcome but written with the async await syntax. As you’ll notice it is very similar to the original synchronous version:

import time, datetime, time
import asyncio

import time, datetime, timeit

customer_queue = [ "C1", "C2", "C3"  ]

def get_next_customer():
    return customer_queue.pop(0)    #Get the first customer from list

async def cook_hamburger(customer):     
    start_customer_timer = timeit.default_timer()
    print( f"[{customer}]: Start cooking hamberger for customer")
    await asyncio.sleep(2)   # Sleep but release control
    end_customer_timer = timeit.default_timer()
    print( f"[{customer}]: Finish cooking hamberger for customer.  Total {end_customer_timer-start_customer_timer} seconds\n")

async def run_shop():
    cooking_queue = []

    while customer_queue:
        curr_customer = get_next_customer()
        cooking_queue.append(  cook_hamburger(curr_customer)  )   #this returns a task only

    #cooking_queue[] has all the async tasks
    await asyncio.gather( *cooking_queue )      #Run all in parallel

def main():
    print('Hamburger Shop')
    start = timeit.default_timer()

    asyncio.run( run_shop() )           #Start the event loop

    stop = timeit.default_timer()
    print(f"** Total runtime: {stop-start} seconds ***")

if __name__ == '__main__':
    main()

Output as follows:

Let’s walk through the code:

Firstly, the async await is available from the library asyncio hence the import asyncio
There’s funny set of async keywords which precede the def run_shop() and the def cook_hamburger(customer) functions. In addition the run_shop() is no longer called directly, instead it is called with a asyncio.run( run_shop() ) function call. So here’s what is happening:
- The asyncio.run() function is the trigger for the so-called event loop. It continues to run forever until all the tasks given to it are completed. You must pass it a function with the async def... prefix hence why run_shop() has the async prefix
- In the async def run_shop() function call, the code iterates while there are customers in the queue to process, and then there’s a call to cook_hamburger(curr_customer) for each customer. A direct call to the customer does not actually call the function but instead creates a task to execute this. That is what the async tells the compiler – that when called directly, return a task.
- At the end of the function code in def run_shop() there’s a call to function await asyncio.gather( *cooking_queue). There’s a few things going on here:
  - The await keywords indicates that you need wait for the work to complete but python can do something else in the meantime
  - The call to gather() actually executes all the tasks given to it as a parameter collectively as a group and then returns the results sequentially (please note that the order of the tasks being executed may be random)
  - The *customer_queue simply expands the list into a list of parameter items. So for example if customer_queue[] == [ '1', '2', '3'] then the gather( *customer_queue) would be the same as gather( '1', '2', '3').
- When the await asyncio.gather( *customer_queue ) is called, the await keyword releases control to any activities that are pending and one of them would be to the calls to function cook_hamburger() which was added to the customer_queue list. Hence calls to cook_hamburger() would be triggered.
- Within cook_hamburger() there is also an await asyncio.sleep(2). This simply waits for 2 seconds, however, it does not force the program to wait for the 2 seconds to complete, instead the await keyword releases python to do something else in the meantime. This is similar to step 3 in Figure 2 where the chef/waiter puts the hamburger on the grill, but then doesn’t wait for the 2 second but instead does something else (i.e. serve the next customer)
The asyncio.run() are new keywords as part of python 3.7. In older versions of python you may see the following but it is the same as simply running asyncio.run( run_shop() ) :
- loop = asyncio.get_event_loop()
- loop.run_until_complete(run_shop())
- loop.close()
As you will notice, this is very similar to the synchronous code that covers Figure 1 above. This is the beauty of async/await

So remember, whenever there’s an await then that means python pauses at that point for that task to complete but then also releases python to do something else. That’s how the performance improvement occurs. In this example, the runtime of this is 2 seconds instead of the sequential 6 seconds!

Async Asynchronous Calling Another Async Function Code Example

Suppose you want t also call another async function once your first async function is completed – how do you go about this? Remember the rule, if you want to run something asynchronously, you have to use the await keyword, and that the function you’re calling has to be defined with async def ...

To continue with the restaurant theme, suppose that after the hamburger is cooked you ask an assistant to put the hamburger into a takeaway bag which takes 1 second. This is also another task that you need not ‘block’ and wait for it to complete. Hence, this action can be put into a function which is defined as an async. Here’s what the code can look like:

import time, datetime, time
import asyncio

customer_queue = [ "C1", "C2", "C3" ]

def get_next_customer():
    return customer_queue.pop(0)    #Get the first customer from list

async def cook_hamburger(customer):     
    start_customer_timer = timeit.default_timer()
    print( f"[{customer}]: Start cooking hamberger for customer")
    await asyncio.sleep(2)   # Sleep but release control
    end_customer_timer = timeit.default_timer()
    print( f"[{customer}]: Finish cooking hamberger for customer.  Total {end_customer_timer-start_customer_timer} seconds")
    await put_hamburger_in_takeaway_bag( customer )

async def put_hamburger_in_takeaway_bag( customer):
    start_customer_timer = timeit.default_timer()
    print( f"[{customer}]: Start packing hamberger")
    await asyncio.sleep(1)   # It takes 2 seconds to cook the hamburger
    end_customer_timer = timeit.default_timer()
    print( f"[{customer}]: Finish packing hamberger.  Total {end_customer_timer-start_customer_timer} seconds\n")

async def run_shop():
    cooking_queue = []

    while customer_queue:
        curr_customer = get_next_customer()
        cooking_queue.append( cook_hamburger(curr_customer) )   #Get each of the event loops
    await asyncio.gather( *cooking_queue )      #Run all in parallel

def main():
    print('Hamburger Shop')
    start = timeit.default_timer()
    asyncio.run( run_shop() )           #Start the event loop 
    stop = timeit.default_timer()
    print(f"** Total runtime: {stop-start} seconds ***")

if __name__ == '__main__':
    main()

The output would be:

See how once the hamburger is cooked (e.g. [C1]: Finish cooking hamburger for customer. Total 2.000924572115764 seconds), then immediately afterwards you have the [C1]: Start packing hamburger step but also gets called asynchronously.

Async Await Real World Example With Web Crawler in Python

One difficulty in learning Async / Await is that many examples provided simply provide the asyncio.sleep() as an example which is helpful to understand the concept, but not very helpful when you want to make something more useful. Let’s try a more complex example where you want to get some stock data from finance.yahoo.com and then, for that same stock, you also get the first 3 newspaper articles from news.google.com in the last 24 hours.

Now one thing you will realise is that await only works with functions that are defined as async. So you cannot call any function with await. Why? Well recall that when you call await you are expecting a function to return a task and not actually call the function, hence that function needs to be defined as async in order to tell python that it returns a task to be executed at the next available time.

Let’s see the synchronous version of the code:

import asyncio, requests, timeit
from bs4 import BeautifulSoup
from pygooglenews import GoogleNews

stock_list = [ "TSLA", "AAPL"]

def get_stock_price_data(stock):
    print(f"-- getting stock data for {stock}")
    data = {"stock":stock, "price_open":0, "price_close":0 }
    stock_page = requests.get( 'https://finance.yahoo.com/quote/' + stock, headers={'Cache-Control': 'no-cache',  "Pragma": "no-cache"})

    soup = BeautifulSoup(stock_page.text, 'html.parser')
    #<fin-streamer active="" class="Fw(b) Fz(36px) Mb(-4px) D(ib)" data-field="regularMarketPrice" data-pricehint="2" data-symbol="TSLA" data-test="qsp-price" data-trend="none" value="759.63">759.63</fin-streamer>
    data['price_close'] = soup.find('fin-streamer', attrs={"data-symbol":stock, "data-field":"regularMarketPrice"} ).text

    #<td class="Ta(end) Fw(600) Lh(14px)" data-test="OPEN-value">723.25</td>
    data['price_open'] = soup.find( attrs={"data-test":"OPEN-value"}).text

    return data

def get_recent_news(stock):
    print(f"-- getting news data for {stock}")
    gn = GoogleNews()
    search = gn.search(f"stocks {stock}", when = '24h')
    news = search['entries'][0:3]
    return news

def print_stock_update(stock, data, news):
    print(f"Stock:{ stock }")
    price_change = 0
    if int(float(data['price_open'])) != 0: price_change = round( 100 * ( float( data['price_close'])/float(data['price_open'])-1), 2)
    print(f"Open Price:{data['price_open']} Close Price:{data['price_close']} Change:{price_change}% ")
    print("Latest News:")
    for news_item in news:        
        print( f"{news_item.published}:{news_item.source.title} - {news_item.title}" )
    print("\n")

def process_stocks():
    for stock in stock_list:
        data = get_stock_price_data( stock )
        news=[]
        news = get_recent_news( stock )
        print_stock_update(stock, data, news)

if __name__ == '__main__':
    start_timer = timeit.default_timer()
    process_stocks()
    end_timer = timeit.default_timer()

    print(f"** Total runtime: {end_timer-start_timer} seconds ***")

Output as follows:

So what’s happening here. Well, you are looping through two stocks TSLA and AAPL, and for each stock the following happens sequentially:

A call to data = get_stock_price_data( stock ) occurs in order to make a call to requests.get( 'https://finance.yahoo.com/quote/' + stock) to get the HTML page for the TSLA stock. Effectively, this page: https://finance.yahoo.com/quote/TSLA
Next we use BeautifulSoup() in order to find the HTML snippet that contains the stock price data for the opening price and the closing price:

After the call to yahoo is complete, then there’s a call to news = get_recent_news( stock ) which uses the module pygooglenews to get the latest google news. In fact we have used this function in our previous Twitter Bot article.
Once this is all done, that output is printed out with the call to print_stock_update(stock, data, news)

Clearly this could be called asynchronously as we are looping each time for each stock, and then also the call to get the stock data is independent to getting the news data. However, one thing has to happen sequentially is the print_stock_update(stock, data, news) which has to wait for both the async calls to complete.

One wait to try is to simply call the website download with:

stock_page = await requests.get( 'https://finance.yahoo.com/quote/' + stock, headers={'Cache-Control': 'no-cache',  "Pragma": "no-cache"})

However, you will get the following error:

The reason is, as you may have guessed, is that the requests.get() is not created with the async def... construct and hence cannot be called asynchronously.

What you can do however is to use another ‘get’ web page module called httpx. This function is defined with async def... and can be called similar to requests. That same line would be re-written as:

import httpx
#....

async def get_stock_price_data(stock):
    print(f"-- stock data:getting stock data for {stock}")
    data = {"stock":stock, "price_open":0, "price_close":0 }

    #*** instead of requests.get('https://finance.yahoo.com/quote/' + stock)) ****
    client = httpx.AsyncClient() 
    stock_page = await client.get( 'https://finance.yahoo.com/quote/' + stock)

    soup = BeautifulSoup(stock_page.text, 'html.parser')
    #<fin-streamer active="" class="Fw(b) Fz(36px) Mb(-4px) D(ib)" data-field="regularMarketPrice" data-pricehint="2" data-symbol="TSLA" data-test="qsp-price" data-trend="none" value="759.63">759.63</fin-streamer>
    data['price_close'] = soup.find('fin-streamer', attrs={"data-symbol":stock, "data-field":"regularMarketPrice"} ).text

    #<td class="Ta(end) Fw(600) Lh(14px)" data-test="OPEN-value">723.25</td>
    data['price_open'] = soup.find( attrs={"data-test":"OPEN-value"}).text
    print(f"-- stock data:done {stock}")
    return data

Ok, that works well. However, but what about the GoogleNews() code. There is no such async version of this function, so how can this be called asynchronously? Well for this, you can actually wrap it around a new thread. A ‘thread’ is way to run a piece of code under the same CPU process but in a parallel. It warrants a whole separate article but for now you can think of it as finding a separate space to execute this independent of the current execution path. However, to execute this in a separate thread, there’s a bit more involved.

The code looks like the following:

### Original Version
def get_recent_news(stock):
    print(f"-- stock news:getting stock data for {stock}")
    gn = GoogleNews()
    search = gn.search(f"stocks {stock}", '24h') #Slow code to run asynchronously
    news = search['entries'][0:3]
    print(f"-- stock news:done {stock}")
    return news

### Asynchronous Version
async def get_recent_news(stock):
    print(f"-- stock news:getting stock data for {stock}")
    gn = GoogleNews()
    search = await asyncio.get_event_loop().run_in_executor( None, gn.search, f"stocks {stock}", '24h')
    news = search['entries'][0:3]
    print(f"-- stock news:done {stock}")
    return news

Here what’s happening is that firstly we are using the await keyword to call the gn.search() function which is now being called through this asyncio.get_event_loop().run_in_executor( .. ) function call. What’s happening here is that we are asking the asyncio module to get access to the event loop (that piece of code that continuously checks for tasks to be done) and then to run in a separate thread. The way it is called is that the parameters must be passed in separate to the function call and hence why the parameters are to be passed in after the function name itself. You will also notice that the whole function can now be defined as async def get_recent_news(stock)

How To Mix Asynchronous And Synchronous Code With Await Async in Python

Now the final problem to be solved is how do we call the two functions of get_stock_price_data( stock ) and get_recent_news(stock) to be run asynchronously, but then wait for both to finish, and THEN run the print. This is where these steps should all be grouped under one function. This is the trick to mix asynchronous and synchronous code.

In order to run a group of tasks in parallel as a group you use asyncio.gather(). However, if you want to execute a synchronous function when ALL tasks that were given to asyncio.gather() is complete, then you should wrap it in another asyncio.gather()

async def process_stock_batch(stock):
    (data, news) = await asyncio.gather( get_stock_price_data( stock ), get_recent_news(stock)  )
    print('-- print:request printing')
    print_stock_update(stock, data, news) 
    print('-- print:done')

async def process_stocks():
    run_stock_list = []
    for stock in stock_list:
        run_stock_list.append(   process_stock_batch(stock) )
    await asyncio.gather( *run_stock_list )

Before we solve it for the real world examples, lets show a simpler example. Suppose we had the following example:

import asyncio, timeit

async def get_web_data_A(index):
    await asyncio.sleep(1)
    print(f"Get Web Data-A[{index}] - sleep 1 second")
        
async def get_web_data_B(index):
    await asyncio.sleep(1)
    print(f"Get Web Data-B[{index}] - sleep 1 second")

async def process(index, start_timer):
    await asyncio.gather( get_web_data_A(index), get_web_data_B(index) )
    print(f"Calculate [{index}] - Elapsed time:[{timeit.default_timer()-start_timer}]")

async def run_all():
    start_timer = timeit.default_timer()
    for index in range(0,2):
        await process(index, start_timer)

if __name__ == '__main__':
    asyncio.run( run_all() )

This has the following output:

What is encouraging with this code, is that even though the call to get_web_data_A() and get_web_data_B() both sleep for 1 second, since they were doing that asynchronously, then the total runtime is still just a little over 1 second. This can be shown by the Calculate [0]... output. However, the problem is that the code still iterates each index sequentially, meaning, that index 0 is processed completely first, and once that’s done, then index 1 is processed. What we want instead is to run all the slow get_web_data_A() and get_web_data_B() first, and then run the code to calculate afterwards. This is where you need to first create the tasks for ALL the iterations, and then call gather() on all the tasks. See the following code:

import asyncio, timeit

async def get_web_data_A(index):
    await asyncio.sleep(1)
    print(f"Get Web Data-A[{index}] - sleep 1 second")
        
async def get_web_data_B(index):
    await asyncio.sleep(1)
    print(f"Get Web Data-B[{index}] - sleep 1 second")

async def process(index, start_timer):
    await asyncio.gather( get_web_data_A(index), get_web_data_B(index) )
    print(f"Calculate [{index}] - Elapsed time:[{timeit.default_timer()-start_timer}]")

async def run_all_2():
    start_timer = timeit.default_timer()
    task_queue = []
    for index in range(0,2):
        task_queue.append( process(index, start_timer) )
    await asyncio.gather( *task_queue )

if __name__ == '__main__':
    asyncio.run( run_all_2() )

Here, in the function async def run_all_2() when we loop, we do not call the blocking code await asyncio.gather... inside the for loop. Instead, we are adding all the tasks to call process(..) into a list called task_queue[], and then at the end of the for loop we are calling await asyncio.gather( *task_queue ) on all tasks in one go. Hence, the output is as follows:

You’ll notice that ALL the get_web_data_A() and get_web_data_B() are being called asynchronously, and then the calculate function is called on all the available data. Hence, the elapsed time for all the iterations is only 1 second, compared to the previous 2 seconds.

So what does this mean for our real world example for getting stock data from Yahoo and then calling Google News asynchronously, and then only printing the data once both are done? Well, the same principle applies. The code is as follows:

import asyncio, httpx, timeit
from bs4 import BeautifulSoup
from pygooglenews import GoogleNews

stock_list = [ "TSLA", "AAPL"]

async def get_stock_price_data(stock):
    print(f"-- stock data:getting stock data for {stock}")
    data = {"stock":stock, "price_open":0, "price_close":0 }

    client = httpx.AsyncClient()
    stock_page = await client.get( 'https://finance.yahoo.com/quote/' + stock)

    soup = BeautifulSoup(stock_page.text, 'html.parser')
    #<fin-streamer active="" class="Fw(b) Fz(36px) Mb(-4px) D(ib)" data-field="regularMarketPrice" data-pricehint="2" data-symbol="TSLA" data-test="qsp-price" data-trend="none" value="759.63">759.63</fin-streamer>
    data['price_close'] = soup.find('fin-streamer', attrs={"data-symbol":stock, "data-field":"regularMarketPrice"} ).text

    #<td class="Ta(end) Fw(600) Lh(14px)" data-test="OPEN-value">723.25</td>
    data['price_open'] = soup.find( attrs={"data-test":"OPEN-value"}).text
    print(f"-- stock data:done {stock}")
    return data

async def get_recent_news(stock):
    print(f"-- stock news:getting stock data for {stock}")
    gn = GoogleNews()
    search = await asyncio.get_event_loop().run_in_executor( None, gn.search, f"stocks {stock}", '24h')
    news = search['entries'][0:3]
    print(f"-- stock news:done {stock}")
    return news

def print_stock_update(stock, data, news):
    print('-- print:starting print')
    print(f"Stock:{ stock }")
    price_change = 0
    if int(float(data['price_open'])) != 0: price_change = round( 100 * ( float( data['price_close'])/float(data['price_open'])-1), 2)
    print(f"Open Price:{data['price_open']} Close Price:{data['price_close']} Change:{price_change}% ")
    print("Latest News:")
    for news_item in news:        
        print( f"{news_item.published}:{news_item.source.title} - {news_item.title}" )

    print("\n")

async def process_stock_batch(stock):
    (data, news) = await asyncio.gather( get_stock_price_data( stock ), get_recent_news(stock)  )
    print('-- print:request printing')
    print_stock_update(stock, data, news) 
    print('-- print:done')

async def process_stocks():
    run_stock_list = []
    for stock in stock_list:
        run_stock_list.append(   process_stock_batch(stock) )
    await asyncio.gather( *run_stock_list )

if __name__ == '__main__':
    start_timer = timeit.default_timer()
    asyncio.run( process_stocks() )
    end_timer = timeit.default_timer()

    print(f"** Total runtime: {end_timer-start_timer} seconds ***")

The key bit of code is in the async def process_stocks() which now iterates over each of the stocks, creates tasks, and then calls await asyncio.gather( *run_stock_list ) on all the stocks in one go, and then in the function process_stock_batch(stock) we have the asynchronous call to (data, news) = await asyncio.gather( get_stock_price_data( stock ), and then the synchronous call to print_stock_update(stock, data, news) once both web data is complete.

Conclusion

The await and async function is an incredibly useful feature of python which takes a bit of getting used to in order to understand the concept, but once you’ve got the hang of it, it can be incredibly useful to get an improve of the performance of your code by leveraging idle time where you are waiting for a task to complete. Remember to be sure about the sequencing and being mindful of whether you care to have a follow-up activity once that task is completed, or you can simply continue to execute.

This not easy to grasp as a beginner, but follow the example code above, and if you get stuck feel free to reach out through our email list below.

How To Use Python instructor for Structured LLM Outputs

by Pubs | May 19, 2026 | APIs, Intermediate

Intermediate

You ask an LLM to extract a user’s name, age, and email from a paragraph of text. Sometimes it returns clean JSON. Sometimes it returns JSON wrapped in markdown fences. Sometimes it returns a paragraph explaining why it extracted those fields. If you have ever built a pipeline that breaks because the model decided today was a good day to add “Sure! Here is the extracted data:” before the JSON, you already understand why instructor exists.

The instructor library patches the OpenAI client (and any OpenAI-compatible API) to force the model to return a fully validated Pydantic model — every time. When validation fails, it retries automatically. You define exactly what fields you need, with their types and constraints, and instructor handles the conversation with the model until the output matches your schema. You need Python 3.9+, an OpenAI API key (or compatible endpoint), and pip install instructor.

This article walks through everything you need to get structured LLM outputs in production: installing and patching the client, defining Pydantic schemas, extracting nested objects, handling lists, using validation hooks, working with non-OpenAI models via LiteLLM, and building a real extraction pipeline. By the end you will have a reusable pattern for reliable structured data from any LLM.

Structured LLM Output: Quick Example

The fastest way to see instructor in action is to extract a structured object from a single sentence. Install the library and try this:

# quick_instructor.py
import instructor
from openai import OpenAI
from pydantic import BaseModel

client = instructor.from_openai(OpenAI())

class Person(BaseModel):
    name: str
    age: int
    city: str

person = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=Person,
    messages=[{"role": "user", "content": "Alice is 32 years old and lives in Melbourne."}]
)

print(person.name)   # Alice
print(person.age)    # 32
print(person.city)   # Melbourne
print(type(person))  # <class '__main__.Person'>

Output:

Alice
32
Melbourne
<class '__main__.Person'>

The key line is instructor.from_openai(OpenAI()) — this patches the standard OpenAI client. After that, you pass response_model=Person to any chat.completions.create call, and instructor automatically: sends the Pydantic schema to the model as a tool definition, parses the model’s tool-call response, validates it against your schema, and retries if validation fails. The return value is a fully typed Pydantic object, not a string or dict.

That example covers the simplest case. The sections below show how to handle nested models, lists, validation rules, retry configuration, and real-world pipelines.

What Is instructor and Why Use It?

When you call an LLM without constraints, it returns free-form text. Parsing that text into structured data is fragile — you write regex, JSON parsers, and fallback handlers that break every time the model changes its wording. instructor solves this by using OpenAI’s function/tool calling feature under the hood: it converts your Pydantic model into a JSON Schema tool definition, forces the model to call that tool, and validates the returned arguments against your schema.

The result is LLM output that behaves like a typed function return value instead of a string you have to parse. If the model returns a field with the wrong type (for example, age as a string “thirty-two” instead of an integer), instructor sends the validation error back to the model and asks it to try again — up to a configurable number of retries.

Approach	Reliability	Type Safety	Auto-Retry
Parse raw LLM text	Fragile	None	Manual
Parse JSON from prompt	Moderate	Manual	Manual
OpenAI function calling	Good	Partial	None
instructor + Pydantic	High	Full	Built-in

The library supports multiple backends: instructor.from_openai, instructor.from_anthropic, instructor.from_gemini, and any OpenAI-compatible endpoint via base_url. This makes it the same interface regardless of which model you use.

Installation and Setup

Install instructor and the OpenAI SDK together. If you are using a different provider, you may also need their SDK:

# Terminal
pip install instructor openai pydantic

Set your API key as an environment variable so it never appears in your code:

# setup_env.py -- run once, or add to your shell profile
import os
# In practice, set this in your shell:
# export OPENAI_API_KEY="sk-..."
print("OPENAI_API_KEY set:", bool(os.environ.get("OPENAI_API_KEY")))

Output:

OPENAI_API_KEY set: True

Patch the client once at startup and reuse it for all calls. Creating a new patched client for every request is wasteful:

# client_setup.py
import instructor
from openai import OpenAI

# Patch once at startup
client = instructor.from_openai(OpenAI())  # reads OPENAI_API_KEY from env

# The client now has response_model support on all completion calls
print(type(client))  # <class 'instructor.client.Instructor'>

Output:

<class 'instructor.client.Instructor'>

Defining Pydantic Schemas for Extraction

Your Pydantic model defines exactly what fields the LLM must return. Field descriptions improve accuracy significantly — the model uses them as instructions for what to put in each field. Use Field(description=...) to guide the extraction:

# schema_example.py
import instructor
from openai import OpenAI
from pydantic import BaseModel, Field
from typing import Optional

client = instructor.from_openai(OpenAI())

class JobPosting(BaseModel):
    title: str = Field(description="The exact job title as written in the posting")
    company: str = Field(description="Company name offering the position")
    location: str = Field(description="City and country, or 'Remote'")
    salary_min: Optional[int] = Field(None, description="Minimum annual salary in USD if mentioned")
    salary_max: Optional[int] = Field(None, description="Maximum annual salary in USD if mentioned")
    is_remote: bool = Field(description="True if the role allows remote work")

text = """
Senior Python Developer at DataFlow Inc. -- Remote (US timezones preferred).
Salary range: $140,000 - $175,000 per year. Must have 5+ years Python experience.
"""

job = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=JobPosting,
    messages=[{"role": "user", "content": f"Extract the job details from: {text}"}]
)

print(f"Title: {job.title}")
print(f"Company: {job.company}")
print(f"Location: {job.location}")
print(f"Salary: ${job.salary_min:,} - ${job.salary_max:,}")
print(f"Remote: {job.is_remote}")

Output:

Title: Senior Python Developer
Company: DataFlow Inc.
Location: Remote (US timezones preferred)
Salary: $140,000 - $175,000
Remote: True

The Optional[int] type tells instructor (and the model) that salary fields may be absent. When the source text does not mention a salary, these fields will be None instead of hallucinated values. Always use Optional for fields that may not appear in the input — without it, the model will invent plausible-sounding values rather than leaving the field empty.

Extracting Nested and List Objects

Real-world extraction often requires nested structures — for example, an invoice with multiple line items, or a resume with a list of work experiences. instructor handles nested Pydantic models and List types natively:

# nested_extraction.py
import instructor
from openai import OpenAI
from pydantic import BaseModel, Field
from typing import List

client = instructor.from_openai(OpenAI())

class LineItem(BaseModel):
    description: str
    quantity: int
    unit_price: float

class Invoice(BaseModel):
    vendor: str
    invoice_number: str
    items: List[LineItem]
    total: float

invoice_text = """
Invoice #INV-2024-0891 from CloudHost Solutions
- 3x Server instances @ $45.00 each
- 1x SSL Certificate @ $12.00
- 2x Domain registrations @ $15.00 each
Total: $222.00
"""

result = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=Invoice,
    messages=[{"role": "user", "content": f"Extract invoice data: {invoice_text}"}]
)

print(f"Vendor: {result.vendor}")
print(f"Invoice #: {result.invoice_number}")
for item in result.items:
    print(f"  {item.quantity}x {item.description} @ ${item.unit_price:.2f}")
print(f"Total: ${result.total:.2f}")

Output:

Vendor: CloudHost Solutions
Invoice #: INV-2024-0891
  3x Server instances @ $45.00
  1x SSL Certificate @ $12.00
  2x Domain registrations @ $15.00
Total: $222.00

Nested models work because instructor converts the entire schema — including nested classes — into a JSON Schema definition that the model understands. The model fills in every field of every nested object, and Pydantic validates the whole structure recursively. If the items list is missing or a line item has an invalid type, instructor retries the extraction with the validation error as feedback.

Adding Custom Validation Rules

Pydantic’s field_validator lets you add business logic on top of type checking. instructor automatically feeds validation errors back to the model, so the model gets a second (or third) chance to return values that satisfy your rules:

# custom_validation.py
import instructor
from openai import OpenAI
from pydantic import BaseModel, Field, field_validator
from typing import List

client = instructor.from_openai(OpenAI())

class ProductReview(BaseModel):
    product_name: str
    rating: int = Field(description="Rating from 1 to 5")
    pros: List[str] = Field(description="List of positive aspects, at least one")
    cons: List[str] = Field(description="List of negative aspects, can be empty")
    summary: str = Field(description="One-sentence summary under 150 characters")

    @field_validator("rating")
    @classmethod
    def rating_in_range(cls, v: int) -> int:
        if not 1 <= v <= 5:
            raise ValueError(f"Rating must be between 1 and 5, got {v}")
        return v

    @field_validator("pros")
    @classmethod
    def at_least_one_pro(cls, v: List[str]) -> List[str]:
        if not v:
            raise ValueError("Must include at least one positive aspect")
        return v

    @field_validator("summary")
    @classmethod
    def summary_length(cls, v: str) -> str:
        if len(v) > 150:
            raise ValueError(f"Summary too long: {len(v)} chars (max 150)")
        return v

text = """
The new Python IDE is pretty solid. Boot time is fast, autocomplete works well.
The memory usage is high and the plugin store is still sparse. Overall a decent
choice for Python development. I'd give it 4 out of 5.
"""

review = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=ProductReview,
    messages=[{"role": "user", "content": f"Extract review details: {text}"}]
)

print(f"Product: {review.product_name}")
print(f"Rating: {review.rating}/5")
print(f"Pros: {review.pros}")
print(f"Cons: {review.cons}")
print(f"Summary: {review.summary}")

Output:

Product: Python IDE
Rating: 4/5
Pros: ['Fast boot time', 'Good autocomplete']
Cons: ['High memory usage', 'Sparse plugin store']
Summary: A solid Python IDE with fast performance but limited plugins and high memory usage.

When a validator raises ValueError, instructor captures the error message and sends it back to the model in a follow-up message: “Validation failed: Rating must be between 1 and 5, got 6. Please fix and try again.” The model then self-corrects. By default, instructor retries up to 3 times before raising an exception. You can configure this with max_retries=N on the completion call.

Configuring Retries and Modes

instructor supports several extraction modes depending on what your model supports. The default mode uses OpenAI’s tool calling, but you can switch to JSON mode or other strategies:

# retry_config.py
import instructor
from instructor import Mode
from openai import OpenAI
from pydantic import BaseModel

# Default: tool calling (most reliable for OpenAI models)
client_tools = instructor.from_openai(OpenAI())

# JSON mode: model returns raw JSON instead of a tool call
client_json = instructor.from_openai(OpenAI(), mode=Mode.JSON)

# MD_JSON mode: model wraps JSON in markdown fences (useful for some fine-tunes)
client_md = instructor.from_openai(OpenAI(), mode=Mode.MD_JSON)

class City(BaseModel):
    name: str
    country: str
    population: int

# Control retries per-call
city = client_tools.chat.completions.create(
    model="gpt-4o-mini",
    response_model=City,
    max_retries=5,           # retry up to 5 times on validation failure
    messages=[{"role": "user", "content": "Tell me about Tokyo"}]
)

print(f"{city.name}, {city.country}: pop {city.population:,}")

Output:

Tokyo, Japan: pop 13,960,000

For most OpenAI models, the default tool-calling mode is most reliable. Use Mode.JSON for models that support JSON mode but not tool calling — for example, some fine-tuned models or older GPT versions. The max_retries parameter controls how many times instructor will re-prompt the model when validation fails. For production pipelines where data quality matters more than cost, set this to 3-5.

Using instructor with Non-OpenAI Models

If you are using Anthropic’s Claude, Google Gemini, or a local model via Ollama, instructor has provider-specific patches. For OpenAI-compatible endpoints (like local LLMs with an OpenAI-compatible API), you can pass a custom base_url:

# multi_provider.py
import instructor
from anthropic import Anthropic
from pydantic import BaseModel

# Anthropic Claude -- uses a different client class
anthropic_client = instructor.from_anthropic(Anthropic())

class Sentiment(BaseModel):
    label: str   # "positive", "negative", or "neutral"
    score: float # confidence from 0.0 to 1.0
    reason: str  # one-sentence explanation

result = anthropic_client.messages.create(
    model="claude-3-haiku-20240307",
    max_tokens=256,
    response_model=Sentiment,
    messages=[{
        "role": "user",
        "content": "This new Python library is fantastic, saves me hours every week!"
    }]
)

print(f"Sentiment: {result.label} ({result.score:.0%})")
print(f"Reason: {result.reason}")

Output:

Sentiment: positive (96%)
Reason: The user expresses strong enthusiasm and quantifies time savings, indicating genuine satisfaction.

For local models via Ollama (which provides an OpenAI-compatible API on localhost:11434), create the client with a custom base URL:

# ollama_instructor.py
import instructor
from openai import OpenAI
from pydantic import BaseModel

# Ollama runs an OpenAI-compatible server locally
ollama_client = instructor.from_openai(
    OpenAI(base_url="http://localhost:11434/v1", api_key="ollama"),
    mode=instructor.Mode.JSON  # use JSON mode for local models
)

class Summary(BaseModel):
    headline: str
    key_points: list[str]

# Works the same as OpenAI -- just a different backend
# summary = ollama_client.chat.completions.create(
#     model="llama3.2",
#     response_model=Summary,
#     messages=[{"role": "user", "content": "Summarize Python's async/await model"}]
# )
print("Local model client ready -- uncomment to use with Ollama running")

Output:

Local model client ready -- uncomment to use with Ollama running

Real-Life Example: Job Posting Extraction Pipeline

Here is a complete pipeline that reads job postings from a list of texts, extracts structured data, filters by criteria, and exports to CSV — the kind of task that comes up in recruiting tools, market research, and job aggregators:

# job_extraction_pipeline.py
import instructor
import csv
from openai import OpenAI
from pydantic import BaseModel, Field
from typing import Optional, List

client = instructor.from_openai(OpenAI())

class JobPosting(BaseModel):
    title: str = Field(description="Job title exactly as written")
    company: str
    location: str = Field(description="City/country or 'Remote'")
    salary_min: Optional[int] = Field(None, description="Min annual salary USD")
    salary_max: Optional[int] = Field(None, description="Max annual salary USD")
    required_years: Optional[int] = Field(None, description="Years of experience required")
    technologies: List[str] = Field(description="List of technologies mentioned")
    is_remote: bool

# Sample job postings to process
JOB_TEXTS = [
    """Senior Python Engineer at Nexaflow -- Remote-first.
    $150k-$190k. 5+ years Python, FastAPI, PostgreSQL, AWS required.""",

    """Junior Data Scientist at BioMetrics Ltd (London, UK).
    GBP 45,000-55,000. 0-2 years exp, pandas, scikit-learn, matplotlib.""",

    """Staff ML Engineer at Quantra -- San Francisco CA.
    $220,000 - $280,000/yr. 8+ years, PyTorch, CUDA, distributed training.""",
]

def extract_jobs(texts: List[str]) -> List[JobPosting]:
    """Extract structured job data from raw posting texts."""
    jobs = []
    for i, text in enumerate(texts, 1):
        job = client.chat.completions.create(
            model="gpt-4o-mini",
            response_model=JobPosting,
            max_retries=3,
            messages=[{"role": "user", "content": f"Extract job details:\n\n{text}"}]
        )
        jobs.append(job)
        print(f"[{i}/{len(texts)}] Extracted: {job.title} at {job.company}")
    return jobs

def filter_remote(jobs: List[JobPosting]) -> List[JobPosting]:
    return [j for j in jobs if j.is_remote]

def export_csv(jobs: List[JobPosting], path: str) -> None:
    with open(path, "w", newline="") as f:
        writer = csv.writer(f)
        writer.writerow(["Title", "Company", "Location", "Salary Min", "Salary Max",
                         "Yrs Required", "Technologies", "Remote"])
        for j in jobs:
            writer.writerow([
                j.title, j.company, j.location,
                j.salary_min or "", j.salary_max or "",
                j.required_years or "",
                ", ".join(j.technologies),
                j.is_remote
            ])

if __name__ == "__main__":
    print("Extracting job postings...")
    jobs = extract_jobs(JOB_TEXTS)
    remote_jobs = filter_remote(jobs)
    print(f"\nTotal extracted: {len(jobs)}, Remote: {len(remote_jobs)}")
    export_csv(jobs, "jobs_extracted.csv")
    print("Saved to jobs_extracted.csv")

Output:

Extracting job postings...
[1/3] Extracted: Senior Python Engineer at Nexaflow
[2/3] Extracted: Junior Data Scientist at BioMetrics Ltd
[3/3] Extracted: Staff ML Engineer at Quantra

Total extracted: 3, Remote: 1
Saved to jobs_extracted.csv

This pipeline is easy to extend: add a database write step, connect it to a web scraper that feeds real job pages, or add more validation rules to the JobPosting model. The core pattern — extract once, validate automatically, retry on failure — stays the same regardless of the scale. You can process thousands of postings by replacing JOB_TEXTS with a generator that reads from a queue or database, keeping the extraction logic identical.

Frequently Asked Questions

Does instructor increase API costs because of retries?

Yes, each retry is an additional API call, so failed extractions cost more. In practice, with well-designed schemas and clear field descriptions, validation failures are rare — under 5% for most extraction tasks. The cost increase is usually worth the reliability gain. If cost is a concern, use max_retries=1 and handle exceptions in your code rather than retrying automatically.

Does instructor support streaming responses?

Yes. Use response_model=Iterable[YourModel] for streaming lists, or Partial[YourModel] for streaming partial updates to a single model. Streaming is useful for large extractions where you want to process results as they arrive rather than waiting for the full response. See the instructor documentation for the streaming API details.

What happens when the model cannot extract a field?

If the field is typed as Optional[X], the model will return None for missing information. If the field is required (non-Optional), the model will either hallucinate a value or fail validation, triggering a retry. For fields that may legitimately be absent in the source text, always use Optional with a None default. This is the most common mistake new users make.

Can I extract data from large documents?

Yes, but be aware of token limits. For documents larger than a few thousand words, split them into chunks and extract from each chunk separately. Use a List[YourModel] return type if a single document contains multiple items to extract (like a list of transactions in a bank statement). For very large documents, consider summarizing first with a regular completion call, then extracting from the summary.

How is this different from just prompting for JSON output?

Prompting for JSON works until it does not — the model adds markdown fences, writes a preamble sentence, or omits fields. instructor uses tool calling (not prompting) to enforce the schema, so the model cannot deviate from the structure. It also runs Pydantic validation on the result and retries if types or constraints are violated. The difference in reliability for production use is significant — JSON prompting is fine for experiments, but instructor is the right tool for pipelines where data quality matters.

Is my data sent to OpenAI when I use instructor?

instructor is a thin wrapper around the OpenAI SDK — your data goes to whatever API endpoint you configure, subject to that provider’s data policy. If you are processing sensitive data, use a self-hosted model via Ollama or another local inference server, and point instructor at your local endpoint with a custom base_url. The library itself does not send data anywhere — it only wraps the client you provide.

Conclusion

The instructor library solves one of the most persistent frustrations in LLM application development: getting the model to return data in the shape your code expects, every time. We covered patching the OpenAI client, defining Pydantic schemas with field descriptions, extracting nested and list objects, adding custom validation rules, configuring retries and modes, and using instructor with non-OpenAI providers. The job extraction pipeline demonstrated how these pieces combine into a production-ready pattern.

The next step is to extend the real-life example: add a web scraper to pull live job postings, or connect the extracted data to a database. With instructor handling the model-to-schema translation, you can focus entirely on the business logic of what to extract and what to do with it.

Full documentation and more examples are at python.useinstructor.com. The library’s GitHub has a large collection of real-world examples including classification, knowledge graph extraction, and citation-backed answers.

Frequently Asked Questions

What is async/await in Python?

async def defines a coroutine function and await pauses execution until an asynchronous operation completes. This enables concurrent I/O operations without threading, using the asyncio event loop.

When should I use async/await instead of threading?

Use async/await for I/O-bound tasks like network requests and database queries with many concurrent connections. Use threading for CPU-bound tasks or libraries that do not support async.

How do I run multiple async tasks concurrently?

Use asyncio.gather(task1(), task2()) to run multiple coroutines concurrently. Use asyncio.create_task() to schedule without immediately waiting.

What does ‘coroutine was never awaited’ mean?

You called an async function without await. Async functions return coroutine objects that must be awaited. Add await before the call or use asyncio.run() from synchronous code.

Can I mix synchronous and asynchronous code?

Yes. Use asyncio.run() to call async from sync. Use loop.run_in_executor() to run blocking functions inside async code without blocking the event loop.

How to Build a Twitter Bot with Python and Twitter API v2

by Pubs | APIs, Automation, Beginner

Beginner

Twitter Bots can be super useful to help automate some of the interactions on social media in order to build and grow engagement but also automate some tasks. There has been many changes on the twitter developer account and sometimes it’s uncertain how to even create a tweet bot. This article will walk through step bey step on how to create a twitter bot with the latest Twitter API v2 and also provide some code you can copy and paste in your next project. We also end with how to create a more useful bot that can post some articles about python automatically.

In a nutshell, how a twitter bot works is that you will need to run your code for a twitter bot in your own compute that can be triggered from a Twitter webhook (not covered) which is called by twitter based on a given event, or by having your program run periodically to read and send tweets (covered in this article). Either way, there are some commonalities and in this article we will walk through how to read tweets, and then to send tweets which are from google news related to python!

Step 1: Sign up for Developer program

If you haven’t already you will need to either sign in or sign up for a twitter account through twitter.com. Make sure your twitter account has an email address allocated to it (if you’re not aware, you can create a twitter account with just your mobile phone number)

Next go to developer.twitter.com and sign up for the developer program (yes, you need to sign up for a second time). This enables you to create applications.

First you’ll need to answer some questions on purpose of the developer account. You can chose “Make a Bot”

Next you will need to agree to the terms and conditions, and then a verification email will be sent to your email address from your twitter account.

When you click on the email to verify your account, you can then enter your app name. This is an internal name and something that will make it easy for you to reference.

Once you click on keys, you will then be given a set of security token keys like below. Please copy them in a safe place as your python code will need to use them to access your specific bot. If you do lose your keys, or someone gets access to them for some reason, you can generate new keys from your developer.twitter.com console.

There are two keys which you will need to use:

API Key (think of this like a username)
API Key Secret (think of this like a password)
Bearer Token (used for read queries such as getting latest tweets)

There is also a third key, a Bearer Token, but this you can ignore. It is for certain types of requests

At the bottom of the screen you’ll see a “Skip to Dashboard”, when you click on that you’ll then see the overview of your API metrics.

Within this screen you can see the limits of the number of calls per month for example and how much you have already consumed.

Next, click on the project and we have to generate the access tokens. Currently with the previous keys you can only read tweets, you cannot create ones as yet.

After clicking on the project, chose the “keys and tokens” tab and at the bottom you can generate the “Access Tokens”. In this screen you can also re-generate the API Keys and Bearer Token you just created before in case your keys were compromised or you forgot them.

Just like before, generate the keys and copy them.

By now, you have 5 security toknes:

API Key – also known as the Consumer Key (think of this like a username)
API Key Secret – also known as the Consumer Secret (think of this like a password)
Bearer Token (used for read queries such as getting latest tweets)
Access Token (‘username’ to allow you to create tweets)
Access Token Secret (‘password’ to allow you to create tweets)

Step 2: Test your twitter API query

Now that you have the API keys, you can do some tests. If you are using a linux based machine you can use the curl command to do a query. Otherwise, you can use a site such as https://reqbin.com/curl to do an online curl request.

Here’s a simple example to get the most recent tweets. It uses the API https://api.twitter.com/2/tweets/search/recent which must include the query keyword which includes a range of parameter options (find out the list in the twitter query documentation).

curl --request GET 'https://api.twitter.com/2/tweets/search/recent?query=from:pythonhowtocode' --header 'Authorization: Bearer <your bearer token from step 1>'

The output is as follows:

{
    "data": [{
        "id": "1523251860110405633",
        "text": "See our latest article on THE complete beginner guide on creating a #discord #bot in #python \n\nEasily add this to your #100DaysOfCode  #100daysofcodechallenge #100daysofpython \n\nhttps://t.co/4WKvDVh1g9"
    }],
    "meta": {
        "newest_id": "1523251860110405633",
        "oldest_id": "1523251860110405633",
        "result_count": 1
    }
}

Here’s a much more complex example. This includes the following parameters:

%23 – which is the escape characters for # and searches for hashtags. Below example is hashtag #python (case insensitive)
%20 – this is an escape character for a space and separates different filters with an AND operation
-is:retweet – this excludes retweets. The ‘-‘ sign preceding the is negates the actual filter
-is:reply – this excludes replies. The ‘-‘ sign preceding the is negates the actual filter
max_results=20 – an integer that defines the maximum number of return results and in this case 20 results
expansions=author_id – this makes sure to include the username internal twitter id and also the actual username under an includes section at the bottom of the returned JSON
tweet.fields=public_metrics,created_at – returns the interaction metrics such as number of likes, number of retweets, etc as well as the time (in GMT timezone) when the tweet was created
user.fields=created_at,location – this returns when the user account was created and the user self-reported location in their profile.

curl --request GET 'https://api.twitter.com/2/tweets/search/recent?query=%23python%20-is:retweet%20-is:reply&max_results=20&expansions=author_id&tweet.fields=public_metrics,created_at&user.fields=created_at,location' --header 'Authorization: Bearer <Your Bearer Token from Step 1>'

Result of this looks like the following – notice that the username details is in the includes section below where you can link the tweet with the username with the author_id field.

{{
    "data": [{
        "id": "1523688996676812800",
        "text": "NEED a #JOB?\nSign up now https://t.co/o7lVlsl75X\nFREE. NO MIDDLEMEN\n#Jobs #AI #DataAnalytics #MachineLearning #Python #JavaScript #WomenWhoCode #Programming #Coding #100DaysofCode #DEVCommunity #gamedev #gamedevelopment #indiedev #IndieGameDev #Mobile #gamers #RHOP #BTC #ETH #SOL https://t.co/kMYD2417jR",
        "author_id": "1332714745871421443",
        "public_metrics": {
            "retweet_count": 3,
            "reply_count": 0,
            "like_count": 0,
            "quote_count": 0
        },
        "created_at": "2022-05-09T15:39:00.000Z"
    },
....
  }],
    "includes": {
        "users": [{
            "name": "Job Preference",
            "id": "1332714745871421443",
            "username": "JobPreference",
            "created_at": "2020-11-28T15:56:01.000Z"
        }, 
....
}

Step 3: Reading tweets with python code

Building on top of the tests conducted on Step 2, it is a simple extra step in order to convert this to python code using the requests module which we’ll show first and after show a simpler way with the library tweepy. You can simply use the library to convert the curl command into a bit of python code. Here’s a structured version of this code where the logic is encapsulated in a class.

import requests, json
from  urllib.parse import quote
from pprint import pprint

class TwitterBot():
    URL_SEARCH_RECENT = 'https://api.twitter.com/2/tweets/search/recent'
    def __init__(self, bearer_key):
        self.bearer_key = bearer_key

    def search_recent(self, query, include_retweets=False, include_replies=False):
        url = self.URL_SEARCH_RECENT + "?query=" + quote(query)
        if not include_retweets: url += quote(' ')+'-is:retweet'
        if not include_replies: url += quote(' ')+'-is:reply'

        url += '&max_results=20&expansions=author_id&tweet.fields=public_metrics,created_at&user.fields=created_at,location' 
        
        headers = {'Authorization': 'Bearer ' + self.bearer_key }

        r = requests.get(url, headers = headers)
        r.encoding = r.apparent_encoding.  #Ensure to use UTF-8 if unicode characters
        return json.loads(r.text)

#create an instance and pass in your Bearer Token
t = TwitterBot('<Insert your Bearer Token from Step 1>')
pprint( t.search_recent( '#python') )

The above code is fairly straightforward and does the following:

TwitterBot class – this class encapsulates the logic to send the API requests
TwitterBot.search_recent – this method takes in the query string, then escapes any special characters, then calls the requests.get() to call the https://api.twitter.com/2/tweets/search/recent API call
pprint() – this simply prints the output in a more readable format

This is the output:

However, there is a simpler way which is to use tweepy.

pip install tweepy

Next you can use the tweepy module to search recent tweets:

import tweepy

client = tweepy.Client(bearer_token='<insert your token here from previous step>')

query = '#python -is:retweet -is:reply' #exclude retweets and replies with '-'
tweets = client.search_recent_tweets(   query=query, 
                                        tweet_fields=['public_metrics', 'context_annotations', 'created_at'], 
                                        user_fields=['username','created_at','location'],
                                        expansions=['entities.mentions.username','author_id'],
                                        max_results=10)
#The details of the users is in the 'includes' list
user_data = {}
for raw_user in tweets.includes['users']:
    user_data[ raw_user.id ] = raw_user

for index, tweet in enumerate(tweets.data):
    print(f"[{index}]::@{user_data[tweet.author_id]['username']}::{tweet.created_at}::{tweet.text.strip()}\n")
    print("------------------------------------------------------------------------------")

Output as follows:

Please note, that after calling the API a few times your number of tweets consumed will have increased and may have hit the limit. You can always visit the dashboard at https://developer.twitter.com/en/portal/dashboard to see how many requests have been consumed. Notice, that this does not count the number of actual API calls but the actual number of tweets. So it can get consumed pretty quickly.

Step 4: Sending out a tweet

So far we’ve only been reading tweets. In order to send a tweet you can use the create_tweet() function of tweepy.

client = tweepy.Client( consumer_key= "<API key from above - see step 1>",
                        consumer_secret= "<API Key secret - see step 1>",
                        access_token= "<Access Token - see step 1>",
                        access_token_secret= "<Access Token Secret - see step 1>")


# Replace the text with whatever you want to Tweet about
response = client.create_tweet(text='A little girl walks into a pet shop and asks for a bunny. The worker says” the fluffy white one or the fluffy brown one”? The girl then says, I don’t think my python really cares.')

print(response)

Output from Console:

Output from Twitter:

How to Send Automated Tweets About the Latest News

To make this a bit more of a useful bot rather than simply tweet out static text, we’ll make it tweet about the latest things happened in the news about python.

In order to search for news information, you can use the python library pygooglenews

pip install pygooglenews

The library searches Google news RSS feed and was developed by Artem Bugara. You can see the full article of he developed the Google News library. You can put in a keyword and also time horizon to make it work. Here’s an example to find the latest python articles in last 24 hours.

from pygooglenews import GoogleNews
gn = GoogleNews()
search = gn.search('python programming', when = '12h')

for article in search['entries']:
    print(article.title)
    print(article.published)
    print(article.source.title)
    print('-'*80)  #string multiplier - show '-' 80 times

Here’s the output:

So, the idea would be to show a random article on the twitter bot which is related to python programming. The gn.search() functions returns a list of all the articles under the entries dictionary item which has a list of those articles. We will simply pick a random one and construct the tweet with the article title and the link to the article.

import tweepy
from pygooglenews import GoogleNews
from random import randint

client = tweepy.Client( consumer_key= "<your consumer/API key - see step 1>",
                        consumer_secret= "<your consumer/API secret - see step 1>",
                        access_token= "<your access token key - see step 1>",
                        access_token_secret= "<your access token secret - see step 1>")

gn = GoogleNews()
search = gn.search('python programming', when = '24h')

#Find random article in last 24 hours using randint between index 0 and the last index
article = search['entries'][ randint( 0, len( search['entries'])-1 ) ]

#construct the tweet text
tweet_text =  f"In python news: {article.title}.  See full article: {article.link}.  #python #pythonprogramming" 

#Fire off the tweet!
response = client.create_tweet( tweet_text )
print(response)

Output from the console on the return result:

And, most importantly, here’s the tweet from our @pythonhowtocode! Twitter automatically pulled the article image

This has currently been scheduled as a daily background job!

How To Use Python instructor for Structured LLM Outputs

by Pubs | May 19, 2026 | APIs, Intermediate

Intermediate

You ask an LLM to extract a user’s name, age, and email from a paragraph of text. Sometimes it returns clean JSON. Sometimes it returns JSON wrapped in markdown fences. Sometimes it returns a paragraph explaining why it extracted those fields. If you have ever built a pipeline that breaks because the model decided today was a good day to add “Sure! Here is the extracted data:” before the JSON, you already understand why instructor exists.

The instructor library patches the OpenAI client (and any OpenAI-compatible API) to force the model to return a fully validated Pydantic model — every time. When validation fails, it retries automatically. You define exactly what fields you need, with their types and constraints, and instructor handles the conversation with the model until the output matches your schema. You need Python 3.9+, an OpenAI API key (or compatible endpoint), and pip install instructor.

This article walks through everything you need to get structured LLM outputs in production: installing and patching the client, defining Pydantic schemas, extracting nested objects, handling lists, using validation hooks, working with non-OpenAI models via LiteLLM, and building a real extraction pipeline. By the end you will have a reusable pattern for reliable structured data from any LLM.

Structured LLM Output: Quick Example

The fastest way to see instructor in action is to extract a structured object from a single sentence. Install the library and try this:

# quick_instructor.py
import instructor
from openai import OpenAI
from pydantic import BaseModel

client = instructor.from_openai(OpenAI())

class Person(BaseModel):
    name: str
    age: int
    city: str

person = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=Person,
    messages=[{"role": "user", "content": "Alice is 32 years old and lives in Melbourne."}]
)

print(person.name)   # Alice
print(person.age)    # 32
print(person.city)   # Melbourne
print(type(person))  # <class '__main__.Person'>

Output:

Alice
32
Melbourne
<class '__main__.Person'>

The key line is instructor.from_openai(OpenAI()) — this patches the standard OpenAI client. After that, you pass response_model=Person to any chat.completions.create call, and instructor automatically: sends the Pydantic schema to the model as a tool definition, parses the model’s tool-call response, validates it against your schema, and retries if validation fails. The return value is a fully typed Pydantic object, not a string or dict.

That example covers the simplest case. The sections below show how to handle nested models, lists, validation rules, retry configuration, and real-world pipelines.

What Is instructor and Why Use It?

When you call an LLM without constraints, it returns free-form text. Parsing that text into structured data is fragile — you write regex, JSON parsers, and fallback handlers that break every time the model changes its wording. instructor solves this by using OpenAI’s function/tool calling feature under the hood: it converts your Pydantic model into a JSON Schema tool definition, forces the model to call that tool, and validates the returned arguments against your schema.

The result is LLM output that behaves like a typed function return value instead of a string you have to parse. If the model returns a field with the wrong type (for example, age as a string “thirty-two” instead of an integer), instructor sends the validation error back to the model and asks it to try again — up to a configurable number of retries.

Approach	Reliability	Type Safety	Auto-Retry
Parse raw LLM text	Fragile	None	Manual
Parse JSON from prompt	Moderate	Manual	Manual
OpenAI function calling	Good	Partial	None
instructor + Pydantic	High	Full	Built-in

The library supports multiple backends: instructor.from_openai, instructor.from_anthropic, instructor.from_gemini, and any OpenAI-compatible endpoint via base_url. This makes it the same interface regardless of which model you use.

Installation and Setup

Install instructor and the OpenAI SDK together. If you are using a different provider, you may also need their SDK:

# Terminal
pip install instructor openai pydantic

Set your API key as an environment variable so it never appears in your code:

# setup_env.py -- run once, or add to your shell profile
import os
# In practice, set this in your shell:
# export OPENAI_API_KEY="sk-..."
print("OPENAI_API_KEY set:", bool(os.environ.get("OPENAI_API_KEY")))

Output:

OPENAI_API_KEY set: True

Patch the client once at startup and reuse it for all calls. Creating a new patched client for every request is wasteful:

# client_setup.py
import instructor
from openai import OpenAI

# Patch once at startup
client = instructor.from_openai(OpenAI())  # reads OPENAI_API_KEY from env

# The client now has response_model support on all completion calls
print(type(client))  # <class 'instructor.client.Instructor'>

Output:

<class 'instructor.client.Instructor'>

Defining Pydantic Schemas for Extraction

Your Pydantic model defines exactly what fields the LLM must return. Field descriptions improve accuracy significantly — the model uses them as instructions for what to put in each field. Use Field(description=...) to guide the extraction:

# schema_example.py
import instructor
from openai import OpenAI
from pydantic import BaseModel, Field
from typing import Optional

client = instructor.from_openai(OpenAI())

class JobPosting(BaseModel):
    title: str = Field(description="The exact job title as written in the posting")
    company: str = Field(description="Company name offering the position")
    location: str = Field(description="City and country, or 'Remote'")
    salary_min: Optional[int] = Field(None, description="Minimum annual salary in USD if mentioned")
    salary_max: Optional[int] = Field(None, description="Maximum annual salary in USD if mentioned")
    is_remote: bool = Field(description="True if the role allows remote work")

text = """
Senior Python Developer at DataFlow Inc. -- Remote (US timezones preferred).
Salary range: $140,000 - $175,000 per year. Must have 5+ years Python experience.
"""

job = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=JobPosting,
    messages=[{"role": "user", "content": f"Extract the job details from: {text}"}]
)

print(f"Title: {job.title}")
print(f"Company: {job.company}")
print(f"Location: {job.location}")
print(f"Salary: ${job.salary_min:,} - ${job.salary_max:,}")
print(f"Remote: {job.is_remote}")

Output:

Title: Senior Python Developer
Company: DataFlow Inc.
Location: Remote (US timezones preferred)
Salary: $140,000 - $175,000
Remote: True

The Optional[int] type tells instructor (and the model) that salary fields may be absent. When the source text does not mention a salary, these fields will be None instead of hallucinated values. Always use Optional for fields that may not appear in the input — without it, the model will invent plausible-sounding values rather than leaving the field empty.

Extracting Nested and List Objects

Real-world extraction often requires nested structures — for example, an invoice with multiple line items, or a resume with a list of work experiences. instructor handles nested Pydantic models and List types natively:

# nested_extraction.py
import instructor
from openai import OpenAI
from pydantic import BaseModel, Field
from typing import List

client = instructor.from_openai(OpenAI())

class LineItem(BaseModel):
    description: str
    quantity: int
    unit_price: float

class Invoice(BaseModel):
    vendor: str
    invoice_number: str
    items: List[LineItem]
    total: float

invoice_text = """
Invoice #INV-2024-0891 from CloudHost Solutions
- 3x Server instances @ $45.00 each
- 1x SSL Certificate @ $12.00
- 2x Domain registrations @ $15.00 each
Total: $222.00
"""

result = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=Invoice,
    messages=[{"role": "user", "content": f"Extract invoice data: {invoice_text}"}]
)

print(f"Vendor: {result.vendor}")
print(f"Invoice #: {result.invoice_number}")
for item in result.items:
    print(f"  {item.quantity}x {item.description} @ ${item.unit_price:.2f}")
print(f"Total: ${result.total:.2f}")

Output:

Vendor: CloudHost Solutions
Invoice #: INV-2024-0891
  3x Server instances @ $45.00
  1x SSL Certificate @ $12.00
  2x Domain registrations @ $15.00
Total: $222.00

Nested models work because instructor converts the entire schema — including nested classes — into a JSON Schema definition that the model understands. The model fills in every field of every nested object, and Pydantic validates the whole structure recursively. If the items list is missing or a line item has an invalid type, instructor retries the extraction with the validation error as feedback.

Adding Custom Validation Rules

Pydantic’s field_validator lets you add business logic on top of type checking. instructor automatically feeds validation errors back to the model, so the model gets a second (or third) chance to return values that satisfy your rules:

# custom_validation.py
import instructor
from openai import OpenAI
from pydantic import BaseModel, Field, field_validator
from typing import List

client = instructor.from_openai(OpenAI())

class ProductReview(BaseModel):
    product_name: str
    rating: int = Field(description="Rating from 1 to 5")
    pros: List[str] = Field(description="List of positive aspects, at least one")
    cons: List[str] = Field(description="List of negative aspects, can be empty")
    summary: str = Field(description="One-sentence summary under 150 characters")

    @field_validator("rating")
    @classmethod
    def rating_in_range(cls, v: int) -> int:
        if not 1 <= v <= 5:
            raise ValueError(f"Rating must be between 1 and 5, got {v}")
        return v

    @field_validator("pros")
    @classmethod
    def at_least_one_pro(cls, v: List[str]) -> List[str]:
        if not v:
            raise ValueError("Must include at least one positive aspect")
        return v

    @field_validator("summary")
    @classmethod
    def summary_length(cls, v: str) -> str:
        if len(v) > 150:
            raise ValueError(f"Summary too long: {len(v)} chars (max 150)")
        return v

text = """
The new Python IDE is pretty solid. Boot time is fast, autocomplete works well.
The memory usage is high and the plugin store is still sparse. Overall a decent
choice for Python development. I'd give it 4 out of 5.
"""

review = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=ProductReview,
    messages=[{"role": "user", "content": f"Extract review details: {text}"}]
)

print(f"Product: {review.product_name}")
print(f"Rating: {review.rating}/5")
print(f"Pros: {review.pros}")
print(f"Cons: {review.cons}")
print(f"Summary: {review.summary}")

Output:

Product: Python IDE
Rating: 4/5
Pros: ['Fast boot time', 'Good autocomplete']
Cons: ['High memory usage', 'Sparse plugin store']
Summary: A solid Python IDE with fast performance but limited plugins and high memory usage.

When a validator raises ValueError, instructor captures the error message and sends it back to the model in a follow-up message: “Validation failed: Rating must be between 1 and 5, got 6. Please fix and try again.” The model then self-corrects. By default, instructor retries up to 3 times before raising an exception. You can configure this with max_retries=N on the completion call.

Configuring Retries and Modes

instructor supports several extraction modes depending on what your model supports. The default mode uses OpenAI’s tool calling, but you can switch to JSON mode or other strategies:

# retry_config.py
import instructor
from instructor import Mode
from openai import OpenAI
from pydantic import BaseModel

# Default: tool calling (most reliable for OpenAI models)
client_tools = instructor.from_openai(OpenAI())

# JSON mode: model returns raw JSON instead of a tool call
client_json = instructor.from_openai(OpenAI(), mode=Mode.JSON)

# MD_JSON mode: model wraps JSON in markdown fences (useful for some fine-tunes)
client_md = instructor.from_openai(OpenAI(), mode=Mode.MD_JSON)

class City(BaseModel):
    name: str
    country: str
    population: int

# Control retries per-call
city = client_tools.chat.completions.create(
    model="gpt-4o-mini",
    response_model=City,
    max_retries=5,           # retry up to 5 times on validation failure
    messages=[{"role": "user", "content": "Tell me about Tokyo"}]
)

print(f"{city.name}, {city.country}: pop {city.population:,}")

Output:

Tokyo, Japan: pop 13,960,000

For most OpenAI models, the default tool-calling mode is most reliable. Use Mode.JSON for models that support JSON mode but not tool calling — for example, some fine-tuned models or older GPT versions. The max_retries parameter controls how many times instructor will re-prompt the model when validation fails. For production pipelines where data quality matters more than cost, set this to 3-5.

Using instructor with Non-OpenAI Models

If you are using Anthropic’s Claude, Google Gemini, or a local model via Ollama, instructor has provider-specific patches. For OpenAI-compatible endpoints (like local LLMs with an OpenAI-compatible API), you can pass a custom base_url:

# multi_provider.py
import instructor
from anthropic import Anthropic
from pydantic import BaseModel

# Anthropic Claude -- uses a different client class
anthropic_client = instructor.from_anthropic(Anthropic())

class Sentiment(BaseModel):
    label: str   # "positive", "negative", or "neutral"
    score: float # confidence from 0.0 to 1.0
    reason: str  # one-sentence explanation

result = anthropic_client.messages.create(
    model="claude-3-haiku-20240307",
    max_tokens=256,
    response_model=Sentiment,
    messages=[{
        "role": "user",
        "content": "This new Python library is fantastic, saves me hours every week!"
    }]
)

print(f"Sentiment: {result.label} ({result.score:.0%})")
print(f"Reason: {result.reason}")

Output:

Sentiment: positive (96%)
Reason: The user expresses strong enthusiasm and quantifies time savings, indicating genuine satisfaction.

For local models via Ollama (which provides an OpenAI-compatible API on localhost:11434), create the client with a custom base URL:

# ollama_instructor.py
import instructor
from openai import OpenAI
from pydantic import BaseModel

# Ollama runs an OpenAI-compatible server locally
ollama_client = instructor.from_openai(
    OpenAI(base_url="http://localhost:11434/v1", api_key="ollama"),
    mode=instructor.Mode.JSON  # use JSON mode for local models
)

class Summary(BaseModel):
    headline: str
    key_points: list[str]

# Works the same as OpenAI -- just a different backend
# summary = ollama_client.chat.completions.create(
#     model="llama3.2",
#     response_model=Summary,
#     messages=[{"role": "user", "content": "Summarize Python's async/await model"}]
# )
print("Local model client ready -- uncomment to use with Ollama running")

Output:

Local model client ready -- uncomment to use with Ollama running

Real-Life Example: Job Posting Extraction Pipeline

Here is a complete pipeline that reads job postings from a list of texts, extracts structured data, filters by criteria, and exports to CSV — the kind of task that comes up in recruiting tools, market research, and job aggregators:

# job_extraction_pipeline.py
import instructor
import csv
from openai import OpenAI
from pydantic import BaseModel, Field
from typing import Optional, List

client = instructor.from_openai(OpenAI())

class JobPosting(BaseModel):
    title: str = Field(description="Job title exactly as written")
    company: str
    location: str = Field(description="City/country or 'Remote'")
    salary_min: Optional[int] = Field(None, description="Min annual salary USD")
    salary_max: Optional[int] = Field(None, description="Max annual salary USD")
    required_years: Optional[int] = Field(None, description="Years of experience required")
    technologies: List[str] = Field(description="List of technologies mentioned")
    is_remote: bool

# Sample job postings to process
JOB_TEXTS = [
    """Senior Python Engineer at Nexaflow -- Remote-first.
    $150k-$190k. 5+ years Python, FastAPI, PostgreSQL, AWS required.""",

    """Junior Data Scientist at BioMetrics Ltd (London, UK).
    GBP 45,000-55,000. 0-2 years exp, pandas, scikit-learn, matplotlib.""",

    """Staff ML Engineer at Quantra -- San Francisco CA.
    $220,000 - $280,000/yr. 8+ years, PyTorch, CUDA, distributed training.""",
]

def extract_jobs(texts: List[str]) -> List[JobPosting]:
    """Extract structured job data from raw posting texts."""
    jobs = []
    for i, text in enumerate(texts, 1):
        job = client.chat.completions.create(
            model="gpt-4o-mini",
            response_model=JobPosting,
            max_retries=3,
            messages=[{"role": "user", "content": f"Extract job details:\n\n{text}"}]
        )
        jobs.append(job)
        print(f"[{i}/{len(texts)}] Extracted: {job.title} at {job.company}")
    return jobs

def filter_remote(jobs: List[JobPosting]) -> List[JobPosting]:
    return [j for j in jobs if j.is_remote]

def export_csv(jobs: List[JobPosting], path: str) -> None:
    with open(path, "w", newline="") as f:
        writer = csv.writer(f)
        writer.writerow(["Title", "Company", "Location", "Salary Min", "Salary Max",
                         "Yrs Required", "Technologies", "Remote"])
        for j in jobs:
            writer.writerow([
                j.title, j.company, j.location,
                j.salary_min or "", j.salary_max or "",
                j.required_years or "",
                ", ".join(j.technologies),
                j.is_remote
            ])

if __name__ == "__main__":
    print("Extracting job postings...")
    jobs = extract_jobs(JOB_TEXTS)
    remote_jobs = filter_remote(jobs)
    print(f"\nTotal extracted: {len(jobs)}, Remote: {len(remote_jobs)}")
    export_csv(jobs, "jobs_extracted.csv")
    print("Saved to jobs_extracted.csv")

Output:

Extracting job postings...
[1/3] Extracted: Senior Python Engineer at Nexaflow
[2/3] Extracted: Junior Data Scientist at BioMetrics Ltd
[3/3] Extracted: Staff ML Engineer at Quantra

Total extracted: 3, Remote: 1
Saved to jobs_extracted.csv

This pipeline is easy to extend: add a database write step, connect it to a web scraper that feeds real job pages, or add more validation rules to the JobPosting model. The core pattern — extract once, validate automatically, retry on failure — stays the same regardless of the scale. You can process thousands of postings by replacing JOB_TEXTS with a generator that reads from a queue or database, keeping the extraction logic identical.

Frequently Asked Questions

Does instructor increase API costs because of retries?

Yes, each retry is an additional API call, so failed extractions cost more. In practice, with well-designed schemas and clear field descriptions, validation failures are rare — under 5% for most extraction tasks. The cost increase is usually worth the reliability gain. If cost is a concern, use max_retries=1 and handle exceptions in your code rather than retrying automatically.

Does instructor support streaming responses?

Yes. Use response_model=Iterable[YourModel] for streaming lists, or Partial[YourModel] for streaming partial updates to a single model. Streaming is useful for large extractions where you want to process results as they arrive rather than waiting for the full response. See the instructor documentation for the streaming API details.

What happens when the model cannot extract a field?

If the field is typed as Optional[X], the model will return None for missing information. If the field is required (non-Optional), the model will either hallucinate a value or fail validation, triggering a retry. For fields that may legitimately be absent in the source text, always use Optional with a None default. This is the most common mistake new users make.

Can I extract data from large documents?

Yes, but be aware of token limits. For documents larger than a few thousand words, split them into chunks and extract from each chunk separately. Use a List[YourModel] return type if a single document contains multiple items to extract (like a list of transactions in a bank statement). For very large documents, consider summarizing first with a regular completion call, then extracting from the summary.

How is this different from just prompting for JSON output?

Prompting for JSON works until it does not — the model adds markdown fences, writes a preamble sentence, or omits fields. instructor uses tool calling (not prompting) to enforce the schema, so the model cannot deviate from the structure. It also runs Pydantic validation on the result and retries if types or constraints are violated. The difference in reliability for production use is significant — JSON prompting is fine for experiments, but instructor is the right tool for pipelines where data quality matters.

Is my data sent to OpenAI when I use instructor?

instructor is a thin wrapper around the OpenAI SDK — your data goes to whatever API endpoint you configure, subject to that provider’s data policy. If you are processing sensitive data, use a self-hosted model via Ollama or another local inference server, and point instructor at your local endpoint with a custom base_url. The library itself does not send data anywhere — it only wraps the client you provide.

Conclusion

The instructor library solves one of the most persistent frustrations in LLM application development: getting the model to return data in the shape your code expects, every time. We covered patching the OpenAI client, defining Pydantic schemas with field descriptions, extracting nested and list objects, adding custom validation rules, configuring retries and modes, and using instructor with non-OpenAI providers. The job extraction pipeline demonstrated how these pieces combine into a production-ready pattern.

The next step is to extend the real-life example: add a web scraper to pull live job postings, or connect the extracted data to a database. With instructor handling the model-to-schema translation, you can focus entirely on the business logic of what to extract and what to do with it.

Full documentation and more examples are at python.useinstructor.com. The library’s GitHub has a large collection of real-world examples including classification, knowledge graph extraction, and citation-backed answers.

Further Reading: For more details, see the Python HTTP client documentation.

Pro Tips for Building a Better Twitter Bot

1. Respect Rate Limits with Exponential Backoff

The Twitter API enforces strict rate limits. Instead of crashing when you hit one, implement exponential backoff to retry gracefully. Wrap your API calls in a retry function that doubles the wait time after each failed attempt, starting from 1 second up to a maximum of 64 seconds. This keeps your bot running reliably without getting your credentials revoked.

# rate_limit_handler.py
import time
import requests

def api_call_with_backoff(url, headers, max_retries=5):
    wait_time = 1
    for attempt in range(max_retries):
        response = requests.get(url, headers=headers)
        if response.status_code == 200:
            return response.json()
        elif response.status_code == 429:
            print(f"Rate limited. Waiting {wait_time}s...")
            time.sleep(wait_time)
            wait_time = min(wait_time * 2, 64)
        else:
            response.raise_for_status()
    raise Exception("Max retries exceeded")

Output:

Rate limited. Waiting 1s...
Rate limited. Waiting 2s...
{'data': [{'id': '1234567890', 'text': 'Hello world'}]}

2. Never Hardcode API Keys

Store your API credentials in environment variables or a .env file, never in your source code. If you accidentally push hardcoded keys to a public GitHub repo, bots will find and abuse them within minutes. Use the python-dotenv library to load credentials from a .env file that you add to your .gitignore.

# secure_credentials.py
import os
from dotenv import load_dotenv

load_dotenv()

BEARER_TOKEN = os.getenv("TWITTER_BEARER_TOKEN")
API_KEY = os.getenv("TWITTER_API_KEY")
API_SECRET = os.getenv("TWITTER_API_SECRET")

if not BEARER_TOKEN:
    raise ValueError("TWITTER_BEARER_TOKEN not set in .env file")

3. Add Logging Instead of Print Statements

Replace print() calls with Python’s built-in logging module. Logging gives you timestamps, severity levels, and the ability to write to files — essential for debugging a bot that runs unattended. When your bot tweets something unexpected at 3 AM, logs are the only way to figure out what happened.

# bot_with_logging.py
import logging

logging.basicConfig(
    level=logging.INFO,
    format="%(asctime)s [%(levelname)s] %(message)s",
    handlers=[
        logging.FileHandler("bot.log"),
        logging.StreamHandler()
    ]
)

logger = logging.getLogger(__name__)
logger.info("Bot started successfully")
logger.warning("Approaching rate limit: 14/15 requests used")
logger.error("Failed to post tweet: 403 Forbidden")

Output:

2026-03-26 10:15:30 [INFO] Bot started successfully
2026-03-26 10:15:31 [WARNING] Approaching rate limit: 14/15 requests used
2026-03-26 10:15:32 [ERROR] Failed to post tweet: 403 Forbidden

4. Track Posted Content to Avoid Duplicates

Bots that post the same content repeatedly get flagged and suspended. Keep a simple record of what you have already tweeted using a JSON file or SQLite database. Before posting, check if the content has been posted before. This is especially important for news bots that might encounter the same story from multiple sources.

5. Use a Scheduler for Consistent Posting

Instead of running your bot in a loop with time.sleep(), use a proper scheduler like schedule or APScheduler. Schedulers handle timing more reliably, support cron-like expressions, and make it easy to run different tasks at different intervals. For production bots, consider using system-level scheduling with cron (Linux) or Task Scheduler (Windows).

Frequently Asked Questions

Can I still build a Twitter bot with the API?

Yes, but access has changed. The free tier of the X (formerly Twitter) API v2 allows basic posting. For reading tweets or higher volume, you need a paid plan. Check current pricing at developer.x.com.

What Python library should I use for the Twitter/X API?

Use tweepy for the most mature Python wrapper with v2 API support. It handles OAuth 2.0 authentication, rate limiting, and provides clean methods for posting, searching, and streaming.

How do I authenticate with the Twitter API v2?

Use OAuth 2.0 Bearer Token for read-only access or OAuth 1.0a for posting. Generate credentials in the X Developer Portal, then pass them to tweepy.Client().

What are the rate limits for the Twitter API?

Rate limits vary by endpoint and plan. The free tier allows 1,500 tweets per month. Always implement rate limit handling with tweepy’s wait_on_rate_limit=True.

What can a Twitter bot do?

Bots can auto-post content, reply to mentions, retweet by keyword, track hashtags, analyze sentiment, and provide automated responses. Always follow the X API terms of service.

Easy guide for data storage options in Python

by Pubs | Data Processing

Beginner

For most serious applications, you will often have to have persistent storage (storage that still exists after your applications stops running) of some sort. For new developers, it can be quite daunting to decide which option to go for. Is a simple flat file enough? When should you use something like a database? Which database should you use? There are so many options that are available it becomes quite daunting to decide which way to go for.

This is a starting guide to provide an overview of some of the many data storage options that are available for you and how you can go about deciding. One thing to keep in mind is that if you are developing an application which is either planned or has a possibility to scale over time, your underlying database might also grow overtime. It may be quick and easy to implement a file as storage, but as your data grows it might be better to use a relational database but it will take a little bit more effort. Let’s look at this a bit deeper

What are the possible ways to store data?

There are many methods of persistent storage that you can use (persistent storage means that after your program is finished running your data is not lost). The typical ways you can do this is either by using a file which you save data to, or by using the python pickle mechanism. Firstly I will explain what some of the persistent storage options are:

File: This is where you store the data in a text based file in format such as CSV (comma separated values), JSON, and others
Python Pickle: A python pickle is a mechanism where you can save a data structure directly to a file, and then you can retrieve the data directly from the file next time you run your program. You can do this with a library called “pickle”
Config files: config files are similar to File and Python Pickle in that the data is stored in a file format but is intended to be directly edited by a user
Database SQLite: this is a database where you can run queries to search for data, but the data is stored in a file
Database Postgres (or other SQL based database): this is a database service where there’s another program that you run to manage the database, and you call functions (or SQL queries) on the database service to get the data back in an efficient manner. SQL based databases are great for structured data – e.g. table-like/excel-like data. You would search for data by category fields as an example
Key-value database (e.g redis is one of the most famous): A key-value database is exactly that, it contains a database where you search by a key, and then it returns a value. This value can be a single value or it can be a set of fields that are associated with that value. A common use of a key-value database is for hash-based data. Meaning that you have a specific key that you want to search for, and then you get all the related fields associated with that key – much like a dictionary in python, but the benefit being its in a persistent storage
Graph Database (e.g. Neo4J): A graph database stores data which is built to navigate relationships. This is something that is rather cumbersome to do in a relational database where you need to have many intermediary tables but becomes trivial with GraphQL language
Text Search (e.g. Elastic Search): A purpose built database for text search which is extremely fast when searching for strings or long text
Time series database (e.g. influx): For IoT data where each record is stored with a timestamp key and you need to do queries in time blocks, time series databases are ideal. You can do common operations such as to aggregate, search, slice data through specific query operations
NOSQL document database (e.g. mongodb, couchdb): this is a database that also runs as a separate service but is specifically for “unstructured data” (non-table like data) such as text, images where you search for records in a free form way such as by text strings.

There is no one persistent storage mechanism that fits all, it really depends on your purpose (or “use case”) to determine which database works best for you as there are pros and cons for each.

	Setup	Editable outside Python	Volume	Read Speed	Write Speed	Inbuilt Redundancy
File	None – you can create a file in your python code	For text based	Small	Slow	Slow	No – manual
Python Pickle	None- you can create this in your python code	No – only in python	Small	Slow	Slow	No – manual
Config File	Optional. You can create a config file before hand	Yes – you can use any text based editor	Small	Slow	Slow	No – manual
Database SQLite	None – database created automatically	No – only in python	Small-Med	Slow-Med	Slow-Med	No – manual
Relational SQL Database	Separate installation of server	Through the SQL console or other SQL clients	Large	Fast	Fast	Yes, require extra setup
NoSQL Column Database	Separate installation of server	Yes, through external client	Very large	Very fast	Very fast	Yes, inbuilt
Key-Value database	Separate installation of server	Yes, through external client	Very large	Very fast	Fast-Very Fast	Yes, require extra setup
Graph Database	Separate installation of serverSeparate installation of server	Yes, through external client	Large	Med	Med	Yes, require extra setup
Time Series Database	Separate installation of server	Yes, through external client	Very large	Very fast	Fast	Yes, require extra setup
Text Search Database	Separate installation of server	Yes, through external client	Very large	Very fast	Fast	Yes, require extra setup
NoSQL Documet DB	Separate installation of server	Yes, through external client	Very large	Very fast	Fast	Yes, require extra setup

A big disclaimer here, for some of the responses, the more accurate answer is “it depends”. For example, for redundancy for relational databases, some have it inbuilt such as Oracle RAC enterprise databases and for others you can set up redundancy where you could have an infrastructure solution. However, to provide a simpler guidance, I’ve made this a bit more prescriptive. If you would like to dive deeper, then please don’t rely purely on the table above! Look into the documentation of the particular database product you are considering or reach out to me and I’m happy to provide some advice.

Summary

There are in fact plenty of SaaS-based options for database or persistent storage that are popping up which is exciting. These newer SaaS options (for example, firebase, restdb.io, anvil.works etc) are great in that they save you time on the heavy lifting, but then there may be times you still want to manage your own database. This may be because you want to keep your data yourself, or simply because you want to save costs as you already have an environment either on your own laptop, or you’re paying a fixed price for a virtual machine. Hence, managing your own persistent storage may be more cost effective rather than paying for another SaaS. However, certainly don’t discount the SaaS options altogether, as they will at least help you with things like backups, security updates etc for you.

How To Use Python instructor for Structured LLM Outputs

by Pubs | May 19, 2026 | APIs, Intermediate

Intermediate

You ask an LLM to extract a user’s name, age, and email from a paragraph of text. Sometimes it returns clean JSON. Sometimes it returns JSON wrapped in markdown fences. Sometimes it returns a paragraph explaining why it extracted those fields. If you have ever built a pipeline that breaks because the model decided today was a good day to add “Sure! Here is the extracted data:” before the JSON, you already understand why instructor exists.

The instructor library patches the OpenAI client (and any OpenAI-compatible API) to force the model to return a fully validated Pydantic model — every time. When validation fails, it retries automatically. You define exactly what fields you need, with their types and constraints, and instructor handles the conversation with the model until the output matches your schema. You need Python 3.9+, an OpenAI API key (or compatible endpoint), and pip install instructor.

This article walks through everything you need to get structured LLM outputs in production: installing and patching the client, defining Pydantic schemas, extracting nested objects, handling lists, using validation hooks, working with non-OpenAI models via LiteLLM, and building a real extraction pipeline. By the end you will have a reusable pattern for reliable structured data from any LLM.

Structured LLM Output: Quick Example

The fastest way to see instructor in action is to extract a structured object from a single sentence. Install the library and try this:

# quick_instructor.py
import instructor
from openai import OpenAI
from pydantic import BaseModel

client = instructor.from_openai(OpenAI())

class Person(BaseModel):
    name: str
    age: int
    city: str

person = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=Person,
    messages=[{"role": "user", "content": "Alice is 32 years old and lives in Melbourne."}]
)

print(person.name)   # Alice
print(person.age)    # 32
print(person.city)   # Melbourne
print(type(person))  # <class '__main__.Person'>

Output:

Alice
32
Melbourne
<class '__main__.Person'>

The key line is instructor.from_openai(OpenAI()) — this patches the standard OpenAI client. After that, you pass response_model=Person to any chat.completions.create call, and instructor automatically: sends the Pydantic schema to the model as a tool definition, parses the model’s tool-call response, validates it against your schema, and retries if validation fails. The return value is a fully typed Pydantic object, not a string or dict.

That example covers the simplest case. The sections below show how to handle nested models, lists, validation rules, retry configuration, and real-world pipelines.

What Is instructor and Why Use It?

When you call an LLM without constraints, it returns free-form text. Parsing that text into structured data is fragile — you write regex, JSON parsers, and fallback handlers that break every time the model changes its wording. instructor solves this by using OpenAI’s function/tool calling feature under the hood: it converts your Pydantic model into a JSON Schema tool definition, forces the model to call that tool, and validates the returned arguments against your schema.

The result is LLM output that behaves like a typed function return value instead of a string you have to parse. If the model returns a field with the wrong type (for example, age as a string “thirty-two” instead of an integer), instructor sends the validation error back to the model and asks it to try again — up to a configurable number of retries.

Approach	Reliability	Type Safety	Auto-Retry
Parse raw LLM text	Fragile	None	Manual
Parse JSON from prompt	Moderate	Manual	Manual
OpenAI function calling	Good	Partial	None
instructor + Pydantic	High	Full	Built-in

The library supports multiple backends: instructor.from_openai, instructor.from_anthropic, instructor.from_gemini, and any OpenAI-compatible endpoint via base_url. This makes it the same interface regardless of which model you use.

Installation and Setup

Install instructor and the OpenAI SDK together. If you are using a different provider, you may also need their SDK:

# Terminal
pip install instructor openai pydantic

Set your API key as an environment variable so it never appears in your code:

# setup_env.py -- run once, or add to your shell profile
import os
# In practice, set this in your shell:
# export OPENAI_API_KEY="sk-..."
print("OPENAI_API_KEY set:", bool(os.environ.get("OPENAI_API_KEY")))

Output:

OPENAI_API_KEY set: True

Patch the client once at startup and reuse it for all calls. Creating a new patched client for every request is wasteful:

# client_setup.py
import instructor
from openai import OpenAI

# Patch once at startup
client = instructor.from_openai(OpenAI())  # reads OPENAI_API_KEY from env

# The client now has response_model support on all completion calls
print(type(client))  # <class 'instructor.client.Instructor'>

Output:

<class 'instructor.client.Instructor'>

Defining Pydantic Schemas for Extraction

Your Pydantic model defines exactly what fields the LLM must return. Field descriptions improve accuracy significantly — the model uses them as instructions for what to put in each field. Use Field(description=...) to guide the extraction:

# schema_example.py
import instructor
from openai import OpenAI
from pydantic import BaseModel, Field
from typing import Optional

client = instructor.from_openai(OpenAI())

class JobPosting(BaseModel):
    title: str = Field(description="The exact job title as written in the posting")
    company: str = Field(description="Company name offering the position")
    location: str = Field(description="City and country, or 'Remote'")
    salary_min: Optional[int] = Field(None, description="Minimum annual salary in USD if mentioned")
    salary_max: Optional[int] = Field(None, description="Maximum annual salary in USD if mentioned")
    is_remote: bool = Field(description="True if the role allows remote work")

text = """
Senior Python Developer at DataFlow Inc. -- Remote (US timezones preferred).
Salary range: $140,000 - $175,000 per year. Must have 5+ years Python experience.
"""

job = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=JobPosting,
    messages=[{"role": "user", "content": f"Extract the job details from: {text}"}]
)

print(f"Title: {job.title}")
print(f"Company: {job.company}")
print(f"Location: {job.location}")
print(f"Salary: ${job.salary_min:,} - ${job.salary_max:,}")
print(f"Remote: {job.is_remote}")

Output:

Title: Senior Python Developer
Company: DataFlow Inc.
Location: Remote (US timezones preferred)
Salary: $140,000 - $175,000
Remote: True

The Optional[int] type tells instructor (and the model) that salary fields may be absent. When the source text does not mention a salary, these fields will be None instead of hallucinated values. Always use Optional for fields that may not appear in the input — without it, the model will invent plausible-sounding values rather than leaving the field empty.

Extracting Nested and List Objects

Real-world extraction often requires nested structures — for example, an invoice with multiple line items, or a resume with a list of work experiences. instructor handles nested Pydantic models and List types natively:

# nested_extraction.py
import instructor
from openai import OpenAI
from pydantic import BaseModel, Field
from typing import List

client = instructor.from_openai(OpenAI())

class LineItem(BaseModel):
    description: str
    quantity: int
    unit_price: float

class Invoice(BaseModel):
    vendor: str
    invoice_number: str
    items: List[LineItem]
    total: float

invoice_text = """
Invoice #INV-2024-0891 from CloudHost Solutions
- 3x Server instances @ $45.00 each
- 1x SSL Certificate @ $12.00
- 2x Domain registrations @ $15.00 each
Total: $222.00
"""

result = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=Invoice,
    messages=[{"role": "user", "content": f"Extract invoice data: {invoice_text}"}]
)

print(f"Vendor: {result.vendor}")
print(f"Invoice #: {result.invoice_number}")
for item in result.items:
    print(f"  {item.quantity}x {item.description} @ ${item.unit_price:.2f}")
print(f"Total: ${result.total:.2f}")

Output:

Vendor: CloudHost Solutions
Invoice #: INV-2024-0891
  3x Server instances @ $45.00
  1x SSL Certificate @ $12.00
  2x Domain registrations @ $15.00
Total: $222.00

Nested models work because instructor converts the entire schema — including nested classes — into a JSON Schema definition that the model understands. The model fills in every field of every nested object, and Pydantic validates the whole structure recursively. If the items list is missing or a line item has an invalid type, instructor retries the extraction with the validation error as feedback.

Adding Custom Validation Rules

Pydantic’s field_validator lets you add business logic on top of type checking. instructor automatically feeds validation errors back to the model, so the model gets a second (or third) chance to return values that satisfy your rules:

# custom_validation.py
import instructor
from openai import OpenAI
from pydantic import BaseModel, Field, field_validator
from typing import List

client = instructor.from_openai(OpenAI())

class ProductReview(BaseModel):
    product_name: str
    rating: int = Field(description="Rating from 1 to 5")
    pros: List[str] = Field(description="List of positive aspects, at least one")
    cons: List[str] = Field(description="List of negative aspects, can be empty")
    summary: str = Field(description="One-sentence summary under 150 characters")

    @field_validator("rating")
    @classmethod
    def rating_in_range(cls, v: int) -> int:
        if not 1 <= v <= 5:
            raise ValueError(f"Rating must be between 1 and 5, got {v}")
        return v

    @field_validator("pros")
    @classmethod
    def at_least_one_pro(cls, v: List[str]) -> List[str]:
        if not v:
            raise ValueError("Must include at least one positive aspect")
        return v

    @field_validator("summary")
    @classmethod
    def summary_length(cls, v: str) -> str:
        if len(v) > 150:
            raise ValueError(f"Summary too long: {len(v)} chars (max 150)")
        return v

text = """
The new Python IDE is pretty solid. Boot time is fast, autocomplete works well.
The memory usage is high and the plugin store is still sparse. Overall a decent
choice for Python development. I'd give it 4 out of 5.
"""

review = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=ProductReview,
    messages=[{"role": "user", "content": f"Extract review details: {text}"}]
)

print(f"Product: {review.product_name}")
print(f"Rating: {review.rating}/5")
print(f"Pros: {review.pros}")
print(f"Cons: {review.cons}")
print(f"Summary: {review.summary}")

Output:

Product: Python IDE
Rating: 4/5
Pros: ['Fast boot time', 'Good autocomplete']
Cons: ['High memory usage', 'Sparse plugin store']
Summary: A solid Python IDE with fast performance but limited plugins and high memory usage.

When a validator raises ValueError, instructor captures the error message and sends it back to the model in a follow-up message: “Validation failed: Rating must be between 1 and 5, got 6. Please fix and try again.” The model then self-corrects. By default, instructor retries up to 3 times before raising an exception. You can configure this with max_retries=N on the completion call.

Configuring Retries and Modes

instructor supports several extraction modes depending on what your model supports. The default mode uses OpenAI’s tool calling, but you can switch to JSON mode or other strategies:

# retry_config.py
import instructor
from instructor import Mode
from openai import OpenAI
from pydantic import BaseModel

# Default: tool calling (most reliable for OpenAI models)
client_tools = instructor.from_openai(OpenAI())

# JSON mode: model returns raw JSON instead of a tool call
client_json = instructor.from_openai(OpenAI(), mode=Mode.JSON)

# MD_JSON mode: model wraps JSON in markdown fences (useful for some fine-tunes)
client_md = instructor.from_openai(OpenAI(), mode=Mode.MD_JSON)

class City(BaseModel):
    name: str
    country: str
    population: int

# Control retries per-call
city = client_tools.chat.completions.create(
    model="gpt-4o-mini",
    response_model=City,
    max_retries=5,           # retry up to 5 times on validation failure
    messages=[{"role": "user", "content": "Tell me about Tokyo"}]
)

print(f"{city.name}, {city.country}: pop {city.population:,}")

Output:

Tokyo, Japan: pop 13,960,000

For most OpenAI models, the default tool-calling mode is most reliable. Use Mode.JSON for models that support JSON mode but not tool calling — for example, some fine-tuned models or older GPT versions. The max_retries parameter controls how many times instructor will re-prompt the model when validation fails. For production pipelines where data quality matters more than cost, set this to 3-5.

Using instructor with Non-OpenAI Models

If you are using Anthropic’s Claude, Google Gemini, or a local model via Ollama, instructor has provider-specific patches. For OpenAI-compatible endpoints (like local LLMs with an OpenAI-compatible API), you can pass a custom base_url:

# multi_provider.py
import instructor
from anthropic import Anthropic
from pydantic import BaseModel

# Anthropic Claude -- uses a different client class
anthropic_client = instructor.from_anthropic(Anthropic())

class Sentiment(BaseModel):
    label: str   # "positive", "negative", or "neutral"
    score: float # confidence from 0.0 to 1.0
    reason: str  # one-sentence explanation

result = anthropic_client.messages.create(
    model="claude-3-haiku-20240307",
    max_tokens=256,
    response_model=Sentiment,
    messages=[{
        "role": "user",
        "content": "This new Python library is fantastic, saves me hours every week!"
    }]
)

print(f"Sentiment: {result.label} ({result.score:.0%})")
print(f"Reason: {result.reason}")

Output:

Sentiment: positive (96%)
Reason: The user expresses strong enthusiasm and quantifies time savings, indicating genuine satisfaction.

For local models via Ollama (which provides an OpenAI-compatible API on localhost:11434), create the client with a custom base URL:

# ollama_instructor.py
import instructor
from openai import OpenAI
from pydantic import BaseModel

# Ollama runs an OpenAI-compatible server locally
ollama_client = instructor.from_openai(
    OpenAI(base_url="http://localhost:11434/v1", api_key="ollama"),
    mode=instructor.Mode.JSON  # use JSON mode for local models
)

class Summary(BaseModel):
    headline: str
    key_points: list[str]

# Works the same as OpenAI -- just a different backend
# summary = ollama_client.chat.completions.create(
#     model="llama3.2",
#     response_model=Summary,
#     messages=[{"role": "user", "content": "Summarize Python's async/await model"}]
# )
print("Local model client ready -- uncomment to use with Ollama running")

Output:

Local model client ready -- uncomment to use with Ollama running

Real-Life Example: Job Posting Extraction Pipeline

Here is a complete pipeline that reads job postings from a list of texts, extracts structured data, filters by criteria, and exports to CSV — the kind of task that comes up in recruiting tools, market research, and job aggregators:

# job_extraction_pipeline.py
import instructor
import csv
from openai import OpenAI
from pydantic import BaseModel, Field
from typing import Optional, List

client = instructor.from_openai(OpenAI())

class JobPosting(BaseModel):
    title: str = Field(description="Job title exactly as written")
    company: str
    location: str = Field(description="City/country or 'Remote'")
    salary_min: Optional[int] = Field(None, description="Min annual salary USD")
    salary_max: Optional[int] = Field(None, description="Max annual salary USD")
    required_years: Optional[int] = Field(None, description="Years of experience required")
    technologies: List[str] = Field(description="List of technologies mentioned")
    is_remote: bool

# Sample job postings to process
JOB_TEXTS = [
    """Senior Python Engineer at Nexaflow -- Remote-first.
    $150k-$190k. 5+ years Python, FastAPI, PostgreSQL, AWS required.""",

    """Junior Data Scientist at BioMetrics Ltd (London, UK).
    GBP 45,000-55,000. 0-2 years exp, pandas, scikit-learn, matplotlib.""",

    """Staff ML Engineer at Quantra -- San Francisco CA.
    $220,000 - $280,000/yr. 8+ years, PyTorch, CUDA, distributed training.""",
]

def extract_jobs(texts: List[str]) -> List[JobPosting]:
    """Extract structured job data from raw posting texts."""
    jobs = []
    for i, text in enumerate(texts, 1):
        job = client.chat.completions.create(
            model="gpt-4o-mini",
            response_model=JobPosting,
            max_retries=3,
            messages=[{"role": "user", "content": f"Extract job details:\n\n{text}"}]
        )
        jobs.append(job)
        print(f"[{i}/{len(texts)}] Extracted: {job.title} at {job.company}")
    return jobs

def filter_remote(jobs: List[JobPosting]) -> List[JobPosting]:
    return [j for j in jobs if j.is_remote]

def export_csv(jobs: List[JobPosting], path: str) -> None:
    with open(path, "w", newline="") as f:
        writer = csv.writer(f)
        writer.writerow(["Title", "Company", "Location", "Salary Min", "Salary Max",
                         "Yrs Required", "Technologies", "Remote"])
        for j in jobs:
            writer.writerow([
                j.title, j.company, j.location,
                j.salary_min or "", j.salary_max or "",
                j.required_years or "",
                ", ".join(j.technologies),
                j.is_remote
            ])

if __name__ == "__main__":
    print("Extracting job postings...")
    jobs = extract_jobs(JOB_TEXTS)
    remote_jobs = filter_remote(jobs)
    print(f"\nTotal extracted: {len(jobs)}, Remote: {len(remote_jobs)}")
    export_csv(jobs, "jobs_extracted.csv")
    print("Saved to jobs_extracted.csv")

Output:

Extracting job postings...
[1/3] Extracted: Senior Python Engineer at Nexaflow
[2/3] Extracted: Junior Data Scientist at BioMetrics Ltd
[3/3] Extracted: Staff ML Engineer at Quantra

Total extracted: 3, Remote: 1
Saved to jobs_extracted.csv

This pipeline is easy to extend: add a database write step, connect it to a web scraper that feeds real job pages, or add more validation rules to the JobPosting model. The core pattern — extract once, validate automatically, retry on failure — stays the same regardless of the scale. You can process thousands of postings by replacing JOB_TEXTS with a generator that reads from a queue or database, keeping the extraction logic identical.

Frequently Asked Questions

Does instructor increase API costs because of retries?

Yes, each retry is an additional API call, so failed extractions cost more. In practice, with well-designed schemas and clear field descriptions, validation failures are rare — under 5% for most extraction tasks. The cost increase is usually worth the reliability gain. If cost is a concern, use max_retries=1 and handle exceptions in your code rather than retrying automatically.

Does instructor support streaming responses?

Yes. Use response_model=Iterable[YourModel] for streaming lists, or Partial[YourModel] for streaming partial updates to a single model. Streaming is useful for large extractions where you want to process results as they arrive rather than waiting for the full response. See the instructor documentation for the streaming API details.

What happens when the model cannot extract a field?

If the field is typed as Optional[X], the model will return None for missing information. If the field is required (non-Optional), the model will either hallucinate a value or fail validation, triggering a retry. For fields that may legitimately be absent in the source text, always use Optional with a None default. This is the most common mistake new users make.

Can I extract data from large documents?

Yes, but be aware of token limits. For documents larger than a few thousand words, split them into chunks and extract from each chunk separately. Use a List[YourModel] return type if a single document contains multiple items to extract (like a list of transactions in a bank statement). For very large documents, consider summarizing first with a regular completion call, then extracting from the summary.

How is this different from just prompting for JSON output?

Prompting for JSON works until it does not — the model adds markdown fences, writes a preamble sentence, or omits fields. instructor uses tool calling (not prompting) to enforce the schema, so the model cannot deviate from the structure. It also runs Pydantic validation on the result and retries if types or constraints are violated. The difference in reliability for production use is significant — JSON prompting is fine for experiments, but instructor is the right tool for pipelines where data quality matters.

Is my data sent to OpenAI when I use instructor?

instructor is a thin wrapper around the OpenAI SDK — your data goes to whatever API endpoint you configure, subject to that provider’s data policy. If you are processing sensitive data, use a self-hosted model via Ollama or another local inference server, and point instructor at your local endpoint with a custom base_url. The library itself does not send data anywhere — it only wraps the client you provide.

Conclusion

The instructor library solves one of the most persistent frustrations in LLM application development: getting the model to return data in the shape your code expects, every time. We covered patching the OpenAI client, defining Pydantic schemas with field descriptions, extracting nested and list objects, adding custom validation rules, configuring retries and modes, and using instructor with non-OpenAI providers. The job extraction pipeline demonstrated how these pieces combine into a production-ready pattern.

The next step is to extend the real-life example: add a web scraper to pull live job postings, or connect the extracted data to a database. With instructor handling the model-to-schema translation, you can focus entirely on the business logic of what to extract and what to do with it.

Full documentation and more examples are at python.useinstructor.com. The library’s GitHub has a large collection of real-world examples including classification, knowledge graph extraction, and citation-backed answers.

Further Reading: For more details, see the Python sqlite3 documentation.

Frequently Asked Questions

What are the main data storage options in Python?

Python supports flat files (text, CSV, JSON), databases (SQLite, PostgreSQL, MySQL), key-value stores (Redis, shelve), pickle serialization, and cloud storage. The best choice depends on data size, structure, and access patterns.

When should I use SQLite vs a full database?

Use SQLite for single-user apps, prototypes, and small-to-medium datasets. Switch to PostgreSQL or MySQL for concurrent multi-user access, complex queries at scale, or production-grade reliability.

How do I save Python objects to disk?

Use pickle for Python-specific serialization, json for interoperable data, shelve for dictionary-like persistent storage, or databases for structured data. For data analysis, pandas can save to CSV, Parquet, or HDF5.

Is JSON or CSV better for storing data?

JSON handles nested, hierarchical data well. CSV is simpler for tabular, flat data. Use JSON for API data and configuration; use CSV for datasets and spreadsheet-compatible exports.

How do I choose between file storage and a database?

Use file storage for simple, single-user scenarios. Use a database when you need querying, indexing, concurrent access, or ACID transactions. SQLite bridges both worlds for simpler applications.

7 Python Programming Myths

Beginner

Python is among the top programming languages that have been used in recent years in designing high-end technologies, such as Machine Learning, artificial intelligence, and data science. Programmers also use Python as their language of choice in developing large-scale applications that scale several products and services. This is why reputed companies hire candidates with good knowledge in coding with Python and other programming skills.

However, despite all these, some python myths can be a concern for aspiring developers. Below are some of the python programming myths you can easily come across.

1. Python is Slow

While Python is admissibly slower than Java and C++, it responds faster than JavaScript, Ruby, and other languages. Python features have specific runtimes and are not slower than other languages. Therefore, using Python for complicated applications saves time, and you’ll be done in a few minutes.

Some years ago, CPUs and memory were costly. However, currently, you can buy better-performing hardware at an affordable price to support programming with Python. Python also supports several programming paradigms, making it functional and imperative.

Python is slow. Python is too pretty. Python doesn't scale. All wrong. — Python is slow. Python is too pretty. Python doesn’t scale. All wrong.

2. Python is Not Compiled and Only Used for Scripting

Python is generally an interpreted coding language since it falls in this category but is also considered a compiled language like Java and other programming languages. The compiling process is automated, making it difficult to detect, and a separate compiler isn’t required. It mostly compiles on virtual machines.

Python isn’t a scripting language wholly but more of a general-purpose coding language that can be used for scripting. Like most scripting languages, Python doesn’t have networking, regular expression, and exception features. This makes it a reliable and trusted programming language that can automate several tasks.

3. Learning to Code with Python is difficult and Time-consuming

Learning to program with Python is easy as it doesn’t require any prior programming knowledge. However, coding experts are advantaged as they can easily relate to its concepts. Python is a high-level language that can easily be implemented. Most of its syntax is simple mathematical instructions and calculations.

Most statements written in python programs look familiar with the English language as it contains less syntax. That said, learning to code with Python can take between three to six months, depending on your commitment. Besides, there are plenty of learning resources and a large supporting community that is ready to help learners.

4. Python is Not Scalable

Contrary to what most people believe, Python can scale both horizontally and vertically better than other languages. However, there is some confusion about this. The scaling process isn’t automated, thus requires some engineering effort. Scaling Python isn’t a straightforward process as it requires several entities.

For instance, you should make the most from the underlying memory, enhance single systems into distributed form, and more. Nonetheless, with proper architecture, scaling Python won’t be a problem.

Python runs your bank, your search engine, and your spacecraft.

5. Coding with Python is Expensive

You are highly mistaken if you think python programming is expensive. Unlike other coding languages, Python is an open-source language that can be downloaded for free from its official website. Python was officially developed in 1991 and is managed under the Python Software Foundation, which guarantees small and large scale users an Open Source License.

However, most of Python’s licenses remain open-source, though others are not. Some contributions, especially those from the General Public License, require users to pay a fee to access customizations added by other developers.

6. Python has Support and Security Issues

Another common myth is that Python isn’t secure, and code lines can easily be hacked. Most programmers believe the assumption that python codes are prone to cyberattacks. In contrast, Python has been used to build networking security systems. The language is also used to develop security testing tools and automation testing, which perform faster compared to others.

On the other hand, Python’s support team is always on standby and ready to assist in case of security issues affecting python programmers. You can contact them anytime, and your details will be kept confidential. Python has also adapted PayPal, eBay, and other highly-secured third-party payment gateways to prove its legitimacy.

7. Python Cannot be used for Big Projects

Just because Python is a simple language doesn’t mean it cannot be applied in big projects. Python has reusable codes and an extensive predefined library, which allow developers to create new codes tailored to suit project needs. Python libraries are also reusable, reducing the amount of time and effort required to write codes. Other languages are quite complicated and take long before a program is designed and implemented. This explains why tech giants, including Google, Facebook, YouTube, and Instagram, use this language.

Major websites / internet services written in Python

The Bottom Line

There is a lot to discover about Python and other programming languages in general. You shouldn’t agree easily to some of these baseless and unproven myths and misconceptions, which often arise during specific situations. That said, if you have some python programming basics, check out this course to learn UX/UI design and advance your skills to also expand your applications to cover front end as well.

[mfe_send_fox title=”Join the Python Insiders Group and get FREE tips in your inbox” body=”

Also, when you subscribe, we will send you a list of the most useful python one liners which will help you save time, make your code more readable, and which you can use immediately in your code! Subscribe to our email list and get the list now!

” list=”237850″ redirect=”https://pythonhowtoprogram.com/thank-you-for-subscribing/” check_last_name=”off” send=”Subscribe FREE to the email group” layout=”top_bottom” first_name_fullwidth=”off” email_fullwidth=”off” _builder_version=”4.17.4″ _module_preset=”default” header_text_color=”#FFFFFF” body_text_color=”#D6D6D6″ background_color=”#0C71C3″ custom_button=”on” button_text_size=”18px” button_bg_color=”#001860″ button_border_radius=”62px” button_font=”|700|||||||” hover_enabled=”0″ global_colors_info=”{}” sticky_enabled=”0″][/mfe_send_fox]

Further Reading: For more details, see the Python FAQ.

Comparing Python to other Web Development Languages

by Pubs | Beginner, Management, Skill Level

Beginner

If you are new to the world of computer programming, choosing a programming language, to begin with, is probably the toughest hurdle. Currently, there are thousands of programming languages with different idiosyncrasies and complexities. On our site, we focus on Python, but there are other languages out there. Before you start your software development journey, choosing a programming language that suits your interests and career goals is important. That said, below are some of the best and in-demand coding languages you should consider.

1. JavaScript

Modern software developers cannot succeed without mastering JavaScript. A 2020 survey done by Stack Overflow found that JavaScript is still the most popular programming language for developers for eight years in a row. More than 70% of study participants reported that they used this language for more than one year.

Together with CSS and HTML, JavaScript is an important coding language for front-end website development. Most websites, including Facebook, Gmail, YouTube, and Twitter, depend on JavaScript to display dynamic content to users for their interactive website pages.

Even though JavaScript is primarily a front-end web development language on browsers, it can be used on the server-side to develop scalable network applications with the help of Node.js. Node.js works with Windows, Linux, Mac OS, and SunOs.

JavaScript is a popular language amongst programming beginners because of its simple learning curve. It is used all through the web, thanks to its speed, and works well with other coding languages, enabling it to be used in various applications. That aside, the demand for JavaScript developers is currently high, with a CareerFoundry study concluding that 72% of businesses need JavaScript developers.

Pros of learning JavaScript

Fast and can run immediately in browsers
Provides an enriched and better web interface
Highly versatile
It can be used in various applications
Has multiple add-ons
Easily integrates with other programming languages.

Cons of learning JavaScript

Lacks an equivalent or alternate method
Different web browsers can interpret code lines differently.

2. Python

Python is a general-purpose coding language that is also very learner-friendly; there are even Python classes for children. However, despite being easy to learn, Python is an overly versatile and powerful language, making it suitable for beginners and experts. It is because of this that major companies, including Facebook and Google, use this language.

Python’s popularity is largely attributed to its extensive usage. It has applications in data science, scientific computing, data analytics, animation, database interfacing, web applications, machine learning, and data visualization. This versatility also explains the high demand for experts in this language.

Key features of Python include;

It has a unique selling point – simple, productive, elegant, and powerful in one package.
It influences other programming languages, such as Go and Julia
Best for back-end web development with first-class integration with other programming languages, such as C++ and C.
It offers many tools that can be applied in computational science, mathematics, statistics, and various libraries and frameworks, such as NumPy, Scikit-Learn, and Pandas.

Pros of learning Python

Works in various platforms
Improves developers and programmers productivity
Has a wide array of support frameworks and libraries
Powered by object-oriented programming

Cons of learning Python

Not ideal for mobile computing
It has a primitive and underdeveloped database

Python won the language war. Just not the speed war.

3. Java

Java is another popular coding language commonly used in-app and web development. Despite being an old coding language, Java is still in demand due to its complexity. Unfortunately, it isn’t beginner-friendly. It is a platform-independent language and a popular choice for various organizations, including Google and Airbnb, for its stability.

Key features of Java include;

It is a multi-paradigm and feature-rich programming language
Very productive for developers
Moderate learning curve
It doesn’t have major changes and updates like Python and Scala
Has the best runtime

Pros of learning Java

Has a wide array of open-source libraries
Automated garbage collection
Allows for platform independence
Supports multithreading and distributed computing
Has multiple APIs that support completion of various tasks, such as database connection, networking, and XML parsing

Cons of learning Java

Expensive memory management
Slow compared to other coding languages, such as C and C++

4. C#

C# is an object-oriented programming language developed by Microsoft. It was initially designed as part of the .NET framework for developing windows applications but is currently used in various applications. It is a general-purpose coding language used particularly in back-end development, game creation, mobile app development, and more. Despite being a Windows-specific language, it can also be used in Android, Linux, and iOS platforms.

The language has a legion of libraries and frameworks that have accrued for the last 20 years. Like Java, C# is independent of other platforms, thanks to its Common Language Runtime feature.

Pros of learning C#

Can work with shared codebases
Safe compared C++ and C
Uses similar syntax with C++ and other C-derived languages
Has rich data types and library
Has a fast compilation and execution

Cons of learning C#

Less flexible compared to C++
You should have good knowledge to solve errors

Same web app, two stacks. The stack matters less than the team.

5. PHP

PHP is another excellent programming language with many applications. While it faces stiff competition from other languages, such as Python and JavaScript, especially for web development, there is still a high demand for PHP professionals in the current job market. PHP is also a general-purpose and dynamic coding language that can be used to develop server-side applications.

Pros of learning PHP

Easy to learn and use
Has a wide ecosystem and community support
Has many frameworks
Supports object-oriented and functional paradigms
Supports various automation tools

Cons of PHP

Builds slow web pages
Lacks error and security handling features

6. Angular

Angular is a recently updated and improved version of the initial AngularJS framework developed by Google. Compared to other recent coding languages, such as React, Angular has a steep learning curve but offers better practical solutions for front-end development. Developers can also program complicated and scalable applications using Angular, thanks to its great functionality, aesthetic visual designs, and business logic.

Key features of Angular include;

Features a model-view control architecture that facilitates dynamic modeling
Uses HTML coding language to develop user interfaces that are simple and easy to understand
Uses old JavaScript objects, which are self-sufficient and very functional
Has Angular filters, which filter data before being viewed

Pros of learning Angular

Requires minimal coding experience to use
Allows development of high-quality hybrid apps
Has quick app prototyping
Has enhanced testing ability

Cons of Angular

Angular developed apps are dynamic, diminishing their performance
Complicated pages in apps can cause glitches
Difficult to learn

Python or JavaScript? Pick what your team can hire for.

7. React

Also called ReactJS, React is a JavaScript framework developed by Facebook that enables programmers to develop user interfaces with dynamic abilities. Sites built using React respond faster, and developers can switch between multiple variable elements seamlessly. The language also enables businesses to build and maintain customer loyalty by providing a great user experience.

Pros of learning React

Easy to learn and SEO friendly
Reuses various components, thus saves time
Has an open-source library
Supported by a strong online community
Has plenty of helpful development tools

Cons of React

Additional SEO hurdle
Has poor code documentation

The Bottom Line

As you choose your preferred web development language to learn, ensure that you aren’t guided by flashy inclinations and popularity contests. Even though the realm of computer programming keeps changing rapidly, the languages mentioned above can withstand these changes. Learning one or more of these languages will put you in a great position for many years to come. Make use of federal funding to pay for your online programming courses and Bootcamps. Veterans can learn web development languages at a discount using the GI Bill Benefits.

How To Use Python instructor for Structured LLM Outputs

by Pubs | May 19, 2026 | APIs, Intermediate

Intermediate

You ask an LLM to extract a user’s name, age, and email from a paragraph of text. Sometimes it returns clean JSON. Sometimes it returns JSON wrapped in markdown fences. Sometimes it returns a paragraph explaining why it extracted those fields. If you have ever built a pipeline that breaks because the model decided today was a good day to add “Sure! Here is the extracted data:” before the JSON, you already understand why instructor exists.

The instructor library patches the OpenAI client (and any OpenAI-compatible API) to force the model to return a fully validated Pydantic model — every time. When validation fails, it retries automatically. You define exactly what fields you need, with their types and constraints, and instructor handles the conversation with the model until the output matches your schema. You need Python 3.9+, an OpenAI API key (or compatible endpoint), and pip install instructor.

This article walks through everything you need to get structured LLM outputs in production: installing and patching the client, defining Pydantic schemas, extracting nested objects, handling lists, using validation hooks, working with non-OpenAI models via LiteLLM, and building a real extraction pipeline. By the end you will have a reusable pattern for reliable structured data from any LLM.

Structured LLM Output: Quick Example

The fastest way to see instructor in action is to extract a structured object from a single sentence. Install the library and try this:

# quick_instructor.py
import instructor
from openai import OpenAI
from pydantic import BaseModel

client = instructor.from_openai(OpenAI())

class Person(BaseModel):
    name: str
    age: int
    city: str

person = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=Person,
    messages=[{"role": "user", "content": "Alice is 32 years old and lives in Melbourne."}]
)

print(person.name)   # Alice
print(person.age)    # 32
print(person.city)   # Melbourne
print(type(person))  # <class '__main__.Person'>

Output:

Alice
32
Melbourne
<class '__main__.Person'>

The key line is instructor.from_openai(OpenAI()) — this patches the standard OpenAI client. After that, you pass response_model=Person to any chat.completions.create call, and instructor automatically: sends the Pydantic schema to the model as a tool definition, parses the model’s tool-call response, validates it against your schema, and retries if validation fails. The return value is a fully typed Pydantic object, not a string or dict.

That example covers the simplest case. The sections below show how to handle nested models, lists, validation rules, retry configuration, and real-world pipelines.

What Is instructor and Why Use It?

When you call an LLM without constraints, it returns free-form text. Parsing that text into structured data is fragile — you write regex, JSON parsers, and fallback handlers that break every time the model changes its wording. instructor solves this by using OpenAI’s function/tool calling feature under the hood: it converts your Pydantic model into a JSON Schema tool definition, forces the model to call that tool, and validates the returned arguments against your schema.

The result is LLM output that behaves like a typed function return value instead of a string you have to parse. If the model returns a field with the wrong type (for example, age as a string “thirty-two” instead of an integer), instructor sends the validation error back to the model and asks it to try again — up to a configurable number of retries.

Approach	Reliability	Type Safety	Auto-Retry
Parse raw LLM text	Fragile	None	Manual
Parse JSON from prompt	Moderate	Manual	Manual
OpenAI function calling	Good	Partial	None
instructor + Pydantic	High	Full	Built-in

The library supports multiple backends: instructor.from_openai, instructor.from_anthropic, instructor.from_gemini, and any OpenAI-compatible endpoint via base_url. This makes it the same interface regardless of which model you use.

Installation and Setup

Install instructor and the OpenAI SDK together. If you are using a different provider, you may also need their SDK:

# Terminal
pip install instructor openai pydantic

Set your API key as an environment variable so it never appears in your code:

# setup_env.py -- run once, or add to your shell profile
import os
# In practice, set this in your shell:
# export OPENAI_API_KEY="sk-..."
print("OPENAI_API_KEY set:", bool(os.environ.get("OPENAI_API_KEY")))

Output:

OPENAI_API_KEY set: True

Patch the client once at startup and reuse it for all calls. Creating a new patched client for every request is wasteful:

# client_setup.py
import instructor
from openai import OpenAI

# Patch once at startup
client = instructor.from_openai(OpenAI())  # reads OPENAI_API_KEY from env

# The client now has response_model support on all completion calls
print(type(client))  # <class 'instructor.client.Instructor'>

Output:

<class 'instructor.client.Instructor'>

Defining Pydantic Schemas for Extraction

Your Pydantic model defines exactly what fields the LLM must return. Field descriptions improve accuracy significantly — the model uses them as instructions for what to put in each field. Use Field(description=...) to guide the extraction:

# schema_example.py
import instructor
from openai import OpenAI
from pydantic import BaseModel, Field
from typing import Optional

client = instructor.from_openai(OpenAI())

class JobPosting(BaseModel):
    title: str = Field(description="The exact job title as written in the posting")
    company: str = Field(description="Company name offering the position")
    location: str = Field(description="City and country, or 'Remote'")
    salary_min: Optional[int] = Field(None, description="Minimum annual salary in USD if mentioned")
    salary_max: Optional[int] = Field(None, description="Maximum annual salary in USD if mentioned")
    is_remote: bool = Field(description="True if the role allows remote work")

text = """
Senior Python Developer at DataFlow Inc. -- Remote (US timezones preferred).
Salary range: $140,000 - $175,000 per year. Must have 5+ years Python experience.
"""

job = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=JobPosting,
    messages=[{"role": "user", "content": f"Extract the job details from: {text}"}]
)

print(f"Title: {job.title}")
print(f"Company: {job.company}")
print(f"Location: {job.location}")
print(f"Salary: ${job.salary_min:,} - ${job.salary_max:,}")
print(f"Remote: {job.is_remote}")

Output:

Title: Senior Python Developer
Company: DataFlow Inc.
Location: Remote (US timezones preferred)
Salary: $140,000 - $175,000
Remote: True

The Optional[int] type tells instructor (and the model) that salary fields may be absent. When the source text does not mention a salary, these fields will be None instead of hallucinated values. Always use Optional for fields that may not appear in the input — without it, the model will invent plausible-sounding values rather than leaving the field empty.

Extracting Nested and List Objects

Real-world extraction often requires nested structures — for example, an invoice with multiple line items, or a resume with a list of work experiences. instructor handles nested Pydantic models and List types natively:

# nested_extraction.py
import instructor
from openai import OpenAI
from pydantic import BaseModel, Field
from typing import List

client = instructor.from_openai(OpenAI())

class LineItem(BaseModel):
    description: str
    quantity: int
    unit_price: float

class Invoice(BaseModel):
    vendor: str
    invoice_number: str
    items: List[LineItem]
    total: float

invoice_text = """
Invoice #INV-2024-0891 from CloudHost Solutions
- 3x Server instances @ $45.00 each
- 1x SSL Certificate @ $12.00
- 2x Domain registrations @ $15.00 each
Total: $222.00
"""

result = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=Invoice,
    messages=[{"role": "user", "content": f"Extract invoice data: {invoice_text}"}]
)

print(f"Vendor: {result.vendor}")
print(f"Invoice #: {result.invoice_number}")
for item in result.items:
    print(f"  {item.quantity}x {item.description} @ ${item.unit_price:.2f}")
print(f"Total: ${result.total:.2f}")

Output:

Vendor: CloudHost Solutions
Invoice #: INV-2024-0891
  3x Server instances @ $45.00
  1x SSL Certificate @ $12.00
  2x Domain registrations @ $15.00
Total: $222.00

Nested models work because instructor converts the entire schema — including nested classes — into a JSON Schema definition that the model understands. The model fills in every field of every nested object, and Pydantic validates the whole structure recursively. If the items list is missing or a line item has an invalid type, instructor retries the extraction with the validation error as feedback.

Adding Custom Validation Rules

Pydantic’s field_validator lets you add business logic on top of type checking. instructor automatically feeds validation errors back to the model, so the model gets a second (or third) chance to return values that satisfy your rules:

# custom_validation.py
import instructor
from openai import OpenAI
from pydantic import BaseModel, Field, field_validator
from typing import List

client = instructor.from_openai(OpenAI())

class ProductReview(BaseModel):
    product_name: str
    rating: int = Field(description="Rating from 1 to 5")
    pros: List[str] = Field(description="List of positive aspects, at least one")
    cons: List[str] = Field(description="List of negative aspects, can be empty")
    summary: str = Field(description="One-sentence summary under 150 characters")

    @field_validator("rating")
    @classmethod
    def rating_in_range(cls, v: int) -> int:
        if not 1 <= v <= 5:
            raise ValueError(f"Rating must be between 1 and 5, got {v}")
        return v

    @field_validator("pros")
    @classmethod
    def at_least_one_pro(cls, v: List[str]) -> List[str]:
        if not v:
            raise ValueError("Must include at least one positive aspect")
        return v

    @field_validator("summary")
    @classmethod
    def summary_length(cls, v: str) -> str:
        if len(v) > 150:
            raise ValueError(f"Summary too long: {len(v)} chars (max 150)")
        return v

text = """
The new Python IDE is pretty solid. Boot time is fast, autocomplete works well.
The memory usage is high and the plugin store is still sparse. Overall a decent
choice for Python development. I'd give it 4 out of 5.
"""

review = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=ProductReview,
    messages=[{"role": "user", "content": f"Extract review details: {text}"}]
)

print(f"Product: {review.product_name}")
print(f"Rating: {review.rating}/5")
print(f"Pros: {review.pros}")
print(f"Cons: {review.cons}")
print(f"Summary: {review.summary}")

Output:

Product: Python IDE
Rating: 4/5
Pros: ['Fast boot time', 'Good autocomplete']
Cons: ['High memory usage', 'Sparse plugin store']
Summary: A solid Python IDE with fast performance but limited plugins and high memory usage.

When a validator raises ValueError, instructor captures the error message and sends it back to the model in a follow-up message: “Validation failed: Rating must be between 1 and 5, got 6. Please fix and try again.” The model then self-corrects. By default, instructor retries up to 3 times before raising an exception. You can configure this with max_retries=N on the completion call.

Configuring Retries and Modes

instructor supports several extraction modes depending on what your model supports. The default mode uses OpenAI’s tool calling, but you can switch to JSON mode or other strategies:

# retry_config.py
import instructor
from instructor import Mode
from openai import OpenAI
from pydantic import BaseModel

# Default: tool calling (most reliable for OpenAI models)
client_tools = instructor.from_openai(OpenAI())

# JSON mode: model returns raw JSON instead of a tool call
client_json = instructor.from_openai(OpenAI(), mode=Mode.JSON)

# MD_JSON mode: model wraps JSON in markdown fences (useful for some fine-tunes)
client_md = instructor.from_openai(OpenAI(), mode=Mode.MD_JSON)

class City(BaseModel):
    name: str
    country: str
    population: int

# Control retries per-call
city = client_tools.chat.completions.create(
    model="gpt-4o-mini",
    response_model=City,
    max_retries=5,           # retry up to 5 times on validation failure
    messages=[{"role": "user", "content": "Tell me about Tokyo"}]
)

print(f"{city.name}, {city.country}: pop {city.population:,}")

Output:

Tokyo, Japan: pop 13,960,000

For most OpenAI models, the default tool-calling mode is most reliable. Use Mode.JSON for models that support JSON mode but not tool calling — for example, some fine-tuned models or older GPT versions. The max_retries parameter controls how many times instructor will re-prompt the model when validation fails. For production pipelines where data quality matters more than cost, set this to 3-5.

Using instructor with Non-OpenAI Models

If you are using Anthropic’s Claude, Google Gemini, or a local model via Ollama, instructor has provider-specific patches. For OpenAI-compatible endpoints (like local LLMs with an OpenAI-compatible API), you can pass a custom base_url:

# multi_provider.py
import instructor
from anthropic import Anthropic
from pydantic import BaseModel

# Anthropic Claude -- uses a different client class
anthropic_client = instructor.from_anthropic(Anthropic())

class Sentiment(BaseModel):
    label: str   # "positive", "negative", or "neutral"
    score: float # confidence from 0.0 to 1.0
    reason: str  # one-sentence explanation

result = anthropic_client.messages.create(
    model="claude-3-haiku-20240307",
    max_tokens=256,
    response_model=Sentiment,
    messages=[{
        "role": "user",
        "content": "This new Python library is fantastic, saves me hours every week!"
    }]
)

print(f"Sentiment: {result.label} ({result.score:.0%})")
print(f"Reason: {result.reason}")

Output:

Sentiment: positive (96%)
Reason: The user expresses strong enthusiasm and quantifies time savings, indicating genuine satisfaction.

For local models via Ollama (which provides an OpenAI-compatible API on localhost:11434), create the client with a custom base URL:

# ollama_instructor.py
import instructor
from openai import OpenAI
from pydantic import BaseModel

# Ollama runs an OpenAI-compatible server locally
ollama_client = instructor.from_openai(
    OpenAI(base_url="http://localhost:11434/v1", api_key="ollama"),
    mode=instructor.Mode.JSON  # use JSON mode for local models
)

class Summary(BaseModel):
    headline: str
    key_points: list[str]

# Works the same as OpenAI -- just a different backend
# summary = ollama_client.chat.completions.create(
#     model="llama3.2",
#     response_model=Summary,
#     messages=[{"role": "user", "content": "Summarize Python's async/await model"}]
# )
print("Local model client ready -- uncomment to use with Ollama running")

Output:

Local model client ready -- uncomment to use with Ollama running

Real-Life Example: Job Posting Extraction Pipeline

Here is a complete pipeline that reads job postings from a list of texts, extracts structured data, filters by criteria, and exports to CSV — the kind of task that comes up in recruiting tools, market research, and job aggregators:

# job_extraction_pipeline.py
import instructor
import csv
from openai import OpenAI
from pydantic import BaseModel, Field
from typing import Optional, List

client = instructor.from_openai(OpenAI())

class JobPosting(BaseModel):
    title: str = Field(description="Job title exactly as written")
    company: str
    location: str = Field(description="City/country or 'Remote'")
    salary_min: Optional[int] = Field(None, description="Min annual salary USD")
    salary_max: Optional[int] = Field(None, description="Max annual salary USD")
    required_years: Optional[int] = Field(None, description="Years of experience required")
    technologies: List[str] = Field(description="List of technologies mentioned")
    is_remote: bool

# Sample job postings to process
JOB_TEXTS = [
    """Senior Python Engineer at Nexaflow -- Remote-first.
    $150k-$190k. 5+ years Python, FastAPI, PostgreSQL, AWS required.""",

    """Junior Data Scientist at BioMetrics Ltd (London, UK).
    GBP 45,000-55,000. 0-2 years exp, pandas, scikit-learn, matplotlib.""",

    """Staff ML Engineer at Quantra -- San Francisco CA.
    $220,000 - $280,000/yr. 8+ years, PyTorch, CUDA, distributed training.""",
]

def extract_jobs(texts: List[str]) -> List[JobPosting]:
    """Extract structured job data from raw posting texts."""
    jobs = []
    for i, text in enumerate(texts, 1):
        job = client.chat.completions.create(
            model="gpt-4o-mini",
            response_model=JobPosting,
            max_retries=3,
            messages=[{"role": "user", "content": f"Extract job details:\n\n{text}"}]
        )
        jobs.append(job)
        print(f"[{i}/{len(texts)}] Extracted: {job.title} at {job.company}")
    return jobs

def filter_remote(jobs: List[JobPosting]) -> List[JobPosting]:
    return [j for j in jobs if j.is_remote]

def export_csv(jobs: List[JobPosting], path: str) -> None:
    with open(path, "w", newline="") as f:
        writer = csv.writer(f)
        writer.writerow(["Title", "Company", "Location", "Salary Min", "Salary Max",
                         "Yrs Required", "Technologies", "Remote"])
        for j in jobs:
            writer.writerow([
                j.title, j.company, j.location,
                j.salary_min or "", j.salary_max or "",
                j.required_years or "",
                ", ".join(j.technologies),
                j.is_remote
            ])

if __name__ == "__main__":
    print("Extracting job postings...")
    jobs = extract_jobs(JOB_TEXTS)
    remote_jobs = filter_remote(jobs)
    print(f"\nTotal extracted: {len(jobs)}, Remote: {len(remote_jobs)}")
    export_csv(jobs, "jobs_extracted.csv")
    print("Saved to jobs_extracted.csv")

Output:

Extracting job postings...
[1/3] Extracted: Senior Python Engineer at Nexaflow
[2/3] Extracted: Junior Data Scientist at BioMetrics Ltd
[3/3] Extracted: Staff ML Engineer at Quantra

Total extracted: 3, Remote: 1
Saved to jobs_extracted.csv

This pipeline is easy to extend: add a database write step, connect it to a web scraper that feeds real job pages, or add more validation rules to the JobPosting model. The core pattern — extract once, validate automatically, retry on failure — stays the same regardless of the scale. You can process thousands of postings by replacing JOB_TEXTS with a generator that reads from a queue or database, keeping the extraction logic identical.

Frequently Asked Questions

Does instructor increase API costs because of retries?

Yes, each retry is an additional API call, so failed extractions cost more. In practice, with well-designed schemas and clear field descriptions, validation failures are rare — under 5% for most extraction tasks. The cost increase is usually worth the reliability gain. If cost is a concern, use max_retries=1 and handle exceptions in your code rather than retrying automatically.

Does instructor support streaming responses?

Yes. Use response_model=Iterable[YourModel] for streaming lists, or Partial[YourModel] for streaming partial updates to a single model. Streaming is useful for large extractions where you want to process results as they arrive rather than waiting for the full response. See the instructor documentation for the streaming API details.

What happens when the model cannot extract a field?

If the field is typed as Optional[X], the model will return None for missing information. If the field is required (non-Optional), the model will either hallucinate a value or fail validation, triggering a retry. For fields that may legitimately be absent in the source text, always use Optional with a None default. This is the most common mistake new users make.

Can I extract data from large documents?

Yes, but be aware of token limits. For documents larger than a few thousand words, split them into chunks and extract from each chunk separately. Use a List[YourModel] return type if a single document contains multiple items to extract (like a list of transactions in a bank statement). For very large documents, consider summarizing first with a regular completion call, then extracting from the summary.

How is this different from just prompting for JSON output?

Prompting for JSON works until it does not — the model adds markdown fences, writes a preamble sentence, or omits fields. instructor uses tool calling (not prompting) to enforce the schema, so the model cannot deviate from the structure. It also runs Pydantic validation on the result and retries if types or constraints are violated. The difference in reliability for production use is significant — JSON prompting is fine for experiments, but instructor is the right tool for pipelines where data quality matters.

Is my data sent to OpenAI when I use instructor?

instructor is a thin wrapper around the OpenAI SDK — your data goes to whatever API endpoint you configure, subject to that provider’s data policy. If you are processing sensitive data, use a self-hosted model via Ollama or another local inference server, and point instructor at your local endpoint with a custom base_url. The library itself does not send data anywhere — it only wraps the client you provide.

Conclusion

The instructor library solves one of the most persistent frustrations in LLM application development: getting the model to return data in the shape your code expects, every time. We covered patching the OpenAI client, defining Pydantic schemas with field descriptions, extracting nested and list objects, adding custom validation rules, configuring retries and modes, and using instructor with non-OpenAI providers. The job extraction pipeline demonstrated how these pieces combine into a production-ready pattern.

The next step is to extend the real-life example: add a web scraper to pull live job postings, or connect the extracted data to a database. With instructor handling the model-to-schema translation, you can focus entirely on the business logic of what to extract and what to do with it.

Full documentation and more examples are at python.useinstructor.com. The library’s GitHub has a large collection of real-world examples including classification, knowledge graph extraction, and citation-backed answers.

Further Reading: For more details, see the official Python tutorial.

Frequently Asked Questions

How does Python compare to JavaScript for web development?

Python excels in backend development with Django and Flask. JavaScript dominates the frontend and runs on the backend with Node.js. Python is preferred for data-heavy backends, while JavaScript enables full-stack development with a single language.

Is Python slower than other web languages?

Python is generally slower in raw execution speed compared to Go, Java, or Node.js. However, for most web apps the bottleneck is I/O, not CPU speed. Python’s developer productivity and rich ecosystem often outweigh the performance difference.

Can Python be used for frontend web development?

Python is primarily a backend language. Tools like Brython, Pyodide, and PyScript allow Python in the browser, but for production frontends JavaScript/TypeScript with React or Vue remains the standard.

What makes Python a good choice for web APIs?

Python offers mature API frameworks (Flask, FastAPI, Django REST Framework), excellent library support for data processing, simple syntax, and strong integration with databases, ML models, and third-party services.

Should I learn Python or JavaScript for web development?

Learn Python if you focus on data science, ML, or backend APIs. Learn JavaScript for full-stack web development. Many developers learn both. Python’s versatility across web, data, and automation makes it a strong choice.

Reading and writing text to files in Python

by Pubs | Beginner, Data Storage

Beginner

The easiest and simplest mechanism to store data from python is the humble file storage which is often, but does not have to be, text based. There are no libraries that you require, and you can use native python functions to open and write to the file very easily.

There are many use cases for file storage and is usually the “go to” method when hacking a quick solution or prototype together. These are also arguably good solutions for production use cases.

Overview of using storing data to files in Python

The typical use cases has the following commonalities:

Setup: There’s no setup that is required for files. You can create the file even from python
Volume: Size Small-ish file size (< 5-10mb). You can go larger of course if your application is not doing heavy reads or writes nor if it doesn’t require fast response (e.g. batch processing)
Record access: Does not require to search data within the file to extract just portion of the records. You would load or save all the data in the file in one go
Data Writes: You can either append to the file or you can upload and download all data in the file.
Write reliability: You do not need to have multiple writes at the same time – there is only possibility (or likelihood) of one person writing at one time, and if there was a case of multiple people writing at once, the consequence are not serious for your application. There are ways to put a lock on a file to prevent conflicts, but you should double check if a file is the write option for you
Data formats: You may have structured record based (such as comma separated value – CSV or tab delimited) or unstructured (eg document of text or JSON format). You can also store binary data in a file as well – e.g. for images
Editability: You may want or allow direct editing of the file by other applications or direct editing by people
Redundancy: There’s no inbuilt redundancy. If there is any failure (data corrupt, the server with the file fails), then you’re out of luck. You need to setup your own mechanisms (e.g. replicate file to another server automatically)

Code examples to read and write to a file

Here are two sets of example code for writing and reading from a file. It is very easy and does not require any libraries. The one thing to be mindful of is what mode you want the file to be opened- read, write, read and write.

Open a text file for (over)writing:

To write to a file, it’s very easy to do so which is to use the ‘w’ switch on the open() function. There are other options as well:

‘r’ – Reading
‘w’ – Writing to a file
‘a’ – Append to end of file
‘r+’ – Read and write to the same file
‘x’ – Used to create and write to a new file

file = open( ‘population.txt’, ‘w’)
file.write(‘Japan’)
file.write(‘United States’)
file.write(‘Australia’)
file.write(‘China’)
file.close() #file is released and closed

You will then have the following output file of population.txt:

Japan
United States
Australia
China

Open a text file fully for reading:

Using the same population.txt file created above –

file = open( ‘population.txt’, ‘r’)
data = file.read() #read full contents of file into a single string
file.close() #file is released and closed
print(“*** file start ***”)
print( data )
print(“*** end file ***”)

The output would be:

*** file start ***
Japan
United States
Australia
China
*** end file ***

Now to explain this a bit further, the open() command helps to open a file where you need to specify how the file is to be opened – in this case with ‘r’ to indicate it is for reading. There are other options as well:

‘r’ – Reading
‘w’ – Writing to a file
‘a’ – Append to end of file
‘r+’ – Read and write to the same file
‘x’ – Used to create and write to a new file

Read a text file line by line:

file = open( ‘population.txt’, ‘r’)
data_list = file.readlines() #read full contents of file into a list of rows
file.close() #file is released and closed
print(“*** file start ***”)
counter = 0
for row in data_list:
  counter = counter + 1
  print( f”{counter}:  {data_list}” )
print(“*** end file ***”)

The output would be:

*** file start ***
1: Japan
2: United States
3: Australia
4: China
*** end file ***

The difference in above to the first example is that the data comes out in a list separated by a newline so that you can process each row. Please note, you can simplify the above using the enumerate to avoid having the separate counter variable setup. E.g.

print(“*** file start ***”)
for index, row in enumerate(data_list):
  print( f”{index+1}:  {data_list}” ) #Note that when using enumerate, first index is 0
print(“*** end file ***”)

Read a file in 3 lines. Write it in 4. Everything else is detail.

Summary of writing and reading to a file

Reading and writing to a file is a very straightforward native operation in Python. There are many other related operations that you can do ranging from putting a lock on a file to prevent two processes writing to the same file, checking file attributes such as access and size, and many other operations. At the most basic though, you can simply use the “open” statement to do the read/write to satisfy most of your needs.

How To Use Python instructor for Structured LLM Outputs

by Pubs | May 19, 2026 | APIs, Intermediate

Intermediate

You ask an LLM to extract a user’s name, age, and email from a paragraph of text. Sometimes it returns clean JSON. Sometimes it returns JSON wrapped in markdown fences. Sometimes it returns a paragraph explaining why it extracted those fields. If you have ever built a pipeline that breaks because the model decided today was a good day to add “Sure! Here is the extracted data:” before the JSON, you already understand why instructor exists.

The instructor library patches the OpenAI client (and any OpenAI-compatible API) to force the model to return a fully validated Pydantic model — every time. When validation fails, it retries automatically. You define exactly what fields you need, with their types and constraints, and instructor handles the conversation with the model until the output matches your schema. You need Python 3.9+, an OpenAI API key (or compatible endpoint), and pip install instructor.

This article walks through everything you need to get structured LLM outputs in production: installing and patching the client, defining Pydantic schemas, extracting nested objects, handling lists, using validation hooks, working with non-OpenAI models via LiteLLM, and building a real extraction pipeline. By the end you will have a reusable pattern for reliable structured data from any LLM.

Structured LLM Output: Quick Example

The fastest way to see instructor in action is to extract a structured object from a single sentence. Install the library and try this:

# quick_instructor.py
import instructor
from openai import OpenAI
from pydantic import BaseModel

client = instructor.from_openai(OpenAI())

class Person(BaseModel):
    name: str
    age: int
    city: str

person = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=Person,
    messages=[{"role": "user", "content": "Alice is 32 years old and lives in Melbourne."}]
)

print(person.name)   # Alice
print(person.age)    # 32
print(person.city)   # Melbourne
print(type(person))  # <class '__main__.Person'>

Output:

Alice
32
Melbourne
<class '__main__.Person'>

The key line is instructor.from_openai(OpenAI()) — this patches the standard OpenAI client. After that, you pass response_model=Person to any chat.completions.create call, and instructor automatically: sends the Pydantic schema to the model as a tool definition, parses the model’s tool-call response, validates it against your schema, and retries if validation fails. The return value is a fully typed Pydantic object, not a string or dict.

That example covers the simplest case. The sections below show how to handle nested models, lists, validation rules, retry configuration, and real-world pipelines.

What Is instructor and Why Use It?

When you call an LLM without constraints, it returns free-form text. Parsing that text into structured data is fragile — you write regex, JSON parsers, and fallback handlers that break every time the model changes its wording. instructor solves this by using OpenAI’s function/tool calling feature under the hood: it converts your Pydantic model into a JSON Schema tool definition, forces the model to call that tool, and validates the returned arguments against your schema.

The result is LLM output that behaves like a typed function return value instead of a string you have to parse. If the model returns a field with the wrong type (for example, age as a string “thirty-two” instead of an integer), instructor sends the validation error back to the model and asks it to try again — up to a configurable number of retries.

Approach	Reliability	Type Safety	Auto-Retry
Parse raw LLM text	Fragile	None	Manual
Parse JSON from prompt	Moderate	Manual	Manual
OpenAI function calling	Good	Partial	None
instructor + Pydantic	High	Full	Built-in

The library supports multiple backends: instructor.from_openai, instructor.from_anthropic, instructor.from_gemini, and any OpenAI-compatible endpoint via base_url. This makes it the same interface regardless of which model you use.

Installation and Setup

Install instructor and the OpenAI SDK together. If you are using a different provider, you may also need their SDK:

# Terminal
pip install instructor openai pydantic

Set your API key as an environment variable so it never appears in your code:

# setup_env.py -- run once, or add to your shell profile
import os
# In practice, set this in your shell:
# export OPENAI_API_KEY="sk-..."
print("OPENAI_API_KEY set:", bool(os.environ.get("OPENAI_API_KEY")))

Output:

OPENAI_API_KEY set: True

Patch the client once at startup and reuse it for all calls. Creating a new patched client for every request is wasteful:

# client_setup.py
import instructor
from openai import OpenAI

# Patch once at startup
client = instructor.from_openai(OpenAI())  # reads OPENAI_API_KEY from env

# The client now has response_model support on all completion calls
print(type(client))  # <class 'instructor.client.Instructor'>

Output:

<class 'instructor.client.Instructor'>

Defining Pydantic Schemas for Extraction

Your Pydantic model defines exactly what fields the LLM must return. Field descriptions improve accuracy significantly — the model uses them as instructions for what to put in each field. Use Field(description=...) to guide the extraction:

# schema_example.py
import instructor
from openai import OpenAI
from pydantic import BaseModel, Field
from typing import Optional

client = instructor.from_openai(OpenAI())

class JobPosting(BaseModel):
    title: str = Field(description="The exact job title as written in the posting")
    company: str = Field(description="Company name offering the position")
    location: str = Field(description="City and country, or 'Remote'")
    salary_min: Optional[int] = Field(None, description="Minimum annual salary in USD if mentioned")
    salary_max: Optional[int] = Field(None, description="Maximum annual salary in USD if mentioned")
    is_remote: bool = Field(description="True if the role allows remote work")

text = """
Senior Python Developer at DataFlow Inc. -- Remote (US timezones preferred).
Salary range: $140,000 - $175,000 per year. Must have 5+ years Python experience.
"""

job = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=JobPosting,
    messages=[{"role": "user", "content": f"Extract the job details from: {text}"}]
)

print(f"Title: {job.title}")
print(f"Company: {job.company}")
print(f"Location: {job.location}")
print(f"Salary: ${job.salary_min:,} - ${job.salary_max:,}")
print(f"Remote: {job.is_remote}")

Output:

Title: Senior Python Developer
Company: DataFlow Inc.
Location: Remote (US timezones preferred)
Salary: $140,000 - $175,000
Remote: True

The Optional[int] type tells instructor (and the model) that salary fields may be absent. When the source text does not mention a salary, these fields will be None instead of hallucinated values. Always use Optional for fields that may not appear in the input — without it, the model will invent plausible-sounding values rather than leaving the field empty.

Extracting Nested and List Objects

Real-world extraction often requires nested structures — for example, an invoice with multiple line items, or a resume with a list of work experiences. instructor handles nested Pydantic models and List types natively:

# nested_extraction.py
import instructor
from openai import OpenAI
from pydantic import BaseModel, Field
from typing import List

client = instructor.from_openai(OpenAI())

class LineItem(BaseModel):
    description: str
    quantity: int
    unit_price: float

class Invoice(BaseModel):
    vendor: str
    invoice_number: str
    items: List[LineItem]
    total: float

invoice_text = """
Invoice #INV-2024-0891 from CloudHost Solutions
- 3x Server instances @ $45.00 each
- 1x SSL Certificate @ $12.00
- 2x Domain registrations @ $15.00 each
Total: $222.00
"""

result = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=Invoice,
    messages=[{"role": "user", "content": f"Extract invoice data: {invoice_text}"}]
)

print(f"Vendor: {result.vendor}")
print(f"Invoice #: {result.invoice_number}")
for item in result.items:
    print(f"  {item.quantity}x {item.description} @ ${item.unit_price:.2f}")
print(f"Total: ${result.total:.2f}")

Output:

Vendor: CloudHost Solutions
Invoice #: INV-2024-0891
  3x Server instances @ $45.00
  1x SSL Certificate @ $12.00
  2x Domain registrations @ $15.00
Total: $222.00

Nested models work because instructor converts the entire schema — including nested classes — into a JSON Schema definition that the model understands. The model fills in every field of every nested object, and Pydantic validates the whole structure recursively. If the items list is missing or a line item has an invalid type, instructor retries the extraction with the validation error as feedback.

Adding Custom Validation Rules

Pydantic’s field_validator lets you add business logic on top of type checking. instructor automatically feeds validation errors back to the model, so the model gets a second (or third) chance to return values that satisfy your rules:

# custom_validation.py
import instructor
from openai import OpenAI
from pydantic import BaseModel, Field, field_validator
from typing import List

client = instructor.from_openai(OpenAI())

class ProductReview(BaseModel):
    product_name: str
    rating: int = Field(description="Rating from 1 to 5")
    pros: List[str] = Field(description="List of positive aspects, at least one")
    cons: List[str] = Field(description="List of negative aspects, can be empty")
    summary: str = Field(description="One-sentence summary under 150 characters")

    @field_validator("rating")
    @classmethod
    def rating_in_range(cls, v: int) -> int:
        if not 1 <= v <= 5:
            raise ValueError(f"Rating must be between 1 and 5, got {v}")
        return v

    @field_validator("pros")
    @classmethod
    def at_least_one_pro(cls, v: List[str]) -> List[str]:
        if not v:
            raise ValueError("Must include at least one positive aspect")
        return v

    @field_validator("summary")
    @classmethod
    def summary_length(cls, v: str) -> str:
        if len(v) > 150:
            raise ValueError(f"Summary too long: {len(v)} chars (max 150)")
        return v

text = """
The new Python IDE is pretty solid. Boot time is fast, autocomplete works well.
The memory usage is high and the plugin store is still sparse. Overall a decent
choice for Python development. I'd give it 4 out of 5.
"""

review = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=ProductReview,
    messages=[{"role": "user", "content": f"Extract review details: {text}"}]
)

print(f"Product: {review.product_name}")
print(f"Rating: {review.rating}/5")
print(f"Pros: {review.pros}")
print(f"Cons: {review.cons}")
print(f"Summary: {review.summary}")

Output:

Product: Python IDE
Rating: 4/5
Pros: ['Fast boot time', 'Good autocomplete']
Cons: ['High memory usage', 'Sparse plugin store']
Summary: A solid Python IDE with fast performance but limited plugins and high memory usage.

When a validator raises ValueError, instructor captures the error message and sends it back to the model in a follow-up message: “Validation failed: Rating must be between 1 and 5, got 6. Please fix and try again.” The model then self-corrects. By default, instructor retries up to 3 times before raising an exception. You can configure this with max_retries=N on the completion call.

Configuring Retries and Modes

instructor supports several extraction modes depending on what your model supports. The default mode uses OpenAI’s tool calling, but you can switch to JSON mode or other strategies:

# retry_config.py
import instructor
from instructor import Mode
from openai import OpenAI
from pydantic import BaseModel

# Default: tool calling (most reliable for OpenAI models)
client_tools = instructor.from_openai(OpenAI())

# JSON mode: model returns raw JSON instead of a tool call
client_json = instructor.from_openai(OpenAI(), mode=Mode.JSON)

# MD_JSON mode: model wraps JSON in markdown fences (useful for some fine-tunes)
client_md = instructor.from_openai(OpenAI(), mode=Mode.MD_JSON)

class City(BaseModel):
    name: str
    country: str
    population: int

# Control retries per-call
city = client_tools.chat.completions.create(
    model="gpt-4o-mini",
    response_model=City,
    max_retries=5,           # retry up to 5 times on validation failure
    messages=[{"role": "user", "content": "Tell me about Tokyo"}]
)

print(f"{city.name}, {city.country}: pop {city.population:,}")

Output:

Tokyo, Japan: pop 13,960,000

For most OpenAI models, the default tool-calling mode is most reliable. Use Mode.JSON for models that support JSON mode but not tool calling — for example, some fine-tuned models or older GPT versions. The max_retries parameter controls how many times instructor will re-prompt the model when validation fails. For production pipelines where data quality matters more than cost, set this to 3-5.

Using instructor with Non-OpenAI Models

If you are using Anthropic’s Claude, Google Gemini, or a local model via Ollama, instructor has provider-specific patches. For OpenAI-compatible endpoints (like local LLMs with an OpenAI-compatible API), you can pass a custom base_url:

# multi_provider.py
import instructor
from anthropic import Anthropic
from pydantic import BaseModel

# Anthropic Claude -- uses a different client class
anthropic_client = instructor.from_anthropic(Anthropic())

class Sentiment(BaseModel):
    label: str   # "positive", "negative", or "neutral"
    score: float # confidence from 0.0 to 1.0
    reason: str  # one-sentence explanation

result = anthropic_client.messages.create(
    model="claude-3-haiku-20240307",
    max_tokens=256,
    response_model=Sentiment,
    messages=[{
        "role": "user",
        "content": "This new Python library is fantastic, saves me hours every week!"
    }]
)

print(f"Sentiment: {result.label} ({result.score:.0%})")
print(f"Reason: {result.reason}")

Output:

Sentiment: positive (96%)
Reason: The user expresses strong enthusiasm and quantifies time savings, indicating genuine satisfaction.

For local models via Ollama (which provides an OpenAI-compatible API on localhost:11434), create the client with a custom base URL:

# ollama_instructor.py
import instructor
from openai import OpenAI
from pydantic import BaseModel

# Ollama runs an OpenAI-compatible server locally
ollama_client = instructor.from_openai(
    OpenAI(base_url="http://localhost:11434/v1", api_key="ollama"),
    mode=instructor.Mode.JSON  # use JSON mode for local models
)

class Summary(BaseModel):
    headline: str
    key_points: list[str]

# Works the same as OpenAI -- just a different backend
# summary = ollama_client.chat.completions.create(
#     model="llama3.2",
#     response_model=Summary,
#     messages=[{"role": "user", "content": "Summarize Python's async/await model"}]
# )
print("Local model client ready -- uncomment to use with Ollama running")

Output:

Local model client ready -- uncomment to use with Ollama running

Real-Life Example: Job Posting Extraction Pipeline

Here is a complete pipeline that reads job postings from a list of texts, extracts structured data, filters by criteria, and exports to CSV — the kind of task that comes up in recruiting tools, market research, and job aggregators:

# job_extraction_pipeline.py
import instructor
import csv
from openai import OpenAI
from pydantic import BaseModel, Field
from typing import Optional, List

client = instructor.from_openai(OpenAI())

class JobPosting(BaseModel):
    title: str = Field(description="Job title exactly as written")
    company: str
    location: str = Field(description="City/country or 'Remote'")
    salary_min: Optional[int] = Field(None, description="Min annual salary USD")
    salary_max: Optional[int] = Field(None, description="Max annual salary USD")
    required_years: Optional[int] = Field(None, description="Years of experience required")
    technologies: List[str] = Field(description="List of technologies mentioned")
    is_remote: bool

# Sample job postings to process
JOB_TEXTS = [
    """Senior Python Engineer at Nexaflow -- Remote-first.
    $150k-$190k. 5+ years Python, FastAPI, PostgreSQL, AWS required.""",

    """Junior Data Scientist at BioMetrics Ltd (London, UK).
    GBP 45,000-55,000. 0-2 years exp, pandas, scikit-learn, matplotlib.""",

    """Staff ML Engineer at Quantra -- San Francisco CA.
    $220,000 - $280,000/yr. 8+ years, PyTorch, CUDA, distributed training.""",
]

def extract_jobs(texts: List[str]) -> List[JobPosting]:
    """Extract structured job data from raw posting texts."""
    jobs = []
    for i, text in enumerate(texts, 1):
        job = client.chat.completions.create(
            model="gpt-4o-mini",
            response_model=JobPosting,
            max_retries=3,
            messages=[{"role": "user", "content": f"Extract job details:\n\n{text}"}]
        )
        jobs.append(job)
        print(f"[{i}/{len(texts)}] Extracted: {job.title} at {job.company}")
    return jobs

def filter_remote(jobs: List[JobPosting]) -> List[JobPosting]:
    return [j for j in jobs if j.is_remote]

def export_csv(jobs: List[JobPosting], path: str) -> None:
    with open(path, "w", newline="") as f:
        writer = csv.writer(f)
        writer.writerow(["Title", "Company", "Location", "Salary Min", "Salary Max",
                         "Yrs Required", "Technologies", "Remote"])
        for j in jobs:
            writer.writerow([
                j.title, j.company, j.location,
                j.salary_min or "", j.salary_max or "",
                j.required_years or "",
                ", ".join(j.technologies),
                j.is_remote
            ])

if __name__ == "__main__":
    print("Extracting job postings...")
    jobs = extract_jobs(JOB_TEXTS)
    remote_jobs = filter_remote(jobs)
    print(f"\nTotal extracted: {len(jobs)}, Remote: {len(remote_jobs)}")
    export_csv(jobs, "jobs_extracted.csv")
    print("Saved to jobs_extracted.csv")

Output:

Extracting job postings...
[1/3] Extracted: Senior Python Engineer at Nexaflow
[2/3] Extracted: Junior Data Scientist at BioMetrics Ltd
[3/3] Extracted: Staff ML Engineer at Quantra

Total extracted: 3, Remote: 1
Saved to jobs_extracted.csv

This pipeline is easy to extend: add a database write step, connect it to a web scraper that feeds real job pages, or add more validation rules to the JobPosting model. The core pattern — extract once, validate automatically, retry on failure — stays the same regardless of the scale. You can process thousands of postings by replacing JOB_TEXTS with a generator that reads from a queue or database, keeping the extraction logic identical.

Frequently Asked Questions

Does instructor increase API costs because of retries?

Yes, each retry is an additional API call, so failed extractions cost more. In practice, with well-designed schemas and clear field descriptions, validation failures are rare — under 5% for most extraction tasks. The cost increase is usually worth the reliability gain. If cost is a concern, use max_retries=1 and handle exceptions in your code rather than retrying automatically.

Does instructor support streaming responses?

Yes. Use response_model=Iterable[YourModel] for streaming lists, or Partial[YourModel] for streaming partial updates to a single model. Streaming is useful for large extractions where you want to process results as they arrive rather than waiting for the full response. See the instructor documentation for the streaming API details.

What happens when the model cannot extract a field?

If the field is typed as Optional[X], the model will return None for missing information. If the field is required (non-Optional), the model will either hallucinate a value or fail validation, triggering a retry. For fields that may legitimately be absent in the source text, always use Optional with a None default. This is the most common mistake new users make.

Can I extract data from large documents?

Yes, but be aware of token limits. For documents larger than a few thousand words, split them into chunks and extract from each chunk separately. Use a List[YourModel] return type if a single document contains multiple items to extract (like a list of transactions in a bank statement). For very large documents, consider summarizing first with a regular completion call, then extracting from the summary.

How is this different from just prompting for JSON output?

Prompting for JSON works until it does not — the model adds markdown fences, writes a preamble sentence, or omits fields. instructor uses tool calling (not prompting) to enforce the schema, so the model cannot deviate from the structure. It also runs Pydantic validation on the result and retries if types or constraints are violated. The difference in reliability for production use is significant — JSON prompting is fine for experiments, but instructor is the right tool for pipelines where data quality matters.

Is my data sent to OpenAI when I use instructor?

instructor is a thin wrapper around the OpenAI SDK — your data goes to whatever API endpoint you configure, subject to that provider’s data policy. If you are processing sensitive data, use a self-hosted model via Ollama or another local inference server, and point instructor at your local endpoint with a custom base_url. The library itself does not send data anywhere — it only wraps the client you provide.

Conclusion

The instructor library solves one of the most persistent frustrations in LLM application development: getting the model to return data in the shape your code expects, every time. We covered patching the OpenAI client, defining Pydantic schemas with field descriptions, extracting nested and list objects, adding custom validation rules, configuring retries and modes, and using instructor with non-OpenAI providers. The job extraction pipeline demonstrated how these pieces combine into a production-ready pattern.

The next step is to extend the real-life example: add a web scraper to pull live job postings, or connect the extracted data to a database. With instructor handling the model-to-schema translation, you can focus entirely on the business logic of what to extract and what to do with it.

Full documentation and more examples are at python.useinstructor.com. The library’s GitHub has a large collection of real-world examples including classification, knowledge graph extraction, and citation-backed answers.

Further Reading: For more details, see the Python Input and Output tutorial.

Forgot to close the file? Welcome to the leak club.

Frequently Asked Questions

How do I read a text file in Python?

Use open('file.txt', 'r') with a with statement: with open('file.txt') as f: content = f.read(). This reads the entire file and automatically closes it. Use f.readlines() to get a list of lines instead.

What is the difference between read(), readline(), and readlines()?

read() returns the entire file as a single string. readline() reads one line at a time. readlines() returns a list of all lines. For large files, iterating with for line in f: is the most memory-efficient approach.

How do I write to a file in Python?

Use open('file.txt', 'w') to write (overwrites existing content) or 'a' to append. Write with f.write('text') or f.writelines(list_of_strings). Always use a with statement to ensure the file is properly closed.

What encoding should I use when reading text files?

Use encoding='utf-8' for most modern text files. UTF-8 handles international characters and is the default on most systems. For legacy files, you may need 'latin-1' or 'cp1252'.

How do I handle file not found errors in Python?

Use a try/except block catching FileNotFoundError. Alternatively, check if the file exists first with pathlib.Path('file.txt').exists() before attempting to read it.

Storing settings data in Config File in Python

by Pubs | Data Storage, Intermediate

Intermediate

A config file is a flat file but is used for reading and writing of settings that affect the behaviour of your application. These files can be incredibly useful so that you can put individual settings inside the human editable file and then have the settings read from your application. This helps you configure your application in the way you need without having to change the application code.

Typically the config file is edited by a simple text editor by the user, then the application runs and reads the config file. If there are any changes to the config file, normally (depending how the code is written), the application will then have to be restarted to take on the new settings.

Some of the considerations for using a config file as a “data store” includes:

Setup: There’s no setup that is required for files. You should use one of the config management python libraries that are available to make it easier to manipulate config files.
Volume: Size Small-ish file size (< 5-10mb)
Record access: Does not require to search data within the file to extract just a portion of the records. You would load or save all the data in the file in one go
Data Writes: Applications don’t generally write to a config file, but it can be done. Instead the config file is edited outside in a text editor
Data formats: Normally the data would be a structured record based (such as comma separated value – CSV or tab delimited), or a more complex structure such as what you see in windows based .INI files or JSON format even
Editability: You generally want to allow direct editing of the file by users
Redundancy: There’s no inbuilt redundancy. If there is any failure (data corrupt, the server with the file fails), then you’re out of luck. You need to setup your own mechanisms (e.g. replicate file to another server automatically)

Code examples to read and write from config file using ConfigParse

Setting up a config file is actually not that much harder than simply creating a constants inside your application. Your main decision will be what type of configuration file format you’d like to use as there are quite a few to choose from. Here are some options and samples:

File type	Example config file
1. Simple text file which is tab-delimited Python Library = noneExample: below	records_per_page 10 logo_icon /images/company_log.jpg
2. A properties file with key value pair Python Library = None	#webpage display records_per_page = 10 logo_icon = /images/company_log.jpg
3. INI file format Python library: configparser	[database] #database related configuration files port = 22 forward = no name = db_test
4. JSON file format Python library: json	{ “records_per_page”:10, “logo_icon”: “/images/company_log.jpg”}

Example 1: Simple text file which is tab-delimited

You can see a full article on how to read a text file in our “Storing Data in Files in Python” article. The short version of open a tab delimited file is as follows:

Suppose you have a configuration file as follows where each row has two fields which is separated by a tab:

config_data.txt

records_per_page    10
logo_icon   /images/company_log.jpg

You can load the data into a python dictionary like the following:

config = {}
file_handler = open('config_data.txt', 'r')
for rec in file_handler:
   config.update( [ tuple( rec.strip().split('\t') ) ] )
file_handler.close()
print(config)

The output will be as follows:

{'records_per_page': '10', 'logo_icon': '/images/company_log.jpg'}

Some explanation may be required on the code though to make it easier to understand. Firstly, the for loop is used to read a record line by line. So each time the for loop iterates, it will read a line into the field rec until the whole file is read.

The following code is a little tricky, but the intent is to take the two columns in the tab delimited file and create a dictionary key value pair.

config.update( [ tuple( rec.strip().split('\t') ) ] )

It works by the following:

It first removes the newline character from the end of the line (through rec.strip() )
This will then return a string which is then split with split() by the a tab characters (denoted by ‘\t’)
The result of this is a two filed array which is then created into a tuple format
The tuple is then put in a list and added to list with the [] brackets
The dictionary .update() method is used to finally add they key value pair

Example 2: A properties file with key value pair

If you have a fairly simple configuration needs with just a key-value pair, then a properties type file would work for you where you have <config name> = <config value>. This can be easily loaded as a text file and then the key-value be loaded into a dictionary.

Imagine this was the config file: config_data.txt

#webpage display
records_per_page =10
logo_icon =/images/company_log.jpg

The following code could easily load this configuration:

config = {}
with open('config_data.txt', 'r') as file_hander:
   for rec in file_hander:
       if rec.startswith('#'): continue
       key, value = rec.strip().split('=')
       if key: config[key] = value
print( config  )

Here the code ignores any comment lines (e.g. the line starts with a ‘#’), and then string-splits the line by the ‘=’ sign. This will then load the dictionary ‘config’

Example 3: INI file format using ConfigParse

You can see a full article on how the ConfigParse library works in our earlier article. The short version is as follows.

Suppose you have a configuration file as follows:

test.ini

[default]
name = development
host = 192.168.1.1
port = 31
username = admin
password = admin

[database]
name = production
host = 144.101.1.1

You can then read the file with the following simple code:

import configparser

config = configparser.ConfigParser()

#Open the file again to try to read it
config.read('test.ini')
print( config['database'][‘name’] ) #This will output ‘production’
print( config['database'][‘port’] ) #This will output ‘31’.  As there is no port under
                                    # database the default value will be extracted

Example 4: Reading Config values from a JSON file

With JSON being so popular, this is also another alternative you could use to keep all your config data in. It is very easy to also load.

Assume your config file is as follows: config_data.txt

{
  "records_per_page":10,
  "logo_icon": "/images/company_log.jpg"
}

Then the following code can be used to bring these into a dictionary:

import json
file_handler = open('config_data.txt', 'r')
config = json.loads( file_handler.read() )
file_handler.close()
print(config)

Where the output would be:

{'records_per_page': 10, 'logo_icon': '/images/company_log.jpg'}

Summary

A config file is a great option if you are looking to store settings for your applications. These are usually loaded at the start of the application and then can be loaded into a dictionary which can then serve as a set of constants which your application can use. This will both avoid the need to hardcode settings and also allow you to change the behaviour of your application without having to touch the code.

How To Use Python instructor for Structured LLM Outputs

by Pubs | May 19, 2026 | APIs, Intermediate

Intermediate

You ask an LLM to extract a user’s name, age, and email from a paragraph of text. Sometimes it returns clean JSON. Sometimes it returns JSON wrapped in markdown fences. Sometimes it returns a paragraph explaining why it extracted those fields. If you have ever built a pipeline that breaks because the model decided today was a good day to add “Sure! Here is the extracted data:” before the JSON, you already understand why instructor exists.

The instructor library patches the OpenAI client (and any OpenAI-compatible API) to force the model to return a fully validated Pydantic model — every time. When validation fails, it retries automatically. You define exactly what fields you need, with their types and constraints, and instructor handles the conversation with the model until the output matches your schema. You need Python 3.9+, an OpenAI API key (or compatible endpoint), and pip install instructor.

This article walks through everything you need to get structured LLM outputs in production: installing and patching the client, defining Pydantic schemas, extracting nested objects, handling lists, using validation hooks, working with non-OpenAI models via LiteLLM, and building a real extraction pipeline. By the end you will have a reusable pattern for reliable structured data from any LLM.

Structured LLM Output: Quick Example

The fastest way to see instructor in action is to extract a structured object from a single sentence. Install the library and try this:

# quick_instructor.py
import instructor
from openai import OpenAI
from pydantic import BaseModel

client = instructor.from_openai(OpenAI())

class Person(BaseModel):
    name: str
    age: int
    city: str

person = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=Person,
    messages=[{"role": "user", "content": "Alice is 32 years old and lives in Melbourne."}]
)

print(person.name)   # Alice
print(person.age)    # 32
print(person.city)   # Melbourne
print(type(person))  # <class '__main__.Person'>

Output:

Alice
32
Melbourne
<class '__main__.Person'>

The key line is instructor.from_openai(OpenAI()) — this patches the standard OpenAI client. After that, you pass response_model=Person to any chat.completions.create call, and instructor automatically: sends the Pydantic schema to the model as a tool definition, parses the model’s tool-call response, validates it against your schema, and retries if validation fails. The return value is a fully typed Pydantic object, not a string or dict.

That example covers the simplest case. The sections below show how to handle nested models, lists, validation rules, retry configuration, and real-world pipelines.

What Is instructor and Why Use It?

When you call an LLM without constraints, it returns free-form text. Parsing that text into structured data is fragile — you write regex, JSON parsers, and fallback handlers that break every time the model changes its wording. instructor solves this by using OpenAI’s function/tool calling feature under the hood: it converts your Pydantic model into a JSON Schema tool definition, forces the model to call that tool, and validates the returned arguments against your schema.

The result is LLM output that behaves like a typed function return value instead of a string you have to parse. If the model returns a field with the wrong type (for example, age as a string “thirty-two” instead of an integer), instructor sends the validation error back to the model and asks it to try again — up to a configurable number of retries.

Approach	Reliability	Type Safety	Auto-Retry
Parse raw LLM text	Fragile	None	Manual
Parse JSON from prompt	Moderate	Manual	Manual
OpenAI function calling	Good	Partial	None
instructor + Pydantic	High	Full	Built-in

The library supports multiple backends: instructor.from_openai, instructor.from_anthropic, instructor.from_gemini, and any OpenAI-compatible endpoint via base_url. This makes it the same interface regardless of which model you use.

Installation and Setup

Install instructor and the OpenAI SDK together. If you are using a different provider, you may also need their SDK:

# Terminal
pip install instructor openai pydantic

Set your API key as an environment variable so it never appears in your code:

# setup_env.py -- run once, or add to your shell profile
import os
# In practice, set this in your shell:
# export OPENAI_API_KEY="sk-..."
print("OPENAI_API_KEY set:", bool(os.environ.get("OPENAI_API_KEY")))

Output:

OPENAI_API_KEY set: True

Patch the client once at startup and reuse it for all calls. Creating a new patched client for every request is wasteful:

# client_setup.py
import instructor
from openai import OpenAI

# Patch once at startup
client = instructor.from_openai(OpenAI())  # reads OPENAI_API_KEY from env

# The client now has response_model support on all completion calls
print(type(client))  # <class 'instructor.client.Instructor'>

Output:

<class 'instructor.client.Instructor'>

Defining Pydantic Schemas for Extraction

Your Pydantic model defines exactly what fields the LLM must return. Field descriptions improve accuracy significantly — the model uses them as instructions for what to put in each field. Use Field(description=...) to guide the extraction:

# schema_example.py
import instructor
from openai import OpenAI
from pydantic import BaseModel, Field
from typing import Optional

client = instructor.from_openai(OpenAI())

class JobPosting(BaseModel):
    title: str = Field(description="The exact job title as written in the posting")
    company: str = Field(description="Company name offering the position")
    location: str = Field(description="City and country, or 'Remote'")
    salary_min: Optional[int] = Field(None, description="Minimum annual salary in USD if mentioned")
    salary_max: Optional[int] = Field(None, description="Maximum annual salary in USD if mentioned")
    is_remote: bool = Field(description="True if the role allows remote work")

text = """
Senior Python Developer at DataFlow Inc. -- Remote (US timezones preferred).
Salary range: $140,000 - $175,000 per year. Must have 5+ years Python experience.
"""

job = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=JobPosting,
    messages=[{"role": "user", "content": f"Extract the job details from: {text}"}]
)

print(f"Title: {job.title}")
print(f"Company: {job.company}")
print(f"Location: {job.location}")
print(f"Salary: ${job.salary_min:,} - ${job.salary_max:,}")
print(f"Remote: {job.is_remote}")

Output:

Title: Senior Python Developer
Company: DataFlow Inc.
Location: Remote (US timezones preferred)
Salary: $140,000 - $175,000
Remote: True

The Optional[int] type tells instructor (and the model) that salary fields may be absent. When the source text does not mention a salary, these fields will be None instead of hallucinated values. Always use Optional for fields that may not appear in the input — without it, the model will invent plausible-sounding values rather than leaving the field empty.

Extracting Nested and List Objects

Real-world extraction often requires nested structures — for example, an invoice with multiple line items, or a resume with a list of work experiences. instructor handles nested Pydantic models and List types natively:

# nested_extraction.py
import instructor
from openai import OpenAI
from pydantic import BaseModel, Field
from typing import List

client = instructor.from_openai(OpenAI())

class LineItem(BaseModel):
    description: str
    quantity: int
    unit_price: float

class Invoice(BaseModel):
    vendor: str
    invoice_number: str
    items: List[LineItem]
    total: float

invoice_text = """
Invoice #INV-2024-0891 from CloudHost Solutions
- 3x Server instances @ $45.00 each
- 1x SSL Certificate @ $12.00
- 2x Domain registrations @ $15.00 each
Total: $222.00
"""

result = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=Invoice,
    messages=[{"role": "user", "content": f"Extract invoice data: {invoice_text}"}]
)

print(f"Vendor: {result.vendor}")
print(f"Invoice #: {result.invoice_number}")
for item in result.items:
    print(f"  {item.quantity}x {item.description} @ ${item.unit_price:.2f}")
print(f"Total: ${result.total:.2f}")

Output:

Vendor: CloudHost Solutions
Invoice #: INV-2024-0891
  3x Server instances @ $45.00
  1x SSL Certificate @ $12.00
  2x Domain registrations @ $15.00
Total: $222.00

Nested models work because instructor converts the entire schema — including nested classes — into a JSON Schema definition that the model understands. The model fills in every field of every nested object, and Pydantic validates the whole structure recursively. If the items list is missing or a line item has an invalid type, instructor retries the extraction with the validation error as feedback.

Adding Custom Validation Rules

Pydantic’s field_validator lets you add business logic on top of type checking. instructor automatically feeds validation errors back to the model, so the model gets a second (or third) chance to return values that satisfy your rules:

# custom_validation.py
import instructor
from openai import OpenAI
from pydantic import BaseModel, Field, field_validator
from typing import List

client = instructor.from_openai(OpenAI())

class ProductReview(BaseModel):
    product_name: str
    rating: int = Field(description="Rating from 1 to 5")
    pros: List[str] = Field(description="List of positive aspects, at least one")
    cons: List[str] = Field(description="List of negative aspects, can be empty")
    summary: str = Field(description="One-sentence summary under 150 characters")

    @field_validator("rating")
    @classmethod
    def rating_in_range(cls, v: int) -> int:
        if not 1 <= v <= 5:
            raise ValueError(f"Rating must be between 1 and 5, got {v}")
        return v

    @field_validator("pros")
    @classmethod
    def at_least_one_pro(cls, v: List[str]) -> List[str]:
        if not v:
            raise ValueError("Must include at least one positive aspect")
        return v

    @field_validator("summary")
    @classmethod
    def summary_length(cls, v: str) -> str:
        if len(v) > 150:
            raise ValueError(f"Summary too long: {len(v)} chars (max 150)")
        return v

text = """
The new Python IDE is pretty solid. Boot time is fast, autocomplete works well.
The memory usage is high and the plugin store is still sparse. Overall a decent
choice for Python development. I'd give it 4 out of 5.
"""

review = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=ProductReview,
    messages=[{"role": "user", "content": f"Extract review details: {text}"}]
)

print(f"Product: {review.product_name}")
print(f"Rating: {review.rating}/5")
print(f"Pros: {review.pros}")
print(f"Cons: {review.cons}")
print(f"Summary: {review.summary}")

Output:

Product: Python IDE
Rating: 4/5
Pros: ['Fast boot time', 'Good autocomplete']
Cons: ['High memory usage', 'Sparse plugin store']
Summary: A solid Python IDE with fast performance but limited plugins and high memory usage.

When a validator raises ValueError, instructor captures the error message and sends it back to the model in a follow-up message: “Validation failed: Rating must be between 1 and 5, got 6. Please fix and try again.” The model then self-corrects. By default, instructor retries up to 3 times before raising an exception. You can configure this with max_retries=N on the completion call.

Configuring Retries and Modes

instructor supports several extraction modes depending on what your model supports. The default mode uses OpenAI’s tool calling, but you can switch to JSON mode or other strategies:

# retry_config.py
import instructor
from instructor import Mode
from openai import OpenAI
from pydantic import BaseModel

# Default: tool calling (most reliable for OpenAI models)
client_tools = instructor.from_openai(OpenAI())

# JSON mode: model returns raw JSON instead of a tool call
client_json = instructor.from_openai(OpenAI(), mode=Mode.JSON)

# MD_JSON mode: model wraps JSON in markdown fences (useful for some fine-tunes)
client_md = instructor.from_openai(OpenAI(), mode=Mode.MD_JSON)

class City(BaseModel):
    name: str
    country: str
    population: int

# Control retries per-call
city = client_tools.chat.completions.create(
    model="gpt-4o-mini",
    response_model=City,
    max_retries=5,           # retry up to 5 times on validation failure
    messages=[{"role": "user", "content": "Tell me about Tokyo"}]
)

print(f"{city.name}, {city.country}: pop {city.population:,}")

Output:

Tokyo, Japan: pop 13,960,000

For most OpenAI models, the default tool-calling mode is most reliable. Use Mode.JSON for models that support JSON mode but not tool calling — for example, some fine-tuned models or older GPT versions. The max_retries parameter controls how many times instructor will re-prompt the model when validation fails. For production pipelines where data quality matters more than cost, set this to 3-5.

Using instructor with Non-OpenAI Models

If you are using Anthropic’s Claude, Google Gemini, or a local model via Ollama, instructor has provider-specific patches. For OpenAI-compatible endpoints (like local LLMs with an OpenAI-compatible API), you can pass a custom base_url:

# multi_provider.py
import instructor
from anthropic import Anthropic
from pydantic import BaseModel

# Anthropic Claude -- uses a different client class
anthropic_client = instructor.from_anthropic(Anthropic())

class Sentiment(BaseModel):
    label: str   # "positive", "negative", or "neutral"
    score: float # confidence from 0.0 to 1.0
    reason: str  # one-sentence explanation

result = anthropic_client.messages.create(
    model="claude-3-haiku-20240307",
    max_tokens=256,
    response_model=Sentiment,
    messages=[{
        "role": "user",
        "content": "This new Python library is fantastic, saves me hours every week!"
    }]
)

print(f"Sentiment: {result.label} ({result.score:.0%})")
print(f"Reason: {result.reason}")

Output:

Sentiment: positive (96%)
Reason: The user expresses strong enthusiasm and quantifies time savings, indicating genuine satisfaction.

For local models via Ollama (which provides an OpenAI-compatible API on localhost:11434), create the client with a custom base URL:

# ollama_instructor.py
import instructor
from openai import OpenAI
from pydantic import BaseModel

# Ollama runs an OpenAI-compatible server locally
ollama_client = instructor.from_openai(
    OpenAI(base_url="http://localhost:11434/v1", api_key="ollama"),
    mode=instructor.Mode.JSON  # use JSON mode for local models
)

class Summary(BaseModel):
    headline: str
    key_points: list[str]

# Works the same as OpenAI -- just a different backend
# summary = ollama_client.chat.completions.create(
#     model="llama3.2",
#     response_model=Summary,
#     messages=[{"role": "user", "content": "Summarize Python's async/await model"}]
# )
print("Local model client ready -- uncomment to use with Ollama running")

Output:

Local model client ready -- uncomment to use with Ollama running

Real-Life Example: Job Posting Extraction Pipeline

Here is a complete pipeline that reads job postings from a list of texts, extracts structured data, filters by criteria, and exports to CSV — the kind of task that comes up in recruiting tools, market research, and job aggregators:

# job_extraction_pipeline.py
import instructor
import csv
from openai import OpenAI
from pydantic import BaseModel, Field
from typing import Optional, List

client = instructor.from_openai(OpenAI())

class JobPosting(BaseModel):
    title: str = Field(description="Job title exactly as written")
    company: str
    location: str = Field(description="City/country or 'Remote'")
    salary_min: Optional[int] = Field(None, description="Min annual salary USD")
    salary_max: Optional[int] = Field(None, description="Max annual salary USD")
    required_years: Optional[int] = Field(None, description="Years of experience required")
    technologies: List[str] = Field(description="List of technologies mentioned")
    is_remote: bool

# Sample job postings to process
JOB_TEXTS = [
    """Senior Python Engineer at Nexaflow -- Remote-first.
    $150k-$190k. 5+ years Python, FastAPI, PostgreSQL, AWS required.""",

    """Junior Data Scientist at BioMetrics Ltd (London, UK).
    GBP 45,000-55,000. 0-2 years exp, pandas, scikit-learn, matplotlib.""",

    """Staff ML Engineer at Quantra -- San Francisco CA.
    $220,000 - $280,000/yr. 8+ years, PyTorch, CUDA, distributed training.""",
]

def extract_jobs(texts: List[str]) -> List[JobPosting]:
    """Extract structured job data from raw posting texts."""
    jobs = []
    for i, text in enumerate(texts, 1):
        job = client.chat.completions.create(
            model="gpt-4o-mini",
            response_model=JobPosting,
            max_retries=3,
            messages=[{"role": "user", "content": f"Extract job details:\n\n{text}"}]
        )
        jobs.append(job)
        print(f"[{i}/{len(texts)}] Extracted: {job.title} at {job.company}")
    return jobs

def filter_remote(jobs: List[JobPosting]) -> List[JobPosting]:
    return [j for j in jobs if j.is_remote]

def export_csv(jobs: List[JobPosting], path: str) -> None:
    with open(path, "w", newline="") as f:
        writer = csv.writer(f)
        writer.writerow(["Title", "Company", "Location", "Salary Min", "Salary Max",
                         "Yrs Required", "Technologies", "Remote"])
        for j in jobs:
            writer.writerow([
                j.title, j.company, j.location,
                j.salary_min or "", j.salary_max or "",
                j.required_years or "",
                ", ".join(j.technologies),
                j.is_remote
            ])

if __name__ == "__main__":
    print("Extracting job postings...")
    jobs = extract_jobs(JOB_TEXTS)
    remote_jobs = filter_remote(jobs)
    print(f"\nTotal extracted: {len(jobs)}, Remote: {len(remote_jobs)}")
    export_csv(jobs, "jobs_extracted.csv")
    print("Saved to jobs_extracted.csv")

Output:

Extracting job postings...
[1/3] Extracted: Senior Python Engineer at Nexaflow
[2/3] Extracted: Junior Data Scientist at BioMetrics Ltd
[3/3] Extracted: Staff ML Engineer at Quantra

Total extracted: 3, Remote: 1
Saved to jobs_extracted.csv

This pipeline is easy to extend: add a database write step, connect it to a web scraper that feeds real job pages, or add more validation rules to the JobPosting model. The core pattern — extract once, validate automatically, retry on failure — stays the same regardless of the scale. You can process thousands of postings by replacing JOB_TEXTS with a generator that reads from a queue or database, keeping the extraction logic identical.

Frequently Asked Questions

Does instructor increase API costs because of retries?

Yes, each retry is an additional API call, so failed extractions cost more. In practice, with well-designed schemas and clear field descriptions, validation failures are rare — under 5% for most extraction tasks. The cost increase is usually worth the reliability gain. If cost is a concern, use max_retries=1 and handle exceptions in your code rather than retrying automatically.

Does instructor support streaming responses?

Yes. Use response_model=Iterable[YourModel] for streaming lists, or Partial[YourModel] for streaming partial updates to a single model. Streaming is useful for large extractions where you want to process results as they arrive rather than waiting for the full response. See the instructor documentation for the streaming API details.

What happens when the model cannot extract a field?

If the field is typed as Optional[X], the model will return None for missing information. If the field is required (non-Optional), the model will either hallucinate a value or fail validation, triggering a retry. For fields that may legitimately be absent in the source text, always use Optional with a None default. This is the most common mistake new users make.

Can I extract data from large documents?

Yes, but be aware of token limits. For documents larger than a few thousand words, split them into chunks and extract from each chunk separately. Use a List[YourModel] return type if a single document contains multiple items to extract (like a list of transactions in a bank statement). For very large documents, consider summarizing first with a regular completion call, then extracting from the summary.

How is this different from just prompting for JSON output?

Prompting for JSON works until it does not — the model adds markdown fences, writes a preamble sentence, or omits fields. instructor uses tool calling (not prompting) to enforce the schema, so the model cannot deviate from the structure. It also runs Pydantic validation on the result and retries if types or constraints are violated. The difference in reliability for production use is significant — JSON prompting is fine for experiments, but instructor is the right tool for pipelines where data quality matters.

Is my data sent to OpenAI when I use instructor?

instructor is a thin wrapper around the OpenAI SDK — your data goes to whatever API endpoint you configure, subject to that provider’s data policy. If you are processing sensitive data, use a self-hosted model via Ollama or another local inference server, and point instructor at your local endpoint with a custom base_url. The library itself does not send data anywhere — it only wraps the client you provide.

Conclusion

The instructor library solves one of the most persistent frustrations in LLM application development: getting the model to return data in the shape your code expects, every time. We covered patching the OpenAI client, defining Pydantic schemas with field descriptions, extracting nested and list objects, adding custom validation rules, configuring retries and modes, and using instructor with non-OpenAI providers. The job extraction pipeline demonstrated how these pieces combine into a production-ready pattern.

The next step is to extend the real-life example: add a web scraper to pull live job postings, or connect the extracted data to a database. With instructor handling the model-to-schema translation, you can focus entirely on the business logic of what to extract and what to do with it.

Full documentation and more examples are at python.useinstructor.com. The library’s GitHub has a large collection of real-world examples including classification, knowledge graph extraction, and citation-backed answers.

Configs are constants you change without redeploying.

Further Reading: For more details, see the Python configparser documentation.

Frequently Asked Questions

What is the best way to store settings in Python?

For simple key-value settings, use INI files with ConfigParser. For nested data, use JSON or TOML. For environment-specific settings, use .env files with python-dotenv. The best choice depends on your complexity needs and whether non-developers will edit the settings.

How do I create a config file in Python?

Use ConfigParser to create INI files: instantiate the parser, add sections and key-value pairs with config['section'] = {'key': 'value'}, then write with config.write(open('config.ini', 'w')). For JSON, use json.dump().

Should I use environment variables or config files?

Use environment variables for sensitive data (API keys, passwords) and deployment-specific settings. Use config files for application-level settings that rarely change. Many projects combine both: a config file for defaults and environment variables for overrides and secrets.

How do I prevent config files from being committed to Git?

Add your config file names to .gitignore (e.g., config.ini, .env). Provide a config.example.ini template in the repository so other developers know what settings are needed without exposing actual values.

Can I use YAML for Python configuration files?

Yes. Install PyYAML with pip install pyyaml and use yaml.safe_load() to read YAML files. YAML supports nested structures, lists, and comments, making it more expressive than INI. However, it is not part of Python’s standard library.

Better organization of your projects with python imports

Beginner

Importing modules or packages (in other languages this would be referred to as libraries) is a fundamental aspect of the language which makes it so useful. As of this writing, the most popular python package library, pypi.org, has over 300k packages to import. This isn’t just important for importing of external packages. It also becomes a must when your own project becomes quite large. You need to make sure you can split your code into manageable logical chunks which can talk to each other. This is what this article is all about.

What’s the difference between a python package vs module

First, some terminology. A module, is a single python file (still with a .py extension) that contains some code which you can import. While a package, is a collection of files. In your project, a package is all the files in a given directory and where the directory also contains the file __init__.py to signal that this is a package.

What happens when you import a python module

There is nothing special in fact you need to do to make a module – all python files are by default a module and can be imported. When a file is imported, all the code does get processed – e.g. if there’s any code to be executed it will run.

See following example. Suppose we have the following relationship:

We have main_file.py importing two modules

Code as follows:

#module1.py
print("module1: I'm in module 1 root section")

def output_hw():
	print("module1: Hello world - output_hw 1")

#module2.py
import module1
print("module2: I'm in root section of module 2")

def output_hw():
	print("module2: Hello world - output_hw 2")

#main_file.py
print("main_file: starting code")
import module1
import module2

print("main_file: I'm in the root section ")

if __name__ == '__main__':
	print("main_file: ******* starting __main__ section")
	module1.output_hw()
	module2.output_hw()
	print("main_file: Main file done!")

Output:

So what’s happening here:

The main_file.py gets executed first and then imports module1 then module2
As part of importing module1, it executes all the code including the print statements in the root part of the code. Similarly for module2
Then the code returns to the main_file where it calls the functions under module1 and module2.
Please note, that both module1 and module2 have the same function name of output_hw(). This is perfectly fine as the scope of the function is in different modules.

One additional item to note, is that the module2 also imports module1. However, the print statement in the root section print("module1: I'm in module 1 root section") did not get executed the second time. Why? Python only imports a given module once.

Now let’s make a slight change – let’s remove the references to module1 in the main_file, and in module2, import module1!

The updated code looks like this:

#module1.py
print("module1: I'm in module 1 root section")

def output_hw():
	print("module1: Hello world - output_hw 1")

#module2.py
import module1
print("module2: I'm in root section of module 2")

def output_hw():
	print("module2: Hello world - output_hw 2")

#main_file.py
print("main_file: starting code")
# import module1
import module2


print("main_file: I'm in the root section ")

if __name__ == '__main__':
	print("main_file: ******* starting __main__ section")
	module2.output_hw()
	# module2.output_hw()
	print("main_file: Main file done!")

Output:

Now notice that module1 gets imported and executed from module2. Notice that the first line is “module1: I’m in module 1 root section” since the very first line of module2 is to import module1!

How do you make a package in your python project

To create a package it’s fairly straightforward. You simply need to move all your files into a directory and then create a file called __init__.py.

This means your directory structure looks like this:

/main_file.py
└── package1/
    ├── __init__.py
    ├── module1.py
    └── module2.py

The above example, would now look like the following:

#__init__py
import package1.module1
import package1.module2

#module1.py
print("module1: I'm in module 1 root section")

def output_hw():
	print("module1: Hello world - output_hw 1")

#module2.py
import package1.module1
print("module2: I'm in root section of module 2")

def output_hw():
	print("module2: Hello world - output_hw 2")

#main_file.py
print("main_file: starting code") 
import package

print("main_file: I'm in the root section ")

if __name__ == '__main__':
	print("main_file: ******* starting __main__ section")
	package1.module1.output_hw()
	package1.module2.output_hw() 
	print("main_file: Main file done!")

So in the __init__.py file, it imports module1 & module2. The reason this is important is because so that when in main_file the package1 is imported, then it will have immediate access to module1 and module2. This is why the package1.module1 and package1.module2 works.

You cannot make the inclusion of modules automatic, and generally you shouldn’t as you may have name clashes which you can avoid if you do this manually.

Can you avoid typing the prefix of “package1” each time? Yes in fact if you use the “from”. See next section.

Only Import a part of a module

You can also import just either a class or a function of a given module if you prefer in order to limit what is accessible in your local code. However, it does still execute your whole module though. It is more a means to make your code much more readable. See the following example:

#module1.py
print("module1: I'm in module 1 root section")

def output_hw():
	print("module1: Hello world - output_hw 1")

#main_file.py
print("main_file: starting code") 
from module1 import output_hw

print("main_file: I'm in the root section ")

if __name__ == '__main__':
	print("main_file: ******* starting __main__ section")
	output_hw() 
	print("main_file: Main file done!")

Output

As can be seen in the above output, although just the output_hw() function is being imported, the statement “module1: Im in module1 root section” was still executed.

Note also, that you do not need to mention the module prefix in the code, you can just refer to the function as is.

So back to above, for the packages, instead of the following:

import package1.module1

you can instead use the “from” keyword but force to check local directory:

from .module1 import *

There’s a few things going on here. The '.' in front of module1 is referring to the current directory. If you wanted to check the parent directory then you can use two '.'s so the line looks like this: from ..module1 import *. The second item is that everything is being imported with the import * section.

Importing a module and applying an alias

In case you wanted to make your code easier to read, or you wanted to avoid any name clashes (see at the start of the article how module1 and module2 both had the same function name of output_hw() ), you can use the “as” keyword at the import statement to give an alternative name.

You can do the following:

#main_file.py
print("main_file: starting code") 
from module1 import output_hw as module1__output_hw

print("main_file: I'm in the root section ")

if __name__ == '__main__':
	print("main_file: ******* starting __main__ section")
	module1__output_hw() 
	print("main_file: Main file done!")

This can also be done with the module or package name as well, i.e.

import module1 as mod1

Importing modules outside your project folder

Modules can by default be imported from the sub-directories up to the main script file. So the following works:

/main_file.py
└── package1/
│   ├── __init__.py
│   ├── module1.py
│   └── module2.py
└── package2/
    ├── __init__.py
    └── pkg2_mod_a.py

Then in module1, you can import from pkg2_mod_2 with the following:

#module1.py
from package2.pkg2_mod_a import get_main_list

def output_hw():
	print("module1: List from pkg2 module A:" + str( get_main_list()) )

Just need to remember in package2/__init__.py that you have to import pkg2_mod_a.py

However, what if the code was outside your main running script? Suppose if you had the following directory structure:

/
└── server_key.py
/r1/
  └── main_file.py
  └── package1/
      ├── __init__.py
      └──  module1.py

From any file in the /r1/ project, if you tried to import a file from server_key.py , you will get the error:

ValueError: attempted relative import beyond top-level package

To resolve this, you can in fact tell python where to look. Python keeps track of all the directories to search for modules under sys.path folder. Hence, the solution is to add an entry for the parent directory. Namely:

import sys
sys.path.append("..")

So the full code looks like the following:

#main_file.py
import sys
sys.path.append("..")
print("main_file: starting code")  
import package1

print("main_file: I'm in the root section ")

if __name__ == '__main__':
	print("main_file: ******* starting __main__ section")
	package1.module1.output_hw() 
	print("main_file: Main file done!")

#module1.py
from package2.pkg2_mod_a import get_main_list
from server_key import get_server_master_key

def output_hw():
	print("module1: List from pkg2 module A:" + str( get_main_list()) )
	print("module1: server key :" + get_server_master_key() )

#server_key.py
def get_server_master_key():
	return "AA33FF1255";

Output – The output is as follows:

How to import modules dynamically

All of the above is when you know exactly what the module name to import. However, what if you don’t know the module name until runtime?

This is where you can use the __import__ and the getattr functions to achieve this.

Firstly the getattr(). This function is used to in fact load an object dynamically where you can specify the object name in a string, or provide a default.

Secondly, the __import__() can be used to provide a module name as a string.

When you combine the two together, you first load the module with __import__, and then use getattr to load the actual function you want to call or class you want to load from the import.

See the following example:

/r1/
  └── main_file.py
  └── package1/
      ├── __init__.py
      └──  module1.py

With the following code:

#module1.py

def output_hw():
	print("module1: take me to a funky town")

#main_file.py
if __name__ == '__main__':
	print("main_file: ******* starting __main__ section")
	
	module = __import__( 'package1.module1')
	func = getattr( module, 'output_hw', None)
	if func:
		func()
	print("main_file: Main file done!")

In the above code, we first load the module called “package1.module1” which only loads the module. Then the getattr is called on the module and then the function is passed as a string. You can also pass in a class name if you wish.

Conclusion

There are many ways to import files and to organize your projects into smaller chunks. The most difficult piece is to decide what parts of your code go where..

Get notified automatically of new articles

We are always here to help provide useful articles with usable ode snippets. Sign up to our newsletter and receive articles in your inbox automatically so you won’t miss out on the next useful tips.

How To Use Python instructor for Structured LLM Outputs

by Pubs | May 19, 2026 | APIs, Intermediate

Intermediate

You ask an LLM to extract a user’s name, age, and email from a paragraph of text. Sometimes it returns clean JSON. Sometimes it returns JSON wrapped in markdown fences. Sometimes it returns a paragraph explaining why it extracted those fields. If you have ever built a pipeline that breaks because the model decided today was a good day to add “Sure! Here is the extracted data:” before the JSON, you already understand why instructor exists.

The instructor library patches the OpenAI client (and any OpenAI-compatible API) to force the model to return a fully validated Pydantic model — every time. When validation fails, it retries automatically. You define exactly what fields you need, with their types and constraints, and instructor handles the conversation with the model until the output matches your schema. You need Python 3.9+, an OpenAI API key (or compatible endpoint), and pip install instructor.

This article walks through everything you need to get structured LLM outputs in production: installing and patching the client, defining Pydantic schemas, extracting nested objects, handling lists, using validation hooks, working with non-OpenAI models via LiteLLM, and building a real extraction pipeline. By the end you will have a reusable pattern for reliable structured data from any LLM.

Structured LLM Output: Quick Example

The fastest way to see instructor in action is to extract a structured object from a single sentence. Install the library and try this:

# quick_instructor.py
import instructor
from openai import OpenAI
from pydantic import BaseModel

client = instructor.from_openai(OpenAI())

class Person(BaseModel):
    name: str
    age: int
    city: str

person = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=Person,
    messages=[{"role": "user", "content": "Alice is 32 years old and lives in Melbourne."}]
)

print(person.name)   # Alice
print(person.age)    # 32
print(person.city)   # Melbourne
print(type(person))  # <class '__main__.Person'>

Output:

Alice
32
Melbourne
<class '__main__.Person'>

The key line is instructor.from_openai(OpenAI()) — this patches the standard OpenAI client. After that, you pass response_model=Person to any chat.completions.create call, and instructor automatically: sends the Pydantic schema to the model as a tool definition, parses the model’s tool-call response, validates it against your schema, and retries if validation fails. The return value is a fully typed Pydantic object, not a string or dict.

That example covers the simplest case. The sections below show how to handle nested models, lists, validation rules, retry configuration, and real-world pipelines.

What Is instructor and Why Use It?

When you call an LLM without constraints, it returns free-form text. Parsing that text into structured data is fragile — you write regex, JSON parsers, and fallback handlers that break every time the model changes its wording. instructor solves this by using OpenAI’s function/tool calling feature under the hood: it converts your Pydantic model into a JSON Schema tool definition, forces the model to call that tool, and validates the returned arguments against your schema.

The result is LLM output that behaves like a typed function return value instead of a string you have to parse. If the model returns a field with the wrong type (for example, age as a string “thirty-two” instead of an integer), instructor sends the validation error back to the model and asks it to try again — up to a configurable number of retries.

Approach	Reliability	Type Safety	Auto-Retry
Parse raw LLM text	Fragile	None	Manual
Parse JSON from prompt	Moderate	Manual	Manual
OpenAI function calling	Good	Partial	None
instructor + Pydantic	High	Full	Built-in

The library supports multiple backends: instructor.from_openai, instructor.from_anthropic, instructor.from_gemini, and any OpenAI-compatible endpoint via base_url. This makes it the same interface regardless of which model you use.

Installation and Setup

Install instructor and the OpenAI SDK together. If you are using a different provider, you may also need their SDK:

# Terminal
pip install instructor openai pydantic

Set your API key as an environment variable so it never appears in your code:

# setup_env.py -- run once, or add to your shell profile
import os
# In practice, set this in your shell:
# export OPENAI_API_KEY="sk-..."
print("OPENAI_API_KEY set:", bool(os.environ.get("OPENAI_API_KEY")))

Output:

OPENAI_API_KEY set: True

Patch the client once at startup and reuse it for all calls. Creating a new patched client for every request is wasteful:

# client_setup.py
import instructor
from openai import OpenAI

# Patch once at startup
client = instructor.from_openai(OpenAI())  # reads OPENAI_API_KEY from env

# The client now has response_model support on all completion calls
print(type(client))  # <class 'instructor.client.Instructor'>

Output:

<class 'instructor.client.Instructor'>

Defining Pydantic Schemas for Extraction

Your Pydantic model defines exactly what fields the LLM must return. Field descriptions improve accuracy significantly — the model uses them as instructions for what to put in each field. Use Field(description=...) to guide the extraction:

# schema_example.py
import instructor
from openai import OpenAI
from pydantic import BaseModel, Field
from typing import Optional

client = instructor.from_openai(OpenAI())

class JobPosting(BaseModel):
    title: str = Field(description="The exact job title as written in the posting")
    company: str = Field(description="Company name offering the position")
    location: str = Field(description="City and country, or 'Remote'")
    salary_min: Optional[int] = Field(None, description="Minimum annual salary in USD if mentioned")
    salary_max: Optional[int] = Field(None, description="Maximum annual salary in USD if mentioned")
    is_remote: bool = Field(description="True if the role allows remote work")

text = """
Senior Python Developer at DataFlow Inc. -- Remote (US timezones preferred).
Salary range: $140,000 - $175,000 per year. Must have 5+ years Python experience.
"""

job = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=JobPosting,
    messages=[{"role": "user", "content": f"Extract the job details from: {text}"}]
)

print(f"Title: {job.title}")
print(f"Company: {job.company}")
print(f"Location: {job.location}")
print(f"Salary: ${job.salary_min:,} - ${job.salary_max:,}")
print(f"Remote: {job.is_remote}")

Output:

Title: Senior Python Developer
Company: DataFlow Inc.
Location: Remote (US timezones preferred)
Salary: $140,000 - $175,000
Remote: True

The Optional[int] type tells instructor (and the model) that salary fields may be absent. When the source text does not mention a salary, these fields will be None instead of hallucinated values. Always use Optional for fields that may not appear in the input — without it, the model will invent plausible-sounding values rather than leaving the field empty.

Extracting Nested and List Objects

Real-world extraction often requires nested structures — for example, an invoice with multiple line items, or a resume with a list of work experiences. instructor handles nested Pydantic models and List types natively:

# nested_extraction.py
import instructor
from openai import OpenAI
from pydantic import BaseModel, Field
from typing import List

client = instructor.from_openai(OpenAI())

class LineItem(BaseModel):
    description: str
    quantity: int
    unit_price: float

class Invoice(BaseModel):
    vendor: str
    invoice_number: str
    items: List[LineItem]
    total: float

invoice_text = """
Invoice #INV-2024-0891 from CloudHost Solutions
- 3x Server instances @ $45.00 each
- 1x SSL Certificate @ $12.00
- 2x Domain registrations @ $15.00 each
Total: $222.00
"""

result = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=Invoice,
    messages=[{"role": "user", "content": f"Extract invoice data: {invoice_text}"}]
)

print(f"Vendor: {result.vendor}")
print(f"Invoice #: {result.invoice_number}")
for item in result.items:
    print(f"  {item.quantity}x {item.description} @ ${item.unit_price:.2f}")
print(f"Total: ${result.total:.2f}")

Output:

Vendor: CloudHost Solutions
Invoice #: INV-2024-0891
  3x Server instances @ $45.00
  1x SSL Certificate @ $12.00
  2x Domain registrations @ $15.00
Total: $222.00

Nested models work because instructor converts the entire schema — including nested classes — into a JSON Schema definition that the model understands. The model fills in every field of every nested object, and Pydantic validates the whole structure recursively. If the items list is missing or a line item has an invalid type, instructor retries the extraction with the validation error as feedback.

Adding Custom Validation Rules

Pydantic’s field_validator lets you add business logic on top of type checking. instructor automatically feeds validation errors back to the model, so the model gets a second (or third) chance to return values that satisfy your rules:

# custom_validation.py
import instructor
from openai import OpenAI
from pydantic import BaseModel, Field, field_validator
from typing import List

client = instructor.from_openai(OpenAI())

class ProductReview(BaseModel):
    product_name: str
    rating: int = Field(description="Rating from 1 to 5")
    pros: List[str] = Field(description="List of positive aspects, at least one")
    cons: List[str] = Field(description="List of negative aspects, can be empty")
    summary: str = Field(description="One-sentence summary under 150 characters")

    @field_validator("rating")
    @classmethod
    def rating_in_range(cls, v: int) -> int:
        if not 1 <= v <= 5:
            raise ValueError(f"Rating must be between 1 and 5, got {v}")
        return v

    @field_validator("pros")
    @classmethod
    def at_least_one_pro(cls, v: List[str]) -> List[str]:
        if not v:
            raise ValueError("Must include at least one positive aspect")
        return v

    @field_validator("summary")
    @classmethod
    def summary_length(cls, v: str) -> str:
        if len(v) > 150:
            raise ValueError(f"Summary too long: {len(v)} chars (max 150)")
        return v

text = """
The new Python IDE is pretty solid. Boot time is fast, autocomplete works well.
The memory usage is high and the plugin store is still sparse. Overall a decent
choice for Python development. I'd give it 4 out of 5.
"""

review = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=ProductReview,
    messages=[{"role": "user", "content": f"Extract review details: {text}"}]
)

print(f"Product: {review.product_name}")
print(f"Rating: {review.rating}/5")
print(f"Pros: {review.pros}")
print(f"Cons: {review.cons}")
print(f"Summary: {review.summary}")

Output:

Product: Python IDE
Rating: 4/5
Pros: ['Fast boot time', 'Good autocomplete']
Cons: ['High memory usage', 'Sparse plugin store']
Summary: A solid Python IDE with fast performance but limited plugins and high memory usage.

When a validator raises ValueError, instructor captures the error message and sends it back to the model in a follow-up message: “Validation failed: Rating must be between 1 and 5, got 6. Please fix and try again.” The model then self-corrects. By default, instructor retries up to 3 times before raising an exception. You can configure this with max_retries=N on the completion call.

Configuring Retries and Modes

instructor supports several extraction modes depending on what your model supports. The default mode uses OpenAI’s tool calling, but you can switch to JSON mode or other strategies:

# retry_config.py
import instructor
from instructor import Mode
from openai import OpenAI
from pydantic import BaseModel

# Default: tool calling (most reliable for OpenAI models)
client_tools = instructor.from_openai(OpenAI())

# JSON mode: model returns raw JSON instead of a tool call
client_json = instructor.from_openai(OpenAI(), mode=Mode.JSON)

# MD_JSON mode: model wraps JSON in markdown fences (useful for some fine-tunes)
client_md = instructor.from_openai(OpenAI(), mode=Mode.MD_JSON)

class City(BaseModel):
    name: str
    country: str
    population: int

# Control retries per-call
city = client_tools.chat.completions.create(
    model="gpt-4o-mini",
    response_model=City,
    max_retries=5,           # retry up to 5 times on validation failure
    messages=[{"role": "user", "content": "Tell me about Tokyo"}]
)

print(f"{city.name}, {city.country}: pop {city.population:,}")

Output:

Tokyo, Japan: pop 13,960,000

For most OpenAI models, the default tool-calling mode is most reliable. Use Mode.JSON for models that support JSON mode but not tool calling — for example, some fine-tuned models or older GPT versions. The max_retries parameter controls how many times instructor will re-prompt the model when validation fails. For production pipelines where data quality matters more than cost, set this to 3-5.

Using instructor with Non-OpenAI Models

If you are using Anthropic’s Claude, Google Gemini, or a local model via Ollama, instructor has provider-specific patches. For OpenAI-compatible endpoints (like local LLMs with an OpenAI-compatible API), you can pass a custom base_url:

# multi_provider.py
import instructor
from anthropic import Anthropic
from pydantic import BaseModel

# Anthropic Claude -- uses a different client class
anthropic_client = instructor.from_anthropic(Anthropic())

class Sentiment(BaseModel):
    label: str   # "positive", "negative", or "neutral"
    score: float # confidence from 0.0 to 1.0
    reason: str  # one-sentence explanation

result = anthropic_client.messages.create(
    model="claude-3-haiku-20240307",
    max_tokens=256,
    response_model=Sentiment,
    messages=[{
        "role": "user",
        "content": "This new Python library is fantastic, saves me hours every week!"
    }]
)

print(f"Sentiment: {result.label} ({result.score:.0%})")
print(f"Reason: {result.reason}")

Output:

Sentiment: positive (96%)
Reason: The user expresses strong enthusiasm and quantifies time savings, indicating genuine satisfaction.

For local models via Ollama (which provides an OpenAI-compatible API on localhost:11434), create the client with a custom base URL:

# ollama_instructor.py
import instructor
from openai import OpenAI
from pydantic import BaseModel

# Ollama runs an OpenAI-compatible server locally
ollama_client = instructor.from_openai(
    OpenAI(base_url="http://localhost:11434/v1", api_key="ollama"),
    mode=instructor.Mode.JSON  # use JSON mode for local models
)

class Summary(BaseModel):
    headline: str
    key_points: list[str]

# Works the same as OpenAI -- just a different backend
# summary = ollama_client.chat.completions.create(
#     model="llama3.2",
#     response_model=Summary,
#     messages=[{"role": "user", "content": "Summarize Python's async/await model"}]
# )
print("Local model client ready -- uncomment to use with Ollama running")

Output:

Local model client ready -- uncomment to use with Ollama running

Real-Life Example: Job Posting Extraction Pipeline

Here is a complete pipeline that reads job postings from a list of texts, extracts structured data, filters by criteria, and exports to CSV — the kind of task that comes up in recruiting tools, market research, and job aggregators:

# job_extraction_pipeline.py
import instructor
import csv
from openai import OpenAI
from pydantic import BaseModel, Field
from typing import Optional, List

client = instructor.from_openai(OpenAI())

class JobPosting(BaseModel):
    title: str = Field(description="Job title exactly as written")
    company: str
    location: str = Field(description="City/country or 'Remote'")
    salary_min: Optional[int] = Field(None, description="Min annual salary USD")
    salary_max: Optional[int] = Field(None, description="Max annual salary USD")
    required_years: Optional[int] = Field(None, description="Years of experience required")
    technologies: List[str] = Field(description="List of technologies mentioned")
    is_remote: bool

# Sample job postings to process
JOB_TEXTS = [
    """Senior Python Engineer at Nexaflow -- Remote-first.
    $150k-$190k. 5+ years Python, FastAPI, PostgreSQL, AWS required.""",

    """Junior Data Scientist at BioMetrics Ltd (London, UK).
    GBP 45,000-55,000. 0-2 years exp, pandas, scikit-learn, matplotlib.""",

    """Staff ML Engineer at Quantra -- San Francisco CA.
    $220,000 - $280,000/yr. 8+ years, PyTorch, CUDA, distributed training.""",
]

def extract_jobs(texts: List[str]) -> List[JobPosting]:
    """Extract structured job data from raw posting texts."""
    jobs = []
    for i, text in enumerate(texts, 1):
        job = client.chat.completions.create(
            model="gpt-4o-mini",
            response_model=JobPosting,
            max_retries=3,
            messages=[{"role": "user", "content": f"Extract job details:\n\n{text}"}]
        )
        jobs.append(job)
        print(f"[{i}/{len(texts)}] Extracted: {job.title} at {job.company}")
    return jobs

def filter_remote(jobs: List[JobPosting]) -> List[JobPosting]:
    return [j for j in jobs if j.is_remote]

def export_csv(jobs: List[JobPosting], path: str) -> None:
    with open(path, "w", newline="") as f:
        writer = csv.writer(f)
        writer.writerow(["Title", "Company", "Location", "Salary Min", "Salary Max",
                         "Yrs Required", "Technologies", "Remote"])
        for j in jobs:
            writer.writerow([
                j.title, j.company, j.location,
                j.salary_min or "", j.salary_max or "",
                j.required_years or "",
                ", ".join(j.technologies),
                j.is_remote
            ])

if __name__ == "__main__":
    print("Extracting job postings...")
    jobs = extract_jobs(JOB_TEXTS)
    remote_jobs = filter_remote(jobs)
    print(f"\nTotal extracted: {len(jobs)}, Remote: {len(remote_jobs)}")
    export_csv(jobs, "jobs_extracted.csv")
    print("Saved to jobs_extracted.csv")

Output:

Extracting job postings...
[1/3] Extracted: Senior Python Engineer at Nexaflow
[2/3] Extracted: Junior Data Scientist at BioMetrics Ltd
[3/3] Extracted: Staff ML Engineer at Quantra

Total extracted: 3, Remote: 1
Saved to jobs_extracted.csv

This pipeline is easy to extend: add a database write step, connect it to a web scraper that feeds real job pages, or add more validation rules to the JobPosting model. The core pattern — extract once, validate automatically, retry on failure — stays the same regardless of the scale. You can process thousands of postings by replacing JOB_TEXTS with a generator that reads from a queue or database, keeping the extraction logic identical.

Frequently Asked Questions

Does instructor increase API costs because of retries?

Yes, each retry is an additional API call, so failed extractions cost more. In practice, with well-designed schemas and clear field descriptions, validation failures are rare — under 5% for most extraction tasks. The cost increase is usually worth the reliability gain. If cost is a concern, use max_retries=1 and handle exceptions in your code rather than retrying automatically.

Does instructor support streaming responses?

Yes. Use response_model=Iterable[YourModel] for streaming lists, or Partial[YourModel] for streaming partial updates to a single model. Streaming is useful for large extractions where you want to process results as they arrive rather than waiting for the full response. See the instructor documentation for the streaming API details.

What happens when the model cannot extract a field?

If the field is typed as Optional[X], the model will return None for missing information. If the field is required (non-Optional), the model will either hallucinate a value or fail validation, triggering a retry. For fields that may legitimately be absent in the source text, always use Optional with a None default. This is the most common mistake new users make.

Can I extract data from large documents?

Yes, but be aware of token limits. For documents larger than a few thousand words, split them into chunks and extract from each chunk separately. Use a List[YourModel] return type if a single document contains multiple items to extract (like a list of transactions in a bank statement). For very large documents, consider summarizing first with a regular completion call, then extracting from the summary.

How is this different from just prompting for JSON output?

Prompting for JSON works until it does not — the model adds markdown fences, writes a preamble sentence, or omits fields. instructor uses tool calling (not prompting) to enforce the schema, so the model cannot deviate from the structure. It also runs Pydantic validation on the result and retries if types or constraints are violated. The difference in reliability for production use is significant — JSON prompting is fine for experiments, but instructor is the right tool for pipelines where data quality matters.

Is my data sent to OpenAI when I use instructor?

instructor is a thin wrapper around the OpenAI SDK — your data goes to whatever API endpoint you configure, subject to that provider’s data policy. If you are processing sensitive data, use a self-hosted model via Ollama or another local inference server, and point instructor at your local endpoint with a custom base_url. The library itself does not send data anywhere — it only wraps the client you provide.

Conclusion

The instructor library solves one of the most persistent frustrations in LLM application development: getting the model to return data in the shape your code expects, every time. We covered patching the OpenAI client, defining Pydantic schemas with field descriptions, extracting nested and list objects, adding custom validation rules, configuring retries and modes, and using instructor with non-OpenAI providers. The job extraction pipeline demonstrated how these pieces combine into a production-ready pattern.

The next step is to extend the real-life example: add a web scraper to pull live job postings, or connect the extracted data to a database. With instructor handling the model-to-schema translation, you can focus entirely on the business logic of what to extract and what to do with it.

Full documentation and more examples are at python.useinstructor.com. The library’s GitHub has a large collection of real-world examples including classification, knowledge graph extraction, and citation-backed answers.

Further Reading: For more details, see the Python import system documentation.

Frequently Asked Questions

What is the difference between absolute and relative imports in Python?

Absolute imports use the full package path from the project root (e.g., from mypackage.module import func). Relative imports use dots to reference the current package (e.g., from .module import func). Absolute imports are generally preferred for clarity.

What does init.py do in a Python package?

The __init__.py file marks a directory as a Python package, allowing its modules to be imported. It can be empty or contain initialization code, define __all__ for controlling wildcard imports, or re-export symbols for a cleaner public API.

How do I fix ‘ModuleNotFoundError’ in Python?

Check that the module is installed (pip install), verify your PYTHONPATH includes the right directories, ensure __init__.py files exist in package directories, and confirm you are using the correct Python environment. Running from the project root often resolves path issues.

What is the best project structure for a Python application?

A common structure includes a top-level project directory containing a src/ folder with your package, a tests/ folder, setup.py or pyproject.toml, and a requirements.txt. This keeps source code, tests, and configuration clearly separated.

Should I use relative or absolute imports?

PEP 8 recommends absolute imports for most cases because they are more readable and less error-prone. Use relative imports only within a package when the internal structure is unlikely to change and the import path would be excessively long with absolute imports.

How To Install Selenium Web Driver For Python in Linux

by Pubs | Beginner, Web Scraping

Beginner

Selenium is a useful python library to extract web page data especially for pages with javascript loading. Many of you may have tried to use selenium but may have gotten stuck in the installation process. One key thing you have to remember is that Selenium will run an actual browser in the background (or foreground if you wish) to query a given website. So a key step is to install the driver if you haven’t done so already.

Step 1: Locate the right web driver

Since Selenium will use an actual driver, one of the first decisions you’ll need to make is to determine which driver to use. Generally it won’t matter, but the best browser to use, is the one that works the best for your target website. For example, if your target website works best under Firefox, then use that.

Browser	Supported OS	Maintained by	Download	Issue Tracker
Chromium/Chrome	Windows/macOS/Linux	Google	Downloads	Issues
Firefox	Windows/macOS/Linux	Mozilla	Downloads	Issues
Edge	Windows 10	Microsoft	Downloads	Issues
Internet Explorer	Windows	Selenium Project	Downloads	Issues
Opera	Windows/macOS/Linux	Opera	Downloads	Issues

So decide which one, and then go to the download page. For this example we will use FireFox. In the above table, the download link goes to this page: https://github.com/mozilla/geckodriver/releases

You can then click on the latest release:

You can then scroll down to the bottom of the page to see the driver list:

Right click on the .gz file, and then get the URL.

Step 2: Download the web driver

Next go to your linux terminal and create a directory to store this file:

Next go into that directory, and then use wget to download the url by pasting the link you copied above:

wget https://github.com/mozilla/geckodriver/releases/download/v0.29.1/geckodriver-v0.29.1-linux32.tar.gz

Step 3: Extract the download web drivers

Next you should see the .gz file when you list the files:

You can the gzip the file to extract it:

gzip -d geckodriver-v0.29.1-linux32.tar.gz

You can then finally untar the file to decompress:

tar -xvf geckodriver-v0.29.1-linux32.tar

Step 4: Configure PATH

What you will be left with is a file called “geckodriver”. This is the driver file. You will need to have it made available via the export path. The reason is that the selenium looks for the driver file from the PATH operating system environment variable.

I simply went to the parent directory, then updated the PATH environment variable by taking the existing PATH value ($PATH) then appending the gdriver folder:

export PATH=$PATH:gdriver

If you do not do the above, you will get the error:

selenium.common.exceptions.WebDriverException: Message: 'geckodriver' executable needs to be in PATH.

Step 5: Test running the web driver

That’s it! Now if you test the following code, you should be able to run a web query by running a firefox driver in the background:

# main.py
from selenium import webdriver
from selenium.webdriver import FirefoxOptions

opts = FirefoxOptions()
opts.add_argument("--headless")
browser = webdriver.Firefox(options=opts)


# Declare a variable containing the URL is going to be scrapped 
URL = 'https://pythonhowtoprogram.com/'
# Web driver going into website
browser.get(URL)

# Printing page title
print(browser.title)

You will notice it does take a few seconds to run for the first time. It’s because that an instance of a browser needs to be loaded which does take a few seconds. Just keep this in mind in case you need to have faster performance for which you may need to use urllib or requests instead.

Next Steps

Now that you know how to install a driver, there are numerous webscraping tutorials we have on offer. You can find them all in our web scraping section: https://pythonhowtoprogram.com/category/web-scraping/

Want More Great Articles? Subscribe to our newsletter and have great articles sent right to your inbox as they come:

How To Use Python instructor for Structured LLM Outputs

by Pubs | May 19, 2026 | APIs, Intermediate

Intermediate

You ask an LLM to extract a user’s name, age, and email from a paragraph of text. Sometimes it returns clean JSON. Sometimes it returns JSON wrapped in markdown fences. Sometimes it returns a paragraph explaining why it extracted those fields. If you have ever built a pipeline that breaks because the model decided today was a good day to add “Sure! Here is the extracted data:” before the JSON, you already understand why instructor exists.

The instructor library patches the OpenAI client (and any OpenAI-compatible API) to force the model to return a fully validated Pydantic model — every time. When validation fails, it retries automatically. You define exactly what fields you need, with their types and constraints, and instructor handles the conversation with the model until the output matches your schema. You need Python 3.9+, an OpenAI API key (or compatible endpoint), and pip install instructor.

This article walks through everything you need to get structured LLM outputs in production: installing and patching the client, defining Pydantic schemas, extracting nested objects, handling lists, using validation hooks, working with non-OpenAI models via LiteLLM, and building a real extraction pipeline. By the end you will have a reusable pattern for reliable structured data from any LLM.

Structured LLM Output: Quick Example

The fastest way to see instructor in action is to extract a structured object from a single sentence. Install the library and try this:

# quick_instructor.py
import instructor
from openai import OpenAI
from pydantic import BaseModel

client = instructor.from_openai(OpenAI())

class Person(BaseModel):
    name: str
    age: int
    city: str

person = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=Person,
    messages=[{"role": "user", "content": "Alice is 32 years old and lives in Melbourne."}]
)

print(person.name)   # Alice
print(person.age)    # 32
print(person.city)   # Melbourne
print(type(person))  # <class '__main__.Person'>

Output:

Alice
32
Melbourne
<class '__main__.Person'>

The key line is instructor.from_openai(OpenAI()) — this patches the standard OpenAI client. After that, you pass response_model=Person to any chat.completions.create call, and instructor automatically: sends the Pydantic schema to the model as a tool definition, parses the model’s tool-call response, validates it against your schema, and retries if validation fails. The return value is a fully typed Pydantic object, not a string or dict.

That example covers the simplest case. The sections below show how to handle nested models, lists, validation rules, retry configuration, and real-world pipelines.

What Is instructor and Why Use It?

When you call an LLM without constraints, it returns free-form text. Parsing that text into structured data is fragile — you write regex, JSON parsers, and fallback handlers that break every time the model changes its wording. instructor solves this by using OpenAI’s function/tool calling feature under the hood: it converts your Pydantic model into a JSON Schema tool definition, forces the model to call that tool, and validates the returned arguments against your schema.

The result is LLM output that behaves like a typed function return value instead of a string you have to parse. If the model returns a field with the wrong type (for example, age as a string “thirty-two” instead of an integer), instructor sends the validation error back to the model and asks it to try again — up to a configurable number of retries.

Approach	Reliability	Type Safety	Auto-Retry
Parse raw LLM text	Fragile	None	Manual
Parse JSON from prompt	Moderate	Manual	Manual
OpenAI function calling	Good	Partial	None
instructor + Pydantic	High	Full	Built-in

The library supports multiple backends: instructor.from_openai, instructor.from_anthropic, instructor.from_gemini, and any OpenAI-compatible endpoint via base_url. This makes it the same interface regardless of which model you use.

Installation and Setup

Install instructor and the OpenAI SDK together. If you are using a different provider, you may also need their SDK:

# Terminal
pip install instructor openai pydantic

Set your API key as an environment variable so it never appears in your code:

# setup_env.py -- run once, or add to your shell profile
import os
# In practice, set this in your shell:
# export OPENAI_API_KEY="sk-..."
print("OPENAI_API_KEY set:", bool(os.environ.get("OPENAI_API_KEY")))

Output:

OPENAI_API_KEY set: True

Patch the client once at startup and reuse it for all calls. Creating a new patched client for every request is wasteful:

# client_setup.py
import instructor
from openai import OpenAI

# Patch once at startup
client = instructor.from_openai(OpenAI())  # reads OPENAI_API_KEY from env

# The client now has response_model support on all completion calls
print(type(client))  # <class 'instructor.client.Instructor'>

Output:

<class 'instructor.client.Instructor'>

Defining Pydantic Schemas for Extraction

Your Pydantic model defines exactly what fields the LLM must return. Field descriptions improve accuracy significantly — the model uses them as instructions for what to put in each field. Use Field(description=...) to guide the extraction:

# schema_example.py
import instructor
from openai import OpenAI
from pydantic import BaseModel, Field
from typing import Optional

client = instructor.from_openai(OpenAI())

class JobPosting(BaseModel):
    title: str = Field(description="The exact job title as written in the posting")
    company: str = Field(description="Company name offering the position")
    location: str = Field(description="City and country, or 'Remote'")
    salary_min: Optional[int] = Field(None, description="Minimum annual salary in USD if mentioned")
    salary_max: Optional[int] = Field(None, description="Maximum annual salary in USD if mentioned")
    is_remote: bool = Field(description="True if the role allows remote work")

text = """
Senior Python Developer at DataFlow Inc. -- Remote (US timezones preferred).
Salary range: $140,000 - $175,000 per year. Must have 5+ years Python experience.
"""

job = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=JobPosting,
    messages=[{"role": "user", "content": f"Extract the job details from: {text}"}]
)

print(f"Title: {job.title}")
print(f"Company: {job.company}")
print(f"Location: {job.location}")
print(f"Salary: ${job.salary_min:,} - ${job.salary_max:,}")
print(f"Remote: {job.is_remote}")

Output:

Title: Senior Python Developer
Company: DataFlow Inc.
Location: Remote (US timezones preferred)
Salary: $140,000 - $175,000
Remote: True

The Optional[int] type tells instructor (and the model) that salary fields may be absent. When the source text does not mention a salary, these fields will be None instead of hallucinated values. Always use Optional for fields that may not appear in the input — without it, the model will invent plausible-sounding values rather than leaving the field empty.

Extracting Nested and List Objects

Real-world extraction often requires nested structures — for example, an invoice with multiple line items, or a resume with a list of work experiences. instructor handles nested Pydantic models and List types natively:

# nested_extraction.py
import instructor
from openai import OpenAI
from pydantic import BaseModel, Field
from typing import List

client = instructor.from_openai(OpenAI())

class LineItem(BaseModel):
    description: str
    quantity: int
    unit_price: float

class Invoice(BaseModel):
    vendor: str
    invoice_number: str
    items: List[LineItem]
    total: float

invoice_text = """
Invoice #INV-2024-0891 from CloudHost Solutions
- 3x Server instances @ $45.00 each
- 1x SSL Certificate @ $12.00
- 2x Domain registrations @ $15.00 each
Total: $222.00
"""

result = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=Invoice,
    messages=[{"role": "user", "content": f"Extract invoice data: {invoice_text}"}]
)

print(f"Vendor: {result.vendor}")
print(f"Invoice #: {result.invoice_number}")
for item in result.items:
    print(f"  {item.quantity}x {item.description} @ ${item.unit_price:.2f}")
print(f"Total: ${result.total:.2f}")

Output:

Vendor: CloudHost Solutions
Invoice #: INV-2024-0891
  3x Server instances @ $45.00
  1x SSL Certificate @ $12.00
  2x Domain registrations @ $15.00
Total: $222.00

Nested models work because instructor converts the entire schema — including nested classes — into a JSON Schema definition that the model understands. The model fills in every field of every nested object, and Pydantic validates the whole structure recursively. If the items list is missing or a line item has an invalid type, instructor retries the extraction with the validation error as feedback.

Adding Custom Validation Rules

Pydantic’s field_validator lets you add business logic on top of type checking. instructor automatically feeds validation errors back to the model, so the model gets a second (or third) chance to return values that satisfy your rules:

# custom_validation.py
import instructor
from openai import OpenAI
from pydantic import BaseModel, Field, field_validator
from typing import List

client = instructor.from_openai(OpenAI())

class ProductReview(BaseModel):
    product_name: str
    rating: int = Field(description="Rating from 1 to 5")
    pros: List[str] = Field(description="List of positive aspects, at least one")
    cons: List[str] = Field(description="List of negative aspects, can be empty")
    summary: str = Field(description="One-sentence summary under 150 characters")

    @field_validator("rating")
    @classmethod
    def rating_in_range(cls, v: int) -> int:
        if not 1 <= v <= 5:
            raise ValueError(f"Rating must be between 1 and 5, got {v}")
        return v

    @field_validator("pros")
    @classmethod
    def at_least_one_pro(cls, v: List[str]) -> List[str]:
        if not v:
            raise ValueError("Must include at least one positive aspect")
        return v

    @field_validator("summary")
    @classmethod
    def summary_length(cls, v: str) -> str:
        if len(v) > 150:
            raise ValueError(f"Summary too long: {len(v)} chars (max 150)")
        return v

text = """
The new Python IDE is pretty solid. Boot time is fast, autocomplete works well.
The memory usage is high and the plugin store is still sparse. Overall a decent
choice for Python development. I'd give it 4 out of 5.
"""

review = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=ProductReview,
    messages=[{"role": "user", "content": f"Extract review details: {text}"}]
)

print(f"Product: {review.product_name}")
print(f"Rating: {review.rating}/5")
print(f"Pros: {review.pros}")
print(f"Cons: {review.cons}")
print(f"Summary: {review.summary}")

Output:

Product: Python IDE
Rating: 4/5
Pros: ['Fast boot time', 'Good autocomplete']
Cons: ['High memory usage', 'Sparse plugin store']
Summary: A solid Python IDE with fast performance but limited plugins and high memory usage.

When a validator raises ValueError, instructor captures the error message and sends it back to the model in a follow-up message: “Validation failed: Rating must be between 1 and 5, got 6. Please fix and try again.” The model then self-corrects. By default, instructor retries up to 3 times before raising an exception. You can configure this with max_retries=N on the completion call.

Configuring Retries and Modes

instructor supports several extraction modes depending on what your model supports. The default mode uses OpenAI’s tool calling, but you can switch to JSON mode or other strategies:

# retry_config.py
import instructor
from instructor import Mode
from openai import OpenAI
from pydantic import BaseModel

# Default: tool calling (most reliable for OpenAI models)
client_tools = instructor.from_openai(OpenAI())

# JSON mode: model returns raw JSON instead of a tool call
client_json = instructor.from_openai(OpenAI(), mode=Mode.JSON)

# MD_JSON mode: model wraps JSON in markdown fences (useful for some fine-tunes)
client_md = instructor.from_openai(OpenAI(), mode=Mode.MD_JSON)

class City(BaseModel):
    name: str
    country: str
    population: int

# Control retries per-call
city = client_tools.chat.completions.create(
    model="gpt-4o-mini",
    response_model=City,
    max_retries=5,           # retry up to 5 times on validation failure
    messages=[{"role": "user", "content": "Tell me about Tokyo"}]
)

print(f"{city.name}, {city.country}: pop {city.population:,}")

Output:

Tokyo, Japan: pop 13,960,000

For most OpenAI models, the default tool-calling mode is most reliable. Use Mode.JSON for models that support JSON mode but not tool calling — for example, some fine-tuned models or older GPT versions. The max_retries parameter controls how many times instructor will re-prompt the model when validation fails. For production pipelines where data quality matters more than cost, set this to 3-5.

Using instructor with Non-OpenAI Models

If you are using Anthropic’s Claude, Google Gemini, or a local model via Ollama, instructor has provider-specific patches. For OpenAI-compatible endpoints (like local LLMs with an OpenAI-compatible API), you can pass a custom base_url:

# multi_provider.py
import instructor
from anthropic import Anthropic
from pydantic import BaseModel

# Anthropic Claude -- uses a different client class
anthropic_client = instructor.from_anthropic(Anthropic())

class Sentiment(BaseModel):
    label: str   # "positive", "negative", or "neutral"
    score: float # confidence from 0.0 to 1.0
    reason: str  # one-sentence explanation

result = anthropic_client.messages.create(
    model="claude-3-haiku-20240307",
    max_tokens=256,
    response_model=Sentiment,
    messages=[{
        "role": "user",
        "content": "This new Python library is fantastic, saves me hours every week!"
    }]
)

print(f"Sentiment: {result.label} ({result.score:.0%})")
print(f"Reason: {result.reason}")

Output:

Sentiment: positive (96%)
Reason: The user expresses strong enthusiasm and quantifies time savings, indicating genuine satisfaction.

For local models via Ollama (which provides an OpenAI-compatible API on localhost:11434), create the client with a custom base URL:

# ollama_instructor.py
import instructor
from openai import OpenAI
from pydantic import BaseModel

# Ollama runs an OpenAI-compatible server locally
ollama_client = instructor.from_openai(
    OpenAI(base_url="http://localhost:11434/v1", api_key="ollama"),
    mode=instructor.Mode.JSON  # use JSON mode for local models
)

class Summary(BaseModel):
    headline: str
    key_points: list[str]

# Works the same as OpenAI -- just a different backend
# summary = ollama_client.chat.completions.create(
#     model="llama3.2",
#     response_model=Summary,
#     messages=[{"role": "user", "content": "Summarize Python's async/await model"}]
# )
print("Local model client ready -- uncomment to use with Ollama running")

Output:

Local model client ready -- uncomment to use with Ollama running

Real-Life Example: Job Posting Extraction Pipeline

Here is a complete pipeline that reads job postings from a list of texts, extracts structured data, filters by criteria, and exports to CSV — the kind of task that comes up in recruiting tools, market research, and job aggregators:

# job_extraction_pipeline.py
import instructor
import csv
from openai import OpenAI
from pydantic import BaseModel, Field
from typing import Optional, List

client = instructor.from_openai(OpenAI())

class JobPosting(BaseModel):
    title: str = Field(description="Job title exactly as written")
    company: str
    location: str = Field(description="City/country or 'Remote'")
    salary_min: Optional[int] = Field(None, description="Min annual salary USD")
    salary_max: Optional[int] = Field(None, description="Max annual salary USD")
    required_years: Optional[int] = Field(None, description="Years of experience required")
    technologies: List[str] = Field(description="List of technologies mentioned")
    is_remote: bool

# Sample job postings to process
JOB_TEXTS = [
    """Senior Python Engineer at Nexaflow -- Remote-first.
    $150k-$190k. 5+ years Python, FastAPI, PostgreSQL, AWS required.""",

    """Junior Data Scientist at BioMetrics Ltd (London, UK).
    GBP 45,000-55,000. 0-2 years exp, pandas, scikit-learn, matplotlib.""",

    """Staff ML Engineer at Quantra -- San Francisco CA.
    $220,000 - $280,000/yr. 8+ years, PyTorch, CUDA, distributed training.""",
]

def extract_jobs(texts: List[str]) -> List[JobPosting]:
    """Extract structured job data from raw posting texts."""
    jobs = []
    for i, text in enumerate(texts, 1):
        job = client.chat.completions.create(
            model="gpt-4o-mini",
            response_model=JobPosting,
            max_retries=3,
            messages=[{"role": "user", "content": f"Extract job details:\n\n{text}"}]
        )
        jobs.append(job)
        print(f"[{i}/{len(texts)}] Extracted: {job.title} at {job.company}")
    return jobs

def filter_remote(jobs: List[JobPosting]) -> List[JobPosting]:
    return [j for j in jobs if j.is_remote]

def export_csv(jobs: List[JobPosting], path: str) -> None:
    with open(path, "w", newline="") as f:
        writer = csv.writer(f)
        writer.writerow(["Title", "Company", "Location", "Salary Min", "Salary Max",
                         "Yrs Required", "Technologies", "Remote"])
        for j in jobs:
            writer.writerow([
                j.title, j.company, j.location,
                j.salary_min or "", j.salary_max or "",
                j.required_years or "",
                ", ".join(j.technologies),
                j.is_remote
            ])

if __name__ == "__main__":
    print("Extracting job postings...")
    jobs = extract_jobs(JOB_TEXTS)
    remote_jobs = filter_remote(jobs)
    print(f"\nTotal extracted: {len(jobs)}, Remote: {len(remote_jobs)}")
    export_csv(jobs, "jobs_extracted.csv")
    print("Saved to jobs_extracted.csv")

Output:

Extracting job postings...
[1/3] Extracted: Senior Python Engineer at Nexaflow
[2/3] Extracted: Junior Data Scientist at BioMetrics Ltd
[3/3] Extracted: Staff ML Engineer at Quantra

Total extracted: 3, Remote: 1
Saved to jobs_extracted.csv

This pipeline is easy to extend: add a database write step, connect it to a web scraper that feeds real job pages, or add more validation rules to the JobPosting model. The core pattern — extract once, validate automatically, retry on failure — stays the same regardless of the scale. You can process thousands of postings by replacing JOB_TEXTS with a generator that reads from a queue or database, keeping the extraction logic identical.

Frequently Asked Questions

Does instructor increase API costs because of retries?

Yes, each retry is an additional API call, so failed extractions cost more. In practice, with well-designed schemas and clear field descriptions, validation failures are rare — under 5% for most extraction tasks. The cost increase is usually worth the reliability gain. If cost is a concern, use max_retries=1 and handle exceptions in your code rather than retrying automatically.

Does instructor support streaming responses?

Yes. Use response_model=Iterable[YourModel] for streaming lists, or Partial[YourModel] for streaming partial updates to a single model. Streaming is useful for large extractions where you want to process results as they arrive rather than waiting for the full response. See the instructor documentation for the streaming API details.

What happens when the model cannot extract a field?

If the field is typed as Optional[X], the model will return None for missing information. If the field is required (non-Optional), the model will either hallucinate a value or fail validation, triggering a retry. For fields that may legitimately be absent in the source text, always use Optional with a None default. This is the most common mistake new users make.

Can I extract data from large documents?

Yes, but be aware of token limits. For documents larger than a few thousand words, split them into chunks and extract from each chunk separately. Use a List[YourModel] return type if a single document contains multiple items to extract (like a list of transactions in a bank statement). For very large documents, consider summarizing first with a regular completion call, then extracting from the summary.

How is this different from just prompting for JSON output?

Prompting for JSON works until it does not — the model adds markdown fences, writes a preamble sentence, or omits fields. instructor uses tool calling (not prompting) to enforce the schema, so the model cannot deviate from the structure. It also runs Pydantic validation on the result and retries if types or constraints are violated. The difference in reliability for production use is significant — JSON prompting is fine for experiments, but instructor is the right tool for pipelines where data quality matters.

Is my data sent to OpenAI when I use instructor?

instructor is a thin wrapper around the OpenAI SDK — your data goes to whatever API endpoint you configure, subject to that provider’s data policy. If you are processing sensitive data, use a self-hosted model via Ollama or another local inference server, and point instructor at your local endpoint with a custom base_url. The library itself does not send data anywhere — it only wraps the client you provide.

Conclusion

The instructor library solves one of the most persistent frustrations in LLM application development: getting the model to return data in the shape your code expects, every time. We covered patching the OpenAI client, defining Pydantic schemas with field descriptions, extracting nested and list objects, adding custom validation rules, configuring retries and modes, and using instructor with non-OpenAI providers. The job extraction pipeline demonstrated how these pieces combine into a production-ready pattern.

The next step is to extend the real-life example: add a web scraper to pull live job postings, or connect the extracted data to a database. With instructor handling the model-to-schema translation, you can focus entirely on the business logic of what to extract and what to do with it.

Full documentation and more examples are at python.useinstructor.com. The library’s GitHub has a large collection of real-world examples including classification, knowledge graph extraction, and citation-backed answers.