Beginner

Introduction: Unlocking AI with Python

The OpenAI API brings powerful language models directly into your Python applications. Whether you’re building chatbots, automating content creation, analyzing text, or generating embeddings, the official OpenAI Python SDK makes integration straightforward and intuitive. In this guide, we’ll explore everything from basic chat completions to advanced features like function calling and vision capabilities, complete with production-ready examples you can deploy immediately.

The modern AI landscape has democratized access to sophisticated language models. What once required significant ML expertise now takes just a few lines of Python. The OpenAI API currently powers applications used by millions of developers worldwide, and with the latest Python SDK (v1.0+), the experience is more elegant and Pythonic than ever. You’ll gain the skills to harness models like GPT-4o, GPT-4o-mini, and GPT-3.5-turbo in your projects.

By the end of this tutorial, you’ll understand how to initialize the client, handle authentication, construct effective prompts, stream responses for real-time interaction, invoke external tools through function calling, process images, generate embeddings, and implement robust error handling. We’ll also examine a complete CLI chatbot implementation that demonstrates conversation history management.

Quick Example: Your First API Call

Let’s get straight to it. Here’s a minimal example that demonstrates the power of the OpenAI API. This script creates a single chat completion request and displays the model’s response. It assumes your OPENAI_API_KEY environment variable is set:

# quick_chat.py
from openai import OpenAI

client = OpenAI()
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Explain recursion in 20 words."}]
)
print(response.choices[0].message.content)

Output:

Recursion is a function calling itself to solve smaller versions of the same problem until reaching a base case.

The OpenAI() client automatically reads your API key from the environment, constructs a message, sends it to the model, and returns a structured response. The choices array contains the model’s completions, and message.content is the actual text response.

What Is the OpenAI API?

The OpenAI API is a REST interface that gives you programmatic access to OpenAI’s language models. Rather than using the web interface, you call the API from your application. The official Python SDK wraps this REST API, handling authentication, request formatting, and response parsing automatically.

OpenAI offers multiple models optimized for different use cases:

ModelBest ForContext WindowRelative CostSpeed
gpt-4oComplex reasoning, multimodal, production128K tokensHigherModerate
gpt-4o-miniFast, cost-effective, high volume128K tokensLowFast
gpt-3.5-turboLegacy applications4K tokensVery LowFastest

For most new projects, we recommend gpt-4o-mini as your starting point. The API also supports embeddings, audio transcription, image generation, and fine-tuning.

Programmer relaying prompts to OpenAI and receiving completions
Prompt in. Completion out. Magic in the middle.

Installing the OpenAI Python SDK

The official OpenAI Python SDK is available on PyPI. We recommend installing within a virtual environment:

# install_openai.sh
$ python3 -m venv openai_env
$ source openai_env/bin/activate
$ pip install openai

Output:

Successfully installed openai-1.30.0

Verify the installation:

# verify_openai.py
import openai
print(f"OpenAI SDK version: {openai.__version__}")

Output:

OpenAI SDK version: 1.30.0

The SDK requires Python 3.7 or higher.

Setting Up Your API Key

Every request requires authentication via an API key. Create one at platform.openai.com/api-keys. Never hardcode your key in source code. Use environment variables instead:

# setup_env.sh
$ export OPENAI_API_KEY="sk-proj-your-actual-key-here"

The OpenAI() client automatically reads this environment variable:

# client_init.py
from openai import OpenAI

client = OpenAI()  # Reads OPENAI_API_KEY from environment
print("Client initialized successfully")

Output:

Client initialized successfully
Developer protecting API keys in a secure vault
Keep your API key closer than your passwords

Chat Completions: The Core API

Chat completions are the foundation of most OpenAI applications. You send a list of messages and the model generates a completion:

# chat_basic.py
from openai import OpenAI

client = OpenAI()
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "user", "content": "What are three benefits of Python for data science?"}
    ],
    max_tokens=200,
    temperature=0.7
)
print(response.choices[0].message.content)

Output:

1. Rich Ecosystem: Libraries like pandas, NumPy, and scikit-learn provide comprehensive tools.
2. Ease of Learning: Python's readable syntax lets data scientists focus on algorithms.
3. Community and Integration: Strong community support and seamless production integration.

Key parameters: model specifies which model, messages is the conversation, max_tokens limits response length, and temperature controls randomness (0.7 is a good default).

System Messages and Conversation Roles

System messages set the assistant’s behavior and personality. Every conversation should begin with one:

# system_messages.py
from openai import OpenAI

client = OpenAI()
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "You are a helpful Python tutor. Keep responses under 150 words."},
        {"role": "user", "content": "What is a list comprehension?"}
    ],
    temperature=0.5
)
print(response.choices[0].message.content)

Output:

A list comprehension is a concise way to create lists in Python:
squares = [x ** 2 for x in range(5)]  # [0, 1, 4, 9, 16]

Messages have three roles: system (instructions), user (human input), and assistant (model responses). Store messages in a list to maintain multi-turn conversation context.

Streaming Responses

Streaming sends tokens as they’re generated, creating a real-time effect:

# streaming_response.py
from openai import OpenAI

client = OpenAI()
stream = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Write a haiku about Python."}],
    stream=True
)
for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)
print()

Output:

Code flows like rivers
Functions call within themselves
Logic pure and clean

The stream=True parameter returns a generator that yields chunks as they arrive — perfect for web UIs.

Tokens streaming in real-time from an API response
Streaming: because waiting for the full response is so 2022

Function Calling and Tool Use

Function calling lets the model request your application invoke specific functions:

# function_calling.py
import json
from openai import OpenAI

client = OpenAI()
tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Get current weather for a city",
        "parameters": {
            "type": "object",
            "properties": {
                "city": {"type": "string", "description": "City name"},
                "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
            },
            "required": ["city"]
        }
    }
}]

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "What's the weather in New York?"}],
    tools=tools,
    tool_choice="auto"
)

if response.choices[0].message.tool_calls:
    call = response.choices[0].message.tool_calls[0]
    print(f"Function: {call.function.name}")
    print(f"Arguments: {call.function.arguments}")

Output:

Function: get_weather
Arguments: {"city": "New York", "unit": "fahrenheit"}

The model decides which function to invoke and structures arguments automatically. Your code executes the logic and sends results back.

Generating Embeddings

Embeddings are numerical representations of text for semantic search and similarity:

# embeddings_example.py
from openai import OpenAI

client = OpenAI()
texts = ["The cat sat on the mat.", "A feline rests on the rug.", "The dog ran through the park."]

response = client.embeddings.create(model="text-embedding-3-small", input=texts)
for i, item in enumerate(response.data):
    print(f"Text {i}: {len(item.embedding)} dimensions, first 3: {item.embedding[:3]}")

Output:

Text 0: 1536 dimensions, first 3: [-0.0234, 0.0891, -0.0123]
Text 1: 1536 dimensions, first 3: [-0.0245, 0.0885, -0.0115]
Text 2: 1536 dimensions, first 3: [0.0123, 0.0342, 0.0789]

Semantically similar texts produce similar embeddings. Store them in vector databases like ChromaDB for powerful search.

Error Handling and Rate Limits

Production applications must handle errors gracefully:

# error_handling.py
from openai import OpenAI, RateLimitError, APIError

client = OpenAI()
try:
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": "Hello!"}],
        max_tokens=10
    )
    print(response.choices[0].message.content)
except RateLimitError:
    print("Rate limit exceeded. Wait before retrying.")
except APIError as e:
    print(f"API error: {e.status_code} - {e.message}")

Output:

Hi there! How can I help you today?

Implement exponential backoff for rate limits — wait progressively longer between retries.

Developer handling rate limit errors with grace and retry logic
Rate limits: the universe’s way of saying slow down

Real-Life Example: Interactive CLI Chatbot

Here’s a complete chatbot with conversation history:

# chatbot.py
from openai import OpenAI

class Chatbot:
    def __init__(self, system_prompt="You are a helpful assistant."):
        self.client = OpenAI()
        self.messages = [{"role": "system", "content": system_prompt}]

    def chat(self, user_input):
        self.messages.append({"role": "user", "content": user_input})
        try:
            response = self.client.chat.completions.create(
                model="gpt-4o-mini",
                messages=self.messages,
                temperature=0.7,
                max_tokens=500
            )
            reply = response.choices[0].message.content
            self.messages.append({"role": "assistant", "content": reply})
            return reply
        except Exception as e:
            return f"Error: {e}"

    def save_history(self, filename="chat_history.txt"):
        with open(filename, "w") as f:
            for msg in self.messages:
                f.write(f"{msg['role'].upper()}:\n{msg['content']}\n\n")

    def run(self):
        print("Chatbot ready. Type 'quit' to exit, 'save' to save history.\n")
        while True:
            user_input = input("You: ").strip()
            if not user_input:
                continue
            if user_input.lower() == "quit":
                break
            if user_input.lower() == "save":
                self.save_history()
                print("History saved.")
                continue
            print(f"Assistant: {self.chat(user_input)}\n")

if __name__ == "__main__":
    Chatbot("You are a knowledgeable Python expert.").run()

Usage:

$ python chatbot.py
Chatbot ready. Type 'quit' to exit, 'save' to save history.

You: What's the difference between lists and tuples?
Assistant: Lists are mutable, tuples are immutable...

You: save
History saved.

This demonstrates conversation history management, error handling, persistent storage, and an interactive loop.

Frequently Asked Questions

How much does the OpenAI API cost?

OpenAI uses pay-per-token pricing. gpt-4o-mini costs roughly $0.15 per million input tokens. Set hard spending limits in your account settings.

What’s the difference between temperature and top_p?

temperature controls randomness directly (0 = deterministic, 2 = very random). top_p uses nucleus sampling. For most apps, adjust temperature and leave top_p at 1.0.

How long can a conversation be?

Limited by the context window: 128K tokens for gpt-4o/gpt-4o-mini. Monitor response.usage to track consumption.

Can I fine-tune the models?

Yes, OpenAI supports fine-tuning for specific models. Start with prompt engineering first — it’s usually sufficient and cheaper.

How do I handle sensitive data?

Never send PII (SSNs, credit cards) to the API. Use data scrubbing and anonymization. Review OpenAI’s privacy policy for compliance.

Conclusion

You now have a comprehensive foundation for building with the OpenAI API: chat completions, system messages, streaming, function calling, vision, embeddings, and error handling. The Python SDK makes integration elegant. Start with a simple chatbot and extend from there.

Visit the official documentation at platform.openai.com/docs for advanced features like fine-tuning and batch processing.