Beginner
Introduction: Unlocking AI with Python
The OpenAI API brings powerful language models directly into your Python applications. Whether you’re building chatbots, automating content creation, analyzing text, or generating embeddings, the official OpenAI Python SDK makes integration straightforward and intuitive. In this guide, we’ll explore everything from basic chat completions to advanced features like function calling and vision capabilities, complete with production-ready examples you can deploy immediately.
The modern AI landscape has democratized access to sophisticated language models. What once required significant ML expertise now takes just a few lines of Python. The OpenAI API currently powers applications used by millions of developers worldwide, and with the latest Python SDK (v1.0+), the experience is more elegant and Pythonic than ever. You’ll gain the skills to harness models like GPT-4o, GPT-4o-mini, and GPT-3.5-turbo in your projects.
By the end of this tutorial, you’ll understand how to initialize the client, handle authentication, construct effective prompts, stream responses for real-time interaction, invoke external tools through function calling, process images, generate embeddings, and implement robust error handling. We’ll also examine a complete CLI chatbot implementation that demonstrates conversation history management.
Quick Example: Your First API Call
Let’s get straight to it. Here’s a minimal example that demonstrates the power of the OpenAI API. This script creates a single chat completion request and displays the model’s response. It assumes your OPENAI_API_KEY environment variable is set:
# quick_chat.py
from openai import OpenAI
client = OpenAI()
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Explain recursion in 20 words."}]
)
print(response.choices[0].message.content)
Output:
Recursion is a function calling itself to solve smaller versions of the same problem until reaching a base case.
The OpenAI() client automatically reads your API key from the environment, constructs a message, sends it to the model, and returns a structured response. The choices array contains the model’s completions, and message.content is the actual text response.
What Is the OpenAI API?
The OpenAI API is a REST interface that gives you programmatic access to OpenAI’s language models. Rather than using the web interface, you call the API from your application. The official Python SDK wraps this REST API, handling authentication, request formatting, and response parsing automatically.
OpenAI offers multiple models optimized for different use cases:
| Model | Best For | Context Window | Relative Cost | Speed |
|---|---|---|---|---|
| gpt-4o | Complex reasoning, multimodal, production | 128K tokens | Higher | Moderate |
| gpt-4o-mini | Fast, cost-effective, high volume | 128K tokens | Low | Fast |
| gpt-3.5-turbo | Legacy applications | 4K tokens | Very Low | Fastest |
For most new projects, we recommend gpt-4o-mini as your starting point. The API also supports embeddings, audio transcription, image generation, and fine-tuning.

Installing the OpenAI Python SDK
The official OpenAI Python SDK is available on PyPI. We recommend installing within a virtual environment:
# install_openai.sh
$ python3 -m venv openai_env
$ source openai_env/bin/activate
$ pip install openai
Output:
Successfully installed openai-1.30.0
Verify the installation:
# verify_openai.py
import openai
print(f"OpenAI SDK version: {openai.__version__}")
Output:
OpenAI SDK version: 1.30.0
The SDK requires Python 3.7 or higher.
Setting Up Your API Key
Every request requires authentication via an API key. Create one at platform.openai.com/api-keys. Never hardcode your key in source code. Use environment variables instead:
# setup_env.sh
$ export OPENAI_API_KEY="sk-proj-your-actual-key-here"
The OpenAI() client automatically reads this environment variable:
# client_init.py
from openai import OpenAI
client = OpenAI() # Reads OPENAI_API_KEY from environment
print("Client initialized successfully")
Output:
Client initialized successfully

Chat Completions: The Core API
Chat completions are the foundation of most OpenAI applications. You send a list of messages and the model generates a completion:
# chat_basic.py
from openai import OpenAI
client = OpenAI()
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "user", "content": "What are three benefits of Python for data science?"}
],
max_tokens=200,
temperature=0.7
)
print(response.choices[0].message.content)
Output:
1. Rich Ecosystem: Libraries like pandas, NumPy, and scikit-learn provide comprehensive tools.
2. Ease of Learning: Python's readable syntax lets data scientists focus on algorithms.
3. Community and Integration: Strong community support and seamless production integration.
Key parameters: model specifies which model, messages is the conversation, max_tokens limits response length, and temperature controls randomness (0.7 is a good default).
System Messages and Conversation Roles
System messages set the assistant’s behavior and personality. Every conversation should begin with one:
# system_messages.py
from openai import OpenAI
client = OpenAI()
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "You are a helpful Python tutor. Keep responses under 150 words."},
{"role": "user", "content": "What is a list comprehension?"}
],
temperature=0.5
)
print(response.choices[0].message.content)
Output:
A list comprehension is a concise way to create lists in Python:
squares = [x ** 2 for x in range(5)] # [0, 1, 4, 9, 16]
Messages have three roles: system (instructions), user (human input), and assistant (model responses). Store messages in a list to maintain multi-turn conversation context.
Streaming Responses
Streaming sends tokens as they’re generated, creating a real-time effect:
# streaming_response.py
from openai import OpenAI
client = OpenAI()
stream = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Write a haiku about Python."}],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)
print()
Output:
Code flows like rivers
Functions call within themselves
Logic pure and clean
The stream=True parameter returns a generator that yields chunks as they arrive — perfect for web UIs.

Function Calling and Tool Use
Function calling lets the model request your application invoke specific functions:
# function_calling.py
import json
from openai import OpenAI
client = OpenAI()
tools = [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a city",
"parameters": {
"type": "object",
"properties": {
"city": {"type": "string", "description": "City name"},
"unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
},
"required": ["city"]
}
}
}]
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "What's the weather in New York?"}],
tools=tools,
tool_choice="auto"
)
if response.choices[0].message.tool_calls:
call = response.choices[0].message.tool_calls[0]
print(f"Function: {call.function.name}")
print(f"Arguments: {call.function.arguments}")
Output:
Function: get_weather
Arguments: {"city": "New York", "unit": "fahrenheit"}
The model decides which function to invoke and structures arguments automatically. Your code executes the logic and sends results back.
Generating Embeddings
Embeddings are numerical representations of text for semantic search and similarity:
# embeddings_example.py
from openai import OpenAI
client = OpenAI()
texts = ["The cat sat on the mat.", "A feline rests on the rug.", "The dog ran through the park."]
response = client.embeddings.create(model="text-embedding-3-small", input=texts)
for i, item in enumerate(response.data):
print(f"Text {i}: {len(item.embedding)} dimensions, first 3: {item.embedding[:3]}")
Output:
Text 0: 1536 dimensions, first 3: [-0.0234, 0.0891, -0.0123]
Text 1: 1536 dimensions, first 3: [-0.0245, 0.0885, -0.0115]
Text 2: 1536 dimensions, first 3: [0.0123, 0.0342, 0.0789]
Semantically similar texts produce similar embeddings. Store them in vector databases like ChromaDB for powerful search.
Error Handling and Rate Limits
Production applications must handle errors gracefully:
# error_handling.py
from openai import OpenAI, RateLimitError, APIError
client = OpenAI()
try:
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Hello!"}],
max_tokens=10
)
print(response.choices[0].message.content)
except RateLimitError:
print("Rate limit exceeded. Wait before retrying.")
except APIError as e:
print(f"API error: {e.status_code} - {e.message}")
Output:
Hi there! How can I help you today?
Implement exponential backoff for rate limits — wait progressively longer between retries.

Real-Life Example: Interactive CLI Chatbot
Here’s a complete chatbot with conversation history:
# chatbot.py
from openai import OpenAI
class Chatbot:
def __init__(self, system_prompt="You are a helpful assistant."):
self.client = OpenAI()
self.messages = [{"role": "system", "content": system_prompt}]
def chat(self, user_input):
self.messages.append({"role": "user", "content": user_input})
try:
response = self.client.chat.completions.create(
model="gpt-4o-mini",
messages=self.messages,
temperature=0.7,
max_tokens=500
)
reply = response.choices[0].message.content
self.messages.append({"role": "assistant", "content": reply})
return reply
except Exception as e:
return f"Error: {e}"
def save_history(self, filename="chat_history.txt"):
with open(filename, "w") as f:
for msg in self.messages:
f.write(f"{msg['role'].upper()}:\n{msg['content']}\n\n")
def run(self):
print("Chatbot ready. Type 'quit' to exit, 'save' to save history.\n")
while True:
user_input = input("You: ").strip()
if not user_input:
continue
if user_input.lower() == "quit":
break
if user_input.lower() == "save":
self.save_history()
print("History saved.")
continue
print(f"Assistant: {self.chat(user_input)}\n")
if __name__ == "__main__":
Chatbot("You are a knowledgeable Python expert.").run()
Usage:
$ python chatbot.py
Chatbot ready. Type 'quit' to exit, 'save' to save history.
You: What's the difference between lists and tuples?
Assistant: Lists are mutable, tuples are immutable...
You: save
History saved.
This demonstrates conversation history management, error handling, persistent storage, and an interactive loop.
Frequently Asked Questions
How much does the OpenAI API cost?
OpenAI uses pay-per-token pricing. gpt-4o-mini costs roughly $0.15 per million input tokens. Set hard spending limits in your account settings.
What’s the difference between temperature and top_p?
temperature controls randomness directly (0 = deterministic, 2 = very random). top_p uses nucleus sampling. For most apps, adjust temperature and leave top_p at 1.0.
How long can a conversation be?
Limited by the context window: 128K tokens for gpt-4o/gpt-4o-mini. Monitor response.usage to track consumption.
Can I fine-tune the models?
Yes, OpenAI supports fine-tuning for specific models. Start with prompt engineering first — it’s usually sufficient and cheaper.
How do I handle sensitive data?
Never send PII (SSNs, credit cards) to the API. Use data scrubbing and anonymization. Review OpenAI’s privacy policy for compliance.
Conclusion
You now have a comprehensive foundation for building with the OpenAI API: chat completions, system messages, streaming, function calling, vision, embeddings, and error handling. The Python SDK makes integration elegant. Start with a simple chatbot and extend from there.
Visit the official documentation at platform.openai.com/docs for advanced features like fine-tuning and batch processing.