Skill Level: Intermediate
Writing Python That Doesn’t Crash at 3 AM
Every Python developer has written code that works perfectly in testing and then explodes in production. The difference between amateur and professional Python code often comes down to one thing: how you handle exceptions. Proper exception handling isn’t about wrapping everything in try/except blocks–it’s about anticipating failure modes, recovering gracefully, and giving yourself enough information to fix problems quickly.
This guide covers everything from basic try/except mechanics to advanced patterns used in production systems. You’ll learn when to catch exceptions, when to let them propagate, how to create custom exception hierarchies, and patterns that will save you hours of debugging. We’re not just covering syntax–we’re covering the thinking behind robust error handling.
Here’s what we’ll work through: the exception hierarchy in Python, try/except/else/finally blocks, catching specific vs broad exceptions, raising and re-raising exceptions, custom exception classes, context managers for cleanup, logging strategies, and real-world patterns for API error handling. By the end, your code will fail gracefully instead of catastrophically.
Quick Example: The Right Way vs The Wrong Way
Before diving deep, here’s the difference between amateur and professional exception handling:
# Bad: Catches everything, hides bugs
try:
result = process_data(user_input)
except:
print("Something went wrong")
# Good: Specific, informative, recoverable
try:
result = process_data(user_input)
except ValidationError as e:
logger.warning(f"Invalid input from user: {e}")
return {"error": str(e), "field": e.field_name}, 400
except DatabaseError as e:
logger.error(f"Database failure processing request: {e}", exc_info=True)
return {"error": "Service temporarily unavailable"}, 503
except Exception as e:
logger.critical(f"Unexpected error: {e}", exc_info=True)
raise
The first example swallows every error silently. The second catches specific exceptions, logs useful context, returns appropriate HTTP status codes, and re-raises unexpected errors so they don’t go unnoticed. That’s the pattern we’re building toward.
Understanding Python’s Exception Hierarchy
Python’s exceptions form a class hierarchy rooted at BaseException. Understanding this hierarchy is critical because your except clauses catch the specified exception and all its subclasses.
# The key parts of the hierarchy
BaseException
├── SystemExit # sys.exit() calls
├── KeyboardInterrupt # Ctrl+C
├── GeneratorExit # Generator cleanup
└── Exception # All "normal" exceptions
├── StopIteration
├── ArithmeticError
│ ├── ZeroDivisionError
│ └── OverflowError
├── LookupError
│ ├── IndexError
│ └── KeyError
├── OSError
│ ├── FileNotFoundError
│ ├── PermissionError
│ └── ConnectionError
├── ValueError
├── TypeError
├── AttributeError
└── RuntimeError
This is why you should almost never write except BaseException or bare except:–they catch KeyboardInterrupt and SystemExit, preventing users from stopping your program with Ctrl+C. Always catch Exception at most.
| Exception | When It Occurs | Common Cause |
|---|---|---|
| ValueError | Right type, wrong value | int(\”abc\”), invalid arguments |
| TypeError | Wrong type entirely | len(42), \”hello\” + 5 |
| KeyError | Dict key missing | my_dict[\”nonexistent\”] |
| IndexError | List index out of range | my_list[999] |
| AttributeError | Object lacks attribute | None.split() |
| FileNotFoundError | File doesn’t exist | open(\”missing.txt\”) |
| ConnectionError | Network failure | requests.get() timeout |
Mastering try/except/else/finally
Most developers know try/except. Fewer use else and finally correctly. Each block has a specific purpose, and using all four makes your intent crystal clear.
import json
def load_config(filepath):
\"\"\"Load and parse a JSON config file with proper error handling\"\"\"
try:
# Only put code that might raise the expected exception here
with open(filepath, 'r') as f:
raw_data = f.read()
except FileNotFoundError:
print(f\"Config file not found: {filepath}\")
return get_default_config()
except PermissionError:
print(f\"No permission to read: {filepath}\")
raise
else:
# Runs only if try block succeeded (no exception)
# Put code here that depends on try's success
# but shouldn't be protected by the except
try:
config = json.loads(raw_data)
except json.JSONDecodeError as e:
print(f\"Invalid JSON in {filepath}: {e}\")
return get_default_config()
return config
finally:
# Always runs, even if an exception was raised
# Use for cleanup that must happen regardless
print(f\"Config loading attempt for {filepath} complete\")
def get_default_config():
return {\"debug\": False, \"log_level\": \"INFO\"}
# Usage
config = load_config(\"settings.json\")
print(config)
Output (file exists with valid JSON):
Config loading attempt for settings.json complete
{\"debug\": True, \"log_level\": \"DEBUG\", \"max_retries\": 3}
Output (file missing):
Config file not found: settings.json
Config loading attempt for settings.json complete
{\"debug\": False, \"log_level\": \"INFO\"}
The else block is often overlooked but serves an important purpose: it separates \”code that might fail\” from \”code that should run only on success.\” This prevents accidentally catching exceptions you didn’t intend to handle.
Catching Specific Exceptions (And Why Order Matters)
Python evaluates except clauses top to bottom and executes the first matching one. Since exceptions form a hierarchy, order matters–put specific exceptions before general ones.
import requests
import json
def fetch_api_data(url):
\"\"\"Fetch data from an API with granular error handling\"\"\"
try:
response = requests.get(url, timeout=10)
response.raise_for_status()
data = response.json()
return data
except requests.exceptions.Timeout:
# Most specific: request timed out
print(f\"Request to {url} timed out after 10 seconds\")
return None
except requests.exceptions.ConnectionError:
# Network-level failure
print(f\"Could not connect to {url}\")
return None
except requests.exceptions.HTTPError as e:
# Server returned error status code
status = e.response.status_code
if status == 404:
print(f\"Resource not found: {url}\")
elif status == 429:
print(f\"Rate limited. Retry after: {e.response.headers.get('Retry-After', 'unknown')}\")
elif status >= 500:
print(f\"Server error ({status}) at {url}\")
return None
except json.JSONDecodeError:
# Response wasn't valid JSON
print(f\"Invalid JSON response from {url}\")
return None
except requests.exceptions.RequestException as e:
# Catch-all for requests library (parent class)
print(f\"Unexpected request error: {e}\")
return None
# Test with various URLs
data = fetch_api_data(\"https://api.github.com/users/python\")
print(f\"Got data: {type(data)}\")
Output:
Got data: <class 'dict'>
If you reversed the order and put RequestException first, it would catch Timeout, ConnectionError, and HTTPError before their specific handlers ever ran. Always go from most specific to most general.
Raising and Re-Raising Exceptions
Sometimes you need to raise exceptions yourself, or catch one and re-raise it after logging. Python gives you three patterns for this.
import logging
logger = logging.getLogger(__name__)
def validate_age(age):
\"\"\"Validate age with descriptive error messages\"\"\"
if not isinstance(age, (int, float)):
raise TypeError(f\"Age must be a number, got {type(age).__name__}\")
if age < 0:
raise ValueError(f\"Age cannot be negative: {age}\")
if age > 150:
raise ValueError(f\"Age seems unrealistic: {age}\")
return int(age)
def process_user_registration(data):
\"\"\"Process registration with re-raising pattern\"\"\"
try:
age = validate_age(data.get('age'))
# ... more processing
return {\"status\": \"success\", \"age\": age}
except (TypeError, ValueError) as e:
# Log and re-raise: preserves original traceback
logger.error(f\"Validation failed for user data: {e}\")
raise # Re-raises the SAME exception with original traceback
except Exception as e:
# Wrap in a new exception: chain for context
logger.critical(f\"Unexpected error during registration: {e}\")
raise RuntimeError(\"Registration system failure\") from e
# Exception chaining with 'from'
def connect_to_database(config):
\"\"\"Demonstrate exception chaining\"\"\"
try:
# Simulating a connection attempt
if not config.get('host'):
raise KeyError('host')
except KeyError as e:
# 'from e' chains the original exception
raise ConnectionError(
f\"Cannot connect: missing config key '{e}'\"
) from e
# Test it
try:
validate_age(-5)
except ValueError as e:
print(f\"Caught: {e}\")
try:
validate_age(\"twenty\")
except TypeError as e:
print(f\"Caught: {e}\")
try:
connect_to_database({})
except ConnectionError as e:
print(f\"Caught: {e}\")
print(f\"Caused by: {e.__cause__}\")
Output:
Caught: Age cannot be negative: -5
Caught: Age must be a number, got str
Caught: Cannot connect: missing config key 'host'
Caused by: 'host'
The raise without arguments re-raises the current exception with its original traceback intact. The raise X from Y syntax creates an exception chain, so the traceback shows both the new error and what caused it.
Creating Custom Exception Hierarchies
For any non-trivial application, create custom exceptions. They make your code self-documenting and let callers handle specific error conditions without parsing error messages.
class AppError(Exception):
\"\"\"Base exception for our application\"\"\"
def __init__(self, message, code=None, details=None):
super().__init__(message)
self.code = code
self.details = details or {}
class ValidationError(AppError):
\"\"\"Input validation failed\"\"\"
def __init__(self, message, field=None, **kwargs):
super().__init__(message, code=\"VALIDATION_ERROR\", **kwargs)
self.field = field
class NotFoundError(AppError):
\"\"\"Requested resource doesn't exist\"\"\"
def __init__(self, resource_type, resource_id):
message = f\"{resource_type} with id '{resource_id}' not found\"
super().__init__(message, code=\"NOT_FOUND\")
self.resource_type = resource_type
self.resource_id = resource_id
class AuthenticationError(AppError):
\"\"\"User authentication failed\"\"\"
def __init__(self, message=\"Authentication required\"):
super().__init__(message, code=\"AUTH_ERROR\")
class RateLimitError(AppError):
\"\"\"Too many requests\"\"\"
def __init__(self, retry_after=60):
message = f\"Rate limit exceeded. Retry after {retry_after} seconds\"
super().__init__(message, code=\"RATE_LIMIT\")
self.retry_after = retry_after
# Using custom exceptions
def get_user(user_id):
users = {\"1\": \"Alice\", \"2\": \"Bob\"}
if not isinstance(user_id, str):
raise ValidationError(\"User ID must be a string\", field=\"user_id\")
if user_id not in users:
raise NotFoundError(\"User\", user_id)
return users[user_id]
def handle_request(user_id):
\"\"\"Handler showing how custom exceptions simplify error responses\"\"\"
try:
user = get_user(user_id)
return {\"status\": \"success\", \"user\": user}
except ValidationError as e:
return {\"error\": e.code, \"message\": str(e), \"field\": e.field}
except NotFoundError as e:
return {\"error\": e.code, \"message\": str(e)}
except AppError as e:
return {\"error\": e.code, \"message\": str(e)}
# Test
print(handle_request(\"1\"))
print(handle_request(\"99\"))
print(handle_request(42))
Output:
{'status': 'success', 'user': 'Alice'}
{'error': 'NOT_FOUND', 'message': \"User with id '99' not found\"}
{'error': 'VALIDATION_ERROR', 'message': 'User ID must be a string', 'field': 'user_id'}
Context Managers for Guaranteed Cleanup
Context managers (the with statement) are Python’s best tool for ensuring cleanup happens even when exceptions occur. They’re cleaner than try/finally for resource management.
import sqlite3
from contextlib import contextmanager
@contextmanager
def database_connection(db_path):
\"\"\"Context manager for database connections with automatic rollback\"\"\"
conn = sqlite3.connect(db_path)
try:
yield conn
conn.commit() # Only commits if no exception occurred
except Exception:
conn.rollback() # Rollback on any error
raise # Re-raise so caller knows something failed
finally:
conn.close() # Always close the connection
@contextmanager
def temporary_file(filepath, mode='w'):
\"\"\"Write to a temp file, then atomically rename on success\"\"\"
import os
temp_path = filepath + '.tmp'
f = open(temp_path, mode)
try:
yield f
f.close()
os.replace(temp_path, filepath) # Atomic rename
except Exception:
f.close()
if os.path.exists(temp_path):
os.remove(temp_path) # Clean up temp file on failure
raise
# Usage
with database_connection(\":memory:\") as conn:
cursor = conn.cursor()
cursor.execute(\"CREATE TABLE users (name TEXT, age INTEGER)\")
cursor.execute(\"INSERT INTO users VALUES ('Alice', 30)\")
cursor.execute(\"SELECT * FROM users\")
print(cursor.fetchall())
# The connection is guaranteed to be closed, committed on success
# or rolled back on failure
Output:
[('Alice', 30)]
The @contextmanager decorator from contextlib lets you write context managers as generator functions. Everything before yield is your setup, and everything after is your cleanup. The try/except/finally inside ensures proper handling regardless of what happens.
Logging Exceptions Effectively
Print statements aren’t sufficient for production. Use Python’s logging module with structured information that helps you diagnose issues quickly.
import logging
import traceback
import sys
# Configure logging with useful format
logging.basicConfig(
level=logging.INFO,
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
handlers=[logging.StreamHandler(sys.stdout)]
)
logger = logging.getLogger(\"myapp\")
def process_payment(user_id, amount):
\"\"\"Demonstrate different logging levels for exceptions\"\"\"
try:
if amount <= 0:
raise ValueError(f\"Invalid amount: {amount}\")
if amount > 10000:
# Warning: not an error, but worth noting
logger.warning(
\"Large payment detected\",
extra={\"user_id\": user_id, \"amount\": amount}
)
# Simulate processing
if user_id == \"blocked\":
raise PermissionError(\"User account is blocked\")
logger.info(f\"Payment processed: user={user_id}, amount=${amount:.2f}\")
return True
except ValueError as e:
# Client error: log as warning, don't need full traceback
logger.warning(f\"Invalid payment request: {e}\")
return False
except PermissionError as e:
# Expected business logic error
logger.error(f\"Payment blocked: user={user_id}, reason={e}\")
return False
except Exception as e:
# Unexpected error: log as critical WITH traceback
logger.critical(
f\"Payment system failure: user={user_id}, amount={amount}\",
exc_info=True # This includes the full traceback
)
raise
# Test scenarios
process_payment(\"user123\", 50.00)
process_payment(\"user456\", -10)
process_payment(\"blocked\", 100)
Output:
2026-04-08 10:30:00 - myapp - INFO - Payment processed: user=user123, amount=$50.00
2026-04-08 10:30:00 - myapp - WARNING - Invalid payment request: Invalid amount: -10
2026-04-08 10:30:00 - myapp - ERROR - Payment blocked: user=blocked, reason=User account is blocked
The key insight: use exc_info=True for unexpected exceptions where you need the full stack trace. For expected exceptions (validation errors, business logic), a simple message is sufficient. Don’t log full tracebacks for every caught exception–it creates noise that obscures real problems.
Real-World Example: Resilient API Client with Retry Logic
Here’s a production-grade pattern combining everything we’ve covered: custom exceptions, context managers, logging, and retry logic for an API client.
import time
import logging
import json
from functools import wraps
logger = logging.getLogger(__name__)
class APIError(Exception):
\"\"\"Base API exception\"\"\"
def __init__(self, message, status_code=None, response_body=None):
super().__init__(message)
self.status_code = status_code
self.response_body = response_body
class RetryableError(APIError):
\"\"\"Error that should trigger a retry\"\"\"
pass
class FatalError(APIError):
\"\"\"Error that should NOT be retried\"\"\"
pass
def retry_on_failure(max_retries=3, base_delay=1, backoff_factor=2):
\"\"\"Decorator that retries on RetryableError with exponential backoff\"\"\"
def decorator(func):
@wraps(func)
def wrapper(*args, **kwargs):
last_exception = None
for attempt in range(max_retries + 1):
try:
return func(*args, **kwargs)
except RetryableError as e:
last_exception = e
if attempt < max_retries:
delay = base_delay * (backoff_factor ** attempt)
logger.warning(
f\"Attempt {attempt + 1}/{max_retries + 1} failed: {e}. \"
f\"Retrying in {delay}s...\"
)
time.sleep(delay)
else:
logger.error(
f\"All {max_retries + 1} attempts failed for {func.__name__}\"
)
except FatalError:
# Don't retry fatal errors
raise
raise last_exception
return wrapper
return decorator
class APIClient:
\"\"\"Resilient API client with proper exception handling\"\"\"
def __init__(self, base_url, api_key):
self.base_url = base_url
self.api_key = api_key
self._session = None
def _classify_error(self, status_code, response_body):
\"\"\"Classify HTTP errors as retryable or fatal\"\"\"
if status_code in (429, 502, 503, 504):
raise RetryableError(
f\"Server returned {status_code}\",
status_code=status_code,
response_body=response_body
)
elif status_code in (400, 401, 403, 404, 422):
raise FatalError(
f\"Client error {status_code}: {response_body}\",
status_code=status_code,
response_body=response_body
)
else:
raise APIError(
f\"Unexpected status {status_code}\",
status_code=status_code,
response_body=response_body
)
@retry_on_failure(max_retries=3, base_delay=1)
def get(self, endpoint):
\"\"\"GET request with automatic retry on transient failures\"\"\"
import urllib.request
import urllib.error
url = f\"{self.base_url}/{endpoint}\"
req = urllib.request.Request(url)
req.add_header('Authorization', f'Bearer {self.api_key}')
try:
with urllib.request.urlopen(req, timeout=10) as response:
data = json.loads(response.read().decode())
logger.info(f\"GET {endpoint}: success\")
return data
except urllib.error.HTTPError as e:
body = e.read().decode() if e.fp else \"\"
self._classify_error(e.code, body)
except urllib.error.URLError as e:
raise RetryableError(f\"Connection failed: {e.reason}\")
except json.JSONDecodeError as e:
raise FatalError(f\"Invalid JSON response: {e}\")
@retry_on_failure(max_retries=2, base_delay=2)
def post(self, endpoint, data):
\"\"\"POST request with retry logic\"\"\"
import urllib.request
import urllib.error
url = f\"{self.base_url}/{endpoint}\"
payload = json.dumps(data).encode('utf-8')
req = urllib.request.Request(url, data=payload, method='POST')
req.add_header('Authorization', f'Bearer {self.api_key}')
req.add_header('Content-Type', 'application/json')
try:
with urllib.request.urlopen(req, timeout=30) as response:
result = json.loads(response.read().decode())
logger.info(f\"POST {endpoint}: success\")
return result
except urllib.error.HTTPError as e:
body = e.read().decode() if e.fp else \"\"
self._classify_error(e.code, body)
except urllib.error.URLError as e:
raise RetryableError(f\"Connection failed: {e.reason}\")
# Usage example
client = APIClient(\"https://api.example.com\", \"my-api-key\")
try:
users = client.get(\"users\")
print(f\"Fetched {len(users)} users\")
except RetryableError as e:
print(f\"Service unavailable after retries: {e}\")
except FatalError as e:
print(f\"Request invalid: {e}\")
except APIError as e:
print(f\"API error: {e}\")
This pattern handles transient failures (network issues, rate limits, server errors) with automatic retry and exponential backoff. Fatal errors like 401 or 404 fail immediately since retrying won't help. The decorator separates retry logic from business logic, keeping your code clean.
Frequently Asked Questions
Should I use except Exception or except BaseException?
Almost always except Exception. The BaseException class includes SystemExit, KeyboardInterrupt, and GeneratorExit, which you almost never want to catch. Catching KeyboardInterrupt prevents Ctrl+C from working. Only use BaseException in top-level cleanup code where you truly need to catch everything.
When should I use EAFP vs LBYL style?
EAFP (Easier to Ask Forgiveness than Permission) means using try/except. LBYL (Look Before You Leap) means checking conditions first. Python idiomatically prefers EAFP. Use try/except when the check would be expensive or racy (like file existence checks). Use LBYL when the check is cheap and makes code clearer, like if key in dict.
Is it bad to use bare except clauses?
Yes, except: without specifying an exception type catches everything including SystemExit and KeyboardInterrupt. It also makes debugging harder because you don't know what went wrong. Always specify at least except Exception, and prefer more specific exception types.
How do I handle exceptions in async code?
Async exception handling uses the same try/except syntax inside async functions. For gathering multiple coroutines, use asyncio.gather(return_exceptions=True) to collect exceptions instead of failing on the first one. For task groups in Python 3.11+, use asyncio.TaskGroup which raises ExceptionGroup containing all child exceptions.
What's the performance cost of try/except?
In Python, entering a try block has virtually zero cost when no exception occurs. Exceptions are \"zero-cost\" on the happy path. However, actually raising and catching exceptions is expensive--roughly 10-100x slower than a simple if/else check. Don't use exceptions for normal control flow (like iterating with IndexError). Use them for truly exceptional conditions.
How do I test exception handling code?
Use pytest.raises as a context manager to verify exceptions are raised correctly. For testing retry logic, use unittest.mock.patch to simulate failures. Test both the happy path and each exception path. Verify that the right exception type, message, and attributes are set.
Wrapping Up
Exception handling separates production-ready code from scripts that work on your laptop. The patterns we covered--specific exception catching, custom hierarchies, context managers, logging strategies, and retry decorators--form the foundation of resilient Python applications.
The key principles to remember: catch specific exceptions rather than broad ones, use else and finally blocks intentionally, create custom exception classes for your domain, log with appropriate severity levels, and design your error recovery strategy before writing the happy path code.
Start applying these patterns incrementally. Pick one codebase and replace bare except clauses with specific ones. Add custom exceptions to your next project. Build a retry decorator for your API calls. Each improvement makes your code more debuggable and your production systems more reliable.
Official Resources
- Python Tutorial: Errors and Exceptions
- Built-in Exceptions Reference
- contextlib Module Documentation
- Logging Module Documentation
- The try Statement Reference