Skill Level: Intermediate

Writing Python That Doesn’t Crash at 3 AM

Every Python developer has written code that works perfectly in testing and then explodes in production. The difference between amateur and professional Python code often comes down to one thing: how you handle exceptions. Proper exception handling isn’t about wrapping everything in try/except blocks–it’s about anticipating failure modes, recovering gracefully, and giving yourself enough information to fix problems quickly.

This guide covers everything from basic try/except mechanics to advanced patterns used in production systems. You’ll learn when to catch exceptions, when to let them propagate, how to create custom exception hierarchies, and patterns that will save you hours of debugging. We’re not just covering syntax–we’re covering the thinking behind robust error handling.

Here’s what we’ll work through: the exception hierarchy in Python, try/except/else/finally blocks, catching specific vs broad exceptions, raising and re-raising exceptions, custom exception classes, context managers for cleanup, logging strategies, and real-world patterns for API error handling. By the end, your code will fail gracefully instead of catastrophically.

Quick Example: The Right Way vs The Wrong Way

Before diving deep, here’s the difference between amateur and professional exception handling:

# Bad: Catches everything, hides bugs
try:
    result = process_data(user_input)
except:
    print("Something went wrong")

# Good: Specific, informative, recoverable
try:
    result = process_data(user_input)
except ValidationError as e:
    logger.warning(f"Invalid input from user: {e}")
    return {"error": str(e), "field": e.field_name}, 400
except DatabaseError as e:
    logger.error(f"Database failure processing request: {e}", exc_info=True)
    return {"error": "Service temporarily unavailable"}, 503
except Exception as e:
    logger.critical(f"Unexpected error: {e}", exc_info=True)
    raise

The first example swallows every error silently. The second catches specific exceptions, logs useful context, returns appropriate HTTP status codes, and re-raises unexpected errors so they don’t go unnoticed. That’s the pattern we’re building toward.

Debug Dee examining error message
Reading the actual error message is the most underrated debugging technique in programming.

Understanding Python’s Exception Hierarchy

Python’s exceptions form a class hierarchy rooted at BaseException. Understanding this hierarchy is critical because your except clauses catch the specified exception and all its subclasses.

# The key parts of the hierarchy
BaseException
├── SystemExit          # sys.exit() calls
├── KeyboardInterrupt   # Ctrl+C
├── GeneratorExit       # Generator cleanup
└── Exception           # All "normal" exceptions
    ├── StopIteration
    ├── ArithmeticError
    │   ├── ZeroDivisionError
    │   └── OverflowError
    ├── LookupError
    │   ├── IndexError
    │   └── KeyError
    ├── OSError
    │   ├── FileNotFoundError
    │   ├── PermissionError
    │   └── ConnectionError
    ├── ValueError
    ├── TypeError
    ├── AttributeError
    └── RuntimeError

This is why you should almost never write except BaseException or bare except:–they catch KeyboardInterrupt and SystemExit, preventing users from stopping your program with Ctrl+C. Always catch Exception at most.

Exception When It Occurs Common Cause
ValueError Right type, wrong value int(\”abc\”), invalid arguments
TypeError Wrong type entirely len(42), \”hello\” + 5
KeyError Dict key missing my_dict[\”nonexistent\”]
IndexError List index out of range my_list[999]
AttributeError Object lacks attribute None.split()
FileNotFoundError File doesn’t exist open(\”missing.txt\”)
ConnectionError Network failure requests.get() timeout

Mastering try/except/else/finally

Most developers know try/except. Fewer use else and finally correctly. Each block has a specific purpose, and using all four makes your intent crystal clear.

import json

def load_config(filepath):
    \"\"\"Load and parse a JSON config file with proper error handling\"\"\"
    try:
        # Only put code that might raise the expected exception here
        with open(filepath, 'r') as f:
            raw_data = f.read()
    except FileNotFoundError:
        print(f\"Config file not found: {filepath}\")
        return get_default_config()
    except PermissionError:
        print(f\"No permission to read: {filepath}\")
        raise
    else:
        # Runs only if try block succeeded (no exception)
        # Put code here that depends on try's success
        # but shouldn't be protected by the except
        try:
            config = json.loads(raw_data)
        except json.JSONDecodeError as e:
            print(f\"Invalid JSON in {filepath}: {e}\")
            return get_default_config()
        return config
    finally:
        # Always runs, even if an exception was raised
        # Use for cleanup that must happen regardless
        print(f\"Config loading attempt for {filepath} complete\")


def get_default_config():
    return {\"debug\": False, \"log_level\": \"INFO\"}


# Usage
config = load_config(\"settings.json\")
print(config)

Output (file exists with valid JSON):

Config loading attempt for settings.json complete
{\"debug\": True, \"log_level\": \"DEBUG\", \"max_retries\": 3}

Output (file missing):

Config file not found: settings.json
Config loading attempt for settings.json complete
{\"debug\": False, \"log_level\": \"INFO\"}

The else block is often overlooked but serves an important purpose: it separates \”code that might fail\” from \”code that should run only on success.\” This prevents accidentally catching exceptions you didn’t intend to handle.

Catching Specific Exceptions (And Why Order Matters)

Python evaluates except clauses top to bottom and executes the first matching one. Since exceptions form a hierarchy, order matters–put specific exceptions before general ones.

import requests
import json

def fetch_api_data(url):
    \"\"\"Fetch data from an API with granular error handling\"\"\"
    try:
        response = requests.get(url, timeout=10)
        response.raise_for_status()
        data = response.json()
        return data

    except requests.exceptions.Timeout:
        # Most specific: request timed out
        print(f\"Request to {url} timed out after 10 seconds\")
        return None

    except requests.exceptions.ConnectionError:
        # Network-level failure
        print(f\"Could not connect to {url}\")
        return None

    except requests.exceptions.HTTPError as e:
        # Server returned error status code
        status = e.response.status_code
        if status == 404:
            print(f\"Resource not found: {url}\")
        elif status == 429:
            print(f\"Rate limited. Retry after: {e.response.headers.get('Retry-After', 'unknown')}\")
        elif status >= 500:
            print(f\"Server error ({status}) at {url}\")
        return None

    except json.JSONDecodeError:
        # Response wasn't valid JSON
        print(f\"Invalid JSON response from {url}\")
        return None

    except requests.exceptions.RequestException as e:
        # Catch-all for requests library (parent class)
        print(f\"Unexpected request error: {e}\")
        return None


# Test with various URLs
data = fetch_api_data(\"https://api.github.com/users/python\")
print(f\"Got data: {type(data)}\")

Output:

Got data: <class 'dict'>

If you reversed the order and put RequestException first, it would catch Timeout, ConnectionError, and HTTPError before their specific handlers ever ran. Always go from most specific to most general.

Sudo Sam drawing exception hierarchy
Understanding the exception hierarchy saves you from catching errors you never meant to handle.

Raising and Re-Raising Exceptions

Sometimes you need to raise exceptions yourself, or catch one and re-raise it after logging. Python gives you three patterns for this.

import logging

logger = logging.getLogger(__name__)

def validate_age(age):
    \"\"\"Validate age with descriptive error messages\"\"\"
    if not isinstance(age, (int, float)):
        raise TypeError(f\"Age must be a number, got {type(age).__name__}\")
    if age < 0:
        raise ValueError(f\"Age cannot be negative: {age}\")
    if age > 150:
        raise ValueError(f\"Age seems unrealistic: {age}\")
    return int(age)


def process_user_registration(data):
    \"\"\"Process registration with re-raising pattern\"\"\"
    try:
        age = validate_age(data.get('age'))
        # ... more processing
        return {\"status\": \"success\", \"age\": age}

    except (TypeError, ValueError) as e:
        # Log and re-raise: preserves original traceback
        logger.error(f\"Validation failed for user data: {e}\")
        raise  # Re-raises the SAME exception with original traceback

    except Exception as e:
        # Wrap in a new exception: chain for context
        logger.critical(f\"Unexpected error during registration: {e}\")
        raise RuntimeError(\"Registration system failure\") from e


# Exception chaining with 'from'
def connect_to_database(config):
    \"\"\"Demonstrate exception chaining\"\"\"
    try:
        # Simulating a connection attempt
        if not config.get('host'):
            raise KeyError('host')
    except KeyError as e:
        # 'from e' chains the original exception
        raise ConnectionError(
            f\"Cannot connect: missing config key '{e}'\"
        ) from e


# Test it
try:
    validate_age(-5)
except ValueError as e:
    print(f\"Caught: {e}\")

try:
    validate_age(\"twenty\")
except TypeError as e:
    print(f\"Caught: {e}\")

try:
    connect_to_database({})
except ConnectionError as e:
    print(f\"Caught: {e}\")
    print(f\"Caused by: {e.__cause__}\")

Output:

Caught: Age cannot be negative: -5
Caught: Age must be a number, got str
Caught: Cannot connect: missing config key 'host'
Caused by: 'host'

The raise without arguments re-raises the current exception with its original traceback intact. The raise X from Y syntax creates an exception chain, so the traceback shows both the new error and what caused it.

Creating Custom Exception Hierarchies

For any non-trivial application, create custom exceptions. They make your code self-documenting and let callers handle specific error conditions without parsing error messages.

class AppError(Exception):
    \"\"\"Base exception for our application\"\"\"
    def __init__(self, message, code=None, details=None):
        super().__init__(message)
        self.code = code
        self.details = details or {}


class ValidationError(AppError):
    \"\"\"Input validation failed\"\"\"
    def __init__(self, message, field=None, **kwargs):
        super().__init__(message, code=\"VALIDATION_ERROR\", **kwargs)
        self.field = field


class NotFoundError(AppError):
    \"\"\"Requested resource doesn't exist\"\"\"
    def __init__(self, resource_type, resource_id):
        message = f\"{resource_type} with id '{resource_id}' not found\"
        super().__init__(message, code=\"NOT_FOUND\")
        self.resource_type = resource_type
        self.resource_id = resource_id


class AuthenticationError(AppError):
    \"\"\"User authentication failed\"\"\"
    def __init__(self, message=\"Authentication required\"):
        super().__init__(message, code=\"AUTH_ERROR\")


class RateLimitError(AppError):
    \"\"\"Too many requests\"\"\"
    def __init__(self, retry_after=60):
        message = f\"Rate limit exceeded. Retry after {retry_after} seconds\"
        super().__init__(message, code=\"RATE_LIMIT\")
        self.retry_after = retry_after


# Using custom exceptions
def get_user(user_id):
    users = {\"1\": \"Alice\", \"2\": \"Bob\"}
    if not isinstance(user_id, str):
        raise ValidationError(\"User ID must be a string\", field=\"user_id\")
    if user_id not in users:
        raise NotFoundError(\"User\", user_id)
    return users[user_id]


def handle_request(user_id):
    \"\"\"Handler showing how custom exceptions simplify error responses\"\"\"
    try:
        user = get_user(user_id)
        return {\"status\": \"success\", \"user\": user}
    except ValidationError as e:
        return {\"error\": e.code, \"message\": str(e), \"field\": e.field}
    except NotFoundError as e:
        return {\"error\": e.code, \"message\": str(e)}
    except AppError as e:
        return {\"error\": e.code, \"message\": str(e)}


# Test
print(handle_request(\"1\"))
print(handle_request(\"99\"))
print(handle_request(42))

Output:

{'status': 'success', 'user': 'Alice'}
{'error': 'NOT_FOUND', 'message': \"User with id '99' not found\"}
{'error': 'VALIDATION_ERROR', 'message': 'User ID must be a string', 'field': 'user_id'}
Loop Larry tangled in exception blocks
Custom exception hierarchies turn something broke into this specific thing broke for this specific reason.

Context Managers for Guaranteed Cleanup

Context managers (the with statement) are Python’s best tool for ensuring cleanup happens even when exceptions occur. They’re cleaner than try/finally for resource management.

import sqlite3
from contextlib import contextmanager

@contextmanager
def database_connection(db_path):
    \"\"\"Context manager for database connections with automatic rollback\"\"\"
    conn = sqlite3.connect(db_path)
    try:
        yield conn
        conn.commit()  # Only commits if no exception occurred
    except Exception:
        conn.rollback()  # Rollback on any error
        raise  # Re-raise so caller knows something failed
    finally:
        conn.close()  # Always close the connection


@contextmanager
def temporary_file(filepath, mode='w'):
    \"\"\"Write to a temp file, then atomically rename on success\"\"\"
    import os
    temp_path = filepath + '.tmp'
    f = open(temp_path, mode)
    try:
        yield f
        f.close()
        os.replace(temp_path, filepath)  # Atomic rename
    except Exception:
        f.close()
        if os.path.exists(temp_path):
            os.remove(temp_path)  # Clean up temp file on failure
        raise


# Usage
with database_connection(\":memory:\") as conn:
    cursor = conn.cursor()
    cursor.execute(\"CREATE TABLE users (name TEXT, age INTEGER)\")
    cursor.execute(\"INSERT INTO users VALUES ('Alice', 30)\")
    cursor.execute(\"SELECT * FROM users\")
    print(cursor.fetchall())

# The connection is guaranteed to be closed, committed on success
# or rolled back on failure

Output:

[('Alice', 30)]

The @contextmanager decorator from contextlib lets you write context managers as generator functions. Everything before yield is your setup, and everything after is your cleanup. The try/except/finally inside ensures proper handling regardless of what happens.

Logging Exceptions Effectively

Print statements aren’t sufficient for production. Use Python’s logging module with structured information that helps you diagnose issues quickly.

import logging
import traceback
import sys

# Configure logging with useful format
logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
    handlers=[logging.StreamHandler(sys.stdout)]
)
logger = logging.getLogger(\"myapp\")


def process_payment(user_id, amount):
    \"\"\"Demonstrate different logging levels for exceptions\"\"\"
    try:
        if amount <= 0:
            raise ValueError(f\"Invalid amount: {amount}\")

        if amount > 10000:
            # Warning: not an error, but worth noting
            logger.warning(
                \"Large payment detected\",
                extra={\"user_id\": user_id, \"amount\": amount}
            )

        # Simulate processing
        if user_id == \"blocked\":
            raise PermissionError(\"User account is blocked\")

        logger.info(f\"Payment processed: user={user_id}, amount=${amount:.2f}\")
        return True

    except ValueError as e:
        # Client error: log as warning, don't need full traceback
        logger.warning(f\"Invalid payment request: {e}\")
        return False

    except PermissionError as e:
        # Expected business logic error
        logger.error(f\"Payment blocked: user={user_id}, reason={e}\")
        return False

    except Exception as e:
        # Unexpected error: log as critical WITH traceback
        logger.critical(
            f\"Payment system failure: user={user_id}, amount={amount}\",
            exc_info=True  # This includes the full traceback
        )
        raise


# Test scenarios
process_payment(\"user123\", 50.00)
process_payment(\"user456\", -10)
process_payment(\"blocked\", 100)

Output:

2026-04-08 10:30:00 - myapp - INFO - Payment processed: user=user123, amount=$50.00
2026-04-08 10:30:00 - myapp - WARNING - Invalid payment request: Invalid amount: -10
2026-04-08 10:30:00 - myapp - ERROR - Payment blocked: user=blocked, reason=User account is blocked

The key insight: use exc_info=True for unexpected exceptions where you need the full stack trace. For expected exceptions (validation errors, business logic), a simple message is sufficient. Don’t log full tracebacks for every caught exception–it creates noise that obscures real problems.

Pyro Pete setting up logging
Good logging turns a mystery crash into a five-minute fix.

Real-World Example: Resilient API Client with Retry Logic

Here’s a production-grade pattern combining everything we’ve covered: custom exceptions, context managers, logging, and retry logic for an API client.

import time
import logging
import json
from functools import wraps

logger = logging.getLogger(__name__)


class APIError(Exception):
    \"\"\"Base API exception\"\"\"
    def __init__(self, message, status_code=None, response_body=None):
        super().__init__(message)
        self.status_code = status_code
        self.response_body = response_body


class RetryableError(APIError):
    \"\"\"Error that should trigger a retry\"\"\"
    pass


class FatalError(APIError):
    \"\"\"Error that should NOT be retried\"\"\"
    pass


def retry_on_failure(max_retries=3, base_delay=1, backoff_factor=2):
    \"\"\"Decorator that retries on RetryableError with exponential backoff\"\"\"
    def decorator(func):
        @wraps(func)
        def wrapper(*args, **kwargs):
            last_exception = None

            for attempt in range(max_retries + 1):
                try:
                    return func(*args, **kwargs)
                except RetryableError as e:
                    last_exception = e
                    if attempt < max_retries:
                        delay = base_delay * (backoff_factor ** attempt)
                        logger.warning(
                            f\"Attempt {attempt + 1}/{max_retries + 1} failed: {e}. \"
                            f\"Retrying in {delay}s...\"
                        )
                        time.sleep(delay)
                    else:
                        logger.error(
                            f\"All {max_retries + 1} attempts failed for {func.__name__}\"
                        )
                except FatalError:
                    # Don't retry fatal errors
                    raise

            raise last_exception

        return wrapper
    return decorator


class APIClient:
    \"\"\"Resilient API client with proper exception handling\"\"\"

    def __init__(self, base_url, api_key):
        self.base_url = base_url
        self.api_key = api_key
        self._session = None

    def _classify_error(self, status_code, response_body):
        \"\"\"Classify HTTP errors as retryable or fatal\"\"\"
        if status_code in (429, 502, 503, 504):
            raise RetryableError(
                f\"Server returned {status_code}\",
                status_code=status_code,
                response_body=response_body
            )
        elif status_code in (400, 401, 403, 404, 422):
            raise FatalError(
                f\"Client error {status_code}: {response_body}\",
                status_code=status_code,
                response_body=response_body
            )
        else:
            raise APIError(
                f\"Unexpected status {status_code}\",
                status_code=status_code,
                response_body=response_body
            )

    @retry_on_failure(max_retries=3, base_delay=1)
    def get(self, endpoint):
        \"\"\"GET request with automatic retry on transient failures\"\"\"
        import urllib.request
        import urllib.error

        url = f\"{self.base_url}/{endpoint}\"
        req = urllib.request.Request(url)
        req.add_header('Authorization', f'Bearer {self.api_key}')

        try:
            with urllib.request.urlopen(req, timeout=10) as response:
                data = json.loads(response.read().decode())
                logger.info(f\"GET {endpoint}: success\")
                return data

        except urllib.error.HTTPError as e:
            body = e.read().decode() if e.fp else \"\"
            self._classify_error(e.code, body)

        except urllib.error.URLError as e:
            raise RetryableError(f\"Connection failed: {e.reason}\")

        except json.JSONDecodeError as e:
            raise FatalError(f\"Invalid JSON response: {e}\")

    @retry_on_failure(max_retries=2, base_delay=2)
    def post(self, endpoint, data):
        \"\"\"POST request with retry logic\"\"\"
        import urllib.request
        import urllib.error

        url = f\"{self.base_url}/{endpoint}\"
        payload = json.dumps(data).encode('utf-8')
        req = urllib.request.Request(url, data=payload, method='POST')
        req.add_header('Authorization', f'Bearer {self.api_key}')
        req.add_header('Content-Type', 'application/json')

        try:
            with urllib.request.urlopen(req, timeout=30) as response:
                result = json.loads(response.read().decode())
                logger.info(f\"POST {endpoint}: success\")
                return result

        except urllib.error.HTTPError as e:
            body = e.read().decode() if e.fp else \"\"
            self._classify_error(e.code, body)

        except urllib.error.URLError as e:
            raise RetryableError(f\"Connection failed: {e.reason}\")


# Usage example
client = APIClient(\"https://api.example.com\", \"my-api-key\")

try:
    users = client.get(\"users\")
    print(f\"Fetched {len(users)} users\")
except RetryableError as e:
    print(f\"Service unavailable after retries: {e}\")
except FatalError as e:
    print(f\"Request invalid: {e}\")
except APIError as e:
    print(f\"API error: {e}\")

This pattern handles transient failures (network issues, rate limits, server errors) with automatic retry and exponential backoff. Fatal errors like 401 or 404 fail immediately since retrying won't help. The decorator separates retry logic from business logic, keeping your code clean.

Cache Katie racing through retries
Exponential backoff is the polite way of saying I will keep trying, but I will wait longer each time.

Frequently Asked Questions

Should I use except Exception or except BaseException?

Almost always except Exception. The BaseException class includes SystemExit, KeyboardInterrupt, and GeneratorExit, which you almost never want to catch. Catching KeyboardInterrupt prevents Ctrl+C from working. Only use BaseException in top-level cleanup code where you truly need to catch everything.

When should I use EAFP vs LBYL style?

EAFP (Easier to Ask Forgiveness than Permission) means using try/except. LBYL (Look Before You Leap) means checking conditions first. Python idiomatically prefers EAFP. Use try/except when the check would be expensive or racy (like file existence checks). Use LBYL when the check is cheap and makes code clearer, like if key in dict.

Is it bad to use bare except clauses?

Yes, except: without specifying an exception type catches everything including SystemExit and KeyboardInterrupt. It also makes debugging harder because you don't know what went wrong. Always specify at least except Exception, and prefer more specific exception types.

How do I handle exceptions in async code?

Async exception handling uses the same try/except syntax inside async functions. For gathering multiple coroutines, use asyncio.gather(return_exceptions=True) to collect exceptions instead of failing on the first one. For task groups in Python 3.11+, use asyncio.TaskGroup which raises ExceptionGroup containing all child exceptions.

What's the performance cost of try/except?

In Python, entering a try block has virtually zero cost when no exception occurs. Exceptions are \"zero-cost\" on the happy path. However, actually raising and catching exceptions is expensive--roughly 10-100x slower than a simple if/else check. Don't use exceptions for normal control flow (like iterating with IndexError). Use them for truly exceptional conditions.

How do I test exception handling code?

Use pytest.raises as a context manager to verify exceptions are raised correctly. For testing retry logic, use unittest.mock.patch to simulate failures. Test both the happy path and each exception path. Verify that the right exception type, message, and attributes are set.

Wrapping Up

Exception handling separates production-ready code from scripts that work on your laptop. The patterns we covered--specific exception catching, custom hierarchies, context managers, logging strategies, and retry decorators--form the foundation of resilient Python applications.

The key principles to remember: catch specific exceptions rather than broad ones, use else and finally blocks intentionally, create custom exception classes for your domain, log with appropriate severity levels, and design your error recovery strategy before writing the happy path code.

Start applying these patterns incrementally. Pick one codebase and replace bare except clauses with specific ones. Add custom exceptions to your next project. Build a retry decorator for your API calls. Each improvement makes your code more debuggable and your production systems more reliable.

Official Resources

Related Articles