Pubs | Python How To Program

How To Connect Python to PostgreSQL with psycopg

Intermediate

PostgreSQL is one of the most powerful open-source relational databases, and Python developers interact with it constantly — whether building web APIs, running data pipelines, or managing application state. If you have ever needed to store structured data beyond what SQLite can handle, PostgreSQL is usually the next step.

The good news is that psycopg (version 3, the modern successor to the venerable psycopg2) makes connecting Python to PostgreSQL straightforward and safe. It supports parameterized queries out of the box, handles connection pooling, and works beautifully with async code. You can install it with a single pip install psycopg[binary] command and be running queries in minutes.

In this article, we will cover everything you need to connect Python to PostgreSQL. We will start with a quick example showing a basic connection and query, then explain the psycopg library and why it is the recommended adapter. From there, we will walk through CRUD operations (Create, Read, Update, Delete), parameterized queries for security, connection pooling for performance, error handling patterns, and finish with a complete real-life project that builds a task manager backed by PostgreSQL.

Connecting Python to PostgreSQL: Quick Example

Here is a minimal working example that connects to a PostgreSQL database, creates a table, inserts a row, and reads it back. This gives you the core pattern you will use in every PostgreSQL project.

# quick_example.py
import psycopg

# Connect to PostgreSQL (adjust these for your setup)
conn_string = "host=localhost dbname=testdb user=postgres password=postgres"

with psycopg.connect(conn_string) as conn:
    with conn.cursor() as cur:
        # Create a simple table
        cur.execute("""
            CREATE TABLE IF NOT EXISTS greetings (
                id SERIAL PRIMARY KEY,
                message TEXT NOT NULL
            )
        """)
        # Insert a row
        cur.execute("INSERT INTO greetings (message) VALUES (%s)", ("Hello from Python!",))
        conn.commit()

        # Read it back
        cur.execute("SELECT id, message FROM greetings ORDER BY id DESC LIMIT 1")
        row = cur.fetchone()
        print(f"ID: {row[0]}, Message: {row[1]}")

Output:

ID: 1, Message: Hello from Python!

The key things to notice: we use psycopg.connect() with a connection string, wrap everything in with blocks for automatic cleanup, and use %s placeholders for parameterized queries (never string formatting). The conn.commit() call makes the insert permanent. Want to go deeper? Below we cover connection options, all four CRUD operations, pooling, and a complete project.

Understanding psycopg Python PostgreSQL adapter — One connection string. Infinite queries. Zero SQL injection.

What Is psycopg and Why Use It?

psycopg is the most popular PostgreSQL adapter for Python. Version 3 (just called psycopg) is a complete rewrite of the classic psycopg2 that powered Django, Flask, and countless Python applications for over a decade. The new version brings a cleaner API, native async support, and better type handling while keeping the reliability developers trusted.

Here is how psycopg compares to other options for connecting Python to PostgreSQL:

Feature	psycopg (v3)	psycopg2	asyncpg
Python 3.7+ support	Yes	Yes	Yes
Async support	Built-in	No (needs wrappers)	Async only
Connection pooling	Built-in	Separate package	Built-in
Parameterized queries	%s and named	%s and named	$1, $2 style
COPY support	Excellent	Good	Good
Active development	Yes (recommended)	Maintenance only	Yes
Django/Flask compatible	Yes	Yes	Limited

For most Python developers, psycopg (v3) is the right choice. It handles both sync and async workflows, has excellent documentation, and is officially recommended by the PostgreSQL community. The rest of this article uses psycopg v3 exclusively.

Installing psycopg

The easiest way to install psycopg is with the binary package, which bundles the C library so you do not need PostgreSQL development headers installed:

# install_psycopg.sh
pip install "psycopg[binary]"

Output:

Successfully installed psycopg-3.1.18 psycopg-binary-3.1.18

If you prefer to compile from source (common in production Docker images), install the base package and make sure libpq-dev is available: pip install psycopg[c]. For development and tutorials, the binary option is the fastest path.

Connecting to PostgreSQL

psycopg offers several ways to specify your connection. The most common patterns are a connection string (DSN) and keyword arguments. Both produce identical results — choose whichever reads better in your codebase.

# connection_methods.py
import psycopg

# Method 1: Connection string (DSN)
conn1 = psycopg.connect("host=localhost dbname=myapp user=appuser password=secret")

# Method 2: Keyword arguments
conn2 = psycopg.connect(
    host="localhost",
    dbname="myapp",
    user="appuser",
    password="secret",
    port=5432
)

# Method 3: PostgreSQL URI format
conn3 = psycopg.connect("postgresql://appuser:secret@localhost:5432/myapp")

# Always use context managers for automatic cleanup
with psycopg.connect("host=localhost dbname=myapp user=appuser password=secret") as conn:
    print(f"Connected to: {conn.info.dbname}")
    print(f"Server version: {conn.info.server_version}")

conn1.close()
conn2.close()
conn3.close()

Output:

Connected to: myapp
Server version: 160001

The context manager pattern (with psycopg.connect(...) as conn) is strongly recommended. It automatically commits the transaction on success, rolls back on exception, and closes the connection when the block exits. This prevents connection leaks and orphaned transactions — two of the most common PostgreSQL headaches in production.

Connecting Python to PostgreSQL database — conn = psycopg.connect() — three seconds to production-ready database access.

CRUD Operations with psycopg

CREATE: Inserting Data

Inserting data uses cursor.execute() with parameterized queries. Always use %s placeholders — never f-strings or string concatenation. Parameterized queries prevent SQL injection and handle type conversion automatically.

# insert_data.py
import psycopg

with psycopg.connect("host=localhost dbname=testdb user=postgres password=postgres") as conn:
    with conn.cursor() as cur:
        cur.execute("""
            CREATE TABLE IF NOT EXISTS users (
                id SERIAL PRIMARY KEY,
                name TEXT NOT NULL,
                email TEXT UNIQUE NOT NULL,
                age INTEGER
            )
        """)

        # Insert a single row with parameterized query
        cur.execute(
            "INSERT INTO users (name, email, age) VALUES (%s, %s, %s)",
            ("Alice Chen", "alice@example.com", 28)
        )

        # Insert multiple rows efficiently with executemany
        new_users = [
            ("Bob Park", "bob@example.com", 34),
            ("Carol Smith", "carol@example.com", 22),
            ("Dave Wilson", "dave@example.com", 45),
        ]
        cur.executemany(
            "INSERT INTO users (name, email, age) VALUES (%s, %s, %s)",
            new_users
        )

        conn.commit()
        print(f"Inserted {1 + len(new_users)} users successfully")

Output:

Inserted 4 users successfully

The executemany() method is cleaner than looping with individual execute() calls, and psycopg optimizes it internally. For truly large batches (thousands of rows), look into cursor.copy() which uses PostgreSQL’s COPY protocol and is dramatically faster.

READ: Querying Data

Reading data involves executing a SELECT query and fetching results. psycopg gives you several fetch options depending on how much data you expect.

# read_data.py
import psycopg

with psycopg.connect("host=localhost dbname=testdb user=postgres password=postgres") as conn:
    with conn.cursor() as cur:
        # Fetch all rows
        cur.execute("SELECT id, name, email, age FROM users ORDER BY name")
        all_users = cur.fetchall()
        print("All users:")
        for user in all_users:
            print(f"  {user[0]}: {user[1]} ({user[2]}), age {user[3]}")

        # Fetch one row
        cur.execute("SELECT name, age FROM users WHERE email = %s", ("alice@example.com",))
        alice = cur.fetchone()
        print(f"\nFound: {alice[0]}, age {alice[1]}")

        # Use row factory for named columns (much more readable)
        cur = conn.cursor(row_factory=psycopg.rows.dict_row)
        cur.execute("SELECT name, email, age FROM users WHERE age > %s", (25,))
        older_users = cur.fetchall()
        print(f"\nUsers over 25:")
        for u in older_users:
            print(f"  {u['name']}: {u['email']}, age {u['age']}")

Output:

All users:
  1: Alice Chen (alice@example.com), age 28
  2: Bob Park (bob@example.com), age 34
  3: Carol Smith (carol@example.com), age 22
  4: Dave Wilson (dave@example.com), age 45

Found: Alice Chen, age 28

Users over 25:
  Alice Chen: alice@example.com, age 28
  Bob Park: bob@example.com, age 34
  Dave Wilson: dave@example.com, age 45

The dict_row row factory is a game-changer for readability. Instead of accessing columns by index (row[0], row[1]), you use names (row['name'], row['email']). This makes your code self-documenting and resilient to column order changes.

UPDATE: Modifying Data

Updates follow the same parameterized pattern. The rowcount attribute tells you how many rows were affected.

# update_data.py
import psycopg

with psycopg.connect("host=localhost dbname=testdb user=postgres password=postgres") as conn:
    with conn.cursor() as cur:
        # Update a single user
        cur.execute(
            "UPDATE users SET age = %s WHERE email = %s",
            (29, "alice@example.com")
        )
        print(f"Updated {cur.rowcount} row(s)")

        # Update multiple rows with a condition
        cur.execute(
            "UPDATE users SET age = age + 1 WHERE age < %s",
            (30,)
        )
        print(f"Birthday bump: {cur.rowcount} user(s) aged up")

        conn.commit()

Output:

Updated 1 row(s)
Birthday bump: 2 user(s) aged up

Always check cur.rowcount after updates and deletes. If it returns 0 when you expected changes, your WHERE clause might be wrong -- and catching that early saves hours of debugging.

DELETE: Removing Data

Deletes work the same way. Be cautious with DELETE statements -- a missing WHERE clause deletes everything in the table.

# delete_data.py
import psycopg

with psycopg.connect("host=localhost dbname=testdb user=postgres password=postgres") as conn:
    with conn.cursor() as cur:
        # Delete a specific user
        cur.execute(
            "DELETE FROM users WHERE email = %s",
            ("dave@example.com",)
        )
        print(f"Deleted {cur.rowcount} user(s)")

        # Verify the deletion
        cur.execute("SELECT COUNT(*) FROM users")
        count = cur.fetchone()[0]
        print(f"Remaining users: {count}")

        conn.commit()

Output:

Deleted 1 user(s)
Remaining users: 3

CRUD operations with Python and PostgreSQL — Four operations, infinite applications. CRUD is the backbone of every database app.

Error Handling

Database operations fail in predictable ways -- duplicate keys, connection drops, malformed queries. psycopg raises specific exception types for each, so you can handle them precisely.

# error_handling.py
import psycopg
from psycopg import errors

conn_string = "host=localhost dbname=testdb user=postgres password=postgres"

try:
    with psycopg.connect(conn_string) as conn:
        with conn.cursor() as cur:
            # This will fail if email already exists (UNIQUE constraint)
            cur.execute(
                "INSERT INTO users (name, email, age) VALUES (%s, %s, %s)",
                ("Alice Chen", "alice@example.com", 28)
            )
            conn.commit()
except errors.UniqueViolation as e:
    print(f"Duplicate entry: {e.diag.message_detail}")
except errors.OperationalError as e:
    print(f"Connection problem: {e}")
except errors.ProgrammingError as e:
    print(f"SQL error: {e}")
except Exception as e:
    print(f"Unexpected error: {type(e).__name__}: {e}")

Output:

Duplicate entry: Key (email)=(alice@example.com) already exists.

The psycopg.errors module maps every PostgreSQL error code to a Python exception class. UniqueViolation, ForeignKeyViolation, CheckViolation -- they are all there. This lets you show users a friendly "email already taken" message instead of a raw database error.

Connection Pooling

Creating a new database connection for every request is slow (each connection involves a TCP handshake, authentication, and memory allocation on the server). Connection pooling solves this by maintaining a set of open connections that get reused across requests.

# connection_pool.py
from psycopg_pool import ConnectionPool

# Create a pool with min 2, max 10 connections
pool = ConnectionPool(
    "host=localhost dbname=testdb user=postgres password=postgres",
    min_size=2,
    max_size=10
)

# Use connections from the pool
with pool.connection() as conn:
    with conn.cursor() as cur:
        cur.execute("SELECT COUNT(*) FROM users")
        count = cur.fetchone()[0]
        print(f"User count: {count}")

# The connection is returned to the pool, not closed
with pool.connection() as conn:
    with conn.cursor() as cur:
        cur.execute("SELECT name FROM users LIMIT 1")
        name = cur.fetchone()[0]
        print(f"First user: {name}")

# Get pool stats
stats = pool.get_stats()
print(f"Pool size: {stats['pool_size']}, available: {stats['pool_available']}")

pool.close()

Output:

User count: 3
First user: Alice Chen
Pool size: 2, available: 2

In a web application (Flask, FastAPI, Django), you would create the pool once at startup and share it across all request handlers. This dramatically reduces latency since connections are reused instead of created fresh for every HTTP request. The max_size parameter prevents your application from overwhelming the database with too many simultaneous connections.

Connection pooling for PostgreSQL in Python — One pool, ten connections, a thousand requests. Connection pooling is free performance.

Working with Transactions

By default, psycopg wraps every operation in a transaction. The context manager commits on success and rolls back on failure. But sometimes you need more control -- for example, when multiple operations must succeed or fail together.

# transactions.py
import psycopg

conn_string = "host=localhost dbname=testdb user=postgres password=postgres"

with psycopg.connect(conn_string) as conn:
    # Explicit transaction control
    try:
        with conn.transaction():
            with conn.cursor() as cur:
                # Both operations must succeed
                cur.execute(
                    "UPDATE users SET age = age - 1 WHERE name = %s",
                    ("Alice Chen",)
                )
                cur.execute(
                    "UPDATE users SET age = age + 1 WHERE name = %s",
                    ("Bob Park",)
                )
                print("Both updates committed together")
    except Exception as e:
        print(f"Transaction rolled back: {e}")

    # Nested savepoints
    with conn.transaction() as tx1:
        with conn.cursor() as cur:
            cur.execute("INSERT INTO users (name, email, age) VALUES (%s, %s, %s)",
                        ("Eve Brown", "eve@example.com", 31))
            try:
                with conn.transaction() as tx2:
                    cur.execute("INSERT INTO users (name, email, age) VALUES (%s, %s, %s)",
                                ("Eve Brown", "eve-duplicate@example.com", 31))
                    # This inner transaction can fail without killing the outer one
            except Exception:
                print("Inner savepoint rolled back, outer transaction continues")

        conn.commit()
        print("Eve inserted successfully")

Output:

Both updates committed together
Eve inserted successfully

The conn.transaction() context manager creates a savepoint when nested. This is incredibly useful for "try this, but if it fails, keep going" patterns -- common in data import pipelines where you want to skip bad rows without losing the entire batch.

Real-Life Example: Building a Task Manager CLI

Let us tie everything together with a complete task manager that stores tasks in PostgreSQL. This project uses connection pooling, parameterized queries, error handling, and all four CRUD operations.

Building a task manager CLI with Python and PostgreSQL — A complete CRUD app with pooling and error handling. Not bad for 50 lines.

# task_manager.py
import psycopg
from psycopg_pool import ConnectionPool
from psycopg import errors
from datetime import datetime

DB_URL = "host=localhost dbname=testdb user=postgres password=postgres"

def setup_database(pool):
    """Create the tasks table if it does not exist."""
    with pool.connection() as conn:
        with conn.cursor() as cur:
            cur.execute("""
                CREATE TABLE IF NOT EXISTS tasks (
                    id SERIAL PRIMARY KEY,
                    title TEXT NOT NULL,
                    description TEXT DEFAULT '',
                    status TEXT DEFAULT 'pending',
                    created_at TIMESTAMP DEFAULT NOW(),
                    completed_at TIMESTAMP
                )
            """)
            conn.commit()

def add_task(pool, title, description=""):
    """Add a new task and return its ID."""
    with pool.connection() as conn:
        with conn.cursor() as cur:
            cur.execute(
                "INSERT INTO tasks (title, description) VALUES (%s, %s) RETURNING id",
                (title, description)
            )
            task_id = cur.fetchone()[0]
            conn.commit()
            return task_id

def list_tasks(pool, status_filter=None):
    """List tasks, optionally filtered by status."""
    with pool.connection() as conn:
        with conn.cursor(row_factory=psycopg.rows.dict_row) as cur:
            if status_filter:
                cur.execute(
                    "SELECT id, title, status, created_at FROM tasks WHERE status = %s ORDER BY created_at",
                    (status_filter,)
                )
            else:
                cur.execute("SELECT id, title, status, created_at FROM tasks ORDER BY created_at")
            return cur.fetchall()

def complete_task(pool, task_id):
    """Mark a task as completed."""
    with pool.connection() as conn:
        with conn.cursor() as cur:
            cur.execute(
                "UPDATE tasks SET status = %s, completed_at = %s WHERE id = %s",
                ("completed", datetime.now(), task_id)
            )
            conn.commit()
            return cur.rowcount > 0

def delete_task(pool, task_id):
    """Delete a task by ID."""
    with pool.connection() as conn:
        with conn.cursor() as cur:
            cur.execute("DELETE FROM tasks WHERE id = %s", (task_id,))
            conn.commit()
            return cur.rowcount > 0

# Demo usage
pool = ConnectionPool(DB_URL, min_size=2, max_size=5)
setup_database(pool)

# Add some tasks
id1 = add_task(pool, "Learn psycopg", "Complete the PostgreSQL tutorial")
id2 = add_task(pool, "Build REST API", "Create FastAPI endpoints for tasks")
id3 = add_task(pool, "Write tests", "Add pytest coverage for database layer")
print(f"Created tasks: {id1}, {id2}, {id3}")

# List all tasks
print("\nAll tasks:")
for task in list_tasks(pool):
    print(f"  [{task['status']}] #{task['id']}: {task['title']}")

# Complete a task
complete_task(pool, id1)
print(f"\nCompleted task #{id1}")

# List pending tasks only
print("\nPending tasks:")
for task in list_tasks(pool, "pending"):
    print(f"  #{task['id']}: {task['title']}")

# Delete a task
delete_task(pool, id3)
print(f"\nDeleted task #{id3}")

# Final count
print(f"\nTotal tasks remaining: {len(list_tasks(pool))}")

pool.close()

Output:

Created tasks: 1, 2, 3

All tasks:
  [pending] #1: Learn psycopg
  [pending] #2: Build REST API
  [pending] #3: Write tests

Completed task #1

Pending tasks:
  #2: Build REST API
  #3: Write tests

Deleted task #3

Total tasks remaining: 2

This task manager demonstrates every concept from the article: connecting with a pool, parameterized queries for safety, dict_row for readable results, RETURNING clauses for getting generated IDs, and proper transaction handling. You could extend this into a full web application by wrapping these functions in FastAPI or Flask endpoints.

Frequently Asked Questions

Should I use psycopg2 or psycopg (v3)?

For new projects, always use psycopg v3 (installed as pip install psycopg). It has better async support, built-in connection pooling, a cleaner API, and is actively developed. psycopg2 is in maintenance mode -- it still works, but new features and improvements only land in v3. The migration is straightforward since the core concepts (parameterized queries, cursors, context managers) are the same.

How do I prevent SQL injection with psycopg?

Always use parameterized queries with %s placeholders: cur.execute("SELECT * FROM users WHERE id = %s", (user_id,)). Never use f-strings, string concatenation, or format() to build SQL. psycopg handles escaping and type conversion automatically, making injection impossible as long as you use placeholders consistently.

Can I use psycopg with async/await?

Yes. psycopg v3 has a built-in async module: from psycopg import AsyncConnection. Use await AsyncConnection.connect() and await cursor.execute(). It works with asyncio, FastAPI, and any other async framework. The async connection pool is AsyncConnectionPool from psycopg_pool.

How many connections should my pool have?

A good starting point is min_size=2, max_size=10 for small applications. The PostgreSQL documentation suggests a formula: max_connections = (core_count * 2) + effective_spindle_count. In practice, most web applications work well with 10-20 connections in the pool. Monitor your PostgreSQL pg_stat_activity view to see actual connection usage and tune from there.

How do I store database credentials securely?

Never hardcode credentials in your source code. Use environment variables (os.environ['DATABASE_URL']), a .env file loaded with python-dotenv, or a secrets manager (AWS Secrets Manager, HashiCorp Vault). PostgreSQL also supports a ~/.pgpass file for local development. For connection strings, the standard DATABASE_URL environment variable works with most frameworks and deployment platforms.

Conclusion

You now have a solid foundation for connecting Python to PostgreSQL with psycopg. We covered the essential workflow: installing psycopg[binary], establishing connections with context managers, running all four CRUD operations with parameterized queries, handling database errors gracefully, and using connection pooling for production performance. The task manager project ties all these concepts into a practical, extensible application.

From here, try extending the task manager with features like priority levels, due dates, or full-text search using PostgreSQL's tsvector type. Psycopg handles all of these naturally since it passes your SQL through to PostgreSQL without limiting which features you can use.

For the complete API reference and advanced topics like COPY operations, async usage, and custom type adapters, check the official psycopg documentation at www.psycopg.org/psycopg3/docs/.

How To Use Command Line Arguments in Python

by Pubs | Beginner, Input Output

Intermediate

Why Command Line Arguments Matter

Imagine you’ve written a Python script that processes data files. Right now, you have the filename hard-coded inside your script. Tomorrow, you need to process a different file. Today, you run it manually every morning and copy-paste results into a spreadsheet. What if your script could accept the filename, output format, and processing options directly from the terminal? Command line arguments turn a rigid script into a flexible tool that integrates seamlessly into automation pipelines, cron jobs, and CI/CD systems.

Good news: Python makes this straightforward. You already have everything you need in the standard library. The sys module gives you raw access to command line arguments via sys.argv, and for more complex tools, the argparse module handles parsing, validation, and automatic help text generation. Both are built-in — no external packages required.

In this article, you’ll learn how to capture and use command line arguments in your Python scripts. We’ll start with sys.argv for simple cases, explore why Python doesn’t have argc, then dive deep into argparse for professional-grade CLI tools. You’ll see how to add required and optional arguments, set defaults, enforce type conversion, create mutually exclusive groups, and build subcommands like git commit and git push. By the end, you’ll build a complete file-processing CLI tool and understand when to reach for third-party libraries like click and typer.

Command Line Arguments in Python: Quick Example

Here’s the fastest way to access command line arguments in Python:

# quick_example.py
import sys

# sys.argv is a list of strings
# sys.argv[0] is the script name
# sys.argv[1:] are the arguments passed to the script

if len(sys.argv) < 2:
    print("Usage: python quick_example.py ")
    sys.exit(1)

name = sys.argv[1]
print(f"Hello, {name}!")

Output:

$ python quick_example.py Alice
Hello, Alice!

$ python quick_example.py
Usage: python quick_example.py

When you run a Python script from the terminal with arguments, those arguments end up in a list called sys.argv. The first element (index 0) is always the name of your script. The rest are whatever you typed after the script name. This is the foundation for all command line input in Python.

For simple scripts with one or two arguments, this is perfectly fine. But for tools with multiple arguments, flags, and options, you’ll want argparse. Let’s explore what’s actually happening under the hood first.

API Alex typing at desk with data streams flowing from keyboard — sys.argv[0] is your script name. sys.argv[1:] is everything else.

What Are Command Line Arguments and Why Use Them?

Command line arguments are values you pass to a program when you run it from the terminal. They’re the text that appears after your program name. For example:

python process_data.py input.csv output.json --verbose --format=json

In this case, process_data.py is the script name, input.csv and output.json are positional arguments, and --verbose and --format=json are optional flag arguments.

Command line arguments are essential because they let users control your script without editing code. They make your script reusable, testable, and compatible with automation tools. A script that only processes one hard-coded file is a toy. A script that accepts a file path as an argument is a real tool that others can use in pipelines and cron jobs.

Three types of command line arguments exist:

Type	Example	Purpose
Positional	`python script.py input.txt`	Required values passed in order (like function arguments)
Optional flags	`python script.py --verbose`	Boolean switches or named options (prefixed with `--` or `-`)
Subcommands	`git commit -m "msg"`	Different commands with their own arguments (like `git push` vs `git pull`)

Python’s sys.argv gives you raw access to everything. The argparse module wraps that complexity and handles validation, type conversion, help text, and error messages for you.

Understanding sys.argv: The Foundation

sys.argv is a simple list. When Python runs your script, it automatically populates this list with everything typed on the command line. Let’s see what’s actually in it:

# inspect_argv.py
import sys

print("sys.argv contents:")
print(sys.argv)
print()
print("Script name (argv[0]):", sys.argv[0])
print("All arguments (argv[1:]):", sys.argv[1:])
print("Number of arguments:", len(sys.argv) - 1)

Output (when run with different arguments):

$ python inspect_argv.py hello world 42
sys.argv contents:
['inspect_argv.py', 'hello', 'world', '42']

Script name (argv[0]): inspect_argv.py
All arguments (argv[1:]: ['hello', 'world', '42']
Number of arguments: 3

This reveals something important: every element in sys.argv is a string. Even though you typed 42, it’s stored as the string '42'. If you need an integer, you must convert it yourself using int(). This is why argparse exists — it handles type conversion automatically.

Why Python Doesn’t Have argc (And What To Use Instead)

You might know that languages like C and JavaScript have both argc (argument count) and argv (argument values). Python doesn’t have argc because sys.argv is a list, and lists have a built-in length. To get the argument count in Python, you simply use len(sys.argv).

Here’s the comparison:

Language	Get argument count	Get argument value
C	`argc`	`argv[0]`, `argv[1]`, …
JavaScript (Node.js)	`process.argv.length`	`process.argv[2]` (index 2, since 0 and 1 are reserved)
Python	`len(sys.argv)`	`sys.argv[0]`, `sys.argv[1]`, …

Since Python gives you a list directly, you get the count for free. This is more Pythonic — simpler, fewer moving parts.

Parsing sys.argv Manually for Simple Scripts

For a script with just one or two arguments, manual parsing is often clearer than adding argparse:

# backup.py
import sys
import shutil
from pathlib import Path

if len(sys.argv) < 2:
    print("Usage: python backup.py ")
    sys.exit(1)

source = sys.argv[1]
destination = sys.argv[2] if len(sys.argv) > 2 else f"{source}.backup"

source_path = Path(source)
if not source_path.exists():
    print(f"Error: {source} does not exist")
    sys.exit(1)

shutil.copy(source_path, destination)
print(f"Backed up {source} to {destination}")

Output:

$ python backup.py config.json
Backed up config.json to config.json.backup

$ python backup.py config.json config_v2.json
Backed up config.json to config_v2.json

$ python backup.py nonexistent.json
Error: nonexistent.json does not exist

This pattern works: check the length of sys.argv, extract arguments by index, validate them, and exit with an error code if anything is wrong. The downside is that you’re building your own help text, validation, and error messages. When your script grows to five or more arguments, argparse becomes worth the overhead.

Loop Larry overwhelmed surrounded by instruction manuals — Manual argv parsing scales to about three arguments. After that, argparse saves your sanity.

Introducing argparse: The Standard Solution

The argparse module is Python’s built-in tool for building professional command line interfaces. It handles parsing, validation, type conversion, help text generation, and error messages. Here’s a minimal example:

# greet.py
import argparse

parser = argparse.ArgumentParser(description="Greet someone by name")
parser.add_argument("name", help="The name to greet")
parser.add_argument("--formal", action="store_true", help="Use formal greeting")

args = parser.parse_args()

if args.formal:
    print(f"Good day, {args.name}. How do you do?")
else:
    print(f"Hey {args.name}!")

Output:

$ python greet.py Alice
Hey Alice!

$ python greet.py Alice --formal
Good day, Alice. How do you do?

$ python greet.py --help
usage: greet.py [-h] [--formal] name

Greet someone by name

positional arguments:
  name        The name to greet

optional arguments:
  -h, --help  show this help message and exit
  --formal    Use formal greeting

Notice what just happened: you didn’t write any help text manually. argparse generated it from the description and help parameters you provided. It also validated that the required name argument was provided, parsed the --formal flag, and made the values accessible as attributes on the args object.

The structure is always the same: create a parser, add arguments to it, then call parse_args() to get back an object with the parsed values.

Adding Positional Arguments

Positional arguments are required values that users pass in order. They’re like function parameters:

# rename_file.py
import argparse
import os

parser = argparse.ArgumentParser(description="Rename a file")
parser.add_argument("old_name", help="Current filename")
parser.add_argument("new_name", help="New filename")

args = parser.parse_args()

if not os.path.exists(args.old_name):
    print(f"Error: {args.old_name} not found")
    exit(1)

os.rename(args.old_name, args.new_name)
print(f"Renamed {args.old_name} to {args.new_name}")

Output:

$ echo "test" > original.txt
$ python rename_file.py original.txt renamed.txt
Renamed original.txt to renamed.txt

$ python rename_file.py nonexistent.txt backup.txt
Error: nonexistent.txt not found

$ python rename_file.py
usage: rename_file.py [-h] old_name new_name
rename_file.py: error: the following arguments are required: old_name, new_name

Positional arguments are mandatory by default. If the user doesn’t provide them, argparse exits with an error automatically. The order matters — the first argument becomes args.old_name, the second becomes args.new_name.

Adding Optional Arguments and Flags

Optional arguments are prefixed with -- (long form) or - (short form). They’re not required and can appear in any order:

# list_files.py
import argparse
import os

parser = argparse.ArgumentParser(description="List files with filtering")
parser.add_argument("directory", help="Directory to list")
parser.add_argument("--extension", "-e", help="Filter by file extension (e.g., .py)")
parser.add_argument("--verbose", "-v", action="store_true", help="Show file sizes")
parser.add_argument("--limit", type=int, default=None, help="Max number of files to show")

args = parser.parse_args()

if not os.path.isdir(args.directory):
    print(f"Error: {args.directory} is not a directory")
    exit(1)

files = os.listdir(args.directory)

if args.extension:
    files = [f for f in files if f.endswith(args.extension)]

if args.limit:
    files = files[:args.limit]

for filename in files:
    if args.verbose:
        filepath = os.path.join(args.directory, filename)
        size = os.path.getsize(filepath)
        print(f"{filename} ({size} bytes)")
    else:
        print(filename)

Output:

$ python list_files.py . --extension .py
script1.py
script2.py

$ python list_files.py . -e .py -v
script1.py (248 bytes)
script2.py (512 bytes)

$ python list_files.py . -e .py --limit 1
script1.py

$ python list_files.py . --help
usage: list_files.py [-h] [--extension EXTENSION] [--verbose] [--limit LIMIT] directory

List files with filtering

positional arguments:
  directory             Directory to list

optional arguments:
  -h, --help            show this help message and exit
  --extension EXTENSION, -e EXTENSION
                        Filter by file extension (e.g., .py)
  --verbose, -v         Show file sizes
  --limit LIMIT         Max number of files to show

Key observations: --extension accepts a value (the extension string), --verbose is a boolean flag using action="store_true", and --limit has type=int for automatic conversion. The short forms -e and -v work alongside the long forms.

Sudo Sam holding a giant checklist clipboard — argparse generates help text, validates arguments, and converts types. You just define them.

Type Conversion and Default Values

One of argparse‘s strengths is automatic type conversion. Specify a type parameter and argparse converts the string input for you:

# process_config.py
import argparse
import json

parser = argparse.ArgumentParser(description="Process configuration")
parser.add_argument("--workers", type=int, default=4, help="Number of worker threads")
parser.add_argument("--timeout", type=float, default=30.0, help="Timeout in seconds")
parser.add_argument("--enable-cache", action="store_true", help="Enable caching")
parser.add_argument("--tags", type=str, default="", help="Comma-separated tags")

args = parser.parse_args()

# All values are now the correct type
config = {
    "workers": args.workers,
    "timeout": args.timeout,
    "cache_enabled": args.enable_cache,
    "tags": [t.strip() for t in args.tags.split(",") if t.strip()]
}

print("Configuration:")
print(json.dumps(config, indent=2))

Output:

$ python process_config.py
Configuration:
{
  "workers": 4,
  "timeout": 30.0,
  "cache_enabled": false,
  "tags": []
}

$ python process_config.py --workers 8 --timeout 60.5 --enable-cache --tags "urgent,production"
Configuration:
{
  "workers": 8,
  "timeout": 60.5,
  "cache_enabled": true,
  "tags": [
    "urgent",
    "production"
  ]
}

$ python process_config.py --workers abc
usage: process_config.py [-h] [--workers WORKERS] [--timeout TIMEOUT] [--enable-cache] [--tags TAGS]
process_config.py: error: argument --workers: invalid int value: 'abc'

The type=int and type=float parameters tell argparse to convert strings to those types. If conversion fails, argparse exits with a clear error message. Default values are provided with the default parameter and are used when the argument isn’t provided on the command line.

Restricting Values with Choices

The choices parameter restricts an argument to a fixed set of allowed values:

# deploy.py
import argparse

parser = argparse.ArgumentParser(description="Deploy application")
parser.add_argument("environment", choices=["dev", "staging", "prod"],
                    help="Deployment environment")
parser.add_argument("--log-level", choices=["debug", "info", "warning", "error"],
                    default="info", help="Logging level")

args = parser.parse_args()

print(f"Deploying to {args.environment} with log level {args.log_level}")

Output:

$ python deploy.py staging
Deploying to staging with log level info

$ python deploy.py --log-level debug staging
Deploying to staging with log level debug

$ python deploy.py testing
usage: deploy.py [-h] [--log-level {debug,info,warning,error}] {dev,staging,prod}
deploy.py: error: argument environment: invalid choice: 'testing' (choose from 'dev', 'staging', 'prod')

The choices parameter automatically validates input and displays allowed values in the help text. This prevents invalid configuration from reaching your code.

Making Optional Arguments Required

By default, arguments prefixed with -- are optional. You can make them required with required=True:

# download.py
import argparse

parser = argparse.ArgumentParser(description="Download a file")
parser.add_argument("--url", required=True, help="URL to download from")
parser.add_argument("--output", "-o", required=True, help="Output filename")
parser.add_argument("--timeout", type=int, default=30, help="Timeout in seconds")

args = parser.parse_args()

print(f"Downloading from {args.url} to {args.output} (timeout: {args.timeout}s)")

Output:

$ python download.py --url https://example.com/file.zip --output file.zip
Downloading from https://example.com/file.zip to file.zip (timeout: 30s)

$ python download.py --output file.zip
usage: download.py [-h] --url URL [-o OUTPUT] [--timeout TIMEOUT]
download.py: error: the following arguments are required: --url

This pattern is useful when you want semantic clarity — using --url=value is more explicit than a positional argument, but sometimes you still want to make it mandatory.

Loop Larry at a fork in the road deciding which path to take — Mutually exclusive groups: pick one path or the other, never both.

Mutually Exclusive Argument Groups

Sometimes arguments conflict with each other. You want users to provide either option A or option B, but not both. Use a mutually exclusive group:

# format_converter.py
import argparse

parser = argparse.ArgumentParser(description="Convert data format")
parser.add_argument("input_file", help="Input file to convert")

# Create a mutually exclusive group
output_group = parser.add_mutually_exclusive_group(required=True)
output_group.add_argument("--to-json", action="store_true", help="Convert to JSON")
output_group.add_argument("--to-csv", action="store_true", help="Convert to CSV")
output_group.add_argument("--to-xml", action="store_true", help="Convert to XML")

args = parser.parse_args()

format_name = "json" if args.to_json else "csv" if args.to_csv else "xml"
print(f"Converting {args.input_file} to {format_name}")

Output:

$ python format_converter.py data.txt --to-json
Converting data.txt to json

$ python format_converter.py data.txt --to-json --to-csv
usage: format_converter.py [-h] (--to-json | --to-csv | --to-xml) input_file
format_converter.py: error: argument --to-csv: not allowed with argument --to-json

$ python format_converter.py data.txt
usage: format_converter.py [-h] (--to-json | --to-csv | --to-xml) input_file
format_converter.py: error: one of the arguments --to-json --to-csv --to-xml is required

The add_mutually_exclusive_group(required=True) creates a group where exactly one option must be chosen. Set required=False if at least one should be chosen but none is acceptable. The error messages are automatically clear about the conflict.

Building Subcommands (Like git commit, git push)

Complex tools like git use subcommands: git commit, git push, and git pull are all different commands with different arguments. argparse supports this with subparsers:

# git_like.py
import argparse

parser = argparse.ArgumentParser(description="Git-like tool")
subparsers = parser.add_subparsers(dest="command", help="Available commands")

# 'commit' subcommand
commit_parser = subparsers.add_parser("commit", help="Create a commit")
commit_parser.add_argument("message", help="Commit message")
commit_parser.add_argument("--author", help="Commit author")

# 'push' subcommand
push_parser = subparsers.add_parser("push", help="Push commits")
push_parser.add_argument("branch", help="Branch to push")
push_parser.add_argument("--remote", default="origin", help="Remote name")

# 'log' subcommand
log_parser = subparsers.add_parser("log", help="Show commit history")
log_parser.add_argument("--limit", type=int, default=10, help="Number of commits to show")

args = parser.parse_args()

if args.command == "commit":
    author = args.author if args.author else "Unknown"
    print(f"Committing: '{args.message}' by {author}")
elif args.command == "push":
    print(f"Pushing {args.branch} to {args.remote}")
elif args.command == "log":
    print(f"Showing last {args.limit} commits")
else:
    print("No command specified")

Output:

$ python git_like.py commit "Fix bug" --author Alice
Committing: 'Fix bug' by Alice

$ python git_like.py push main --remote upstream
Pushing main to upstream

$ python git_like.py log --limit 5
Showing last 5 commits

$ python git_like.py --help
usage: git_like.py [-h] {commit,push,log} ...

Git-like tool

positional arguments:
  {commit,push,log}  Available commands
    commit           Create a commit
    push             Push commits
    log              Show commit history

optional arguments:
  -h, --help         show this help message and exit

The add_subparsers() method creates a sub-parser for each command. Each subparser has its own arguments and help text. The dest="command" stores which subcommand was chosen in args.command. This pattern scales to tools with dozens of commands.

Real-Life Example: A File Processing CLI Tool

Let’s build a realistic tool that accepts input and output files, processes them with various options, and validates everything:

# file_processor.py
import argparse
import sys
from pathlib import Path
import json

parser = argparse.ArgumentParser(
    description="Process text files with various transformations"
)

# Positional arguments
parser.add_argument("input_file", help="Input file to process")
parser.add_argument("output_file", help="Output file")

# Optional arguments
parser.add_argument("--transform", choices=["uppercase", "lowercase", "reverse"],
                    default="lowercase", help="Text transformation to apply")
parser.add_argument("--add-line-numbers", action="store_true",
                    help="Prepend line numbers")
parser.add_argument("--exclude-empty-lines", action="store_true",
                    help="Skip empty lines")
parser.add_argument("--max-lines", type=int, default=None,
                    help="Process only first N lines")
parser.add_argument("--encoding", default="utf-8",
                    help="File encoding")
parser.add_argument("--stats", action="store_true",
                    help="Print processing statistics")

args = parser.parse_args()

# Validate input file exists
input_path = Path(args.input_file)
if not input_path.exists():
    print(f"Error: Input file '{args.input_file}' not found", file=sys.stderr)
    sys.exit(1)

# Process the file
try:
    with open(input_path, "r", encoding=args.encoding) as f:
        lines = f.readlines()
except UnicodeDecodeError as e:
    print(f"Error: Could not decode file with {args.encoding} encoding", file=sys.stderr)
    sys.exit(1)

# Apply transformations
processed_lines = []
original_count = len(lines)
skipped_count = 0

for line_num, line in enumerate(lines, 1):
    # Check line limit
    if args.max_lines and line_num > args.max_lines:
        break

    # Skip empty lines if requested
    if args.exclude_empty_lines and line.strip() == "":
        skipped_count += 1
        continue

    # Apply transformation
    content = line.rstrip("\n")
    if args.transform == "uppercase":
        content = content.upper()
    elif args.transform == "lowercase":
        content = content.lower()
    elif args.transform == "reverse":
        content = content[::-1]

    # Add line numbers if requested
    if args.add_line_numbers:
        content = f"{line_num}: {content}"

    processed_lines.append(content + "\n")

# Write output file
output_path = Path(args.output_file)
try:
    with open(output_path, "w", encoding=args.encoding) as f:
        f.writelines(processed_lines)
except IOError as e:
    print(f"Error: Could not write to '{args.output_file}': {e}", file=sys.stderr)
    sys.exit(1)

# Print statistics if requested
if args.stats:
    stats = {
        "input_file": args.input_file,
        "output_file": args.output_file,
        "original_lines": original_count,
        "processed_lines": len(processed_lines),
        "skipped_lines": skipped_count,
        "transformation": args.transform,
        "line_numbers_added": args.add_line_numbers,
        "encoding": args.encoding
    }
    print("\nProcessing Statistics:")
    print(json.dumps(stats, indent=2))
else:
    print(f"Processed {len(processed_lines)} lines, output written to {args.output_file}")

Output:

$ cat input.txt
Hello World
This is a test

Keep going

$ python file_processor.py input.txt output.txt --transform uppercase --add-line-numbers --stats
Processing Statistics:
{
  "input_file": "input.txt",
  "output_file": "output.txt",
  "original_lines": 5,
  "processed_lines": 5,
  "skipped_lines": 0,
  "transformation": "uppercase",
  "line_numbers_added": true,
  "encoding": "utf-8"
}

$ cat output.txt
1: HELLO WORLD
2: THIS IS A TEST
3:
4: KEEP GOING

$ python file_processor.py input.txt output.txt --transform lowercase --exclude-empty-lines --max-lines 2
Processed 2 lines, output written to output.txt

$ cat output.txt
hello world
this is a test

This example demonstrates several key patterns: input validation, defensive file I/O with error handling, type-safe argument conversion, and combining multiple options. The tool is flexible (users can apply transformations, filter lines, add statistics) while remaining simple to understand and extend.

Pyro Pete celebrating victory next to a glowing monitor — A CLI tool that accepts arguments is infinitely more useful than one with hard-coded paths.

Third-Party Alternatives: click and typer

For even more powerful CLI tools, the Python community has built two popular third-party libraries:

click is a decorator-based framework that makes building CLI tools elegant and expressive. It handles groups, commands, options, and context passing with minimal boilerplate. It’s widely used in professional tools like Flask and Invoke.

typer is the modern alternative, built on top of Click but with a focus on type hints and fewer decorators. If you’re comfortable with Python’s type annotation syntax, Typer feels more natural.

Here’s a quick comparison:

Feature	argparse	click	typer
Built-in	Yes	No (pip install)	No (pip install)
Syntax	Verbose, class-based	Decorator-based	Type hints
Subcommands	Good	Excellent	Excellent
Context/State	Manual	Built-in	Built-in
Auto-help text	Yes	Yes	Yes
Learning curve	Moderate	Low (for decorator style)	Low (for type hints)

For production scripts and tools that ship with your project, stick with argparse — no external dependencies. For internal tools, microservices, and CLIs meant for other developers, click and typer often reduce boilerplate and improve readability.

Frequently Asked Questions

How do I access sys.argv at any point in my code?

sys.argv is a global list that persists for the entire run of your script. You can import sys and access it anywhere. However, argparse is better because it parses arguments once, validates them, and gives you structured access. With argparse, you pass the args object to functions instead of having functions depend on sys.argv directly. This makes testing easier and your code more modular.

Can I make a positional argument optional?

Yes, use the nargs="?" parameter: parser.add_argument("name", nargs="?", default="World"). This makes the argument optional with a default value. If the user provides it, your code uses that value; if not, the default is used. However, this can be confusing for users because they won’t know the argument is optional just from the usage line. Use optional flags with -- instead for clarity.

How do I handle a variable number of arguments?

Use nargs="*" (zero or more), nargs="+" (one or more), or nargs=3 (exactly three). For example: parser.add_argument("files", nargs="+", help="Files to process") requires at least one file and stores them as a list in args.files.

How do I pass arguments with spaces or special characters?

Quote them on the command line: python script.py "hello world" --message "test message". The shell treats quoted strings as single arguments. Python receives them correctly in sys.argv or through argparse.

How do I test a script that uses argparse?

Mock sys.argv in your tests or call parse_args() with a list of strings instead of using the default (which reads sys.argv). Example: args = parser.parse_args(["input.txt", "--verbose"]). This lets you test different argument combinations without running the script from the command line.

Conclusion

Command line arguments transform your scripts from one-off tools into reusable, composable utilities. You’ve learned the fundamentals: sys.argv for raw access, the reasons Python doesn’t need argc, and why argparse is the standard library’s powerful answer to building professional CLI tools. You’ve seen how to parse positional arguments, optional flags, enforce type conversion, restrict choices, create mutually exclusive groups, and build subcommands. The file-processing tool example shows how these patterns combine in real code.

Now take the real-life example and extend it. Add a --config flag that reads settings from a JSON file. Build a tool that accepts multiple input files and processes them in parallel. Create a command with subcommands like your own mini git. These exercises will solidify your understanding and show you the flexibility of command line argument handling.

For deeper details, consult the official argparse documentation and the sys.argv documentation.

How To Send Emails From Gmail Using Python 3

by Pubs | APIs, Automation, Intermediate

Intermediate

Introduction

Email is everywhere in modern software — from order confirmations to password resets to automated reports. If you’re building a Python application that needs to send messages to users, you don’t need to pay for a third-party email service right away. Gmail, which most developers already use, has a built-in SMTP (Simple Mail Transfer Protocol) server that you can connect to directly. This opens up a world of possibilities: send alerts when your scripts finish, notify team members of important events, or automate bulk communication — all without leaving Python.

The good news: you don’t need to understand SMTP inside and out to get started. Python’s smtplib library and the email module handle the complex parts, and Gmail provides clear documentation for developers. You’ll need to set up Gmail for programmatic access (it’s a one-time configuration), but after that, it takes just a few lines of Python to send your first email.

This article covers the complete journey: setting up Gmail for Python access, connecting via SMTP, sending plain text and HTML emails, attaching files, handling errors gracefully, and using secure authentication practices. We’ll start with a working example you can run in 30 seconds, then dive into each concept in detail. By the end, you’ll be able to send formatted emails with attachments, implement proper error handling, and understand the security best practices that separate a toy script from production-ready code.

How To Send Emails From Gmail: Quick Example

Before we dive into the details, here’s a working script that sends a simple email from Gmail. This is the absolute minimum to get a message sent:

# quick_gmail_send.py
import smtplib
from email.mime.text import MIMEText
import os

sender_email = os.getenv('GMAIL_EMAIL')
sender_password = os.getenv('GMAIL_PASSWORD')
recipient = "recipient@example.com"

message = MIMEText("This is the body of the email.")
message['Subject'] = "Hello from Python"
message['From'] = sender_email
message['To'] = recipient

with smtplib.SMTP_SSL("smtp.gmail.com", 465) as server:
    server.login(sender_email, sender_password)
    server.send_message(message)
    print("Email sent successfully!")

# Expected Output:
# Email sent successfully!

The script creates a MIMEText message (MIME stands for Multipurpose Internet Mail Extensions — it’s the standard email format), connects to Gmail’s SMTP server using SSL encryption on port 465, authenticates with your email and password, and sends. The with statement handles closing the connection automatically.

Three critical things are happening here: (1) we’re reading the email and password from environment variables, not hardcoding them into the script — this keeps your credentials safe; (2) we’re using port 465 with SMTP_SSL for secure, encrypted communication; and (3) we’re using the send_message method instead of the older sendmail, which is cleaner and handles headers automatically. The next sections explain each piece in depth.

What Is SMTP and Why Use Gmail?

SMTP is the protocol computers use to send email across the internet. When you hit “send” in your email client, it connects to an SMTP server, authenticates, and hands off your message. The server then delivers it to the recipient’s mailbox server (which uses IMAP or POP3 on the receiving end — but that’s outside our scope).

Gmail’s SMTP server is smtp.gmail.com on port 465 (for SSL/TLS encryption) or port 587 (for STARTTLS). Most developers use port 465 because it’s simpler: the connection is encrypted from the start. You authenticate using your Gmail address and a special app password (more on that in the next section), and Gmail handles delivery for you.

The advantage: you get a reliable, professional email infrastructure without hosting your own mail server or paying for a service like SendGrid. The trade-off: Gmail has rate limits (you can send up to 500 emails per day for a typical account), and bulk email is better handled by a service built for that purpose. For automating scripts, notifications, and moderate-volume communication, Gmail is perfect.

Approach	Setup Complexity	Cost	Volume Limit	Use Case
Gmail SMTP	Low	Free	500/day	Notifications, automated alerts, low-volume
SendGrid / Mailgun	Medium	Pay-as-you-go	Higher limits	Production bulk email, webhooks, analytics
Gmail API + OAuth2	High	Free	500/day	Production apps, user consent, best practices
Self-hosted SMTP	Very High	Server costs	Unlimited (delivery dependent)	Enterprise, full control

For the purposes of this article, we’re focusing on SMTP — it’s direct, easy to understand, and enough for most use cases. If you’re building a production app that sends email on behalf of users, you’ll eventually want to move to the Gmail API with OAuth2 (we’ll touch on that at the end).

Loop Larry frustrated at desk with red padlock blocking computer — App passwords exist for a reason. Don’t use your main Gmail password in code.

Setting Up Gmail for Programmatic Access

Step 1: Enable Two-Factor Authentication

Gmail no longer allows you to use your regular password in third-party apps for security reasons. First, you need to enable Two-Factor Authentication (2FA) on your Gmail account — this is a one-time setup. Go to your Google Account security page, find “How you sign in to Google,” and enable 2-Step Verification. You’ll need a phone to receive a verification code. Once that’s done, you’re ready for the next step.

Step 2: Generate an App Password

After 2FA is enabled, Google will give you the option to create “App Passwords.” An App Password is a 16-character random password that grants access to your Gmail account without ever sharing your real password. Go back to the security page, find “App passwords” (it appears under “How you sign in to Google” once 2FA is on), select “Mail” and “Windows Computer” (or your device), and Google generates a unique password. Copy this password and save it somewhere safe — you’ll only see it once.

Why use an App Password instead of your real password? If your script is compromised (or worse, your script source code is leaked on GitHub), an attacker gets access to send email from your account, but not to change your password or access other Google services. It’s a security boundary. Always use App Passwords for programmatic access.

Step 3: Store Credentials Securely

Now you have an app password. Never hardcode it in your script. If your script ends up on GitHub or in a log file, your credentials are exposed. Instead, store them in environment variables. Create a .env file in your project directory (and add .env to your .gitignore so it’s never committed):

# .env
GMAIL_EMAIL=your-email@gmail.com
GMAIL_PASSWORD=your-16-char-app-password

In your Python script, read these values using the os module or the python-dotenv library (which loads .env automatically). Here’s the secure pattern:

# secure_email_setup.py
import os
from dotenv import load_dotenv

load_dotenv()  # Loads GMAIL_EMAIL and GMAIL_PASSWORD from .env

sender_email = os.getenv('GMAIL_EMAIL')
sender_password = os.getenv('GMAIL_PASSWORD')

if not sender_email or not sender_password:
    raise ValueError("GMAIL_EMAIL and GMAIL_PASSWORD must be set in environment.")

print(f"Using email: {sender_email}")

# Expected Output:
# Using email: your-email@gmail.com

Install python-dotenv with pip install python-dotenv if it’s not already available. The load_dotenv() call reads your .env file and makes the variables available via os.getenv(). Checking that both values exist with the if not guard prevents confusing errors later if someone forgets to set up their .env file.

Connecting to Gmail’s SMTP Server

Python’s smtplib module is your gateway to sending email. Let’s break down the connection pattern:

# connect_to_gmail_smtp.py
import smtplib
import os

sender_email = os.getenv('GMAIL_EMAIL')
sender_password = os.getenv('GMAIL_PASSWORD')

# Method 1: SMTP_SSL (port 465, encrypted from start)
with smtplib.SMTP_SSL("smtp.gmail.com", 465) as server:
    server.login(sender_email, sender_password)
    print("Connected and authenticated!")

# Expected Output:
# Connected and authenticated!

SMTP_SSL creates a secure connection to Gmail’s SMTP server on port 465. The connection is encrypted immediately, and the with statement ensures the connection closes automatically when done. The login() method authenticates using your email and app password. If the credentials are wrong, smtplib raises an SMTPAuthenticationError.

There’s an older alternative, SMTP with port 587 and starttls():

# connect_with_starttls.py
import smtplib
import os

sender_email = os.getenv('GMAIL_EMAIL')
sender_password = os.getenv('GMAIL_PASSWORD')

# Method 2: SMTP with STARTTLS (port 587, upgrade to encryption)
with smtplib.SMTP("smtp.gmail.com", 587) as server:
    server.starttls()  # Upgrade to encrypted connection
    server.login(sender_email, sender_password)
    print("Connected and authenticated via STARTTLS!")

# Expected Output:
# Connected and authenticated via STARTTLS!

Both methods are secure. SMTP_SSL (port 465) is simpler and preferred; STARTTLS (port 587) starts with a plain connection then upgrades to encryption. For Gmail, use SMTP_SSL unless your network blocks port 465 (rare, but it happens). The rest of this article uses port 465.

Pyro Pete excitedly operating an industrial control panel — Port 465? Port 587? Both work. Pick one and move on.

Sending Plain Text Emails

The simplest email is plain text. You create a MIMEText message, set the subject and recipients, and send. Here’s the complete flow:

# send_plain_text_email.py
import smtplib
from email.mime.text import MIMEText
import os

sender_email = os.getenv('GMAIL_EMAIL')
sender_password = os.getenv('GMAIL_PASSWORD')
recipient = "recipient@example.com"

# Create the email message
message = MIMEText("This is the body of a plain text email.")
message['Subject'] = "Hello from Python"
message['From'] = sender_email
message['To'] = recipient

# Send via Gmail SMTP
with smtplib.SMTP_SSL("smtp.gmail.com", 465) as server:
    server.login(sender_email, sender_password)
    server.send_message(message)
    print("Plain text email sent successfully!")

# Expected Output:
# Plain text email sent successfully!

The MIMEText() constructor takes the email body as a string. We then set the standard email headers: Subject, From, and To. These headers are visible to the recipient and email clients. The send_message() method (added in Python 3.2) is cleaner than the older sendmail() method because it extracts the sender and recipients from the message headers automatically.

You can send to multiple recipients by setting To as a comma-separated string and passing a list to send_message():

# send_to_multiple_recipients.py
import smtplib
from email.mime.text import MIMEText
import os

sender_email = os.getenv('GMAIL_EMAIL')
sender_password = os.getenv('GMAIL_PASSWORD')
recipients = ["alice@example.com", "bob@example.com"]

message = MIMEText("Hello everyone!")
message['Subject'] = "Group notification"
message['From'] = sender_email
message['To'] = ", ".join(recipients)

with smtplib.SMTP_SSL("smtp.gmail.com", 465) as server:
    server.login(sender_email, sender_password)
    server.send_message(message)
    print(f"Email sent to {len(recipients)} recipients!")

# Expected Output:
# Email sent to 2 recipients!

The ", ".join(recipients) line converts the list into a comma-separated string for the To header, making it readable in the recipient’s email client. You still pass the original list to send_message() so SMTP delivers to each address directly.

Sending HTML-Formatted Emails

Plain text is fine for simple messages, but modern emails are formatted with HTML: colors, images, links, bold text, and multi-column layouts. The MIMEText constructor accepts a second argument, _subtype='html', which tells email clients to render the content as HTML instead of plain text.

# send_html_email.py
import smtplib
from email.mime.text import MIMEText
import os

sender_email = os.getenv('GMAIL_EMAIL')
sender_password = os.getenv('GMAIL_PASSWORD')
recipient = "recipient@example.com"

# HTML body
html_body = """
<html>
  <body>
    <h1 style="color: #0066cc;">Welcome!</h1>
    <p>This is an <strong>HTML email</strong> with <em>formatting</em>.</p>
    <a href="https://pythonhowtoprogram.com">Visit our site</a>
  </body>
</html>
"""

message = MIMEText(html_body, 'html')
message['Subject'] = "Formatted HTML Email"
message['From'] = sender_email
message['To'] = recipient

with smtplib.SMTP_SSL("smtp.gmail.com", 465) as server:
    server.login(sender_email, sender_password)
    server.send_message(message)
    print("HTML email sent successfully!")

# Expected Output:
# HTML email sent successfully!

The key difference: MIMEText(html_body, 'html') tells MIME that this is HTML content. Email clients that support HTML will render the formatted version; older clients fall back to plain text (the raw HTML appears, but at least the message is readable). Always make sure your HTML is valid and test in multiple email clients, as Gmail, Outlook, Apple Mail, and mobile clients each have slightly different HTML rendering engines.

For production emails, consider using a templating approach — write your HTML in a separate file and load it into the script:

# send_html_from_template.py
import smtplib
from email.mime.text import MIMEText
import os

sender_email = os.getenv('GMAIL_EMAIL')
sender_password = os.getenv('GMAIL_PASSWORD')
recipient = "recipient@example.com"

# Load HTML template
with open("email_template.html", "r") as f:
    html_body = f.read()

message = MIMEText(html_body, 'html')
message['Subject'] = "Email from template"
message['From'] = sender_email
message['To'] = recipient

with smtplib.SMTP_SSL("smtp.gmail.com", 465) as server:
    server.login(sender_email, sender_password)
    server.send_message(message)
    print("Templated email sent!")

# Expected Output:
# Templated email sent!

Keeping templates in separate files makes your code cleaner and easier to update without touching Python logic. For even more power, use a library like Jinja2 to insert variables into templates: pip install jinja2, then Template(html_body).render(user_name="Alice").

API Alice painting on a canvas in an art studio — Style your emails, but remember: Outlook ignores 90% of your CSS.

Adding Attachments

Emails often carry files — invoices, PDFs, images, spreadsheets. To attach files, you need to use MIMEMultipart instead of just MIMEText. A multipart message can contain multiple components: text body, attachments, embedded images, etc.

# send_email_with_attachment.py
import smtplib
from email.mime.text import MIMEText
from email.mime.multipart import MIMEMultipart
from email.mime.base import MIMEBase
from email import encoders
import os

sender_email = os.getenv('GMAIL_EMAIL')
sender_password = os.getenv('GMAIL_PASSWORD')
recipient = "recipient@example.com"

# Create multipart message (can contain text + attachments)
message = MIMEMultipart()
message['Subject'] = "Email with PDF attachment"
message['From'] = sender_email
message['To'] = recipient

# Add text body
body = "Please find the report attached."
message.attach(MIMEText(body, 'plain'))

# Attach a file
filename = "report.pdf"
if os.path.exists(filename):
    with open(filename, 'rb') as attachment:
        part = MIMEBase('application', 'octet-stream')
        part.set_payload(attachment.read())
        encoders.encode_base64(part)
        part.add_header('Content-Disposition', f'attachment; filename= {filename}')
        message.attach(part)

# Send
with smtplib.SMTP_SSL("smtp.gmail.com", 465) as server:
    server.login(sender_email, sender_password)
    server.send_message(message)
    print(f"Email with {filename} sent successfully!")

# Expected Output:
# Email with report.pdf sent successfully!

This pattern uses MIMEBase for generic attachments and MIMEText for the body. The file is read in binary mode (‘rb’), the bytes are base64-encoded (so they survive email transmission as text), and a Content-Disposition header tells email clients it’s an attachment with a filename. The os.path.exists() check ensures the file actually exists before trying to read it — defensive programming that prevents crashes on missing files.

For common file types, Python provides shortcuts:

# send_email_with_image_attachment.py
import smtplib
from email.mime.text import MIMEText
from email.mime.image import MIMEImage
from email.mime.multipart import MIMEMultipart
import os

sender_email = os.getenv('GMAIL_EMAIL')
sender_password = os.getenv('GMAIL_PASSWORD')
recipient = "recipient@example.com"

message 
= MIMEMultipart()
message['Subject'] = "Email with image"
message['From'] = sender_email
message['To'] = recipient

body = "Here's a photo:"
message.attach(MIMEText(body, 'plain'))

# Attach an image
image_file = "screenshot.png"
if os.path.exists(image_file):
    with open(image_file, 'rb') as img:
        part = MIMEImage(img.read())
        part.add_header('Content-Disposition', f'attachment; filename= {image_file}')
        message.attach(part)

with smtplib.SMTP_SSL("smtp.gmail.com", 465) as server:
    server.login(sender_email, sender_password)
    server.send_message(message)
    print("Email with image sent!")

# Expected Output:
# Email with image sent!

MIMEImage is simpler for images than MIMEBase — it handles the MIME type automatically. For PDFs, Word docs, and binary formats, use MIMEBase with 'application', 'octet-stream' (a generic binary type). For plain text files, you can use MIMEText directly without needing multipart.

Error Handling and Debugging

Email sending can fail for many reasons: wrong credentials, network issues, recipient address is invalid, rate limits hit, or the SMTP server is temporarily down. Good error handling makes debugging easier and prevents your scripts from crashing silently.

# send_email_with_error_handling.py
import smtplib
from email.mime.text import MIMEText
import os

sender_email = os.getenv('GMAIL_EMAIL')
sender_password = os.getenv('GMAIL_PASSWORD')
recipient = "recipient@example.com"

try:
    message = MIMEText("Test email body.")
    message['Subject'] = "Test"
    message['From'] = sender_email
    message['To'] = recipient

    with smtplib.SMTP_SSL("smtp.gmail.com", 465, timeout=10) as server:
        server.login(sender_email, sender_password)
        server.send_message(message)
        print("Email sent successfully!")

except smtplib.SMTPAuthenticationError:
    print("Error: Invalid email or password.")
except smtplib.SMTPException as e:
    print(f"SMTP error occurred: {e}")
except Exception as e:
    print(f"Unexpected error: {e}")

# Expected Output (on success):
# Email sent successfully!

The key exceptions to catch: SMTPAuthenticationError (wrong credentials), SMTPException (SMTP-level issues like invalid recipients or server errors), and generic Exception as a catch-all. The timeout=10 parameter tells Python to wait up to 10 seconds for a server response before giving up. Without a timeout, a hung connection can block your script forever.

Common exceptions and their causes:

Exception	Cause	Fix
`SMTPAuthenticationError`	Wrong email/password	Verify credentials in .env file. Regenerate app password.
`SMTPNotSupportedError`	SMTP command not supported	Check Gmail account type; some limits apply to newer accounts.
`socket.timeout`	Connection timeout	Check internet connection; increase timeout value.
`ConnectionRefusedError`	Can’t reach SMTP server	Verify SMTP server address; check firewall/network.
`SMTPSenderRefused`	Sender address rejected	Ensure sender email matches authenticated account.

For debugging, enable smtplib debug mode:

# debug_smtp_connection.py
import smtplib
from email.mime.text import MIMEText
import os

sender_email = os.getenv('GMAIL_EMAIL')
sender_password = os.getenv('GMAIL_PASSWORD')

try:
    with smtplib.SMTP_SSL("smtp.gmail.com", 465) as server:
        server.set_debuglevel(1)  # Print all SMTP commands and responses
        server.login(sender_email, sender_password)

        message = MIMEText("Test")
        message['Subject'] = "Test"
        message['From'] = sender_email
        message['To'] = "recipient@example.com"
        server.send_message(message)

except Exception as e:
    print(f"Error: {e}")

# Expected Output (with debug info):
# send: b'ehlo [your.ip.address]\r\n'
# reply: b'250-smtp.gmail.com at your service...'
# ... (many more debug lines)

The set_debuglevel(1) call prints every command sent to the server and every response received. This is invaluable for understanding what’s happening under the hood. Use it when your script fails unexpectedly.

Debug Dee examining a circuit board with magnifying glass — set_debuglevel(1) reveals everything the SMTP server is thinking. Useful at 3am.

Security Best Practices

Sending email is straightforward, but there are security pitfalls that can compromise your account or expose user data.

Never Hardcode Credentials

This is rule #1. If you commit credentials to GitHub, you’ve publicly leaked them, even if you delete them later (GitHub’s history is searchable). Always use environment variables or a secrets management system:

# bad_example.py (DO NOT DO THIS)
sender_email = "my-email@gmail.com"  # Exposed on GitHub!
sender_password = "xxxxxx"  # Exposed on GitHub!

# good_example.py
import os
sender_email = os.getenv('GMAIL_EMAIL')
sender_password = os.getenv('GMAIL_PASSWORD')

For local development, use a .env file (remember to add it to .gitignore). For production (servers, CI/CD pipelines, cloud environments), use your platform’s native secrets: GitHub Secrets for Actions, AWS Secrets Manager for Lambda, Google Secret Manager for Cloud Functions, etc.

Use App Passwords, Not Your Real Password

Google App Passwords are specifically designed for third-party apps. If an attacker gets an App Password, they can only send email; they can’t access your Google Drive, Gmail inbox, or change your password. If you accidentally leaked your real Gmail password, an attacker could take over your entire account. Always use App Passwords for programmatic access.

Validate Recipient Addresses

User input for email addresses should be validated. A simple regex check catches obvious typos:

# validate_email_addresses.py
import re
import smtplib
from email.mime.text import MIMEText
import os

def is_valid_email(email):
    pattern = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'
    return re.match(pattern, email) is not None

recipients = ["alice@example.com", "bob@example", "charlie@domain.co.uk"]

valid_recipients = [e for e in recipients if is_valid_email(e)]
invalid_recipients = [e for e in recipients if not is_valid_email(e)]

print(f"Valid: {valid_recipients}")
print(f"Invalid: {invalid_recipients}")

# Expected Output:
# Valid: ['alice@example.com', 'charlie@domain.co.uk']
# Invalid: ['bob@example']

This regex is simple and covers most real email formats. It’s not bulletproof (the RFC 5322 standard for email addresses is insanely complex), but it catches common mistakes. For production systems, consider sending a confirmation email and only adding to your list after the user clicks a link in the confirmation.

Be Aware of Rate Limits

Gmail limits you to 500 emails per day for standard accounts (business/workspace accounts have higher limits). If you hit this limit, Gmail temporarily blocks further sends. For bulk email, you’ll need a specialized service like SendGrid or AWS SES. For monitoring, keep a log of sent emails:

# log_sent_emails.py
import smtplib
from email.mime.text import MIMEText
import os
import json
from datetime import datetime

sender_email = os.getenv('GMAIL_EMAIL')
sender_password = os.getenv('GMAIL_PASSWORD')

log_file = "email_log.json"
daily_count = 0

# Count today's emails
if os.path.exists(log_file):
    with open(log_file, 'r') as f:
        logs = json.load(f)
        today = datetime.now().strftime("%Y-%m-%d")
        daily_count = sum(1 for log in logs if log['date'] == today)

if daily_count >= 500:
    print("Error: Daily email limit reached.")
else:
    # Send email
    message = MIMEText("Test")
    message['Subject'] = "Test"
    message['From'] = sender_email
    message['To'] = "recipient@example.com"

    with smtplib.SMTP_SSL("smtp.gmail.com", 465) as server:
        server.login(sender_email, sender_password)
        server.send_message(message)

    # Log the send
    log_entry = {
        'date': datetime.now().strftime("%Y-%m-%d"),
        'time': datetime.now().strftime("%H:%M:%S"),
        'to': "recipient@example.com"
    }

    logs = []
    if os.path.exists(log_file):
        with open(log_file, 'r') as f:
            logs = json.load(f)

    logs.append(log_entry)
    with open(log_file, 'w') as f:
        json.dump(logs, f, indent=2)

    print(f"Email sent. Daily count: {daily_count + 1}/500")

# Expected Output:
# Email sent. Daily count: 1/500

This script maintains a JSON log of sends and checks the count before sending. For production, a database is more robust, but a file works for simple scripts.

Real-Life Example: Automated Report Sender

Let’s combine all the concepts into a practical project: an automated script that generates a daily report and emails it to team members. This is a common pattern for data analysis, monitoring, and notifications.

Sudo Sam standing before a holographic dashboard with charts — Your daily report just landed in their inbox. No manual copy-paste required.

# daily_report_sender.py
import smtplib
from email.mime.text import MIMEText
from email.mime.multipart import MIMEMultipart
from email.mime.base import MIMEBase
from email import encoders
import os
from datetime impo
rt datetime
import json

class ReportSender:
    def __init__(self):
        self.sender_email = os.getenv('GMAIL_EMAIL')
        self.sender_password = os.getenv('GMAIL_PASSWORD')

        if not self.sender_email or not self.sender_password:
            raise ValueError("GMAIL_EMAIL and GMAIL_PASSWORD not set.")

    def generate_report(self):
        """Generate a sample daily report."""
        report_data = {
            'date': datetime.now().strftime("%Y-%m-%d"),
            'items_processed': 1250,
            'errors': 3,
            'success_rate': 99.76
        }
        return report_data

    def create_html_report(self, data):
        """Create HTML-formatted report."""
        html = f"""
        <html>
          <body style="font-family: Arial, sans-serif;">
            <h2>Daily Report - {data['date']}</h2>
            <table border="1" cellpadding="10">
              <tr>
                <td><strong>Items Processed</strong></td>
                <td>{data['items_processed']}</td>
              </tr>
              <tr>
                <td><strong>Errors</strong></td>
                <td>{data['errors']}</td>
              </tr>
              <tr>
                <td><strong>Success Rate</strong></td>
                <td>{data['success_rate']}%</td>
              </tr>
            </table>
            <p><em>Report generated by your Python automation script.</em></p>
          </body>
        </html>
        """
        return html

    def send_report(self, recipients, report_data):
        """Send the report to recipients."""
        try:
            message = MIMEMultipart('alternative')
            message['Subject'] = f"Daily Report - {report_data['date']}"
            message['From'] = self.sender_email
            message['To'] = ", ".join(recipients)

            # Create both plain text and HTML versions
            text_body = f"Daily Report: {report_data['items_processed']} items, {report_data['errors']} errors."
            html_body = self.create_html_report(report_data)

            message.attach(MIMEText(text_body, 'plain'))
            message.attach(MIMEText(html_body, 'html'))

            with smtplib.SMTP_SSL("smtp.gmail.com", 465, timeout=10) as server:
                server.login(self.sender_email, self.sender_password)
                server.send_message(message)

            return True, f"Report sent to {len(recipients)} recipients."

        except smtplib.SMTPAuthenticationError:
            return False, "Authentication failed. Check credentials."
        except smtplib.SMTPException as e:
            return False, f"SMTP error: {e}"
        except Exception as e:
            return False, f"Unexpected error: {e}"

# Main execution
if __name__ == "__main__":
    try:
        sender = ReportSender()
        report_data = sender.generate_report()
        recipients = ["alice@example.com", "bob@example.com"]

        success, message = sender.send_report(recipients, report_data)
        print(message)

    except ValueError as e:
        print(f"Setup error: {e}")

# Expected Output:
# Report sent to 2 recipients.

This example demonstrates several best practices: class-based organization separates concerns, the generate_report() method can be extended to pull real data, the create_html_report() method creates a professional-looking email, and error handling returns success/failure status. For production, you’d schedule this with cron (Unix/Linux), Task Scheduler (Windows), or a cloud scheduler (AWS EventBridge, Google Cloud Scheduler).

Alternative: Using the Gmail API with OAuth2

For production applications where your script sends email on behalf of users (not just from your own account), the Gmail API with OAuth2 is the right approach. It’s more complex than SMTP but offers better security, built-in analytics, and compliance with Google’s policies.

The difference: SMTP requires storing your password (or app password) in the script. The Gmail API uses OAuth2, where users grant permission through Google’s login flow, and you receive a token that expires. If the token is compromised, it only works for the specific permissions granted and only for a limited time.

Here’s the high-level flow: (1) Register your app in Google Cloud Console, (2) Configure OAuth2 credentials, (3) Direct users to Google’s login page where they grant permission, (4) Receive an access token, (5) Use the Gmail API (not SMTP) to send email on their behalf.

For detailed instructions, follow Google’s Gmail API sending guide. The google-auth-oauthlib and google-auth-httplib2 libraries handle the OAuth2 flow. SMTP is simpler for personal scripts and low-volume automation; the Gmail API is essential when you’re handling user accounts.

Frequently Asked Questions

My app password isn’t working. What do I check first?

Most likely culprits: (1) Two-factor authentication isn’t enabled on your Gmail account yet — go to myaccount.google.com/security and enable it. (2) You copied the app password with extra spaces — the 16-character password is sensitive to trailing/leading whitespace. (3) You’re using your regular Gmail password instead of the app password — they’re different; always use the app password for scripts. (4) Your environment variables aren’t being loaded — verify print(os.getenv('GMAIL_PASSWORD')) returns the password, not None.

I hit Gmail’s 500-email limit. How do I recover?

The limit resets daily at midnight PST. Wait until the next day, and you can send again. If you regularly need to send more than 500 emails per day, you need a transactional email service: SendGrid (100/month free, then $20+/month), Mailgun, AWS SES, or similar. These services are designed for bulk email and have much higher limits (thousands per day).

My HTML email renders differently in Gmail vs Outlook. Why?

Email clients have inconsistent CSS and HTML support. Gmail strips `

Intermediate

Why Logging Matters in Python

You’re debugging a production issue, but your application is silent. You added a few print() statements weeks ago, the messages got buried in the terminal, and now you have no idea what’s happening. Or worse: your app is logging to console, but the logs disappear the moment the process restarts. You need a way to capture what your application is doingâwhen it’s doing it, at what severity level, and where it should be recorded.

This is where Python’s built-in logging module becomes essential. Unlike print() statements, which are crude and destructive once you delete them, the logging module is a professional-grade system designed for production applications. It comes built-in to Python, requires no external dependencies, and provides granular control over message levels, formatting, and output destinations.

In this article, you’ll learn how to set up the logging module to output messages simultaneously to both your console (for immediate feedback during development) and to a file (for long-term record-keeping and debugging). We’ll cover logging levels, handlers, formatters, log rotation to prevent massive log files, and the patterns used in real multi-module projects. By the end, you’ll understand how to instrument your code with logging that developers trust.

How To Set Up Logging: Quick Example

Here’s a minimal example that outputs log messages to both console and file:

# quick_logging_example.py
import logging

# Create a logger
logger = logging.getLogger(__name__)
logger.setLevel(logging.DEBUG)

# File handler
file_handler = logging.FileHandler("app.log")
file_handler.setLevel(logging.DEBUG)

# Console handler
console_handler = logging.StreamHandler()
console_handler.setLevel(logging.INFO)

# Formatter
formatter = logging.Formatter(
    "%(asctime)s - %(name)s - %(levelname)s - %(message)s"
)
file_handler.setFormatter(formatter)
console_handler.setFormatter(formatter)

# Add handlers to logger
logger.addHandler(file_handler)
logger.addHandler(console_handler)

# Log some messages
logger.debug("Debug message (goes to file only)")
logger.info("Info message (goes to both)")
logger.warning("Warning message (goes to both)")
logger.error("Error message (goes to both)")
logger.critical("Critical message (goes to both)")

Output (to console):

2026-03-29 14:22:15,342 - __main__ - INFO - Info message (goes to both)
2026-03-29 14:22:15,343 - __main__ - WARNING - Warning message (goes to both)
2026-03-29 14:22:15,344 - __main__ - ERROR - Error message (goes to both)
2026-03-29 14:22:15,344 - __main__ - CRITICAL - Critical message (goes to both)

Output (written to app.log):

2026-03-29 14:22:15,341 - __main__ - DEBUG - Debug message (goes to file only)
2026-03-29 14:22:15,342 - __main__ - INFO - Info message (goes to both)
2026-03-29 14:22:15,343 - __main__ - WARNING - Warning message (goes to both)
2026-03-29 14:22:15,344 - __main__ - ERROR - Error message (goes to both)
2026-03-29 14:22:15,344 - __main__ - CRITICAL - Critical message (goes to both)

Notice the key pattern: we created a logger, attached two separate handlers (one for files, one for console), set different levels for each, and applied a formatter that includes timestamps and severity levels. This is the foundation for everything that follows. The sections below show you how to customize each piece.

Debug Dee examining floating log entries through magnifying glass — Good logs are how you debug code you wrote six months ago and forgot about.

What is Python Logging and Why Use It?

The logging module is Python’s standard library tool for recording events that happen during program execution. Unlike print statements, logging provides:

Severity levels â categorize messages by importance (DEBUG, INFO, WARNING, ERROR, CRITICAL)
Multiple outputs â send logs to files, console, email, syslog, or custom handlers simultaneously
Formatting control â include timestamps, function names, line numbers, and custom metadata
Filtering â selectively log messages based on logger name, level, or custom criteria
No side effects â unlike print, you can leave logging code in production without cluttering output

The alternativeâusing print() for debuggingâbreaks down immediately:

Aspect	print() Statements	logging Module
Disable in production	Must manually remove	Adjust level, keep code in place
Output destination	Always stdout	File, console, email, or custom
Timestamps	Manual string concatenation	Automatic, customizable format
Severity levels	None	DEBUG, INFO, WARNING, ERROR, CRITICAL
Performance	Always evaluates	Can be filtered; lazy evaluation
Multi-module coordination	No built-in support	Hierarchical logger names

The logging module is designed for exactly what you need: professional-grade event recording that stays in your code indefinitely.

Understanding Logging Levels

Python’s logging module defines five standard severity levels, plus a catch-all NOTSET. Each level has a numeric value, and loggers will only record messages at or above their configured level:

Level	Numeric Value	When to Use	Example
DEBUG	10	Detailed diagnostic info for debugging	Variable values, function entry/exit, loop iterations
INFO	20	General informational messages	Application startup, config loaded, request received
WARNING	30	Something unexpected or potentially harmful	Deprecated API usage, missing optional config, retrying failed request
ERROR	40	A serious problem; some operation failed	File not found, API returned 500, database connection lost
CRITICAL	50	A very serious error; program may not continue	Out of memory, permissions denied, unrecoverable system error

When you set a logger’s level to INFO, it will log INFO, WARNING, ERROR, and CRITICAL messagesâbut not DEBUG messages. This is how you control verbosity.

# logging_levels_demo.py
import logging

logger = logging.getLogger(__name__)
logger.setLevel(logging.DEBUG)

# Add a console handler so we can see output
handler = logging.StreamHandler()
handler.setLevel(logging.WARNING)
formatter = logging.Formatter("%(levelname)s - %(message)s")
handler.setFormatter(formatter)
logger.addHandler(handler)

# These will NOT appear (level is below WARNING)
logger.debug("This is a debug message")
logger.info("This is an info message")

# These WILL appear
logger.warning("This is a warning message")
logger.error("This is an error message")
logger.critical("This is a critical message")

Output:

WARNING - This is a warning message
ERROR - This is an error message
CRITICAL - This is a critical message

Notice: the logger itself has one level (DEBUG), but the console handler has a different level (WARNING). You can filter messages at multiple levelsâfirst at the logger, then at each handler. This is crucial for sending different messages to different outputs (e.g., all DEBUG messages to a debug log file, only ERROR+ to a critical alert file).

Handlers and Formatters: Controlling Where and How Logs Go

A logger is just a container. The actual work happens in handlers and formatters:

Handler â an output destination. FileHandler writes to a file, StreamHandler writes to console, etc.
Formatter â defines how log messages are formatted: which fields to include (timestamp, function name, etc.) and in what order

You create a handler, assign a formatter to it, set a level, and attach it to a logger. A single logger can have multiple handlers, each with different levels and formatters.

Creating a StreamHandler (Console Output):

# stream_handler_example.py
import logging

logger = logging.getLogger("myapp")
logger.setLevel(logging.DEBUG)

# Create a console handler
console_handler = logging.StreamHandler()
console_handler.setLevel(logging.INFO)

# Format: timestamp, logger name, level, message
formatter = logging.Formatter(
    "%(asctime)s - %(name)s - %(levelname)s - %(message)s"
)
console_handler.setFormatter(formatter)

logger.addHandler(console_handler)

logger.info("Application started")
logger.warning("This is a warning")
logger.error("An error occurred")

Output:

2026-03-29 14:25:30,123 - myapp - INFO - Application started
2026-03-29 14:25:30,124 - myapp - WARNING - This is a warning
2026-03-29 14:25:30,125 - myapp - ERROR - An error occurred

The %(asctime)s token automatically includes a timestamp. Other useful tokens include %(funcName)s (the function name), %(lineno)d (line number), and %(module)s (the module filename).

Creating a FileHandler (File Output):

# file_handler_example.py
import logging

logger = logging.getLogger("myapp")
logger.setLevel(logging.DEBUG)

# Create a file handler
file_handler = logging.FileHandler("app.log")
file_handler.setLevel(logging.DEBUG)

formatter = logging.Formatter(
    "%(asctime)s - %(name)s - %(levelname)s - %(message)s"
)
file_handler.setFormatter(formatter)

logger.addHandler(file_handler)

logger.debug("Debug: application starting")
logger.info("Info: loading configuration")
logger.warning("Warning: deprecated API used")
logger.error("Error: failed to connect to database")

After running this, check your app.log file. All four messages will be there because the file handler’s level is DEBUG.

Output (written to app.log):

2026-03-29 14:27:01,456 - myapp - DEBUG - Debug: application starting
2026-03-29 14:27:01,457 - myapp - INFO - Info: loading configuration
2026-03-29 14:27:01,458 - myapp - WARNING - Warning: deprecated API used
2026-03-29 14:27:01,459 - myapp - ERROR - Error: failed to connect to database

Sudo Sam directing log traffic at an intersection — Handlers are traffic directors: DEBUG takes the file fork, ERROR takes the console.

Logging to Console and File Simultaneously

The most common pattern in production is to send all logs to a file (for permanent record) and only show WARNING+ messages on the console (for immediate visibility during operation). Here’s how:

# console_and_file_logging.py
import logging
import os

# Create a logger
logger = logging.getLogger("myapp")
logger.setLevel(logging.DEBUG)

# Create log directory if it doesn't exist
log_dir = "logs"
if not os.path.exists(log_dir):
    os.makedirs(log_dir)

# File handler: captures all messages
file_handler = logging.FileHandler(os.path.join(log_dir, "app.log"))
file_handler.setLevel(logging.DEBUG)

# Console handler: shows only warnings and above
console_handler = logging.StreamHandler()
console_handler.setLevel(logging.WARNING)

# Shared formatter for both handlers
formatter = logging.Formatter(
    "%(asctime)s - %(name)s - %(levelname)s - %(message)s",
    datefmt="%Y-%m-%d %H:%M:%S"
)
file_handler.setFormatter(formatter)
console_handler.setFormatter(formatter)

# Attach handlers to logger
logger.addHandler(file_handler)
logger.addHandler(console_handler)

# Lof messages at different levels
logger.debug("Starting application initialization")
logger.info("Configuration loaded successfully")
logger.info("Database connection established")
logger.warning("API response time is higher than usual")
logger.error("Failed to write to cache, continuing without cache")
logger.critical("Memory usage exceeded safe threshold")

Output (to console):

2026-03-29 14:30:12 - myapp - WARNING - API response time is higher than usual
2026-03-29 14:30:12 - myapp - ERROR - Failed to write to cache, continuing without cache
2026-03-29 14:30:12 - myapp - CRITICAL - Memory usage exceeded safe threshold

Output (written to logs/app.log):

2026-03-29 14:30:12 - myapp - DEBUG - Starting application initialization
2026-03-29 14:30:12 - myapp - INFO - Configuration loaded successfully
2026-03-29 14:30:12 - myapp - INFO - Database connection established
2026-03-29 14:30:12 - myapp - WARNING - API response time is higher than usual
2026-03-29 14:30:12 - myapp - ERROR - Failed to write to cache, continuing without cache
2026-03-29 14:30:12 - myapp - CRITICAL - Memory usage exceeded safe threshold

This pattern is powerful: you get a permanent record of everything (including debug messages developers need when troubleshooting), but the console stays clean during normal operationâonly showing problems that need immediate attention. When a warning or error occurs, developers see it right away.

Custom Log Formatting with Timestamps and Metadata

The formatter string controls what information appears in each log message. The most useful format tokens are:

Token	Meaning	Example
`%(asctime)s`	Timestamp (human-readable)	2026-03-29 14:30:12,456
`%(name)s`	Logger name	myapp.database
`%(levelname)s`	Severity level	INFO, WARNING, ERROR
`%(message)s`	The actual log message	Database query completed
`%(funcName)s`	Name of function that logged	connect_to_db
`%(filename)s`	Source filename	database.py
`%(lineno)d`	Line number in source	42
`%(module)s`	Module name	database
`%(process[=]d`	Process ID	12345
`%(thread)d`	Thread ID	140256789012345

Here are some practical format examples:

# formatting_examples.py
import logging

logger = logging.getLogger("myapp")
logger.setLevel(logging.DEBUG)

# Example 1: Detailed format with function and line number
handler1 = logging.StreamHandler()
formatter1 = logging.Formatter(
    "%(asctime)s [%(levelname)s] %(funcName)s:;%(lineno)d - %(message)s"
)
handler1.setFormatter(formatter1)

# Example 2: Compact format (good for production)
handler2_formatter = "%(asctime)s - %(name)s - %(levelname)s - %(message)s"

# Example 3: Include module name (useful in multi-file projects)
handler3_formatter = (
    "[%(asctime)s] %(module)s - %(levelname)s - %(message)s"
)

# Example 4: ISO 8601 timestamp with timezone
handler4 = logging.StreamHandler()
formatter4 = logging.Formatter(
    "%(asctime)s - %(levelname)s - %(message)s",
    datefmt="%Y-%m-%dT%H:%M:%S"
)
handler4.setFormatter(formatter4)

logger.addHandler(handler1)

def process_payment(user_id):
    logger.info(f"Processing payment for user {user_id}")
    logger.debug("Validating card information")
    logger.info("Payment submitted to processor")
    return True

process_payment(12345)

Output (Example 1 format):

2026-03-29 14:32:45,123 [INFO] process_payment:55 - Processing payment for user 12345
2026-03-29 14:32:45,124 [DEBUG] process_payment:56 - Validating card information
2026-03-29:0;( 14:32:45,125 [INFO] process_payment:57 - Payment submitted to processor

Controlling Log File Size with Log Rotation

If your application runs 24/7 and logs every request, your log files can grow huge in notime, wasting disk space. The solution is CotatingFileRotationHandler, which automatically recicycles old log files:

# log_rotation_example.py
import logging
i¤©ort logging.handlers  # Important!i

logger = logging.getLogger("myapp")
logger.setLevel(logging.DEBUG)

# Use RotatingFileHandler instead of FileHandler!
# Makes a new file every 000 1000 messages or 1 MB, whichever comes first
# Keeps an other to backup files
rotating_handler = logging.handlers.RotatingFileHandler(
    filename="app.log",
    maxBytes=1048576,  # 1 MB (optional)
    backupCount=5  # Keep up to 5 backup files
)

# Take the same formatter as before
formatter = logging.Formatter("%(asctime)s - %(name)s - %(levelname)s - %(message)s")
rotating_handler.setFormatter(formatter)

logger.addHandler(rotating_handler)

# Now you can log directly without worrying about file size!
for i in range(10000):
    logger.info(f"Processing request {i}")

# app.log will be rotated to: app.log.1, app.log.2, etc. as it grows past 1 MB
//_The oldest backups are deleted automatically staying under 10 MB total.

Output (logs directory):

- WF app.log (1 MB)
- WF app.log.1 (1 MB)
- WF app.log.2 (500 KB)
- WF app.log.3 (450 KB)
- WF app.log.4 (425 KB)
- WF app.log.5 (400 KB)
 (project has processed ~6350KB = 63.5 MB since rotation started)

The rotating handler automatically deletes oldest files when it exceeds the backupCount - saving space.

Sudo Sam pointing at a holographic hierarchical tree structure — One logging configuration. Dozens of modules. Hierarchical loggers handle the coordination.

Logging in Multi-File, Multi-Module Projects

For anything beyond a tiny script, use hierarchical logger names based on the module structure:

45,125 [INFO] process_payment:57 – Payment submitted to processor

The detailed format is invaluable when debugging: you know exactly which function logged the message and on which line. For production systems receiving hundreds of requests per second, the compact format reduces file size while keeping essential information.

Cache Katie adjusting gears inside a giant clock mechanism — Every log message is a snapshot in time. Good formatting makes the snapshot useful.

Preventing Massive Log Files with RotatingFileHandler

If your application runs 24/7, a single FileHandler will eventually create a multi-gigabyte log file. The solution is RotatingFileHandler, which automatically archives old log files and starts a new one when the current file reaches a size limit.

# rotating_file_handler_example.py
import logging
from logging.handlers import RotatingFileHandler
import os

logger = logging.getLogger("myapp")
logger.setLevel(logging.DEBUG)

# Create log directory
log_dir = "logs"
if not os.path.exists(log_dir):
    os.makedirs(log_dir)

# RotatingFileHandler: max 1 MB per file, keep 5 backups
rotating_handler = RotatingFileHandler(
    filename=os.path.join(log_dir, "app.log"),
    maxBytes=1024 * 1024,  # 1 MB
    backupCount=5          # Keep app.log.1, app.log.2, ..., app.log.5
)
rotating_handler.setLevel(logging.DEBUG)

formatter = logging.Formatter(
    "%(asctime)s - %(name)s - %(levelname)s - %(message)s"
)
rotating_handler.setFormatter(formatter)

logger.addHandler(rotating_handler)

# Simulate some logging activity
for i in range(100):
    logger.info(f"Processing item {i}: " + "X" * 100)  # Verbose message

When app.log reaches 1 MB, the handler automatically renames it to app.log.1, creates a fresh app.log, and continues logging. After 5 rotations, the oldest file is deleted. This keeps your disk usage bounded while preserving recent log history.

For time-based rotation (e.g., “create a new log file each day”), use TimedRotatingFileHandler:

# timed_rotating_file_handler_example.py
import logging
from logging.handlers import TimedRotatingFileHandler
import os

logger = logging.getLogger("myapp")
logger.setLevel(logging.DEBUG)

log_dir = "logs"
if not os.path.exists(log_dir):
    os.makedirs(log_dir)

# Create a new log file every day at midnight
timed_handler = TimedRotatingFileHandler(
    filename=os.path.join(log_dir, "app.log"),
    when="midnight",       # Rotate at midnight
    interval=1,            # Every 1 day
    backupCount=7          # Keep 7 days of logs
)
timed_handler.setLevel(logging.DEBUG)

formatter = logging.Formatter(
    "%(asctime)s - %(name)s - %(levelname)s - %(message)s"
)
timed_handler.setFormatter(formatter)

logger.addHandler(timed_handler)

logger.info("Daily log rotation is configured")

The `when` parameter accepts values like "midnight" (daily), "W0" (Monday), "H" (hourly), etc. This is the preferred approach for long-running services where you want to correlate logs with calendar time.

Debug Dee plucking data from a glowing spider web — Web scrapers talk a lot. Good logging turns chatter into intelligence.

Real-World Pattern: Logging in Multi-Module Projects

Most projects have multiple modules. The best practice is to:

Configure logging once in your main module (or in a centralized config module)
In each module, create a logger with logging.getLogger(__name__)
Log directlyâno need to pass handlers around

Here’s how it works:

File: config.py (centralized logging setup)

# config.py
import logging
import logging.handlers
import os

def setup_logging():
    """Configure logging for the entire application."""
    # Root logger
    root_logger = logging.getLogger()
    root_logger.setLevel(logging.DEBUG)

    # Create logs directory
    log_dir = "logs"
    if not os.path.exists(log_dir):
        os.makedirs(log_dir)

    # File handler (all messages)
    file_handler = logging.handlers.RotatingFileHandler(
        filename=os.path.join(log_dir, "app.log"),
        maxBytes=5 * 1024 * 1024,  # 5 MB
        backupCount=10
    )
    file_handler.setLevel(logging.DEBUG)

    # Console handler (warnings only)
    console_handler = logging.StreamHandler()
    console_handler.setLevel(logging.WARNING)

    # Formatter
    formatter = logging.Formatter(
        "%(asctime)s - %(name)s - %(levelname)s - %(message)s"
    )
    file_handler.setFormatter(formatter)
    console_handler.setFormatter(formatter)

    # Attach handlers
    root_logger.addHandler(file_handler)
    root_logger.addHandler(console_handler)\ÜÙ[HB]\ÈÝ]\ÈÚÈB^Ù\^Ù\[Û\ÈNÙÙÙ\\Ü\]Y\ÝZ[YÙ_HBZ\ÙOØÛÙOÜOÝÛÏ[NXZ[H
\XØ][Û[HÚ[
OÜÝÛÏÜOÛÙOÈXZ[B[\ÜÙÙÚ[ÂÛHÛÛYÈ[\ÜÙ]\ÛÙÙÚ[Â[\Ü]X\ÙB[\Ü\BÈÛÛYÝ\HÙÙÚ[ÈÓÑH]Ý\\Ù]\ÛÙÙÚ[Ê
BÙÙÙ\HÙÙÚ[ËÙ]ÙÙÙ\×Û[YW×ÊBYXZ[
NÙÙÙ\[Ê\XØ][ÛÝ\YBÈ[Ù[\È]]ÛX]XØ[H\ÙHHÛÛYÝ\YÙÙÚ[ÂY]X\ÙKÛÛXÝÝ×Ù]X\ÙJØØ[ÜÝ
MÌN]X\ÙK^XÝ]WÜ]Y\JÑSPÕ
ÓH\Ù\ÈSRULBN\Ý[H\K[WÜ\]Y\Ý
Ø\KÝ\Ù\ÈBÙÙÙ\[ÊTH]\YÜ\Ý[HB^Ù\^Ù\[Û\ÈNÙÙÙ\^Ù\[ÛTH\ÜÙ_HBÙÙÙ\[Ê\XØ][ÛÚ]ÝÛBY×Û[YW×ÈOH×ÛXZ[×ÈXZ[
OØÛÙOÜOÝÛÏÝ]]
ÈÛÛÛÛJNÜÝÛÏÜOÛÙOLËLHMÍNLÈHÛÝHSÈH\XØ][ÛÝ\YLËLHMÍNLHH]X\ÙHHSÈH]X\ÙHÛÛXÝ[Û\ÝX\ÚYLËLHMÍNLÈH]X\ÙHHSÈH]Y\H^XÝ]YÝXØÙ\ÜÙ[BLËLHMÍNLHH\HHSÈH\]Y\ÝØÙ\ÜÙYÝXØÙ\ÜÙ[BLËLHMÍNLÌHH×ÛXZ[×ÈHSÈHTH]\YÉÜÝ]\ÉÎ	ÛÚÉßBLËLHMÍNLÌÈH×ÛXZ[×ÈHSÈH\XØ][ÛÚ]ÝÛØÛÙOÜOÝÛÏÝ]]
Ü][ÈÙÜËØ\ÙÊNÜÝÛÏÜOÛÙOLËLHMÍNLH×ÛXZ[×ÈHSÈH\XØ][ÛÝ\YLËLHMÍNLHH]X\ÙHHSÈHÛÛXÝ[ÈÈ]X\ÙH]ØØ[ÜÝMÌLËLHMÍNLH]X\ÙHHPQÈH\Ú[ÈÛÛXÝ[Û[Y[Ý]ÙÌÙXÛÛÂLËLHMÍNLÈH]X\ÙHHSÈH]X\ÙHÛÛXÝ[Û\ÝX\ÚYLËLHMÍNLH]X\ÙHHPQÈH^XÝ][È]Y\NÑSPÕ
ÓH\Ù\ÈSRULLËLHMÍNLHH]X\ÙHHSÈH]Y\H^XÝ]YÝXØÙ\ÜÙ[BLËLHMÍNLH\HHSÈHXÙZ]Y\]Y\ÝØ\KÝ\Ù\ÂLËLHMÍNLÈH\HHPQÈH[Y][È\]Y\Ý\[Y]\ÂLËLHMÍNLH\HHSÈH\]Y\ÝØÙ\ÜÙYÝXØÙ\ÜÙ[BLËLHMÍNLHH×ÛXZ[×ÈHSÈHTH]\YÉÜÝ]\ÉÎ	ÛÚÉßBLËLHMÍNLÌHH×ÛXZ[×ÈHSÈH\XØ][ÛÚ]ÝÛØÛÙOÜOHÙ^H[ÚYÚH\Ú[ÈÛÙOÙÙÙ\HÙÙÚ[ËÙ]ÙÙÙ\×Û[YW×ÊOØÛÙO[XXÚ[Ù[KÙÙÙ\È\H]]ÛX]XØ[HY\\ÚXØ[ÛÙO\KOØÛÙOÜÈÙÙÙ\\È[YYÛÙO\HØÛÙOÛÙO]X\ÙKOØÛÙOÜÈ\ÈÛÙO]X\ÙHØÛÙO[[Ù[H\ÜXÝHÛÛYÝ\][ÛÙ][ÛÙOÛÛYËOØÛÙO[ÝHÛÛYÝ\HÛÙNÈ]\H[Ù[HÙ]ÈH[\È[ÜX][È]]ÛX]XØ[KÜKKHSPQÑWÔPÑRÓT\Ú]XÝ]Y]Ú[ÈY\[ÈÙHZ[[ÈÚ]][\HÛÜËØ\[ÛÛHÙÙÚ[ÈÛÛYÝ\][ÛÞ[ÈÙ[Ù[\ËY\\ÚXØ[ÙÙÙ\È[HHÛÛÜ[][ÛKOYHÛÛ[[Û[Z\ÝZÙ\ÈÛÛ[[ÛZ\ÝZÙ\È[XYÙÚ[È\ÏÚÈYHZ\ÝZÙKY\XØ]KZ[\ÈZ\ÝZÙHN\XØ]H[\È
\XØ]HÙÈY\ÜØYÙ\ÊOÚÏH[ÜÝÛÛ[[ÛYÎ[ÝHYH[\]Y\ÜØYÙ\È\X\ÚXÙK\È\ÝX[H\[ÈÚ[[ÝHØ[ÛÙOÙ]\ÛÙÙÚ[Ê
OØÛÙO][\H[Y\ÎÜOÛÙOÈÔÓÈHÚ[Ø]\ÙH\XØ]HY\ÜØYÙ\ÂÙ]\ÛÙÙÚ[Ê
BÙ]\ÛÙÙÚ[Ê
HÈÛÜÈHYY[\ÈYØZ[ØÛÙOÜOÛÛ][ÛØ[Ù]\ÛÙH]\XØ][ÛÝ\\Y[ÝIÜH[H\ÝÝZ]KÛX\[\È]ÙY[\ÝÎÜOÛÙOÈÛX\^\Ý[È[\ÈYÜHXÛÛYÝ\[ÂÙÙÙ\HÙÙÚ[ËÙ]ÙÙÙ\
BÜ[\[ÙÙÙ\[\ÖÎNÙÙÙ\[[ÝR[\[\OØÛÙOÜOÈYHZ\ÝZÙK]ÜÛË[ÙÙÙ\[[YHZ\ÝZÙHÜÛÈÙÙÙ\[YOÚÏY[ÝHÈÛÙOÙÙÚ[ËÙ]ÙÙÙ\\ÛÙYÛ[YHOØÛÙO[ÝXYÙÛÙOÙÙÚ[ËÙ]ÙÙÙ\×Û[YW×ÊOØÛÙO[ÝHÜÙHHX[]HÈ[\H[Ù[K[Ø^\È\ÙHÛÙO×Û[YW×ÏØÛÙOÜOÛÙOÈÔÓÂÙÙÙ\HÙÙÚ[ËÙ]ÙÙÙ\^X\HÈ[[Ù[\È[YY^X\ÈQÒÙÙÙ\HÙÙÚ[ËÙ]ÙÙÙ\×Û[YW×ÊHÈXXÚ[Ù[H\È]ÈÝÛÙÙÙ\ØÛÙOÜOÈYHZ\ÝZÙKYÜÙ][ËYÜX]\Z\ÝZÙHÎÈÜX]\
[Y\Ý[\ÈZ\ÜÚ[ÊOÚÏY[ÝHYH[\]ÛÝÙ]HÜX]\[ÝIÛÙ]H\HY\ÜØYÙHÚ]È[Y\Ý[\ÜÛÛ^ÜOÛÙOÈÝ]]Ú]Ý]ÜX]\ØÙ\ÜÚ[È][HBÈÝ]]Ú]ÜX]\LËLHMÍNLÈH×ÛXZ[×ÈHSÈHØÙ\ÜÚ[È][HOØÛÙOÜO[Ø^\È]XÚHÜX]\È[\È]Ü]HÈ[\ËÛÛÛÛH[\ÈØ[ÛÜÈÚ]Ý]]]	ÜÈ]\È]HÛÛÚ\Ý[ÜX][È]\]Ú\KÜÈYHZ\ÝZÙKZ[\[][]Ë[ÙÙÙ\[][Z\ÝZÙH
ÛÛ\Ú[È[\][ÈÙÙÙ\][ÚÏXXÚÙÙÙ\[XXÚ[\\È]ÈÝÛ][Ý[\È\NÜOÛÙOÈÙÙÙ\][PQÈH[ÝÜÈ]\][ÈÝYÚÈ[H[\][SÈH[\ÈÈSÈ[XÝBÈ\Ý[PQÈY\ÜØYÙ\ÈÛÈÈÛÛÛÛH]Õ[BÙÙÙ\Ù]][
ÙÙÚ[ËPQÊB[WÚ[\Ù]][
ÙÙÚ[ËSÊBÙÙÙ\XYÊ\ÈXXÚ\ÈÛÛÛÛH]Ý[HHÈÛÛ\Ú[ÈOØÛÙOÜO\ÝXÝXÙNÙ]HÙÙÙ\][ÈPQÈ
HÝÙ\Ý
K[[\]XXÚ[\ÜOÛÙOÙÙÙ\Ù]][
ÙÙÚ[ËPQÊHÈ][\ÈXÚYHÚ]ÈÙY\[WÚ[\Ù]][
ÙÙÚ[ËPQÊHÈ[HÙ]È]\][ÂÛÛÛÛWÚ[\Ù]][
ÙÙÚ[ËÐTSÊHÈÛÛÛÛHÙ]ÈØ\[ÜÈÛOØÛÙOÜOÈYHZ\ÝZÙKY^Ù\[ÛË[Ý[ÙÙÙYZ\ÝZÙH
N^Ù\[ÛÈÛÝ[ÛYHXÙXXÚÜÏÚÏ\ÙHÛÙOÙÙÙ\^Ù\[Û
OØÛÙO
ÝÛÙOÙÙÙ\\Ü
OØÛÙOHÚ[ÙÙÚ[È[ÚYH[^Ù\[Û[\È[ÛYHH[XÙXXÚÎÜOÛÙOÈÔÓÈHÈXÙXXÚÂN\ÚÞWÛÜ\][Û
B^Ù\^Ù\[Û\ÈNÙÙÙ\\ÜZ[YÙ_HBÈQÒH[ÛY\È[XÙXXÚÂN\ÚÞWÛÜ\][Û
B^Ù\^Ù\[Û\ÈNÙÙÙ\^Ù\[ÛZ[YÙ_HOØÛÙOÜOÝÛÏÝ]]Ú]ÙÙÙ\^Ù\[Û
NÜÝÛÏÜOÛÙOTÔHZ[Y]\Ú[ÛH\ÂXÙXXÚÈ
[ÜÝXÙ[Ø[\Ý
N[HXZ[H[HL[XZ[\ÚÞWÛÜ\][Û
B[H][ËH[H
K[\ÚÞWÛÜ\][Û]\LÈ\Ñ]\Ú[Û\Ü]\Ú[ÛH\ÏØÛÙOÜOYHX[[YKY^[\HX[SYH^[\NHÙXØÜ\\Ú]ÛÛ\Z[Ú]HÙÙÚ[ÏÚ]	ÜÈZ[HX[\ÝXÈÚXÝHÙXØÜ\\]ÙÜÈÈÝ[H[ÛÛÛÛHÚ]Ý][Û[\È\ÜÈÜXÙY[K[ÝY\È]Z[YXYÙÚ[È[ÜX][ÛÜKKHSPQÑWÔPÑRÓTÜY\ÙXÚ]]HÚ[ÈZ[ÈØ\\Y[ÜØ[^YØ\[ÛÙXØÜ\\È[ÈHÝÛÛÙÙÙÚ[È\ÈÚ]\[È[[YÙ[ÙKKOOÛÙOÈÙXÜØÜ\\ÝÚ]ÛÙÙÚ[ËB[\ÜÙÙÚ[Â[\ÜÙÙÚ[Ë[\Â[\ÜÜÂ[\Ü[YBÛH\X\]Y\Ý[\Ü\Ü[ÛH\X\Ü[\ÜT\Ü[\ÜÛÛÈÛÛYÝ\HÙÙÚ[ÂYÙ]\ÛÙÙÚ[Ê
NÙÙÙ\HÙÙÚ[ËÙ]ÙÙÙ\ÙXØÜ\\BÙÙÙ\Ù]][
ÙÙÚ[ËPQÊBÙ×Ù\HÙÜÈYÝÜË]^\ÝÊÙ×Ù\NÜËXZÙY\ÊÙ×Ù\BÈ[H[\[Y\ÜØYÙ\ËÝ]\È]P[WÚ[\HÙÙÚ[Ë[\ËÝ][Ñ[R[\[[[YO[ÜË]Ú[Ù×Ù\ØÜ\\ÙÈKX^]\ÏL
L
LXÚÝ\ÛÝ[MB
B[WÚ[\Ù]][
ÙÙÚ[ËPQÊBÈÛÛÛÛH[\Ø\[ÜÈÛBÛÛÛÛWÚ[\HÙÙÚ[ËÝX[R[\
BÛÛÛÛWÚ[\Ù]][
ÙÙÚ[ËÐTSÊBÈÜX]\Ú]]Z[Y[ÂÜX]\HÙÙÚ[ËÜX]\J\ØÝ[YJ\ÈÉJ][[YJ\×H	J[Ó[YJ\ÎJ[[ÊYH	JY\ÜØYÙJ\È]Y]HVKI[KIY	RSNTÈ
B[WÚ[\Ù]ÜX]\ÜX]\BÛÛÛÛWÚ[\Ù]ÜX]\ÜX]\BÙÙÙ\Y[\[WÚ[\BÙÙÙ\Y[\ÛÛÛÛWÚ[\B]\ÙÙÙ\ÙÙÙ\HÙ]\ÛÙÙÚ[Ê
BY]ÚÚÛÛ\[Y[Ý]MJN]Ú[\ÙHÓÓÛHHTÙÙÙ\XYÊ][\[ÈÈ]ÚÝ\HBNÚ]\Ü[\[Y[Ý]][Y[Ý]
H\È\ÜÛÙN]HHÛÛØYÊ\ÜÛÙKXY

KXÛÙJ]NJBÙÙÙ\[ÊÝXØÙ\ÜÙ[H]ÚYÛ[]J_HXÛÜÈÛHÝ\HB]\]B^Ù\T\Ü\ÈNÙÙÙ\\Ü]ÛÜÈ\Ü]Ú[ÈÝ\NÙ_HB]\ÛB^Ù\ÛÛÓÓXÛÙQ\Ü\ÈNÙÙÙ\\ÜZ[YÈ\ÙHÓÓÛHÝ\NÙ_HB]\ÛB^Ù\^Ù\[Û\ÈNÙÙÙ\^Ù\[Û[^XÝY\Ü]Ú[ÈÝ\HB]\ÛBYØÙ\Ü×Ù]J]JNØÙ\ÜÈ]ÚY]KYÝ]NÙÙÙ\Ø\[ÊÈ]HÈØÙ\ÜÈB]\ÙÙÙ\[ÊØÙ\ÜÚ[ÈÛ[]J_H][\ÈBØÙ\ÜÙYHÜK][H[[[Y\]J]JNNÈÚ[][]HØÙ\ÜÚ[ÂY\Ú[Ý[ÙJ][KXÝ
NØÙ\ÜÙY
ÏHBYH	HLOHÙÙÙ\XYÊØÙ\ÜÙYÚ_H][\ÈB^Ù\^Ù\[Û\ÈNÙÙÙ\Ø\[ÊZ[YÈØÙ\ÜÈ][HÚ_NÙ_HBÙÙÙ\[ÊÝXØÙ\ÜÙ[HØÙ\ÜÙYÜØÙ\ÜÙYKÞÛ[]J_H][\ÈB]\ØÙ\ÜÙYYXZ[
NÙÙÙ\[ÊÙXØÜ\\Ý\YBÈ]ÚÛHÓÓXÙZÛ\
YHZÙHÓÓTJB\HÎËÚÛÛXÙZÛ\\XÛÙKÛÛKÜÜÝÈÙÙÙ\XYÊÛÛYÝ\][Û[Y[Ý]M\ËX^Ü]Y\ÏLÈB]HH]ÚÚÛÛ\
BY]NØÙ\ÜÙYHØÙ\Ü×Ù]J]JBÙÙÙ\[ÊØÜ\[ÈÛÛ\]YÜØÙ\ÜÙYH][\ÈØÙ\ÜÙYB[ÙNÙÙÙ\\ÜØÜ\[ÈZ[Y[XHÈ]Ú]HBÙÙÙ\[ÊÙXØÜ\\[\ÚYBY×Û[YW×ÈOH×ÛXZ[×ÈNXZ[
B^Ù\Ù^XØ\[\\ÙÙÙ\Ø\[ÊØÜ\\[\\YH\Ù\B^Ù\^Ù\[Û\ÈNÙÙÙ\^Ù\[Û][\Ü[XZ[BZ\ÙOØÛÙOÜOÝÛÏÝ]]
ÈÛÛÛÛJNÜÝÛÏÜOÛÙOLËLHMMHÒS×HXZ[HÙXØÜ\\Ý\YLËLHMMÒS×H]ÚÚÛÛHÝXØÙ\ÜÙ[H]ÚYLXÛÜÈÛHÎËÚÛÛXÙZÛ\\XÛÙKÛÛKÜÜÝÂLËLHMMÒS×HØÙ\Ü×Ù]NHØÙ\ÜÚ[ÈL][\ÂLËLHMMÒS×HØÙ\Ü×Ù]NHÝXØÙ\ÜÙ[HØÙ\ÜÙYLÌL][\ÂLËLHMMÒS×HXZ[ÈHØÜ\[ÈÛÛ\]YL][\ÈØÙ\ÜÙYLËLHMMÒS×HXZ[
HÙXØÜ\\[\ÚYØÛÙOÜOÝÛÏÝ]]
[ÙÜËÜØÜ\\ÙÊNÜÝÛÏÜOÛÙOLËLHMMHÒS×HXZ[HÙXØÜ\\Ý\YLËLHMMHÑPQ×HXZ[HÛÛYÝ\][Û[Y[Ý]M\ËX^Ü]Y\ÏLÂLËLHMMHÑPQ×H]ÚÚÛÛHH][\[ÈÈ]ÚÎËÚÛÛXÙZÛ\\XÛÙKÛÛKÜÜÝÂLËLHMMÒS×H]ÚÚÛÛHÝXØÙ\ÜÙ[H]ÚYLXÛÜÈÛHÎËÚÛÛXÙZÛ\\XÛÙKÛÛKÜÜÝÂLËLHMMÒS×HØÙ\Ü×Ù]NHØÙ\ÜÚ[ÈL][\ÂLËLHMMÑPQ×HØÙ\Ü×Ù]NHHØÙ\ÜÙY][\ÂLËLHMMÑPQ×HØÙ\Ü×Ù]NHHØÙ\ÜÙYL][\ÂLËLHMMÑPQ×HØÙ\Ü×Ù]NHHØÙ\ÜÙY][\Â
[ÜHXYÈY\ÜØYÙ\ÊBLËLHMMÑPQ×HØÙ\Ü×Ù]NHHØÙ\ÜÙYL][\ÂLËLHMMÒS×HØÙ\Ü×Ù]NHÝXØÙ\ÜÙ[HØÙ\ÜÙYLÌL][\ÂLËLHMMÒS×HXZ[ÈHØÜ\[ÈÛÛ\]YL][\ÈØÙ\ÜÙYLËLHMMÒS×HXZ[
HÙXØÜ\\[\ÚYØÛÙOÜO\È^[\H[[ÛÝ]\È\ÝXÝXÙ\ÎÙ]\ÙÙÚ[ÈÛÙK\ÙH[Ù[K[][ÙÙÙ\Ú]ÛÙO×Û[YW×ÏØÛÙO[H^Ù\[ÛÈÚ]ÛÛ^[ÛYHXYË[][XYÛÜÝXÜÈÜÝX\ÚÛÝ[Ë[\ÙHSÈY\ÜØYÙ\ÈÈXÚÈH\H]Ú[HØ[HØØÝ\ËHXYÈÙÈ\È[H]Z[ÎÈHÛÛÛÛHÝ^\È]ZY][[ÛÛY][ÈXÝX[HÛÙ\ÈÜÛËÜYH\H\]Y[H\ÚÙY]Y\Ý[ÛÏÚÈYH\KX\ÚXØÛÛYÈNÚHÙ\È\ÚXÐÛÛYÈÛÛY][Y\ÈÝÛÜÏÏÚÏÛÙOÙÙÚ[Ë\ÚXÐÛÛYÊ
OØÛÙOÛÛYÝ\\ÈHÛÝÙÙÙ\]ÛHYÈ[\È]HY[YYY]ÛÙH[ÝHYH[\
][[HX\H[ÝIÜH\Ú[ÊKÛÙO\ÚXÐÛÛYÊ
OØÛÙOXÛÛY\ÈHË[ÜÜ\ÈX\ÛÛ]ÚYÛÙO\ÚXÐÛÛYÊ
OØÛÙO[ÙXÝ[ÛÛÙK[ÝXY\ÙHH]\ÛHH][K[[Ù[H^[\NÜX]HHÙ]\[Ý[Û]ÛÛYÝ\\È[\È^XÚ]HÛHÛÝÙÙÙ\ÜÈYH\K\\ÜX[ÙHNÙ\ÈÙÙÚ[È\\ÜX[ÙOÏÚÏÝÚYÛYXØ[KHÙÙÚ[È[Ù[H\ÈÜ[Z^Y[[ÝHØ[\ØXH^[Ú]HÜ\][ÛËY[ÝIÜHÛÛÙ\Y\ÙH^H][X][ÛÛÙOÙÙÙ\XYÊ[YNÙ^[Ú]WÙ[Ý[Û
_HOØÛÙOÛHØ[ÈH[Ý[ÛYPQÈ\È[XY[\]][KÚXÚÈH][\ÝÛÙOYÙÙÙ\\Ñ[XYÜÙÙÚ[ËPQÊNÙÙÙ\XYÊOØÛÙOÜ[ÜÝ\XØ][ÛËÙÙÚ[È\ÈYÛYÚXHÝ\XYÜÈYH\K]XY\ØYHN\ÈÙÙÚ[ÈXY\ØYOÏÚÏY\ËHÙÙÚ[È[Ù[H\Ù\ÈØÚÜÈ[\[HÈÛÛÜ[]HXØÙ\ÜÈÈ[\Ë[ÝHØ[ØY[HØ[ÙÙÙ\Y]ÙÈÛH][\HXYÈÚ]Ý]ÛÜ\[ÛÝÙ]\H[\È[\Ù[\È
\ÜXÚX[HÝ\ÝÛH[\ÊHÚÝ[[ÛÈHXY\ØYKZ[Z[[\ÈZÙHÛÙO[R[\ØÛÙO\HØYKÜÈYH\K[ÙËYÜX]]ÚÙ[ÈNÚ]Ý\ÜX]ÚÙ[È\H]Z[XH^[ÛHÛÛ[[ÛÛ\ÏÏÚÏX[KÛÙOJ][ÊYØÛÙO
[Y\XÈ][
KÛÙOJ][YJ\ÏØÛÙO
[[H]
KÛÙOJØÙ\ÜÓ[YJ\ÏØÛÙO
ØÙ\ÜÈ[YJKÛÙOJXY[YJ\ÏØÛÙO
XY[YJK[ÛÙOJ\ÙXÜÊYØÛÙO
Z[\ÙXÛÛÊKÙYHHÙXÚX[]ÛØÝ[Y[][ÛÜHÛÛ\]H\Ý[ÜÝÚXÝÈÝXÚÈÚ]HÚÙ[ÈÚÝÛX\Y\XØ]\ÙH^IÜHH[ÜÝ\ÙY[ÜÈYH\KY\ØXK[X\K[ÙÜÈNÝÈÈHÚ[[ÙH\ÜÙHÙÜÈÛH\\\HX\Y\ÏÏÚÏÙ]HÙÙÙ\Ü]X\HÈHYÚ\][X\Y\È\ÙHÛÙOÙÙÚ[ËÙ]ÙÙÙ\×Û[YW×ÊOØÛÙOÛÈ[ÝHØ[ÎÛÙOÙÙÚ[ËÙ]ÙÙÙ\\]Y\ÝÈKÙ]][
ÙÙÚ[ËÐTSÊOØÛÙOÈÚ[[ÙHH\]Y\ÝÈX\K\È\ÈHÛÛ[[Û]\Ú[[YÜ][È][\HX\Y\È]ÙÈX][KÜÈYH\K\Þ\ÛÙÈNÝÈÈHÙÈÈÞ\ÛÙÈÜÝ\[ÝÏÚÏ\ÙHÛÙOÙÙÚ[Ë[\ËÞ\ÓÙÒ[\ØÛÙO[ÝXYÙÛÙO[R[\ØÛÙOÛ[^Ó[^Þ\Ý[\ÎÛÙO[\HÙÙÚ[Ë[\ËÞ\ÓÙÒ[\Y\ÜÏJ	ËÙ]ÛÙÉË
JOØÛÙOÛÞ\Ý[\ÈÚ]Ý\[Ý\ÙHÛÙOÞ\ÓÙÒ[\Y\ÜÏJ	ÛØØ[ÜÝ	Ë
LM
JOØÛÙO\È\È[\Ü[ÜÛË\[[ÈÙ\XÙ\È]Þ\Ý[YX[YÙ\ËÜYHÛÛÛ\Ú[ÛÛÛÛ\Ú[ÛÚ[ÝHÝÈ[\Ý[ÝÈÈ[Ý[Y[]Û\XØ][ÛÈÚ]Ù\ÜÚ[Û[YÜYHÙÙÚ[Ë[ÝIÝHX\YÈÜX]HÙÙÙ\Ë]XÚ[\ÈÜ[H[ÛÛÛÛHÝ]]ÜX]Y\ÜØYÙ\ÈÚ][Y\Ý[\È[ÛÛ^Ý]HÙÈ[\ÈÈ][[]Ø^H\ÚÈ\ØYÙK[ÛÛÜ[]HÙÙÚ[ÈXÜÜÜÈ][\H[Ù[\È[HX[ÚXÝÜHÙ^HZÙX]Ø^NÙÙÚ[È\ÈÝH^\NÈ]	ÜÈ\ÜÙ[X[[\ÝXÝ\KÛÙO[

OØÛÙOÝ][Y[È]\Ü]KÙÈ[\È\H\X[[XÛÜÈÙÚ][Ý\\XØ][ÛYÚ[]Y][Ú]Ù[ÜÛËHÛÝÚ[ÈH]\È[\È\XÛx %\ÜXÚX[HH][K[[Ù[HÙ]\[ÛÛYËX8 %[ÝHZ[\XØ][ÛÈ]\HX\ÞHÈXYÈ[ÙXÝ[ÛÜZÙHHÙXØÜ\\^[\H[^[]Y[ÜHÛÛ\^\Ü[[ËÙÈY]XÜÈ
]Y\Y\È\ÙXÛÛ]\ÈØÙ\ÜÙY
K^\[Y[Ú]Y\[[\È
[XZ[[\ÈÜÜ]XØ[\ÜËÜ^[\JKHÙÙÚ[È[Ù[H\È^XH[ÝYÚÈÜÝÈÚ][Ý\ÚXÝÜÜÛÛ\]HØÝ[Y[][ÛÛ[[\È[ÜX]\ËÙYHHÙXÚX[HYHÎËÙØÜË]ÛÜËÌËÛX\KÛÙÙÚ[Ë[]ÛÙÙÚ[ÈØÝ[Y[][ÛØOÜYH[]YX\XÛ\È[]Y\XÛ\ÏÚ[OHYHÎËÜ]ÛÝÝÜÙÜ[KÛÛKÙ^Ù\[ÛZ[[ËZ[\]Û]KY^Ù\Y[ÙKY[[KÈ^Ù\[Û[[È[]Û
K^Ù\[ÙK[[JOØOÛOOHYHÎËÜ]ÛÝÝÜÙÜ[KÛÛKÜ]ÛYXÛÜ]ÜËZÝË]ËXÜX]KX[]\ÙKÈ]ÛXÛÜ]ÜÎÝÈÈÜX]H[\ÙOØOÛOOHYHÎËÜ]ÛÝÝÜÙÜ[KÛÛKÜ]ÛZÛÛ\\Ú[ËX[\ØÙ\ÜÚ[ËÈ]ÛÓÓ\Ú[È[ØÙ\ÜÚ[ÏØOÛOÝ[ËÙ]ÜÝ^VËÙ]ÜØÛÛ[[VËÙ]ÜÜÝ×VËÙ]ÜÜÙXÝ[ÛBKKHÝÜ]KÜXÙZÛ\KO


			
					

											
															
					
															How To Automate Repetitive Tasks with Python
					
					 by Pubs | Automation, Beginner

				
				
				
				
				
				
				
				
				
				
				
				
				
				
				
				
				
				



Beginner

Introduction

Every developer has experienced the monotony of repetitive tasks: renaming thousands of files, backing up project folders on schedule, generating weekly reports, or scanning for files that need processing. These are the moments when you wish a robot would just handle it while you focus on actual coding. The good news? Python makes this incredibly straightforward, and you already have everything you need in the standard library.

Python was designed with automation in mind. Libraries like os, shutil, pathlib, and smtplib give you powerful tools to interact with the file system, schedule tasks, and send notifications. You don’t need to learn complex shell scripts or invest in expensive automation software. A few lines of Python can save you hours of manual work.

In this guide, we’ll explore practical automation patterns starting with file operations and building toward a real-world automated backup system. By the end, you’ll have a toolkit for automating any repetitive task in your workflow.

Quick Example: Rename Files in Bulk

Before diving deep, let’s see automation in action. Imagine you have 500 image files named like IMG_0001.jpg, IMG_0002.jpg, and you want to prefix them with today’s date. Without automation, this takes hours. With Python, it takes seconds:

# bulk_rename.py
import os
import datetime

directory = "./photos"
prefix = datetime.date.today().strftime("%Y%m%d_")

for filename in os.listdir(directory):
    if filename.endswith(".jpg"):
        old_path = os.path.join(directory, filename)
        new_filename = prefix + filename
        new_path = os.path.join(directory, new_filename)
        os.rename(old_path, new_path)
        print(f"Renamed: {filename} -> {new_filename}")

Output:

Renamed: IMG_0001.jpg -> 20260329_IMG_0001.jpg
Renamed: IMG_0002.jpg -> 20260329_IMG_0002.jpg
Renamed: IMG_0003.jpg -> 20260329_IMG_0003.jpg

That script runs instantly and accomplishes what would take manual clicking for hours. This is the power of automation.

Python automation: where boredom goes to die.

Why Automate with Python?

You might be wondering: why Python instead of shell scripts, scheduled tasks, or other tools? The answer is clarity, portability, and power. Here’s how they compare:




Task Aspect
Manual Process
Shell Script
Python Script




Development Time
Hours per occurrence
30-60 minutes
15-30 minutes


Readability
N/A
Cryptic syntax
Human-readable code


Cross-Platform
N/A
Linux/Mac only
Windows, Mac, Linux


Debugging
N/A
Difficult
Easy with proper logging


Email Integration
Manual setup
Complex
Built-in libraries


Maintainability
N/A
Hard to modify
Easy to extend and modify




Python wins for most automation tasks because it balances simplicity with power. You can read Python code six months later and understand what it does, and you can add new features without rewriting everything.

Working with Files and Directories

Using os and pathlib Modules

Python provides two ways to work with file paths and directories: the older os module and the modern pathlib module. pathlib is more intuitive and handles cross-platform differences automatically, but os is still widely used. Let’s explore both:

# file_operations.py
import os
from pathlib import Path

# Using os module
print("Using os module:")
current_dir = os.getcwd()
print(f"Current directory: {current_dir}")

# List files
for item in os.listdir("."):
    if os.path.isfile(item):
        print(f"File: {item}")

# Using pathlib (modern approach)
print("\nUsing pathlib:")
current_path = Path(".")

for item in current_path.iterdir():
    if item.is_file():
        print(f"File: {item.name}")
        print(f"Size: {item.stat().st_size} bytes")
        print(f"Extension: {item.suffix}")

Output:

Using os module:
Current directory: /home/user/projects

Using pathlib:
File: script.py
Size: 1245 bytes
Extension: .py
File: data.csv
Size: 5678 bytes
Extension: .csv

pathlib.Path is generally preferred because it’s more readable and handles path separators automatically (backslash on Windows, forward slash on Unix). However, both work fine depending on your preference and existing codebase.

Renaming and Organizing Files

One of the most common automation tasks is organizing files by type, date, or naming convention. The shutil module and os.rename() make this simple:

# organize_files.py
import os
import shutil
from pathlib import Path

download_dir = "./downloads"

# Create subdirectories if they don't exist
for category in ["Images", "Documents", "Archives", "Other"]:
    Path(download_dir, category).mkdir(exist_ok=True)

# Organize files by extension
for filename in os.listdir(download_dir):
    if filename.startswith("."):
        continue

    filepath = os.path.join(download_dir, filename)

    if not os.path.isfile(filepath):
        continue

    # Determine category based on extension
    ext = os.path.splitext(filename)[1].lower()

    if ext in [".jpg", ".png", ".gif", ".webp"]:
        category = "Images"
    elif ext in [".pdf", ".doc", ".docx", ".txt"]:
        category = "Documents"
    elif ext in [".zip", ".rar", ".7z"]:
        category = "Archives"
    else:
        category = "Other"

    # Move file to appropriate directory
    dest_path = os.path.join(download_dir, category, filename)
    shutil.move(filepath, dest_path)
    print(f"Moved {filename} to {category}/")

Output:

Moved vacation.jpg to Images/
Moved resume.pdf to Documents/
Moved backup.zip to Archives/
Moved config.txt to Documents/

This script is the foundation of smart file organization. In a real system, you’d add error handling, logging, and checks to avoid overwriting files. The Path.mkdir(exist_ok=True) pattern ensures directories exist without throwing errors if they do.

When your Downloads folder finally achieves organization.

Watching for File Changes with watchdog

Sometimes you need to react the moment a file appears or changes. The watchdog library monitors file system events in real-time. First, install it:

pip install watchdog

Now create a file watcher that triggers actions when new files appear:

# watch_folder.py
import time
from pathlib import Path
from watchdog.observers import Observer
from watchdog.events import FileSystemEventHandler

class FileProcessor(FileSystemEventHandler):
    def on_created(self, event):
        if not event.is_directory:
            filename = Path(event.src_path).name
            print(f"New file detected: {filename}")
            print(f"Full path: {event.src_path}")

    def on_modified(self, event):
        if not event.is_directory:
            filename = Path(event.src_path).name
            print(f"File modified: {filename}")

# Watch the current directory
observer = Observer()
observer.schedule(FileProcessor(), path=".", recursive=False)
observer.start()

print("Watching for file changes. Press Ctrl+C to stop.")
try:
    while True:
        time.sleep(1)
except KeyboardInterrupt:
    observer.stop()
    observer.join()

Output (after creating/modifying files):

Watching for file changes. Press Ctrl+C to stop.
New file detected: report.pdf
Full path: ./report.pdf
File modified: report.pdf

The watchdog library is perfect for implementing “drop a file to process it” workflows, such as converting documents, generating thumbnails, or triggering CI/CD pipelines.

Scheduling Tasks with the schedule Library

Many automation tasks need to run at specific times or intervals: daily backups, hourly data syncs, or weekly reports. The schedule library makes this elegant:

pip install schedule

Here’s how to create a task scheduler:

# task_scheduler.py
import schedule
import time
from datetime import datetime

def backup_database():
    timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
    print(f"[{timestamp}] Running database backup...")
    # Actual backup logic here

def clean_temp_files():
    timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
    print(f"[{timestamp}] Cleaning temporary files...")
    # Actual cleanup logic here

def generate_report():
    timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
    print(f"[{timestamp}] Generating daily report...")
    # Actual report generation here

# Schedule tasks
schedule.every().day.at("02:00").do(backup_database)
schedule.every().hour.do(clean_temp_files)
schedule.every().monday.at("09:00").do(generate_report)

# Keep scheduler running
print("Scheduler started. Tasks will run according to schedule.")
while True:
    schedule.run_pending()
    time.sleep(60)  # Check every minute

Output (sample execution):

Scheduler started. Tasks will run according to schedule.
[2026-03-29 02:00:12] Running database backup...
[2026-03-29 03:00:05] Cleaning temporary files...
[2026-03-29 09:00:00] Generating daily report...

The schedule library is straightforward but doesn’t persist across system restarts. For production systems, consider using cron (Linux/Mac) or Task Scheduler (Windows) to run your Python script, or use a more robust library like APScheduler.

Sending Email Notifications with smtplib

Automating tasks is great, but you need to know when something fails or completes. Python’s built-in smtplib library sends email notifications:

# send_email.py
import smtplib
from email.mime.text import MIMEText
from email.mime.multipart import MIMEMultipart

def send_notification(recipient, subject, body):
    sender_email = "automation@example.com"
    sender_password = "your_app_password_here"

    # Create message
    message = MIMEMultipart()
    message["From"] = sender_email
    message["To"] = recipient
    message["Subject"] = subject
    message.attach(MIMEText(body, "plain"))

    # Send email
    try:
        with smtplib.SMTP_SSL("smtp.gmail.com", 465) as server:
            server.login(sender_email, sender_password)
            server.send_message(message)
            print(f"Email sent to {recipient}")
    except Exception as e:
        print(f"Error sending email: {e}")

# Usage
send_notification(
    "admin@example.com",
    "Backup Complete",
    "Daily backup completed successfully at 2026-03-29 02:15:30."
)

Output:

Email sent to admin@example.com

Important: Never hardcode passwords in scripts. Use environment variables or a configuration file outside version control. For Gmail, generate an “App Password” in your account settings rather than using your actual password.

Working with CSV and Excel Files for Reports

Automated reporting is a huge time-saver. Python handles CSV files natively and can create Excel files with the openpyxl library:

# generate_report.py
import csv
from datetime import datetime
from pathlib import Path

# Sample data (from database or API in real scenario)
sales_data = [
    {"date": "2026-03-29", "product": "Widget A", "sales": 150},
    {"date": "2026-03-29", "product": "Widget B", "sales": 200},
    {"date": "2026-03-29", "product": "Widget C", "sales": 175},
]

# Generate CSV report
report_date = datetime.now().strftime("%Y%m%d")
report_filename = f"sales_report_{report_date}.csv"

with open(report_filename, "w", newline="") as csvfile:
    fieldnames = ["date", "product", "sales"]
    writer = csv.DictWriter(csvfile, fieldnames=fieldnames)

    writer.writeheader()
    writer.writerows(sales_data)

print(f"Report generated: {report_filename}")

Output:

Report generated: sales_report_20260329.csv

File contents:

date,product,sales
2026-03-29,Widget A,150
2026-03-29,Widget B,200
2026-03-29,Widget C,175

For more complex reports with formatting, install openpyxl: pip install openpyxl. This lets you create Excel files with colors, formulas, and multiple sheets.

Running System Commands with subprocess

Sometimes you need to call external programs from Python. The subprocess module handles this safely:

# run_commands.py
import subprocess
import os

# Run a simple command
result = subprocess.run(["python", "--version"], capture_output=True, text=True)
print(f"Python version: {result.stdout.strip()}")

# Run a command and capture output
result = subprocess.run(["ls", "-la"], capture_output=True, text=True)
print("Directory listing:")
print(result.stdout)

# Check if command succeeded
result = subprocess.run(["git", "status"], capture_output=True)
if result.returncode == 0:
    print("Git repository is clean")
else:
    print("Not a git repository or git error")

Output (Linux/Mac):

Python version: Python 3.10.6
Directory listing:
total 48
drwxr-xr-x 5 user user 4096 Mar 29 10:15 .
drwxr-xr-x 8 user user 4096 Mar 29 09:00 ..
-rw-r--r-- 1 user user 1245 Mar 29 10:12 script.py
Git repository is clean

Use capture_output=True to collect program output and text=True to get strings instead of bytes. Always check the return code to verify success.

Python calling system commands: the glue that holds automation together.

Real-Life Example: Automated Backup System

Now let’s build a complete, production-ready backup system that watches a directory and creates timestamped ZIP archives. This example combines everything we’ve learned:

# backup_system.py
import os
import shutil
import zipfile
import smtplib
import schedule
import time
from pathlib import Path
from datetime import datetime
from email.mime.text import MIMEText
from email.mime.multipart import MIMEMultipart

class BackupManager:
    def __init__(self, source_dir, backup_dir, email_to):
        self.source_dir = source_dir
        self.backup_dir = backup_dir
        self.email_to = email_to
        Path(backup_dir).mkdir(exist_ok=True)

    def create_backup(self):
        timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
        backup_filename = f"backup_{timestamp}.zip"
        backup_path = os.path.join(self.backup_dir, backup_filename)

        try:
            with zipfile.ZipFile(backup_path, "w", zipfile.ZIP_DEFLATED) as zipf:
                for root, dirs, files in os.walk(self.source_dir):
                    for file in files:
                        file_path = os.path.join(root, file)
                        arcname = os.path.relpath(file_path, self.source_dir)
                        zipf.write(file_path, arcname)

            file_size = os.path.getsize(backup_path) / (1024 * 1024)
            print(f"Backup created: {backup_filename} ({file_size:.2f} MB)")

            self.send_notification(
                f"Backup Success",
                f"Backup created successfully: {backup_filename}\nSize: {file_size:.2f} MB"
            )

            # Cleanup old backups (keep last 7)
            self.cleanup_old_backups()

        except Exception as e:
            print(f"Backup failed: {e}")
            self.send_notification("Backup Failed", f"Error: {str(e)}")

    def cleanup_old_backups(self):
        backups = sorted(Path(self.backup_dir).glob("backup_*.zip"))
        if len(backups) > 7:
            for old_backup in backups[:-7]:
                old_backup.unlink()
                print(f"Deleted old backup: {old_backup.name}")

    def send_notification(self, subject, body):
        sender_email = "backup@example.com"
        sender_password = "your_app_password"

        try:
            message = MIMEMultipart()
            message["From"] = sender_email
            message["To"] = self.email_to
            message["Subject"] = subject
            message.attach(MIMEText(body, "plain"))

            with smtplib.SMTP_SSL("smtp.gmail.com", 465) as server:
                server.login(sender_email, sender_password)
                server.send_message(message)
        except Exception as e:
            print(f"Could not send email: {e}")

# Setup and run
if __name__ == "__main__":
    manager = BackupManager(
        source_dir="./important_files",
        backup_dir="./backups",
        email_to="admin@example.com"
    )

    # Schedule daily backups at 2 AM
    schedule.every().day.at("02:00").do(manager.create_backup)

    print("Backup system started. Waiting for scheduled time...")
    while True:
        schedule.run_pending()
        time.sleep(60)

Output (sample):

Backup system started. Waiting for scheduled time...
Backup created: backup_20260329_020015.zip (45.32 MB)
Deleted old backup: backup_20260322_020012.zip

This system handles the full lifecycle: creating backups, managing disk space, and notifying you of success or failure. In production, you’d run this as a background service using systemd (Linux), launchd (Mac), or Task Scheduler (Windows).

Frequently Asked Questions

How do I run a Python script in the background?

Linux/Mac: Use nohup to ignore hangup signals: nohup python backup_system.py &. Or use screen or tmux for interactive backgrounds. Better: use cron to schedule it properly.

Windows: Use Task Scheduler to run the script with python.exe. Create a task that runs at startup or on a schedule without showing a window.

Should I add error handling to automation scripts?

Absolutely. Always wrap file operations in try-except blocks. Log errors to a file so you can debug later. For critical tasks, send notifications on failure. Here’s a pattern:

try:
    # Your automation code
    do_something()
except Exception as e:
    logger.error(f"Task failed: {e}")
    send_alert_email(f"Error: {e}")

Is it safe to put passwords in automation scripts?

No. Use environment variables, config files outside version control, or credential managers. For email, use app-specific passwords instead of your real password. Never commit secrets to GitHub.

import os
password = os.getenv("EMAIL_PASSWORD")  # Load from environment

How do I write automation that works on Windows, Mac, and Linux?

Use pathlib.Path instead of string path concatenation–it handles separators automatically. Use subprocess carefully since some commands differ. Test on all platforms or use Docker for consistency.

What if the user’s system doesn’t have the libraries I need?

Create a requirements.txt file listing dependencies, then users can install them with pip install -r requirements.txt. For standalone scripts, use PyInstaller to bundle Python and libraries into a single executable.

Conclusion

Python automation transforms tedious manual tasks into reliable, repeatable processes. You’ve learned to work with files and directories using os and pathlib, schedule tasks with the schedule library, send email notifications via smtplib, and build complete systems like automated backups. The key is starting simple–automate your most painful task first, then gradually expand your automation toolkit.

For deeper learning, explore the official documentation: os module, pathlib, shutil, and smtplib are all built-in. For external libraries, check schedule and watchdog on PyPI. The automation possibilities are endless once you see Python as your personal robot assistant.

Related Articles

Python File Handling: Reading, Writing, and Manipulating Files

Python Modules and Packages: Organizing Your Code

Python Error Handling: try, except, and finally
			
			
				
				
				
				
			
				
				
			
				
					
			
	« Older Entries
	Next Entries »

Task Aspect	Manual Process	Shell Script	Python Script
Development Time	Hours per occurrence	30-60 minutes	15-30 minutes
Readability	N/A	Cryptic syntax	Human-readable code
Cross-Platform	N/A	Linux/Mac only	Windows, Mac, Linux
Debugging	N/A	Difficult	Easy with proper logging
Email Integration	Manual setup	Complex	Built-in libraries
Maintainability	N/A	Hard to modify	Easy to extend and modify



				
		How To Use uv: The Fast Python Package Manager
How To Use Pydantic V2 for Data Validation in Python
How To Build a REST API with FastAPI in Python
How To Use Python 3.14 Template Strings (T-Strings) for Safe Interpolation
How To Mock API Calls in Python