Intermediate

If you have ever built a Python app that needs to store files, images, or backups in the cloud, AWS S3 is one of the most reliable options available. Whether you are saving user uploads, archiving log files, or distributing static assets, S3 gives you virtually unlimited storage with simple, consistent access via the boto3 library. The challenge is knowing how to connect your Python code to S3 correctly — and that is exactly what this article covers.

The boto3 library is the official AWS SDK for Python. It handles authentication, request signing, and all the low-level HTTP details, so you can focus on your application logic. You will need an AWS account and an IAM user with S3 access, but once those are in place, boto3 makes S3 operations surprisingly straightforward.

In this article, you will learn how to configure boto3 credentials, create and list S3 buckets, upload and download files, manage object metadata, work with presigned URLs, and build a practical file sync script that mirrors a local folder to S3. By the end, you will have a working toolkit for integrating S3 into any Python project.

Using boto3 with S3: Quick Example

Before diving into setup, here is a minimal example that uploads a file to S3 and downloads it back — the two most common operations you will need in any project:

# s3_quick.py
import boto3

s3 = boto3.client('s3', region_name='us-east-1')
bucket = 'my-python-bucket-demo'

# Upload a file
s3.upload_file('hello.txt', bucket, 'hello.txt')
print("Uploaded hello.txt to S3")

# Download it back
s3.download_file(bucket, 'hello.txt', 'hello_downloaded.txt')
print("Downloaded hello.txt from S3")

Output:

Uploaded hello.txt to S3
Downloaded hello.txt from S3

The upload_file method takes three arguments: local file path, bucket name, and S3 object key (the file name in S3). The download_file method reverses this: bucket, S3 key, local destination. The rest of this article builds on these two core operations with more realistic scenarios.

What Is boto3 and Why Use It for S3?

boto3 is the official AWS SDK for Python, maintained by Amazon. It provides two levels of access: the low-level client interface, which maps almost directly to the AWS REST API, and the higher-level resource interface, which wraps common operations in convenient Python objects. For S3, both are widely used — clients give you full control; resources make simple tasks more readable.

InterfaceStyleBest ForExample
clientLow-level, dict-basedFull control, presigned URLs, metadatas3.put_object(...)
resourceObject-orientedSimple CRUD, bucket/object iterationbucket.upload_file(...)

For most projects, you will use the client interface — it gives you access to every S3 feature. The resource interface is convenient for quick scripts but does not support all operations. Throughout this article, we will use the client interface so the patterns work everywhere.

Setting up IAM credentials for boto3
IAM credentials in order. Bucket policy reviewed twice. Ready.

Setting Up boto3 and AWS Credentials

Install boto3 with pip:

# terminal
pip install boto3

Output:

Successfully installed boto3-1.34.x botocore-1.34.x

Next, configure your AWS credentials. The recommended approach for development is the AWS credentials file. boto3 automatically reads from ~/.aws/credentials — you never need to hardcode keys in your code:

# ~/.aws/credentials  (create this file manually or run: aws configure)
[default]
aws_access_key_id = YOUR_ACCESS_KEY_ID
aws_secret_access_key = YOUR_SECRET_ACCESS_KEY
region = us-east-1

For production, use IAM roles attached to your EC2 instance or Lambda function instead of hardcoded keys — boto3 picks these up automatically with no code changes. If you are testing locally without the credentials file, you can pass them directly to the client, but never commit keys to source control:

# s3_explicit_credentials.py
import boto3

# Only for local testing -- use IAM roles or credentials file in production
s3 = boto3.client(
    's3',
    region_name='us-east-1',
    aws_access_key_id='YOUR_KEY',
    aws_secret_access_key='YOUR_SECRET'
)
print("Client created:", s3.meta.endpoint_url or 'default AWS endpoint')

Output:

Client created: None

Creating and Listing S3 Buckets

A bucket is a top-level container in S3 — like a root folder. Every object in S3 lives inside a bucket. Bucket names must be globally unique across all AWS accounts, lowercase, and between 3-63 characters. Here is how to create a bucket and verify it exists:

# s3_buckets.py
import boto3

s3 = boto3.client('s3', region_name='us-east-1')

# Create a new bucket (us-east-1 uses a different API than other regions)
bucket_name = 'my-python-demo-bucket-2026'

try:
    s3.create_bucket(Bucket=bucket_name)
    print(f"Created bucket: {bucket_name}")
except s3.exceptions.BucketAlreadyOwnedByYou:
    print(f"Bucket already exists and is yours: {bucket_name}")

# List all your buckets
response = s3.list_buckets()
print("\nYour S3 buckets:")
for bucket in response['Buckets']:
    print(f"  {bucket['Name']} (created: {bucket['CreationDate'].date()})")

Output:

Created bucket: my-python-demo-bucket-2026

Your S3 buckets:
  my-python-demo-bucket-2026 (created: 2026-05-02)

Note that for regions other than us-east-1, you must pass a CreateBucketConfiguration parameter specifying the region — boto3 will raise a InvalidLocationConstraint error otherwise. For example: s3.create_bucket(Bucket=name, CreateBucketConfiguration={'LocationConstraint': 'eu-west-1'}).

Uploading files with boto3
create_bucket() — because mkdir -p doesn’t scale to petabytes.

Uploading and Downloading Files

boto3 provides three upload methods, each suited to different situations. upload_file reads from disk, upload_fileobj reads from a file-like object, and put_object uploads raw bytes or strings. For most use cases, upload_file is the right choice because it automatically uses multipart upload for large files:

# s3_upload.py
import boto3
import os

s3 = boto3.client('s3', region_name='us-east-1')
bucket = 'my-python-demo-bucket-2026'

# Upload a local file
s3.upload_file(
    Filename='report.csv',      # local path
    Bucket=bucket,
    Key='reports/2026/report.csv',  # S3 path (key)
    ExtraArgs={'ContentType': 'text/csv'}
)
print("Uploaded report.csv")

# Upload from an in-memory bytes object
import io
data = io.BytesIO(b"name,score\nAlice,95\nBob,87\n")
s3.upload_fileobj(data, bucket, 'reports/2026/scores.csv')
print("Uploaded scores from memory")

# List objects in a prefix
response = s3.list_objects_v2(Bucket=bucket, Prefix='reports/')
for obj in response.get('Contents', []):
    print(f"  {obj['Key']} ({obj['Size']} bytes)")

Output:

Uploaded report.csv
Uploaded scores from memory
  reports/2026/report.csv (1024 bytes)
  reports/2026/scores.csv (28 bytes)

Downloading follows the same pattern with mirrored methods. Use download_file for disk, download_fileobj for streaming into memory, and get_object when you need the raw response body or metadata alongside the content:

# s3_download.py
import boto3
import io

s3 = boto3.client('s3', region_name='us-east-1')
bucket = 'my-python-demo-bucket-2026'

# Download to disk
s3.download_file(bucket, 'reports/2026/report.csv', 'local_report.csv')
print("Downloaded to local_report.csv")

# Download into memory (no temp file)
buffer = io.BytesIO()
s3.download_fileobj(bucket, 'reports/2026/scores.csv', buffer)
buffer.seek(0)
content = buffer.read().decode('utf-8')
print("In-memory content:", content[:40])

# Get object with metadata
response = s3.get_object(Bucket=bucket, Key='reports/2026/report.csv')
print("Content-Type:", response['ContentType'])
print("Last-Modified:", response['LastModified'])

Output:

Downloaded to local_report.csv
In-memory content: name,score
Alice,95
Bob,87

Content-Type: text/csv
Last-Modified: 2026-05-02 10:00:00+00:00

Deleting and Copying Objects

Deleting and copying objects are single API calls. Copying is particularly useful for server-side operations — moving objects between buckets, creating versioned backups, or changing storage classes without downloading and re-uploading:

# s3_delete_copy.py
import boto3

s3 = boto3.client('s3', region_name='us-east-1')
bucket = 'my-python-demo-bucket-2026'

# Copy an object within the same bucket
s3.copy_object(
    CopySource={'Bucket': bucket, 'Key': 'reports/2026/report.csv'},
    Bucket=bucket,
    Key='reports/archive/report_backup.csv'
)
print("Copied report.csv to archive/")

# Delete a single object
s3.delete_object(Bucket=bucket, Key='reports/2026/scores.csv')
print("Deleted scores.csv")

# Delete multiple objects at once (more efficient than one-by-one)
s3.delete_objects(
    Bucket=bucket,
    Delete={
        'Objects': [
            {'Key': 'reports/2026/report.csv'},
            {'Key': 'reports/archive/report_backup.csv'},
        ]
    }
)
print("Deleted 2 objects in batch")

Output:

Copied report.csv to archive/
Deleted scores.csv
Deleted 2 objects in batch

Always use delete_objects for batch deletes — it saves API calls and is significantly faster than calling delete_object in a loop when removing dozens or hundreds of files.

Copying files between S3 buckets
delete_objects() takes a list. Your loop takes a credit card.

Generating Presigned URLs

Presigned URLs allow you to grant temporary, time-limited access to private S3 objects without making them public. They are essential for scenarios like letting users download their own files, sharing reports with external stakeholders, or accepting uploads directly from browsers — all without exposing your AWS credentials:

# s3_presigned.py
import boto3
from botocore.exceptions import ClientError

s3 = boto3.client('s3', region_name='us-east-1')
bucket = 'my-python-demo-bucket-2026'

# Generate a presigned download URL (valid for 1 hour)
try:
    url = s3.generate_presigned_url(
        'get_object',
        Params={'Bucket': bucket, 'Key': 'reports/2026/report.csv'},
        ExpiresIn=3600  # seconds
    )
    print("Download URL (expires in 1 hour):")
    print(url[:80] + "...")
except ClientError as e:
    print(f"Error generating URL: {e}")

# Generate a presigned POST URL for browser uploads
post = s3.generate_presigned_post(
    bucket,
    'uploads/user_file.pdf',
    Fields={'Content-Type': 'application/pdf'},
    Conditions=[['content-length-range', 0, 10485760]],  # max 10MB
    ExpiresIn=600  # 10 minutes
)
print("\nPresigned POST fields:", list(post['fields'].keys()))

Output:

Download URL (expires in 1 hour):
https://my-python-demo-bucket-2026.s3.amazonaws.com/reports/2026/report.csv?X-Amz-Algorithm=AWS4...

Presigned POST fields: ['key', 'AWSAccessKeyId', 'policy', 'signature', 'Content-Type']

The presigned POST URL is different from the GET URL — it returns a dictionary with a URL and fields that your frontend must include in a multipart form submission. This lets you accept direct browser-to-S3 uploads without routing files through your server, which saves bandwidth and infrastructure cost.

Real-Life Example: Local Folder to S3 Sync Script

boto3 S3 pipeline with API Alice
S3 sync: because rsync doesn’t have a free tier.

This script syncs a local folder to an S3 bucket, uploading only new or changed files based on file size and last-modified time. It is the kind of utility you would use for automated backups or deploying static websites:

# s3_sync.py
import boto3
import os
import hashlib
from pathlib import Path
from botocore.exceptions import ClientError

def get_local_files(folder: str) -> dict:
    """Return {relative_path: file_size} for all files in folder."""
    result = {}
    base = Path(folder)
    for path in base.rglob('*'):
        if path.is_file():
            rel = str(path.relative_to(base)).replace('\\', '/')
            result[rel] = path.stat().st_size
    return result

def get_s3_files(s3_client, bucket: str, prefix: str) -> dict:
    """Return {key: size} for all objects under prefix."""
    result = {}
    paginator = s3_client.get_paginator('list_objects_v2')
    for page in paginator.paginate(Bucket=bucket, Prefix=prefix):
        for obj in page.get('Contents', []):
            key = obj['Key']
            if prefix:
                key = key[len(prefix):]
            result[key] = obj['Size']
    return result

def sync_folder_to_s3(local_folder: str, bucket: str, prefix: str = '') -> None:
    s3 = boto3.client('s3', region_name='us-east-1')
    local = get_local_files(local_folder)
    remote = get_s3_files(s3, bucket, prefix)

    uploaded = 0
    skipped = 0

    for rel_path, local_size in local.items():
        s3_key = (prefix + rel_path) if prefix else rel_path
        remote_size = remote.get(rel_path)

        if remote_size == local_size:
            skipped += 1
            continue  # File unchanged, skip

        local_full = os.path.join(local_folder, rel_path)
        s3.upload_file(local_full, bucket, s3_key)
        print(f"  Uploaded: {rel_path}")
        uploaded += 1

    print(f"\nSync complete: {uploaded} uploaded, {skipped} skipped")

# Run the sync
sync_folder_to_s3('./my_project', 'my-python-demo-bucket-2026', prefix='backups/')

Output:

  Uploaded: README.md
  Uploaded: src/main.py
  Uploaded: src/utils.py

Sync complete: 3 uploaded, 2 skipped

This script uses a paginator for the S3 listing, which is important — S3 list_objects_v2 returns at most 1000 objects per call, and the paginator handles the continuation tokens automatically. You can extend this script to add deletion of remote files that no longer exist locally, or to use ETag-based comparison for more accurate change detection.

Frequently Asked Questions

Why do I get a redirect error when creating buckets in non-us-east-1 regions?

S3 bucket creation in regions other than us-east-1 requires passing a CreateBucketConfiguration dict with the LocationConstraint set to your target region. Without it, you get a IllegalLocationConstraintException. The us-east-1 region is the exception — it does not accept the configuration parameter. To write region-agnostic code, conditionally add CreateBucketConfiguration only when the region is not us-east-1.

How does boto3 find credentials automatically?

boto3 follows a credential chain in this order: environment variables (AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY), the ~/.aws/credentials file, IAM instance profiles (for EC2), and ECS task role credentials. In production on AWS, always use IAM roles instead of hardcoded keys — the SDK picks them up automatically with zero configuration changes in your code.

How do I upload large files efficiently?

boto3’s upload_file automatically uses multipart upload for files larger than 8MB (the default threshold). You can configure the threshold and part size via a TransferConfig object: config = boto3.s3.transfer.TransferConfig(multipart_threshold=8388608, max_concurrency=10), then pass it as s3.upload_file(..., Config=config). For very large files (GB+), increasing max_concurrency significantly improves throughput.

How do I make an object publicly accessible?

Set the ACL to 'public-read' in ExtraArgs during upload: s3.upload_file(file, bucket, key, ExtraArgs={'ACL': 'public-read'}). However, newer AWS accounts have “Block Public Access” enabled by default at the account level. You may need to disable this in the S3 console under your bucket’s Permissions tab. For most use cases, presigned URLs are a better alternative to public objects because they are time-limited and don’t require changing bucket policies.

How can I reduce S3 storage costs?

Use lifecycle rules to automatically transition objects to cheaper storage classes (Standard-IA after 30 days, Glacier after 90 days) or delete them after a retention period. Set these up via the S3 console or with s3.put_bucket_lifecycle_configuration(). Also enable S3 Intelligent-Tiering for data with unpredictable access patterns — it automatically moves objects between access tiers with no retrieval fees.

Conclusion

You now have a complete toolkit for S3 operations in Python using boto3. We covered authentication via credentials files and IAM roles, bucket creation and listing, uploading and downloading files with all three method variants, batch delete and server-side copy, generating presigned URLs for secure temporary access, and a practical folder sync script that handles both new and changed files efficiently with pagination support.

The next step is to extend the sync script with deletion support (remove remote objects not present locally) and ETag-based change detection using MD5 checksums for more accurate diffing. You can also add progress callbacks to track upload progress for large files using boto3’s callback parameter in upload_file.

For the full boto3 S3 documentation including all available operations, storage classes, and IAM policy examples, see the official boto3 S3 reference.