How To Detect If A Date Is A Public Holiday In Python 3

Last Updated: June 01, 2026

Table of Contents

Checking For Public Holiday With Holidays Module
Checking For Holidays With API Call to Calendarific
Summary
Subscribe to our newsletter
Pro Tips for Working with Public Holidays in Python
Frequently Asked Questions
Related Articles

Beginner

Determining the current date is a public holiday can be tricky when holidays change and it of course changes from country to country. From system time of servers & machines running to timestamps for tracking the transactions and events in e-commerce platforms, the date and time play a major role. There are a variety of use cases related to manipulating date and time that can be solved using the inbuilt datetime module in Python3, such as

Finding if a given year is a leap year or an ordinary year
Finding the number of days between the two mentioned dates
Convert between different date or time formats

What if you were to check if a given date is a public holiday? There isn’t any specific formula or logic to determine that, do we? Holidays can be pre-defined or uncalled for.

Here, we will be exploring the two ways to detect if a date is a holiday or not.

Written by Pubs

Python developer and educator with 15+ years building production systems across data engineering, web APIs, and AI tooling. Founder of Python How To Program — 270+ in-depth tutorials covering the modern Python stack.

View all tutorials by Pubs →

Checking For Public Holiday With Holidays Module

Although Python3 doesn’t provide any modules to detect if a date is a holiday or not, there are some of the external modules that help in detecting this. One of those modules is Holidays.

In your terminal, type in the following to get the module installed.

sudo pip3 install holidays

Now that our module is ready, let’s understand a bit about what the module and what it is capable of. Have a look at the following code snippet.

'''
Snippet to check if a given date is a holiday
'''

from datetime import date                                           # Step 1
import holidays

us_holidays = holidays.UnitedStates()                               # Step 2

input_date = input("Enter the date as YYYY-MM-DD: ")                # Step 3

holiday_name = us_holidays .get(input_date)                         # Step 4

if holiday_name != None: 
    output = "{} is a US Holiday - It's {}".format(input_date, holiday_name)
else:
    output = "{} is not a US Holiday".format(input_date)
                                                                    # Step 5
print (output)

In the above snippet,

Step 1: Imports the required modules
Step 2: Initializes the us_holidays object, so that the corresponding get function can be invoked at step 3
Step 3: Gets dateinput from the user
Step 4: Invokes the get function of the holidays module. This returns the name of the holiday if the date is a holiday or returns None in case if it isn’t. This gets assigned to the variable – holiday_name.
Step 5: Based on the variable – holiday_name, using the if clause the string formatting is done. Can you make this if clause even leaner? Read this article to know about the One line if else statements.

Here’s what the output looks like.

Checking For Holidays With API Call to Calendarific

The above method is suitable for simple projects; however, it can never be used to provide an enterprise-grade solution. Let’s say, you are building a web application for a holiday and travel startup, building an enterprise-grade application requires an enterprise-grade solution. If you haven’t noticed, the holidays module is pretty simple and if you consider state-wise or newly announced holidays, then this solution doesn’t simply cut for a large-scale application.

Enterprise requirements such as these can be satisfied by using external APIs such as Calendarific which provides the API as a service for such applications to consume. They keep updating the holidays of states and countries constantly, and the applications may consume these APIs. Of course, enterprise solutions don’t always come free, but the developer account has a limit of 1000API requests per month.

Locate to https://calendarific.com/ on your favorite browser and follow the steps as shown in the following images to get yourself a free account and an API key for this exercise.

Step 1: Open Calendarific on your favorite browser

Understanding the Calendarific REST API

Before we could dive into using the API KEY, get yourself a REST API client – Insomnia or Postman. We are about to test our API key if we are able to retrieve the holiday information. Plugin the following URL by replacing [APIKEY] text with your API KEY received from above on your REST client.

https://calendarific.com/api/v2/holidays?api_key=[APIKEY]&country=us-ny&type=national&year=2020&month=1&day=1

In the above URL:

https://calendarific.com/api/v2 is the API Base URL
/holidays is the API route
api_key, country, type, year, month, day are URL Parameters
Each parameter has a value allocated to it with an = (equal sign)
Each parameter and value pair is split by an & (ampersand)

For the above API call, the following response will be received; the value corresponding to the code key under the meta tag as ‘200’ corresponds to a successful response.

{
  "meta": {
    "code": 200
  },
  "response": {
    "holidays": [
      {
        "name": "New Year's Day",
        "description": "New Year's Day is the first day of the Gregorian calendar, which is widely used in many countries such as the USA.",
        "country": {
          "id": "us",
          "name": "United States"
        },
        "date": {
          "iso": "2020-01-01",
          "datetime": {
            "year": 2020,
            "month": 1,
            "day": 1
          }
        },
        "type": [
          "National holiday"
        ],
        "locations": "All",
        "states": "All"
      }
    ]
  }
}

The REST API call has returned some useful info about the National holiday on the 1st of January. Let’s see if it’s able to detect for the 2nd of January. Plugin the following URL again by replacing the text [APIKEY] with your API Key.

https://calendarific.com/api/v2/holidays?api_key=[APIKEY]&country=us-ny&type=national&year=2020&month=1&day=2

The above URL should be returning a response similar to below.

{
   "meta": {
     "code": 200
   },
   "response": {
     "holidays": []
   }
 }

Indeed, the 2nd of January is not a public holiday and hence, the holidays list inside the response nested JSON key turns out to be an empty list.

Now we know that our API works very well, it is now time to incorporate Calendarific REST API into our Python code. We will be using the requests module in order to make this happen. Here’s how it is done.

'''
Snippet to check if a given date is a holiday using an external API - Calendarific
'''

import requests                                                     # Step 1

api_key   = '[APIKEY]'                                              # Step 2
base_url  = 'https://calendarific.com/api/v2'
api_route = '/holidays'

location  = input("Enter Country & State code - E.g.: us-ny: ")
date_inpt = input("Enter the date as YYYY-MM-DD: ")                 # Step 3
y, m, d = date_inpt.split('-')
 
full_url = '{}{}?api_key={}&country={}&type=national&year={}&month={}&day={}'\
                .format(base_url, api_route, api_key, location, str(int(y)), str(int(m)), str(int(d)))                                           # Step 4

response = requests.get(full_url).json()                            # Step 5

if response['response']['holidays'] != []:
    print ("{} is a holiday - {}".format(date_inpt, response['response']['holidays'][0]['name']))
else:                                                               # Step 6
    print ("{} is not a holiday".format(date_inpt))

In the above snippet,

Step 1: Import requests module – you will be needing this module to invoke the REST API.
Step 2: Replace ‘[APIKEY]’ with your own API key from Calendarific
Step 3: The user inputs the corresponding location and date for which the holiday needs to be detected
Step 4: String formatting in order to frame the URL
Step 5: Invoke the API and convert the response to a JSON; i.e.) a dictionary
Step 6: If clause checks for the presence of an empty list or with a returned response.

Here’s what the output looks like.

And there you have it, a working example for detecting if a given date is a holiday using an external API.

Summary

From an overall perspective, there could be multiple ways to solve a given problem, and here, we have portrayed two of those ways in detecting if a given date is a holiday or not. One is a straight forward out-of-the-box solution and the other one is an enterprise-ready solution, which one would you choose?

How To Automate Tasks with Python: A Practical Guide

by Pubs | Jun 14, 2026 | Automation

Last Updated: June 14, 2026

Table of Contents

Automate Tasks with Python: Quick Example
What Is Python Automation and When Should You Use It?
Automating File and Folder Operations
Automating Web Data Collection
Running Automated Jobs on a Schedule
Running System Commands with subprocess
Real-Life Example: Automated Downloads Folder Organizer
Frequently Asked Questions
Conclusion
Related Articles

Beginner to Intermediate

You have a folder full of downloaded files named document(1).pdf, screenshot_2024.png, and report_final_v3_FINAL.xlsx. Every week you spend 20 minutes sorting them by hand. Or maybe you copy data from a website into a spreadsheet every morning, or you run the same three terminal commands every time you start work. These are the tasks Python was built to eliminate. With a few dozen lines of code you can turn a painful weekly chore into something that runs itself while you drink coffee.

Python ships with a rich standard library for automation — pathlib for file operations, subprocess for running system commands, smtplib for sending email — and the broader ecosystem adds libraries like schedule for periodic jobs and requests for web data collection. No special setup is needed beyond a standard Python 3.8+ install. For scheduling, you will need to install schedule with pip, but everything else in this article is built in.

In this guide we will cover four practical automation categories: organizing files and folders with pathlib and shutil, collecting web data with requests and BeautifulSoup, scheduling jobs to run automatically with the schedule library, and running system commands with subprocess. Each section ends with working code you can adapt to your own situation. By the end you will have a toolkit of reusable automation patterns and a complete script that organizes a messy downloads folder automatically.

Written by Pubs

Python developer and educator with 15+ years building production systems across data engineering, web APIs, and AI tooling. Founder of Python How To Program.

Automate Tasks with Python: Quick Example

Here is a self-contained script that renames and moves files in a folder based on their extension — one of the most common automation tasks you will ever write:

# sort_downloads.py
from pathlib import Path
import shutil

FOLDER = Path.home() / "Downloads"
DESTINATIONS = {
    ".pdf":  "Documents",
    ".png":  "Images",
    ".jpg":  "Images",
    ".xlsx": "Spreadsheets",
    ".csv":  "Spreadsheets",
    ".zip":  "Archives",
}

for file in FOLDER.iterdir():
    if file.is_file() and file.suffix in DESTINATIONS:
        dest_folder = FOLDER / DESTINATIONS[file.suffix]
        dest_folder.mkdir(exist_ok=True)
        shutil.move(str(file), dest_folder / file.name)
        print(f"Moved: {file.name} -> {DESTINATIONS[file.suffix]}/")

Output (example):

Moved: invoice_march.pdf -> Documents/
Moved: screenshot_2024.png -> Images/
Moved: sales_report.xlsx -> Spreadsheets/

The script uses Path.home() to get the user’s home directory regardless of operating system, iterdir() to loop over every item in the folder, and shutil.move() to relocate the file. The mkdir(exist_ok=True) call creates the destination folder if it does not already exist — no crash if it is there, no error if it is not. We will build a more complete version in the real-life example section, including duplicate detection and logging.

What Is Python Automation and When Should You Use It?

Automation means writing code that performs a repetitive task so you do not have to. The rule of thumb is: if you have done something manually more than three times, it is worth automating. Python is the go-to language for automation because it has concise syntax, a massive standard library, and third-party packages that cover almost every automation use case out of the box.

The table below maps common repetitive tasks to the Python tools that handle them:

Task Type	Python Tool	When to Use
File/folder operations	`pathlib`, `shutil`	Renaming, moving, copying, deleting files
Reading/writing files	built-in `open()`, `csv`	Log parsing, report generation, data transformation
Web data collection	`requests`, `BeautifulSoup`	Pulling prices, headlines, tables from websites
Scheduled jobs	`schedule`, `APScheduler`	Running tasks daily, hourly, or on a cron-like schedule
System commands	`subprocess`	Running CLI tools, shell scripts, git, ffmpeg
Email sending	`smtplib`, `yagmail`	Automated reports, alerts, notifications

A good automation script is idempotent — running it twice produces the same result as running it once. It handles edge cases (missing files, network errors, duplicate names) without crashing. And it logs what it did so you can review the results later. Keep these principles in mind as we work through each section.

Python developer sorting files automatically with pathlib and shutil — pathlib.iterdir() — because sorting files by hand is how you waste a Tuesday.

Automating File and Folder Operations

The pathlib module (Python 3.4+) provides an object-oriented interface for working with file paths that is far more readable than the older os.path approach. Combined with shutil for copy/move operations, these two modules cover 90% of file automation tasks.

Finding and Filtering Files with pathlib

The glob() and rglob() methods on a Path object let you find files matching a pattern across an entire directory tree. glob() searches one level deep; rglob() (recursive glob) searches all subdirectories:

# find_files.py
from pathlib import Path

base = Path("/tmp/project")  # change to your actual folder

# Find all Python files in this folder only
py_files = list(base.glob("*.py"))
print("Python files (top-level):", [f.name for f in py_files])

# Find all log files anywhere in the tree
log_files = list(base.rglob("*.log"))
print("Log files (all depths):", [f.relative_to(base) for f in log_files])

# Find files larger than 1 MB
large_files = [f for f in base.rglob("*") if f.is_file() and f.stat().st_size > 1_000_000]
print("Files over 1MB:", [f.name for f in large_files])

Output (example):

Python files (top-level): ['app.py', 'utils.py', 'config.py']
Log files (all depths): [PosixPath('logs/app.log'), PosixPath('logs/error.log')]
Files over 1MB: ['dataset.csv', 'backup.tar.gz']

f.stat().st_size returns the file size in bytes. The expression f.relative_to(base) strips the base directory from the path so you see logs/app.log instead of the full absolute path. Both are useful for building reports of what your script found before it starts moving anything.

Renaming and Copying Files Safely

Before moving or renaming files in an automation script, always check whether the destination already exists. Blindly overwriting a file can cause data loss that is impossible to reverse:

# safe_copy.py
from pathlib import Path
import shutil

def safe_copy(src: Path, dest_dir: Path) -> Path:
    """Copy src into dest_dir, appending a counter if the name already exists."""
    dest_dir.mkdir(parents=True, exist_ok=True)
    dest = dest_dir / src.name

    if dest.exists():
        counter = 1
        while dest.exists():
            stem = src.stem
            dest = dest_dir / f"{stem}_{counter}{src.suffix}"
            counter += 1

    shutil.copy2(str(src), dest)   # copy2 preserves metadata (timestamps)
    return dest

# Demo
src_file = Path("/tmp/report.pdf")
src_file.write_text("dummy content")   # create test file
result = safe_copy(src_file, Path("/tmp/archive"))
print(f"Copied to: {result}")

result2 = safe_copy(src_file, Path("/tmp/archive"))  # simulate duplicate
print(f"Duplicate handled: {result2}")

Output:

Copied to: /tmp/archive/report.pdf
Duplicate handled: /tmp/archive/report_1.pdf

The shutil.copy2() function copies the file content AND preserves the original modification time, which is important when you want archive copies to retain their original dates. The counter loop ensures you never silently overwrite an existing file — a critical safety net for any file automation script.

Developer inspecting duplicate files during automation — Duplicate detected. Counter incremented. Crisis averted.

Automating Web Data Collection

Web scraping lets your scripts pull data from websites automatically. The standard approach uses requests to download HTML and BeautifulSoup to parse it. Install both with pip install requests beautifulsoup4.

Fetching and Parsing a Web Page

We will use quotes.toscrape.com, a site built specifically for scraping practice. It serves reliable HTML with a stable structure, so this code will continue to work without modification.

Here is the HTML structure of each quote on that page, so you can see exactly what the selectors are targeting:

<!-- HTML structure of each quote on quotes.toscrape.com -->
<div class="quote">
    <span class="text">"The world as we have created it..."</span>
    <span>
        by <small class="author">Albert Einstein</small>
    </span>
    <div class="tags">
        <a class="tag" href="/tag/change/page/1/">change</a>
    </div>
</div>

Now the scraping code:

# scrape_quotes.py
import requests
from bs4 import BeautifulSoup

def scrape_quotes(url: str) -> list[dict]:
    response = requests.get(url, timeout=10)
    response.raise_for_status()   # raises HTTPError for 4xx/5xx responses

    soup = BeautifulSoup(response.text, "html.parser")
    results = []

    for quote_div in soup.select("div.quote"):
        text_elem = quote_div.select_one("span.text")
        author_elem = quote_div.select_one("small.author")
        tag_elems = quote_div.select("a.tag")

        # Defensive: check before accessing .text
        text   = text_elem.text.strip()   if text_elem   else "Unknown"
        author = author_elem.text.strip() if author_elem else "Unknown"
        tags   = [t.text for t in tag_elems]

        results.append({"quote": text, "author": author, "tags": tags})

    return results

quotes = scrape_quotes("https://quotes.toscrape.com")
for q in quotes[:3]:
    print(f'{q["author"]}: {q["quote"][:60]}...')
    print(f'  Tags: {", ".join(q["tags"])}')
    print()

Output:

Albert Einstein: "The world as we have created it is a process of...
  Tags: change, deep-thoughts, thinking, world

J.K. Rowling: "It is our choices, Harry, that show what we truly a...
  Tags: abilities, choices

Albert Einstein: "There are only two ways to live your life. One is...
  Tags: inspirational, life, live, miracle, miracles

response.raise_for_status() is a one-line safety net — it raises a requests.HTTPError if the server returns a 4xx or 5xx status code instead of silently continuing with bad data. The defensive checks (text_elem.text if text_elem else "Unknown") protect against pages where an element is missing, which happens constantly on real-world sites.

Saving Scraped Data to CSV

Collecting data is only half the job — you need to store it somewhere useful. Writing to CSV with Python’s built-in csv module keeps the output format-agnostic and readable in any spreadsheet application:

# save_to_csv.py
import csv
import requests
from bs4 import BeautifulSoup

def scrape_quotes(url):
    resp = requests.get(url, timeout=10)
    resp.raise_for_status()
    soup = BeautifulSoup(resp.text, "html.parser")
    results = []
    for div in soup.select("div.quote"):
        text   = div.select_one("span.text")
        author = div.select_one("small.author")
        results.append({
            "quote":  text.text.strip()   if text   else "",
            "author": author.text.strip() if author else "",
        })
    return results

quotes = scrape_quotes("https://quotes.toscrape.com")

output_file = "quotes.csv"
with open(output_file, "w", newline="", encoding="utf-8") as f:
    writer = csv.DictWriter(f, fieldnames=["author", "quote"])
    writer.writeheader()
    writer.writerows(quotes)

print(f"Saved {len(quotes)} quotes to {output_file}")

Output:

Saved 10 quotes to quotes.csv

Always pass encoding="utf-8" when writing CSV files — without it, non-ASCII characters (curly quotes, accented letters, em-dashes) will cause encoding errors or garbled output on Windows. The newline="" argument is also required by Python’s csv module to prevent extra blank lines on Windows.

Python web scraping with requests and BeautifulSoup — BeautifulSoup.select() — structured extraction from unstructured chaos.

Running Automated Jobs on a Schedule

Writing the automation code is only half the job. You also need it to run at the right time, without you having to remember to start it. The schedule library provides a clean Python API for defining when jobs should run — every minute, every day at 9am, every Monday, and so on. Install it with pip install schedule.

Basic Scheduling with the schedule Library

The schedule library works with a simple event loop: you register jobs with schedule.every(), then call schedule.run_pending() in a loop to check whether any jobs are due:

# basic_schedule.py
import schedule
import time
from datetime import datetime

def morning_report():
    print(f"[{datetime.now():%H:%M:%S}] Good morning! Running daily report...")
    # your actual report logic goes here

def hourly_check():
    print(f"[{datetime.now():%H:%M:%S}] Hourly check complete.")

# Schedule the jobs
schedule.every().day.at("09:00").do(morning_report)
schedule.every().hour.do(hourly_check)
schedule.every(30).minutes.do(lambda: print("30-min heartbeat"))

print("Scheduler running. Press Ctrl+C to stop.")
while True:
    schedule.run_pending()
    time.sleep(30)   # check every 30 seconds to save CPU

Output (example at 09:00:00):

Scheduler running. Press Ctrl+C to stop.
[09:00:00] Good morning! Running daily report...
[09:00:00] Hourly check complete.
[09:00:00] 30-min heartbeat
[09:30:00] 30-min heartbeat
[10:00:00] Hourly check complete.

The time.sleep(30) inside the loop is important — without a sleep, the loop burns 100% of one CPU core doing nothing. Sleeping 30 seconds means jobs can fire up to 30 seconds late (acceptable for most automation), while using negligible CPU. If you need second-level precision, use time.sleep(1) instead.

Handling Errors in Scheduled Jobs

When a job in a scheduled loop raises an unhandled exception, the whole process crashes and no more jobs run. Wrap your job functions with a try/except to log errors and keep the loop running:

# robust_schedule.py
import schedule
import time
import logging
from datetime import datetime

logging.basicConfig(
    filename="automation.log",
    level=logging.INFO,
    format="%(asctime)s %(levelname)s %(message)s",
)

def run_safely(job_func):
    """Decorator that catches exceptions and logs them without crashing the loop."""
    def wrapper():
        try:
            job_func()
            logging.info(f"{job_func.__name__} completed successfully")
        except Exception as exc:
            logging.error(f"{job_func.__name__} failed: {exc}", exc_info=True)
    return wrapper

def daily_scrape():
    # Simulating a job that sometimes fails
    import random
    if random.random() < 0.3:
        raise ConnectionError("Network unavailable")
    print("Scraped data successfully.")

schedule.every().day.at("08:00").do(run_safely(daily_scrape))

while True:
    schedule.run_pending()
    time.sleep(60)

The run_safely decorator wraps any function so that exceptions are caught, logged to a file, and the scheduler continues. The exc_info=True argument tells Python's logging module to include the full traceback in the log file -- essential for debugging failures that happen at 3am while you are asleep.

Python schedule library running automated jobs on a timer — schedule.run_pending() -- the world's most patient while loop.

Running System Commands with subprocess

Sometimes the right tool for a job is a command-line program, not a Python library. subprocess.run() lets your Python script launch any system command and capture its output, making it easy to orchestrate CLI tools like git, ffmpeg, or database utilities.

Basic subprocess Usage

# run_commands.py
import subprocess

# Run a command and capture its output
result = subprocess.run(
    ["python3", "--version"],
    capture_output=True,
    text=True,       # decode bytes to str automatically
    check=False,     # don't raise on non-zero exit code
)

print("Return code:", result.returncode)
print("Stdout:", result.stdout.strip())
print("Stderr:", result.stderr.strip())

# Run a command that lists files (works on macOS/Linux)
ls_result = subprocess.run(
    ["ls", "-la", "/tmp"],
    capture_output=True,
    text=True,
)
print("\nFirst 3 lines of /tmp listing:")
for line in ls_result.stdout.strip().split("\n")[:3]:
    print(" ", line)

Output:

Return code: 0
Stdout: Python 3.11.4
Stderr:

First 3 lines of /tmp listing:
  total 0
  drwxrwxrwt  20 root  wheel  640 Jun 12 09:31 .
  drwxr-xr-x  20 root  wheel  640 May 18 11:02 ..

Always use a list for the command argument (["ls", "-la", "/tmp"]) rather than a string ("ls -la /tmp"). The list form avoids shell injection vulnerabilities -- if any part of the command comes from user input, a string passed to shell=True can execute arbitrary shell commands. The list form is always safer.

Practical Example: Automating Git Operations

Here is a real-world use case -- a script that automatically stages, commits, and pushes changes in a git repository. This is useful for automating backup commits or syncing generated files:

# git_autocommit.py
import subprocess
from datetime import datetime
from pathlib import Path

def git_run(args: list[str], cwd: Path) -> subprocess.CompletedProcess:
    """Run a git command in the specified directory."""
    return subprocess.run(
        ["git"] + args,
        capture_output=True,
        text=True,
        cwd=cwd,
    )

def auto_commit(repo_path: Path, message: str = None) -> bool:
    """Stage all changes and commit if there is anything to commit."""
    # Check for changes
    status = git_run(["status", "--porcelain"], repo_path)
    if not status.stdout.strip():
        print("No changes to commit.")
        return False

    if message is None:
        message = f"Auto-commit {datetime.now():%Y-%m-%d %H:%M}"

    # Stage all changes
    git_run(["add", "-A"], repo_path)

    # Commit
    commit = git_run(["commit", "-m", message], repo_path)
    if commit.returncode == 0:
        print(f"Committed: {message}")
        return True
    else:
        print(f"Commit failed: {commit.stderr.strip()}")
        return False

# Usage (change to your actual repo path)
repo = Path("/tmp/my-project")
repo.mkdir(exist_ok=True)
auto_commit(repo, "Automated daily backup")

Output (when changes exist):

Committed: Automated daily backup

The helper function git_run() takes the git subcommand as a list and prepends "git", keeping the calling code clean. The cwd=cwd argument tells subprocess where to run the command -- without it, git would operate on whatever directory the script itself lives in, which is almost never what you want.

Real-Life Example: Automated Downloads Folder Organizer

We will now build a complete, production-ready script that watches your Downloads folder and organizes files into subfolders by type. It handles duplicates, logs every action, and can be scheduled to run automatically.

Python automation script organizing downloads folder with logging — Automation log: 47 files sorted. 3 duplicates handled. 0 manual clicks.

# downloads_organizer.py
import shutil
import logging
from pathlib import Path
from datetime import datetime

# --- Configuration ---
DOWNLOADS_DIR = Path.home() / "Downloads"
LOG_FILE = Path.home() / "downloads_organizer.log"

RULES = {
    "Documents": [".pdf", ".doc", ".docx", ".txt", ".rtf"],
    "Images":    [".jpg", ".jpeg", ".png", ".gif", ".svg", ".webp", ".heic"],
    "Videos":    [".mp4", ".mov", ".mkv", ".avi", ".m4v"],
    "Audio":     [".mp3", ".m4a", ".flac", ".wav", ".aac"],
    "Archives":  [".zip", ".tar", ".gz", ".rar", ".7z"],
    "Code":      [".py", ".js", ".html", ".css", ".json", ".sh", ".ipynb"],
    "Data":      [".csv", ".xlsx", ".xls", ".tsv", ".parquet"],
}

# Build reverse lookup: extension -> folder name
EXT_MAP = {ext: folder for folder, exts in RULES.items() for ext in exts}

# --- Logging setup ---
logging.basicConfig(
    filename=LOG_FILE,
    level=logging.INFO,
    format="%(asctime)s %(levelname)s %(message)s",
)

def unique_dest(dest: Path) -> Path:
    """Append a counter to avoid overwriting existing files."""
    if not dest.exists():
        return dest
    counter = 1
    while True:
        candidate = dest.parent / f"{dest.stem}_{counter}{dest.suffix}"
        if not candidate.exists():
            return candidate
        counter += 1

def organize(dry_run: bool = False) -> dict:
    """Move files from Downloads into categorized subfolders."""
    stats = {"moved": 0, "skipped": 0, "unknown": 0}

    for item in DOWNLOADS_DIR.iterdir():
        if not item.is_file():
            continue

        folder_name = EXT_MAP.get(item.suffix.lower())
        if folder_name is None:
            logging.info(f"SKIP (unknown type): {item.name}")
            stats["unknown"] += 1
            continue

        dest_dir = DOWNLOADS_DIR / folder_name
        dest = unique_dest(dest_dir / item.name)

        if dry_run:
            print(f"[DRY RUN] Would move: {item.name} -> {folder_name}/")
        else:
            dest_dir.mkdir(exist_ok=True)
            shutil.move(str(item), dest)
            logging.info(f"MOVED: {item.name} -> {folder_name}/{dest.name}")
            stats["moved"] += 1

    return stats

if __name__ == "__main__":
    print(f"Organizing {DOWNLOADS_DIR} ...")
    result = organize(dry_run=False)
    summary = f"Done. Moved: {result['moved']}, Unknown: {result['unknown']}"
    print(summary)
    logging.info(summary)

Output (example):

Organizing /Users/alice/Downloads ...
Done. Moved: 12, Unknown: 3

The dry_run=True mode lets you preview what the script would do without actually moving anything -- run it first to confirm the output looks right. The unique_dest() function guarantees you never silently overwrite a file with the same name in the destination folder. You can schedule this script to run daily using the schedule library covered earlier, or on macOS/Linux you can add it to your crontab with crontab -e for OS-level scheduling without a Python process running continuously.

Frequently Asked Questions

What is the difference between shutil and pathlib for file operations?

pathlib is for path manipulation and simple operations: checking if a file exists, reading its metadata, renaming it within the same filesystem. shutil is for heavier operations: copying files (with or without metadata), moving files across filesystems, and deleting entire directory trees. In practice you often use both -- pathlib to build and check paths, shutil to actually move or copy the files. Use shutil.move() for moves (it handles cross-filesystem moves gracefully) and shutil.copy2() when you want to preserve file modification times.

Should I use the schedule library or cron for scheduled tasks?

It depends on your setup. The schedule library is pure Python and works identically on Windows, macOS, and Linux -- great for scripts you want to be portable or that run within an existing Python process. Cron (on Linux/macOS) and Task Scheduler (on Windows) are OS-level schedulers that are more reliable for long-running production tasks because they survive reboots automatically and do not require a Python process to stay running. For personal automation scripts on a development machine, schedule is simpler. For server deployments, lean on cron or a process manager like systemd.

When is it safe to use subprocess with shell=True?

Use shell=True only when the entire command is a string literal you control completely -- for example, a hardcoded one-liner like subprocess.run("ls -la /tmp | wc -l", shell=True). Never pass user input, command-line arguments, or any external data into a shell=True command string; doing so opens a shell injection vulnerability where an attacker can execute arbitrary commands. The list form (["ls", "-la", "/tmp"]) is safe with external data because each list element is passed directly to the OS without going through a shell interpreter.

How do I avoid getting blocked when scraping websites?

The main causes of blocks are too-fast request rates, missing request headers, and large-volume scraping. Add a delay between requests using time.sleep(random.uniform(1, 3)) -- randomized delays look more human than a fixed interval. Always set a User-Agent header in your requests session to identify your scraper politely: session.headers.update({"User-Agent": "MyBot/1.0 (research project)"}). Always check the site's robots.txt file before scraping -- if the path you want to scrape is listed as disallowed, respect it. For sites that require JavaScript to load content, switch to a tool like playwright or selenium instead of requests.

Why should automation scripts log to a file instead of just printing?

When a script runs unattended -- on a schedule overnight or as a background process -- there is no terminal to see the output. File logging means you can review what happened after the fact, including any errors. Python's built-in logging module automatically records timestamps, log levels (INFO, WARNING, ERROR), and full tracebacks on exceptions. Set up a RotatingFileHandler for long-running scripts to cap the log file size so it does not grow indefinitely: from logging.handlers import RotatingFileHandler, then RotatingFileHandler("script.log", maxBytes=1_000_000, backupCount=3) keeps the last 3MB of logs and discards older entries automatically.

How do I make an automation script safe to run multiple times?

Design for idempotency -- the script should produce the same result whether it runs once or ten times. For file organization scripts, this means checking if a file already exists in the destination before moving it (and using a counter suffix for duplicates, as shown in the real-life example). For database or API writes, check for existing records before inserting. For web scraping pipelines, track which pages or records have already been collected in a CSV or SQLite database, and skip them on subsequent runs. The general pattern is: check state first, act only if the desired state is not already present.

Conclusion

We have covered four practical automation categories in this guide: file and folder operations with pathlib and shutil, web data collection with requests and BeautifulSoup, scheduled jobs with the schedule library, and system command automation with subprocess. The real-life Downloads Organizer script ties these concepts together into a complete, production-ready tool with duplicate handling, configurable rules, dry-run mode, and file logging.

The best next step is to adapt the Downloads Organizer to your own situation -- add more file types, change the destination folders, or hook it into schedule to run automatically every morning. Once you have the pattern down, you will start seeing automation opportunities everywhere: renaming podcast downloads, archiving old project folders, pulling daily exchange rates from an API, or auto-committing generated reports to git.

For deeper reading, the official Python documentation covers pathlib, shutil, and subprocess in full detail. The schedule library docs have examples for every scheduling pattern you might need. And BeautifulSoup4's documentation is an excellent reference for parsing more complex HTML structures.

Continue Learning Python

Tutorials you might also find useful:

« Older Entries

Further Reading: For more details, see the Python datetime module documentation.

Pro Tips for Working with Public Holidays in Python

1. Cache Holiday Data to Avoid Repeated API Calls

If you are using the Calendarific API, cache the results locally instead of calling the API every time you check a date. Holiday lists for a given country and year rarely change. Save the API response to a JSON file and only refresh it when the year changes. This reduces API usage and makes your application faster.

# cache_holidays.py
import json
import os
from datetime import date

CACHE_FILE = "holidays_cache.json"

def get_cached_holidays(country, year):
    if os.path.exists(CACHE_FILE):
        with open(CACHE_FILE, "r") as f:
            cache = json.load(f)
        key = f"{country}_{year}"
        if key in cache:
            print(f"Using cached holidays for {country} {year}")
            return cache[key]
    return None

def save_to_cache(country, year, holidays):
    cache = {}
    if os.path.exists(CACHE_FILE):
        with open(CACHE_FILE, "r") as f:
            cache = json.load(f)
    cache[f"{country}_{year}"] = holidays
    with open(CACHE_FILE, "w") as f:
        json.dump(cache, f, indent=2)
    print(f"Cached {len(holidays)} holidays for {country} {year}")

Output:

Cached 11 holidays for US 2026
Using cached holidays for US 2026

2. Calculate Business Days Excluding Holidays

One of the most common real-world uses of holiday detection is calculating business days. Combine the holidays library with Python’s datetime to count only working days between two dates, excluding weekends and public holidays. This is essential for shipping estimates, SLA calculations, and payroll processing.

# business_days.py
import holidays
from datetime import date, timedelta

def business_days_between(start, end, country="US"):
    us_holidays = holidays.country_holidays(country)
    count = 0
    current = start
    while current <= end:
        if current.weekday() < 5 and current not in us_holidays:
            count += 1
        current += timedelta(days=1)
    return count

start = date(2026, 12, 20)
end = date(2026, 12, 31)
days = business_days_between(start, end)
print(f"Business days from {start} to {end}: {days}")

Output:

Business days from 2026-12-20 to 2026-12-31: 7

3. Handle Multiple Countries for International Apps

If your application serves users in different countries, check holidays for each user's country rather than assuming a single country. The holidays library supports 100+ countries. Store each user's country code and pass it when checking holidays. Remember that some countries have regional holidays too -- for example, different states in Australia or provinces in Canada have different public holidays.

4. Build a Holiday-Aware Scheduler

Many applications need to skip processing on holidays. Instead of checking manually every time, create a decorator that wraps scheduled tasks and automatically skips execution on public holidays. This is useful for automated reports, email campaigns, and batch processing jobs that should only run on business days.

# holiday_aware_scheduler.py
import holidays
from datetime import date
from functools import wraps

def skip_on_holidays(country="US"):
    def decorator(func):
        @wraps(func)
        def wrapper(*args, **kwargs):
            today = date.today()
            if today in holidays.country_holidays(country):
                name = holidays.country_holidays(country).get(today)
                print(f"Skipping {func.__name__}: today is {name}")
                return None
            return func(*args, **kwargs)
        return wrapper
    return decorator

@skip_on_holidays("US")
def send_daily_report():
    print("Sending daily report...")
    return "Report sent"

result = send_daily_report()
print(f"Result: {result}")

Output (on a regular business day):

Sending daily report...
Result: Report sent

5. Display Upcoming Holidays for Better UX

Show your users which holidays are coming up so they can plan ahead. This is valuable for project management tools, delivery estimate pages, and HR applications. Sort the holiday list by date and filter for upcoming dates only to give users a clear view of the next few holidays.

Frequently Asked Questions

How do I check if a date is a public holiday in Python?

Use the holidays library: install it with pip install holidays, then check with date in holidays.country_holidays('US'). It returns True if the date is a recognized public holiday for that country.

What countries does the Python holidays library support?

The holidays library supports over 100 countries and their subdivisions. Major countries include the US, UK, Canada, Australia, Germany, France, India, and many more. Use holidays.list_supported_countries() to see the complete list.

Can I add custom holidays to the holidays library?

Yes. Create a custom holiday class inheriting from the country class, or use the append() method to add individual dates. You can also create entirely custom holiday calendars for company-specific or regional holidays.

How do I get the name of a holiday for a specific date?

Access the holiday name with holidays.country_holidays('US').get(date), which returns the holiday name as a string, or None if it is not a holiday. You can also iterate over the holidays object to list all holidays in a year.

Is the holidays library useful for business day calculations?

Yes. Combine it with numpy.busday_count() or pandas.bdate_range() to calculate working days excluding public holidays. This is useful for project management, payroll calculations, and delivery date estimation.

Continue Learning Python

Tutorials you might also find useful:

Post Views: 4,510

How To Detect If A Date Is A Public Holiday In Python 3

Checking For Public Holiday With Holidays Module

Checking For Holidays With API Call to Calendarific

Understanding the Calendarific REST API

Summary

Subscribe to our newsletter

How To Automate Tasks with Python: A Practical Guide

Automate Tasks with Python: Quick Example

What Is Python Automation and When Should You Use It?

Automating File and Folder Operations

Finding and Filtering Files with pathlib

Renaming and Copying Files Safely

Automating Web Data Collection

Fetching and Parsing a Web Page

Saving Scraped Data to CSV

Running Automated Jobs on a Schedule

Basic Scheduling with the schedule Library

Handling Errors in Scheduled Jobs

Running System Commands with subprocess

Basic subprocess Usage

Practical Example: Automating Git Operations

Real-Life Example: Automated Downloads Folder Organizer

Frequently Asked Questions

What is the difference between shutil and pathlib for file operations?

Should I use the schedule library or cron for scheduled tasks?

When is it safe to use subprocess with shell=True?

How do I avoid getting blocked when scraping websites?

Why should automation scripts log to a file instead of just printing?

How do I make an automation script safe to run multiple times?

Conclusion

Related Articles

Continue Learning Python

Pro Tips for Working with Public Holidays in Python

1. Cache Holiday Data to Avoid Repeated API Calls

2. Calculate Business Days Excluding Holidays

3. Handle Multiple Countries for International Apps

4. Build a Holiday-Aware Scheduler

5. Display Upcoming Holidays for Better UX

Frequently Asked Questions

How do I check if a date is a public holiday in Python?

What countries does the Python holidays library support?

Can I add custom holidays to the holidays library?

How do I get the name of a holiday for a specific date?

Is the holidays library useful for business day calculations?

Related Articles

Continue Learning Python

Submit a Comment Cancel reply