Last Updated: June 01, 2026
Beginner
The need to print when you program is of course one of the most important, and probably the very first things you ever did! This is your full guide on how to print for both python 2 and python 3).
The quickest and simplest scenario on how to print is to simply write the following:
print("Hello World")

However, there are many other variations of printing that comes up when you are coding in python. These could be printing json files, printing without a new line, printing to a log file, printing formatted text, and many more. Find below what you’re looking for in this one stop guide to printing!
Python developer and educator with 15+ years building production systems across data engineering, web APIs, and AI tooling. Founder of Python How To Program — 270+ in-depth tutorials covering the modern Python stack.
Printing without a new line
When you normally use the print(“abc”) construct it still adds a new line character. In order to print without a new line use the end parameter.
print("Hello World", end="")
Normally when printing:

See example with the print parameter:

Printing Text together
When printing items, there are often times you need to print text write next to each other, or you need concatenate text together. Concatenating text in Python is simple and can be done in several ways.
Note that in the below approach that for the method 2, there is no space between the text which is why method 3 helps to solve this problem.
text1 = 'shoe'
text2 = 'laces'
print("Method 1:", text1, text2)
print("Method 2:", text1 + text2)
print("Method 3:", text1 + ' ' + text2)
print("Method 4:", "%s %s" % ( text1, text2) )

Formatting numeric output when printing
When printing, often it’s needed to format the print output. Here’s a list of formatting scenarios.
Printing a number with a string
When printing a number, it is typically simple to do with the following statement:
counter = 5
print(counter)

The problem arises when you want to print text along the same line. You will typically get this error TypeError: unsupported operand type(s) for +: ‘int’ and ‘str’ . For example for the following:

The trick is that you can always concatenate two strings together. Hence, you simply need to convert the int (short for integer, or whole number) into a str (string).
counter = 5
print( str(counter) + ' apples' )

Padding zeros when printing numbers
When printing numbers, often you need to pad with zeros. There are multiple ways to do this, but one of the easier ways is to first convert the number to a string, and then use the zfill function of the string where you can specify how long the number should be.
#this prints the number 10 with up to 8 padded zeros
counter = 10
str(counter).zfill(8)
A more advanced example follows where we’re printing 7 numbers. Notice that for the last number where the number is more than 8 digits, that there are no padded zeros.
counter = 2
for x in range(1, 7):
print( str(counter).zfill(8) )
counter = counter * counter

Another method to pad zeros is the following method to use the format function where a zero is placed in front of the number of digits. Here the “08” refers to padding zeros for 8 digits
counter = 2
for x in range(1, 7):
print( format(counter, '08'))
counter = counter * counter

Printing with text alignment
The following can be used when you want to print a table of contents where the structure “{:>nn”.format(‘text to format’) is used. nn is the number of letters to pad.
'{:>15}'.format('text')
Without any alignment:

With alignment:

Printing complex data structures in readable format
One of the great things about python is that you can put together complex data structures fairly easily. This could be a dictionary where each dictionary item is a list. However, to print this out normally is quite difficult to read. This is where pretty print comes in. Suppose you have the following structure:

Within python, this is represented as a dictionary where the main items “furniture” and “appliances” have the sub-items. So if the data structure is “listitems”, then the data coudl be represented as follows:
listitems ={ 'furniture':[ 'desk', 'chair', 'sofa'], 'appliances':['tv', 'lamp', 'hifi']}
With this in mind, then printing of this data would be as follows:
listitems ={ 'furniture':[ 'desk', 'chair', 'sofa'], 'appliances':['tv', 'lamp', 'hifi']}
print(listitems)

This is where the import library pprint comes in. You can simply use this to print out the output in a more readable fashion. There are two important parameters though. You should use the indent parameter to specify how much space there is per element, and then width to ensure that limited items are put on a single two. If you put a width of 1 character, then that’ll ensure only show one element at most (so if a element has more than 1 character it’s ok, but you cannot include a second element in there as you’re already the 1 width limit).
import pprint;
listitems ={ 'furniture':[ 'desk', 'chair', 'sofa'], 'appliances':['tv', 'lamp', 'hifi']}
pprint.pprint(listitems, indent=1, width=1)

Printing time
Printing time is another important item that you tend to do often in case you want to monitor performance or perhaps to give an update that your long operation is still running.
Print the time
First lets simply print the current date and time
import datetime
print(datetime.datetime.now())

This date time can be easily formatted using the special function from the date object “strftime”. With strftime you can convert the format of the time quite easily to a specified format of hours, mins, seconds and date, with or without the timezone information
import datetime
currentTime = datetime.datetime.now()
print(currentTime.strftime("%Y-%m-%d"))
print(currentTime.strftime("%Y-%m-%d %H:%M:%S"))
print(currentTime.strftime("%Y-%m-%d %H:%M:%S %Z%z"))

As you can guess the Y=year, m = month, d = year, H = hour, M=minutes, S = seconds. Z = timezone. You’ll notice that for the 3rd print item the timezone is blank. We’ll address that in the next section.
Print the time in the correct timezone
However, if you are using a remote machine, or a virtual machine where your local timezone is not set, you may want to chose your own timezone. Also, if you are running services which are across different machines, it is important to make sure you use the right timezone. One simple way is to use universal time (UTC), or to simply to set a single timezone. You can then convert as required.
import datetime
import pytz #include the timezone module
currentTime = datetime.datetime.now( pytz.timezone('UTC') )
print(currentTime.strftime("%Y-%m-%d"))
print(currentTime.strftime("%Y-%m-%d %H:%M:%S"))
print(currentTime.strftime("%Y-%m-%d %H:%M:%S %Z%z"))

Please note in the above example that when the timezone format was shown it showed that the timezone was set to UTC+0000 unlike the previous example. This means that the timezone information was present there.
In the following code, we will first get the time in the UTC timezone, and then convert the time to Hong Kong timezone.
import datetime
import pytz
currentTime = datetime.datetime.now( pytz.timezone('UTC') )
print("Time 1 (UTC time):", currentTime.strftime("%Y-%m-%d %H:%M:%S %Z%z"))
now_local = currentTime.astimezone(pytz.timezone('Asia/Hong_Kong'))
print("Time 2a(HK time) :", now_local.strftime("%Y-%m-%d %H:%M:%S %Z%z"))
print("Time 2b(HK time) :", now_local.strftime("%Y-%m-%d %H:%M:%S "))

Please note that in the “Time 2a” output, you can see the Hong Kong time as 2am with the timezone indicator at the end of +8 hours. The final “Time 2b” is the same time without the timezone included.
Finally, you can get a list of all the timezones available with a quick check on the pytz module and checking “all_timezones”.
import pytz
for tz in pytz.all_timezones:
print(tz)

How to print an exception
Things will go wrong in your code all the time – especially things that you don’t expect. This is where exceptions come in where the try exception blocks fit in quite nicely. The tricky part is that you need to make sure you output what the exception is in order for you to understand what’s going on.
Firstly a quick example of where a try /except can be helpful. Suppose you had the following code where after the definition of the function, the function was called.
def badFunction():
print(a) #print an undefined function
badFunction()#call the functionprint("have a nice day")

In here, as the variable “a” was not defined, then the program terminated and the final line “have a nice day” was never printed.
This is where try/except blocks can come in where you can catch errors from uncertain actions. So you can wrap the “badfunction” in a try block. See following example:
def badFunction():
print(a)
#Try the unsafe code
try:
badFunction()
except NameError:
print("Variable x is not defined")
except:
print("Something else went wrong")
print("have a nice day")

Here, the program continued to run gracefully and it caught the exception with the error “Variable x is not defined”. The reason it was caught was due to “NameError” exception object being defined.
In this next example, we have put a different error. Now the variable is defined as a number but there will be an exception as the number will be concatenated to a string.
def badFunction():
a = 1
print(a + ' join str') #this will fail as joining a string with a number
#try the unsafe code
try:
badFunction()
except NameError:
print("Variable x is not defined")
except:
print("Something else went wrong")
print("have a nice day")

Here another exception was caught but with the generic message of “Something else went wrong”. This is where printing the actual exception is really important. This is where you can define the exception object and print out the error.
ef badFunction():
a = 1
print(a + ' join str')
try:
badFunction()
except NameError:
print("Variable x is not defined")
except Exception as e:
print(e) #print the exception object print("have a nice day")

Here you can see that the reason for the failure was included, and the program continued to run.
There’s a final improvement we can make which is to include where the problem occurred. This is really important where you have logging defined and you can see where the issue was caused.
import traceback
def badFunction():
a = 1
print(a + ' join str')
#run the unsafe code
try:
badFunction()
except NameError:
print("Variable x is not defined")
except Exception as e:
print(e)
traceback.print_tb(e.__traceback__) #show the call list
print("have a nice day")

Here you can see the error description “Unsupported operand type(s) for +”, and then also where the error occurred from the initial call on line 8 with the call to “badFunction()” and the actual offending line of line 5.
Many more printing on python
There’s many more ways to print outputs within python, however this was intended to be a simple resource for some of the common printing challenges that come up, how you can use them, and with a simple example to get you up to speed very quickly with usable code. More to come!
Subscribe to our newsletter
How To Automate Tasks with Python: A Practical Guide
Last Updated: June 14, 2026
- Automate Tasks with Python: Quick Example
- What Is Python Automation and When Should You Use It?
- Automating File and Folder Operations
- Automating Web Data Collection
- Running Automated Jobs on a Schedule
- Running System Commands with subprocess
- Real-Life Example: Automated Downloads Folder Organizer
- Frequently Asked Questions
- Conclusion
- Related Articles
Beginner to Intermediate
You have a folder full of downloaded files named document(1).pdf, screenshot_2024.png, and report_final_v3_FINAL.xlsx. Every week you spend 20 minutes sorting them by hand. Or maybe you copy data from a website into a spreadsheet every morning, or you run the same three terminal commands every time you start work. These are the tasks Python was built to eliminate. With a few dozen lines of code you can turn a painful weekly chore into something that runs itself while you drink coffee.
Python ships with a rich standard library for automation — pathlib for file operations, subprocess for running system commands, smtplib for sending email — and the broader ecosystem adds libraries like schedule for periodic jobs and requests for web data collection. No special setup is needed beyond a standard Python 3.8+ install. For scheduling, you will need to install schedule with pip, but everything else in this article is built in.
In this guide we will cover four practical automation categories: organizing files and folders with pathlib and shutil, collecting web data with requests and BeautifulSoup, scheduling jobs to run automatically with the schedule library, and running system commands with subprocess. Each section ends with working code you can adapt to your own situation. By the end you will have a toolkit of reusable automation patterns and a complete script that organizes a messy downloads folder automatically.
Python developer and educator with 15+ years building production systems across data engineering, web APIs, and AI tooling. Founder of Python How To Program.
Automate Tasks with Python: Quick Example
Here is a self-contained script that renames and moves files in a folder based on their extension — one of the most common automation tasks you will ever write:
# sort_downloads.py
from pathlib import Path
import shutil
FOLDER = Path.home() / "Downloads"
DESTINATIONS = {
".pdf": "Documents",
".png": "Images",
".jpg": "Images",
".xlsx": "Spreadsheets",
".csv": "Spreadsheets",
".zip": "Archives",
}
for file in FOLDER.iterdir():
if file.is_file() and file.suffix in DESTINATIONS:
dest_folder = FOLDER / DESTINATIONS[file.suffix]
dest_folder.mkdir(exist_ok=True)
shutil.move(str(file), dest_folder / file.name)
print(f"Moved: {file.name} -> {DESTINATIONS[file.suffix]}/")
Output (example):
Moved: invoice_march.pdf -> Documents/
Moved: screenshot_2024.png -> Images/
Moved: sales_report.xlsx -> Spreadsheets/
The script uses Path.home() to get the user’s home directory regardless of operating system, iterdir() to loop over every item in the folder, and shutil.move() to relocate the file. The mkdir(exist_ok=True) call creates the destination folder if it does not already exist — no crash if it is there, no error if it is not. We will build a more complete version in the real-life example section, including duplicate detection and logging.
What Is Python Automation and When Should You Use It?
Automation means writing code that performs a repetitive task so you do not have to. The rule of thumb is: if you have done something manually more than three times, it is worth automating. Python is the go-to language for automation because it has concise syntax, a massive standard library, and third-party packages that cover almost every automation use case out of the box.
The table below maps common repetitive tasks to the Python tools that handle them:
| Task Type | Python Tool | When to Use |
|---|---|---|
| File/folder operations | pathlib, shutil | Renaming, moving, copying, deleting files |
| Reading/writing files | built-in open(), csv | Log parsing, report generation, data transformation |
| Web data collection | requests, BeautifulSoup | Pulling prices, headlines, tables from websites |
| Scheduled jobs | schedule, APScheduler | Running tasks daily, hourly, or on a cron-like schedule |
| System commands | subprocess | Running CLI tools, shell scripts, git, ffmpeg |
| Email sending | smtplib, yagmail | Automated reports, alerts, notifications |
A good automation script is idempotent — running it twice produces the same result as running it once. It handles edge cases (missing files, network errors, duplicate names) without crashing. And it logs what it did so you can review the results later. Keep these principles in mind as we work through each section.
Automating File and Folder Operations
The pathlib module (Python 3.4+) provides an object-oriented interface for working with file paths that is far more readable than the older os.path approach. Combined with shutil for copy/move operations, these two modules cover 90% of file automation tasks.
Finding and Filtering Files with pathlib
The glob() and rglob() methods on a Path object let you find files matching a pattern across an entire directory tree. glob() searches one level deep; rglob() (recursive glob) searches all subdirectories:
# find_files.py
from pathlib import Path
base = Path("/tmp/project") # change to your actual folder
# Find all Python files in this folder only
py_files = list(base.glob("*.py"))
print("Python files (top-level):", [f.name for f in py_files])
# Find all log files anywhere in the tree
log_files = list(base.rglob("*.log"))
print("Log files (all depths):", [f.relative_to(base) for f in log_files])
# Find files larger than 1 MB
large_files = [f for f in base.rglob("*") if f.is_file() and f.stat().st_size > 1_000_000]
print("Files over 1MB:", [f.name for f in large_files])
Output (example):
Python files (top-level): ['app.py', 'utils.py', 'config.py']
Log files (all depths): [PosixPath('logs/app.log'), PosixPath('logs/error.log')]
Files over 1MB: ['dataset.csv', 'backup.tar.gz']
f.stat().st_size returns the file size in bytes. The expression f.relative_to(base) strips the base directory from the path so you see logs/app.log instead of the full absolute path. Both are useful for building reports of what your script found before it starts moving anything.
Renaming and Copying Files Safely
Before moving or renaming files in an automation script, always check whether the destination already exists. Blindly overwriting a file can cause data loss that is impossible to reverse:
# safe_copy.py
from pathlib import Path
import shutil
def safe_copy(src: Path, dest_dir: Path) -> Path:
"""Copy src into dest_dir, appending a counter if the name already exists."""
dest_dir.mkdir(parents=True, exist_ok=True)
dest = dest_dir / src.name
if dest.exists():
counter = 1
while dest.exists():
stem = src.stem
dest = dest_dir / f"{stem}_{counter}{src.suffix}"
counter += 1
shutil.copy2(str(src), dest) # copy2 preserves metadata (timestamps)
return dest
# Demo
src_file = Path("/tmp/report.pdf")
src_file.write_text("dummy content") # create test file
result = safe_copy(src_file, Path("/tmp/archive"))
print(f"Copied to: {result}")
result2 = safe_copy(src_file, Path("/tmp/archive")) # simulate duplicate
print(f"Duplicate handled: {result2}")
Output:
Copied to: /tmp/archive/report.pdf
Duplicate handled: /tmp/archive/report_1.pdf
The shutil.copy2() function copies the file content AND preserves the original modification time, which is important when you want archive copies to retain their original dates. The counter loop ensures you never silently overwrite an existing file — a critical safety net for any file automation script.
Automating Web Data Collection
Web scraping lets your scripts pull data from websites automatically. The standard approach uses requests to download HTML and BeautifulSoup to parse it. Install both with pip install requests beautifulsoup4.
Fetching and Parsing a Web Page
We will use quotes.toscrape.com, a site built specifically for scraping practice. It serves reliable HTML with a stable structure, so this code will continue to work without modification.
Here is the HTML structure of each quote on that page, so you can see exactly what the selectors are targeting:
<!-- HTML structure of each quote on quotes.toscrape.com -->
<div class="quote">
<span class="text">"The world as we have created it..."</span>
<span>
by <small class="author">Albert Einstein</small>
</span>
<div class="tags">
<a class="tag" href="/tag/change/page/1/">change</a>
</div>
</div>
Now the scraping code:
# scrape_quotes.py
import requests
from bs4 import BeautifulSoup
def scrape_quotes(url: str) -> list[dict]:
response = requests.get(url, timeout=10)
response.raise_for_status() # raises HTTPError for 4xx/5xx responses
soup = BeautifulSoup(response.text, "html.parser")
results = []
for quote_div in soup.select("div.quote"):
text_elem = quote_div.select_one("span.text")
author_elem = quote_div.select_one("small.author")
tag_elems = quote_div.select("a.tag")
# Defensive: check before accessing .text
text = text_elem.text.strip() if text_elem else "Unknown"
author = author_elem.text.strip() if author_elem else "Unknown"
tags = [t.text for t in tag_elems]
results.append({"quote": text, "author": author, "tags": tags})
return results
quotes = scrape_quotes("https://quotes.toscrape.com")
for q in quotes[:3]:
print(f'{q["author"]}: {q["quote"][:60]}...')
print(f' Tags: {", ".join(q["tags"])}')
print()
Output:
Albert Einstein: "The world as we have created it is a process of...
Tags: change, deep-thoughts, thinking, world
J.K. Rowling: "It is our choices, Harry, that show what we truly a...
Tags: abilities, choices
Albert Einstein: "There are only two ways to live your life. One is...
Tags: inspirational, life, live, miracle, miracles
response.raise_for_status() is a one-line safety net — it raises a requests.HTTPError if the server returns a 4xx or 5xx status code instead of silently continuing with bad data. The defensive checks (text_elem.text if text_elem else "Unknown") protect against pages where an element is missing, which happens constantly on real-world sites.
Saving Scraped Data to CSV
Collecting data is only half the job — you need to store it somewhere useful. Writing to CSV with Python’s built-in csv module keeps the output format-agnostic and readable in any spreadsheet application:
# save_to_csv.py
import csv
import requests
from bs4 import BeautifulSoup
def scrape_quotes(url):
resp = requests.get(url, timeout=10)
resp.raise_for_status()
soup = BeautifulSoup(resp.text, "html.parser")
results = []
for div in soup.select("div.quote"):
text = div.select_one("span.text")
author = div.select_one("small.author")
results.append({
"quote": text.text.strip() if text else "",
"author": author.text.strip() if author else "",
})
return results
quotes = scrape_quotes("https://quotes.toscrape.com")
output_file = "quotes.csv"
with open(output_file, "w", newline="", encoding="utf-8") as f:
writer = csv.DictWriter(f, fieldnames=["author", "quote"])
writer.writeheader()
writer.writerows(quotes)
print(f"Saved {len(quotes)} quotes to {output_file}")
Output:
Saved 10 quotes to quotes.csv
Always pass encoding="utf-8" when writing CSV files — without it, non-ASCII characters (curly quotes, accented letters, em-dashes) will cause encoding errors or garbled output on Windows. The newline="" argument is also required by Python’s csv module to prevent extra blank lines on Windows.
Running Automated Jobs on a Schedule
Writing the automation code is only half the job. You also need it to run at the right time, without you having to remember to start it. The schedule library provides a clean Python API for defining when jobs should run — every minute, every day at 9am, every Monday, and so on. Install it with pip install schedule.
Basic Scheduling with the schedule Library
The schedule library works with a simple event loop: you register jobs with schedule.every(), then call schedule.run_pending() in a loop to check whether any jobs are due:
# basic_schedule.py
import schedule
import time
from datetime import datetime
def morning_report():
print(f"[{datetime.now():%H:%M:%S}] Good morning! Running daily report...")
# your actual report logic goes here
def hourly_check():
print(f"[{datetime.now():%H:%M:%S}] Hourly check complete.")
# Schedule the jobs
schedule.every().day.at("09:00").do(morning_report)
schedule.every().hour.do(hourly_check)
schedule.every(30).minutes.do(lambda: print("30-min heartbeat"))
print("Scheduler running. Press Ctrl+C to stop.")
while True:
schedule.run_pending()
time.sleep(30) # check every 30 seconds to save CPU
Output (example at 09:00:00):
Scheduler running. Press Ctrl+C to stop.
[09:00:00] Good morning! Running daily report...
[09:00:00] Hourly check complete.
[09:00:00] 30-min heartbeat
[09:30:00] 30-min heartbeat
[10:00:00] Hourly check complete.
The time.sleep(30) inside the loop is important — without a sleep, the loop burns 100% of one CPU core doing nothing. Sleeping 30 seconds means jobs can fire up to 30 seconds late (acceptable for most automation), while using negligible CPU. If you need second-level precision, use time.sleep(1) instead.
Handling Errors in Scheduled Jobs
When a job in a scheduled loop raises an unhandled exception, the whole process crashes and no more jobs run. Wrap your job functions with a try/except to log errors and keep the loop running:
# robust_schedule.py
import schedule
import time
import logging
from datetime import datetime
logging.basicConfig(
filename="automation.log",
level=logging.INFO,
format="%(asctime)s %(levelname)s %(message)s",
)
def run_safely(job_func):
"""Decorator that catches exceptions and logs them without crashing the loop."""
def wrapper():
try:
job_func()
logging.info(f"{job_func.__name__} completed successfully")
except Exception as exc:
logging.error(f"{job_func.__name__} failed: {exc}", exc_info=True)
return wrapper
def daily_scrape():
# Simulating a job that sometimes fails
import random
if random.random() < 0.3:
raise ConnectionError("Network unavailable")
print("Scraped data successfully.")
schedule.every().day.at("08:00").do(run_safely(daily_scrape))
while True:
schedule.run_pending()
time.sleep(60)
The run_safely decorator wraps any function so that exceptions are caught, logged to a file, and the scheduler continues. The exc_info=True argument tells Python's logging module to include the full traceback in the log file -- essential for debugging failures that happen at 3am while you are asleep.
Running System Commands with subprocess
Sometimes the right tool for a job is a command-line program, not a Python library. subprocess.run() lets your Python script launch any system command and capture its output, making it easy to orchestrate CLI tools like git, ffmpeg, or database utilities.
Basic subprocess Usage
# run_commands.py
import subprocess
# Run a command and capture its output
result = subprocess.run(
["python3", "--version"],
capture_output=True,
text=True, # decode bytes to str automatically
check=False, # don't raise on non-zero exit code
)
print("Return code:", result.returncode)
print("Stdout:", result.stdout.strip())
print("Stderr:", result.stderr.strip())
# Run a command that lists files (works on macOS/Linux)
ls_result = subprocess.run(
["ls", "-la", "/tmp"],
capture_output=True,
text=True,
)
print("\nFirst 3 lines of /tmp listing:")
for line in ls_result.stdout.strip().split("\n")[:3]:
print(" ", line)
Output:
Return code: 0
Stdout: Python 3.11.4
Stderr:
First 3 lines of /tmp listing:
total 0
drwxrwxrwt 20 root wheel 640 Jun 12 09:31 .
drwxr-xr-x 20 root wheel 640 May 18 11:02 ..
Always use a list for the command argument (["ls", "-la", "/tmp"]) rather than a string ("ls -la /tmp"). The list form avoids shell injection vulnerabilities -- if any part of the command comes from user input, a string passed to shell=True can execute arbitrary shell commands. The list form is always safer.
Practical Example: Automating Git Operations
Here is a real-world use case -- a script that automatically stages, commits, and pushes changes in a git repository. This is useful for automating backup commits or syncing generated files:
# git_autocommit.py
import subprocess
from datetime import datetime
from pathlib import Path
def git_run(args: list[str], cwd: Path) -> subprocess.CompletedProcess:
"""Run a git command in the specified directory."""
return subprocess.run(
["git"] + args,
capture_output=True,
text=True,
cwd=cwd,
)
def auto_commit(repo_path: Path, message: str = None) -> bool:
"""Stage all changes and commit if there is anything to commit."""
# Check for changes
status = git_run(["status", "--porcelain"], repo_path)
if not status.stdout.strip():
print("No changes to commit.")
return False
if message is None:
message = f"Auto-commit {datetime.now():%Y-%m-%d %H:%M}"
# Stage all changes
git_run(["add", "-A"], repo_path)
# Commit
commit = git_run(["commit", "-m", message], repo_path)
if commit.returncode == 0:
print(f"Committed: {message}")
return True
else:
print(f"Commit failed: {commit.stderr.strip()}")
return False
# Usage (change to your actual repo path)
repo = Path("/tmp/my-project")
repo.mkdir(exist_ok=True)
auto_commit(repo, "Automated daily backup")
Output (when changes exist):
Committed: Automated daily backup
The helper function git_run() takes the git subcommand as a list and prepends "git", keeping the calling code clean. The cwd=cwd argument tells subprocess where to run the command -- without it, git would operate on whatever directory the script itself lives in, which is almost never what you want.
Real-Life Example: Automated Downloads Folder Organizer
We will now build a complete, production-ready script that watches your Downloads folder and organizes files into subfolders by type. It handles duplicates, logs every action, and can be scheduled to run automatically.
# downloads_organizer.py
import shutil
import logging
from pathlib import Path
from datetime import datetime
# --- Configuration ---
DOWNLOADS_DIR = Path.home() / "Downloads"
LOG_FILE = Path.home() / "downloads_organizer.log"
RULES = {
"Documents": [".pdf", ".doc", ".docx", ".txt", ".rtf"],
"Images": [".jpg", ".jpeg", ".png", ".gif", ".svg", ".webp", ".heic"],
"Videos": [".mp4", ".mov", ".mkv", ".avi", ".m4v"],
"Audio": [".mp3", ".m4a", ".flac", ".wav", ".aac"],
"Archives": [".zip", ".tar", ".gz", ".rar", ".7z"],
"Code": [".py", ".js", ".html", ".css", ".json", ".sh", ".ipynb"],
"Data": [".csv", ".xlsx", ".xls", ".tsv", ".parquet"],
}
# Build reverse lookup: extension -> folder name
EXT_MAP = {ext: folder for folder, exts in RULES.items() for ext in exts}
# --- Logging setup ---
logging.basicConfig(
filename=LOG_FILE,
level=logging.INFO,
format="%(asctime)s %(levelname)s %(message)s",
)
def unique_dest(dest: Path) -> Path:
"""Append a counter to avoid overwriting existing files."""
if not dest.exists():
return dest
counter = 1
while True:
candidate = dest.parent / f"{dest.stem}_{counter}{dest.suffix}"
if not candidate.exists():
return candidate
counter += 1
def organize(dry_run: bool = False) -> dict:
"""Move files from Downloads into categorized subfolders."""
stats = {"moved": 0, "skipped": 0, "unknown": 0}
for item in DOWNLOADS_DIR.iterdir():
if not item.is_file():
continue
folder_name = EXT_MAP.get(item.suffix.lower())
if folder_name is None:
logging.info(f"SKIP (unknown type): {item.name}")
stats["unknown"] += 1
continue
dest_dir = DOWNLOADS_DIR / folder_name
dest = unique_dest(dest_dir / item.name)
if dry_run:
print(f"[DRY RUN] Would move: {item.name} -> {folder_name}/")
else:
dest_dir.mkdir(exist_ok=True)
shutil.move(str(item), dest)
logging.info(f"MOVED: {item.name} -> {folder_name}/{dest.name}")
stats["moved"] += 1
return stats
if __name__ == "__main__":
print(f"Organizing {DOWNLOADS_DIR} ...")
result = organize(dry_run=False)
summary = f"Done. Moved: {result['moved']}, Unknown: {result['unknown']}"
print(summary)
logging.info(summary)
Output (example):
Organizing /Users/alice/Downloads ...
Done. Moved: 12, Unknown: 3
The dry_run=True mode lets you preview what the script would do without actually moving anything -- run it first to confirm the output looks right. The unique_dest() function guarantees you never silently overwrite a file with the same name in the destination folder. You can schedule this script to run daily using the schedule library covered earlier, or on macOS/Linux you can add it to your crontab with crontab -e for OS-level scheduling without a Python process running continuously.
Frequently Asked Questions
What is the difference between shutil and pathlib for file operations?
pathlib is for path manipulation and simple operations: checking if a file exists, reading its metadata, renaming it within the same filesystem. shutil is for heavier operations: copying files (with or without metadata), moving files across filesystems, and deleting entire directory trees. In practice you often use both -- pathlib to build and check paths, shutil to actually move or copy the files. Use shutil.move() for moves (it handles cross-filesystem moves gracefully) and shutil.copy2() when you want to preserve file modification times.
Should I use the schedule library or cron for scheduled tasks?
It depends on your setup. The schedule library is pure Python and works identically on Windows, macOS, and Linux -- great for scripts you want to be portable or that run within an existing Python process. Cron (on Linux/macOS) and Task Scheduler (on Windows) are OS-level schedulers that are more reliable for long-running production tasks because they survive reboots automatically and do not require a Python process to stay running. For personal automation scripts on a development machine, schedule is simpler. For server deployments, lean on cron or a process manager like systemd.
When is it safe to use subprocess with shell=True?
Use shell=True only when the entire command is a string literal you control completely -- for example, a hardcoded one-liner like subprocess.run("ls -la /tmp | wc -l", shell=True). Never pass user input, command-line arguments, or any external data into a shell=True command string; doing so opens a shell injection vulnerability where an attacker can execute arbitrary commands. The list form (["ls", "-la", "/tmp"]) is safe with external data because each list element is passed directly to the OS without going through a shell interpreter.
How do I avoid getting blocked when scraping websites?
The main causes of blocks are too-fast request rates, missing request headers, and large-volume scraping. Add a delay between requests using time.sleep(random.uniform(1, 3)) -- randomized delays look more human than a fixed interval. Always set a User-Agent header in your requests session to identify your scraper politely: session.headers.update({"User-Agent": "MyBot/1.0 (research project)"}). Always check the site's robots.txt file before scraping -- if the path you want to scrape is listed as disallowed, respect it. For sites that require JavaScript to load content, switch to a tool like playwright or selenium instead of requests.
Why should automation scripts log to a file instead of just printing?
When a script runs unattended -- on a schedule overnight or as a background process -- there is no terminal to see the output. File logging means you can review what happened after the fact, including any errors. Python's built-in logging module automatically records timestamps, log levels (INFO, WARNING, ERROR), and full tracebacks on exceptions. Set up a RotatingFileHandler for long-running scripts to cap the log file size so it does not grow indefinitely: from logging.handlers import RotatingFileHandler, then RotatingFileHandler("script.log", maxBytes=1_000_000, backupCount=3) keeps the last 3MB of logs and discards older entries automatically.
How do I make an automation script safe to run multiple times?
Design for idempotency -- the script should produce the same result whether it runs once or ten times. For file organization scripts, this means checking if a file already exists in the destination before moving it (and using a counter suffix for duplicates, as shown in the real-life example). For database or API writes, check for existing records before inserting. For web scraping pipelines, track which pages or records have already been collected in a CSV or SQLite database, and skip them on subsequent runs. The general pattern is: check state first, act only if the desired state is not already present.
Conclusion
We have covered four practical automation categories in this guide: file and folder operations with pathlib and shutil, web data collection with requests and BeautifulSoup, scheduled jobs with the schedule library, and system command automation with subprocess. The real-life Downloads Organizer script ties these concepts together into a complete, production-ready tool with duplicate handling, configurable rules, dry-run mode, and file logging.
The best next step is to adapt the Downloads Organizer to your own situation -- add more file types, change the destination folders, or hook it into schedule to run automatically every morning. Once you have the pattern down, you will start seeing automation opportunities everywhere: renaming podcast downloads, archiving old project folders, pulling daily exchange rates from an API, or auto-committing generated reports to git.
For deeper reading, the official Python documentation covers pathlib, shutil, and subprocess in full detail. The schedule library docs have examples for every scheduling pattern you might need. And BeautifulSoup4's documentation is an excellent reference for parsing more complex HTML structures.
Related Articles
Continue Learning Python
Tutorials you might also find useful:
Further Reading: For more details, see the Python print() function documentation.
Frequently Asked Questions
What does \n do in Python print statements?
The \n escape sequence creates a newline character, causing text after it to appear on the next line. For example, print('Hello\nWorld') outputs ‘Hello’ and ‘World’ on separate lines.
How do I print multiple lines without using \n?
You can use triple-quoted strings (''' or """) to write multi-line text directly, or call print() multiple times. The textwrap.dedent() function also helps format multi-line strings cleanly.
What is a format exception in Python?
A format exception (typically a ValueError) occurs when a format string and its arguments do not match. For example, using the wrong number of placeholders in str.format() or mismatched types in f-strings.
How do I use f-strings for text formatting in Python?
F-strings (formatted string literals) use the syntax f'text {variable}' and were introduced in Python 3.6. They allow you to embed expressions directly inside string literals for readable, efficient formatting.
What is the difference between print() and sys.stdout.write()?
print() adds a newline by default and accepts multiple arguments with separators. sys.stdout.write() writes raw text without any automatic newline, giving you more control over output formatting.
Related Articles
- How To Print String in Color in Python
- Python Type Hints for Better Code
- How To Read and Write JSON Files
Continue Learning Python
Tutorials you might also find useful:
- How To Use Python textwrap Module for Text Formatting
- How To Format Python Code with Black and isort
- How To Use Python difflib for Comparing Text and Sequences
- How To Use Python tabulate for Pretty-Printing Tables
- Simple Guide To Markov Chain Text Generator in Python 3
- Reading and writing text to files in Python