Intermediate

Introduction: Why You Need to Know subprocess

Whether you’re automating a deployment pipeline, running system diagnostics, managing file compression, or orchestrating complex DevOps workflows, you’ll eventually need to run shell commands from Python. The subprocess module is your safe passage to that world. It’s one of the most powerful—and most misused—tools in Python’s standard library. Many developers reach for quick fixes like os.system() or shell injection vulnerable approaches, not realizing that subprocess exists to solve exactly this problem with security, control, and elegance.

Here’s the great news: you don’t need to pip install anything. The subprocess module is part of Python’s standard library, available in every Python 3.x installation. This means you can use it immediately in any environment, whether it’s your laptop, a Docker container, or a cloud function. It’s been battle-tested for decades and is the recommended way to spawn child processes from Python code.

In this article, we’ll walk through everything you need to know—from your first simple command all the way to advanced patterns like piping, timeouts, and deployment automation. By the end, you’ll understand not just how to use subprocess, but when to use each function, how to handle errors gracefully, and how to avoid the security pitfalls that catch even experienced developers.

Quick Example: Your First subprocess Command

Let’s get something working right away so you can see how simple it is:

# quick_example.py
import subprocess

result = subprocess.run(['ls', '-la'], capture_output=True, text=True)
print(result.stdout)
print(f"Return code: {result.returncode}")

Output:

total 48
drwxr-xr-x  5 user staff   160 Mar 14 10:42 .
drwxr-xr-x 12 user staff   384 Mar 14 09:15 ..
-rw-r--r--  1 user staff  1234 Mar 14 10:40 quick_example.py
-rw-r--r--  1 user staff  5678 Mar 14 10:35 deployment.py
Return code: 0

That’s it. You’ve just executed a shell command from Python. The subprocess.run() function took a list of arguments (the command and its flags), captured the output as text, and returned a result object with the stdout and return code. A return code of 0 means success. If the command had failed, you’d see a non-zero value. In the sections below, we’ll explore what each parameter does, how to handle errors, and when you need more advanced tools like Popen.

What is subprocess and Why Use It?

The subprocess module is Python’s official way to spawn and manage child processes. It replaces older, less safe approaches like os.system() and os.popen(). When you call a subprocess function, Python creates a new process that runs independently, typically executing a shell command or external program. This is essential when you need to integrate with system utilities, run compiled binaries, execute scripts in other languages, or automate tasks that are more natural to express as shell commands than Python code.

Why not just use os.system()? Because os.system() invokes the shell directly, making your code vulnerable to shell injection attacks if user input isn’t carefully sanitized. subprocess lets you pass arguments as a list, bypassing the shell entirely by default. This means you can safely use untrusted input without worrying about command injection vulnerabilities.

Let’s compare the three main approaches:

Feature subprocess.run() Popen os.system()
Simple, one-shot commands ✓ Perfect Overkill Unsafe
Capture output ✓ Easy Manual pipes Workaround needed
Shell injection safe ✓ By default ✓ By default Only with shlex
Advanced features (pipes, streaming) Limited ✓ Full control N/A
Timeout support ✓ Built-in Manual logic No
Return code access ✓ Direct ✓ Direct ✓ Direct

For 90% of your use cases, subprocess.run() is exactly what you need. It’s simple, secure, and handles the common scenarios. Popen is for situations where you need more control—like streaming large outputs or keeping a process alive. And os.system()? You can almost always skip it.

Running Your First Command with subprocess.run()

The subprocess.run() function is your primary tool. It takes a command as a list, executes it, waits for it to complete, and returns a CompletedProcess object with information about the result. Let’s look at the anatomy of this function with a practical example.

The first argument to subprocess.run() is always a list of strings representing the command and its arguments. Never pass a single string—that’s a common mistake that leads to subtle bugs. Each element of the list is one token: the program name, then each flag, then each argument.

# run_commands.py
import subprocess

# Run a simple command that lists directory contents
result = subprocess.run(['echo', 'Hello from subprocess'], text=True, capture_output=True)
print("STDOUT:", result.stdout)
print("Return code:", result.returncode)

# Run a command that lists files
result2 = subprocess.run(['pwd'], text=True, capture_output=True)
print("\nCurrent directory:", result2.stdout.strip())

Output:

STDOUT: Hello from subprocess
Return code: 0

Current directory: /home/developer/project

Notice the text=True parameter—this tells subprocess to return stdout and stderr as strings instead of bytes. Without it, you’d get bytes objects that you’d need to decode manually. The capture_output=True parameter captures both stdout and stderr into the result object. These two parameters together create the most convenient API for most situations. The returncode attribute tells you whether the command succeeded (0) or failed (non-zero).

Capturing Command Output

Often you don’t just want to run a command—you want to see what it produced. The capture_output and text parameters work together to make this straightforward. When you set capture_output=True, subprocess collects the command’s output instead of letting it print to your console. The text=True parameter ensures that output comes back as a string you can work with easily.

Let’s capture the output of a real command and process it:

# capture_output.py
import subprocess

# Get the list of files and process the output
result = subprocess.run(['ls', '-l', '/tmp'], capture_output=True, text=True)

lines = result.stdout.split('\n')
print(f"Directory listing has {len(lines) - 1} entries")

# Print just the first few lines
for line in lines[:5]:
    if line.strip():
        print(line)

# Check if there were any errors
if result.stderr:
    print(f"Warnings: {result.stderr}")

Output:

Directory listing has 12 entries
total 48
drwxrwxrwt 12 root root 4096 Mar 14 10:42 .
drwxr-xr-x 15 root root 4096 Mar 14 09:15 ..
-rw-r--r--  1 user user 1024 Mar 14 09:10 temp_file.txt

Here we’ve captured the output from ls -l /tmp, split it into lines, and processed it as Python strings. Notice we also checked the stderr attribute—this contains any error messages the command produced. Many commands send warnings or non-critical messages to stderr while their main output goes to stdout. Separating them gives you the flexibility to handle each independently.

Handling Errors and Return Codes

Every command returns a status code that tells you whether it succeeded or failed. A return code of 0 means success; any non-zero value indicates an error. By default, subprocess.run() doesn’t raise an exception when a command fails—it just gives you the return code. This is actually a design feature because sometimes a non-zero exit code doesn’t mean failure in the user-facing sense. But usually, you want to know when something goes wrong.

Let’s see how to check return codes and decide what to do about failures:

# check_return_codes.py
import subprocess

# Try to run a command that will fail
result = subprocess.run(['grep', 'nonexistent_pattern', '/etc/hostname'],
                       capture_output=True, text=True)

print(f"Return code: {result.returncode}")
print(f"STDOUT: {result.stdout}")
print(f"STDERR: {result.stderr}")

# Manual error handling
if result.returncode != 0:
    print(f"Command failed with exit code {result.returncode}")

# For grep, non-zero means "pattern not found" which is normal
if result.returncode == 0:
    print(f"Pattern found: {result.stdout}")
else:
    print("Pattern not found (exit code 1 is normal for grep)")

Output:

Return code: 1
STDOUT:
STDERR:
Command failed with exit code 1
Pattern not found (exit code 1 is normal for grep)

This example shows that grep returns 1 when the pattern isn’t found. Your code needs to understand what each return code means for the specific command you’re running. For some commands (like grep), non-zero codes aren’t really errors. For others, they indicate genuine problems. The next section shows a more automatic way to handle this.

Using check=True for Automatic Error Handling

Most of the time, if a command fails (returns non-zero), you want your Python script to stop immediately rather than continue with bad data. The check=True parameter makes this automatic. When you set check=True, subprocess will raise a CalledProcessError exception if the command returns a non-zero exit code. This lets you use Python’s normal exception handling:

# error_handling.py
import subprocess

try:
    # This command will fail
    result = subprocess.run(['false'], check=True, text=True, capture_output=True)
except subprocess.CalledProcessError as error:
    print(f"Command failed with exit code {error.returncode}")
    print(f"Command was: {error.cmd}")

try:
    # This command succeeds
    result = subprocess.run(['echo', 'Success!'], check=True, text=True, capture_output=True)
    print(f"Output: {result.stdout.strip()}")
except subprocess.CalledProcessError as error:
    print(f"Unexpected failure: {error}")

Output:

Command failed with exit code 1
Command was: ['false']
Output: Success!

The check=True parameter transforms subprocess.run() into a fail-fast tool. If anything goes wrong, an exception is raised immediately, stopping your script. This is the safer default for most scripts. You can wrap it in a try-except block when you expect certain commands might fail and you want to handle that gracefully. This is much cleaner than manually checking result.returncode every single time.

Understanding Shell Mode and Security

By default, subprocess does not invoke the shell—it passes your command directly to the operating system. This is a security feature. However, some tasks genuinely need shell features like wildcards, environment variable expansion, or pipes. You can enable the shell with shell=True, but you must understand the security implications. If you ever use shell=True with user input, you’re vulnerable to shell injection attacks.

Let’s see the difference between shell and non-shell mode:

# shell_vs_noshell.py
import subprocess

# Without shell=True, the wildcard is literal
try:
    result = subprocess.run(['echo', '*.py'], capture_output=True, text=True)
    print("Without shell:")
    print(result.stdout)
except Exception as e:
    print(f"Error: {e}")

# With shell=True, the wildcard is expanded by the shell
result = subprocess.run('echo *.py', shell=True, capture_output=True, text=True)
print("\nWith shell=True:")
print(result.stdout)

Output:

Without shell:
*.py

With shell=True:
script.py utils.py main.py config.py

Notice the difference: without the shell, the wildcard is treated as a literal string. With the shell, the shell expands the wildcard to actual filenames. Now let’s see why this matters for security:

# shell_injection.py
import subprocess

# UNSAFE: Never do this with untrusted input
user_input = "test.txt; rm -rf /"
dangerous_command = f"echo {user_input}"
# If we ran this with shell=True, it would try to delete everything!
print(f"Dangerous command would be: {dangerous_command}")

# SAFE: Always use list form without shell=True when user input is involved
safe_result = subprocess.run(['echo', user_input], capture_output=True, text=True)
print(f"Safe output: {safe_result.stdout}")

Output:

Dangerous command would be: echo test.txt; rm -rf /
Safe output: test.txt; rm -rf /

When you use the list form without shell=True, the user input is treated as a literal argument to the command. The shell never sees the semicolon as a command separator. This is the safe way. Use shell=True only when you need shell features and all the input comes from you, not from users.

Developer standing behind a glowing energy shield deflecting attacks in an industrial factory setting
shell=False is your force field. Drop it and you’re dodging injection attacks bare-handed.

Working with Popen for Advanced Control

The subprocess.run() function is convenient, but sometimes you need more control. Maybe you want to keep a process running and interact with it in real-time, or you need to handle stdout and stderr separately in advanced ways. This is where Popen comes in. Popen (short for “pipe open”) is the underlying class that run() actually uses. When you call subprocess.run(), Python creates a Popen object, waits for it to finish, and returns the result.

Here’s how to use Popen directly for situations where you need that extra control:

# popen_example.py
import subprocess

# Create a Popen object without waiting
process = subprocess.Popen(
    ['ping', '-c', '4', 'google.com'],
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE,
    text=True
)

print(f"Process started with PID: {process.pid}")

# Do other work while the process runs...
print("Doing other work while ping runs...")

# Wait for the process to finish
stdout_data, stderr_data = process.communicate()

print(f"Process finished with return code: {process.returncode}")
print(f"\nPing results (first 200 chars):\n{stdout_data[:200]}")

Output:

Process started with PID: 24531
Doing other work while ping runs...
Process finished with return code: 0

Ping results (first 200 chars):
PING google.com (172.217.16.46): 56 data bytes
64 bytes from 172.217.16.46: icmp_seq=0 ttl=119 time=25.432 ms
64 bytes from 172.217.16.46: icmp_seq=1 ttl=119 time=24.891 ms

The Popen constructor returns immediately without waiting for the process to finish. You can check the process’s status, read output as it becomes available, or even send input to the process. The communicate() method waits for the process to finish and returns both stdout and stderr. You can also use poll() to check if the process has finished without blocking, or wait() to block until it finishes. Popen gives you the raw power when subprocess.run() doesn’t provide enough flexibility.

Piping Commands Together

In the shell, you often pipe one command’s output to another command’s input using the pipe operator (|). Subprocess makes this possible too. You can connect the stdout of one process to the stdin of another, letting you build sophisticated command chains directly from Python. This is useful when you want the efficiency and power of composing Unix utilities without shelling out to a shell script.

Let’s pipe two commands together:

# piping_commands.py
import subprocess

# Pipe 'ls -la' output to 'grep' to find Python files
# This is equivalent to: ls -la | grep .py
list_process = subprocess.Popen(
    ['ls', '-la'],
    stdout=subprocess.PIPE,
    text=True
)

grep_process = subprocess.Popen(
    ['grep', '.py'],
    stdin=list_process.stdout,
    stdout=subprocess.PIPE,
    text=True
)

# Close list_process.stdout to signal EOF to grep
list_process.stdout.close()

# Get the piped output
output, _ = grep_process.communicate()
print("Python files found:")
print(output)

Output:

Python files found:
-rw-r--r--  1 user staff  2048 Mar 14 10:15 main.py
-rw-r--r--  1 user staff  1234 Mar 14 09:42 utils.py
-rw-r--r--  1 user staff   890 Mar 14 08:30 config.py

This example chains two commands: ls -la feeds its output to grep .py, which filters for lines containing “.py”. Notice that we must close list_process.stdout after connecting it—this signals EOF (end of file) to the grep process so it knows there’s no more input coming. Without this, grep would wait forever. Piping is powerful when you need to compose multiple command-line tools, though for complex logic, you might consider doing the filtering in Python itself for clarity.

Setting Timeouts, Environment Variables, and Working Directories

Real-world scripts often need to control how long a command runs, set environment variables that the command needs, or run commands in specific directories. The subprocess module provides parameters for all of these. The timeout parameter lets you specify a maximum time (in seconds) that a command can run before being killed. The env parameter lets you pass a dictionary of environment variables. The cwd parameter sets the working directory for the command.

Here’s a practical example using all three:

# advanced_control.py
import subprocess
import os

# Set custom environment variables
custom_env = os.environ.copy()
custom_env['MY_VAR'] = 'hello from python'
custom_env['DEBUG'] = '1'

try:
    # Run a command with a timeout, in a specific directory, with custom env
    result = subprocess.run(
        ['bash', '-c', 'echo "MY_VAR is: $MY_VAR"; pwd; sleep 1'],
        cwd='/tmp',
        env=custom_env,
        timeout=5,
        capture_output=True,
        text=True,
        check=True
    )
    print("Command succeeded:")
    print(result.stdout)
except subprocess.TimeoutExpired:
    print("Command took too long and was killed")
except subprocess.CalledProcessError as e:
    print(f"Command failed: {e}")

# Example with a timeout that expires
try:
    subprocess.run(['sleep', '10'], timeout=2, check=True)
except subprocess.TimeoutExpired:
    print("\nThis command timed out (as expected - we told it to sleep 10 seconds with a 2 second timeout)")

Output:

Command succeeded:
MY_VAR is: hello from python
/tmp

This command timed out (as expected - we told it to sleep 10 seconds with a 2 second timeout)

Notice that we copy os.environ and then modify the copy—this preserves all the existing environment variables while adding our custom ones. Without this, the command wouldn’t have access to essential variables like PATH, which would prevent most commands from running. The timeout parameter is especially important for preventing your script from hanging if a command gets stuck. The cwd parameter is useful when you need to run commands in directories without using cd first.

Energetic character smashing a giant clock with a hammer in an industrial factory with gears exploding
When your subprocess decides to take an extended vacation, timeout brings the hammer down.

Real-Life Example: Automated Deployment Script

Let’s bring everything together with a practical, real-world example: a deployment automation script. Imagine you’re deploying a Python web application to production. You need to pull the latest code, install dependencies, run tests, and restart the service—all safely and with proper error handling. Here’s how you’d do it with subprocess:

# deployment_script.py
import subprocess
import os
import sys
from pathlib import Path

class DeploymentManager:
    def __init__(self, project_dir, timeout=300):
        self.project_dir = Path(project_dir)
        self.timeout = timeout
        self.deploy_log = []

    def log_output(self, stage, output):
        """Store deployment output for later review"""
        self.deploy_log.append(f"[{stage}] {output}")
        print(f"[{stage}] {output}")

    def run_command(self, command, stage_name):
        """Run a command and handle errors gracefully"""
        try:
            self.log_output(stage_name, f"Running: {' '.join(command)}")
            result = subprocess.run(
                command,
                cwd=self.project_dir,
                capture_output=True,
                text=True,
                timeout=self.timeout,
                check=True
            )
            self.log_output(stage_name, f"Success. Output: {result.stdout[:100]}")
            return True
        except subprocess.TimeoutExpired:
            self.log_output(stage_name, "ERROR: Command timed out")
            return False
        except subprocess.CalledProcessError as e:
            self.log_output(stage_name, f"ERROR: {e.stderr}")
            return False

    def deploy(self):
        """Execute the full deployment pipeline"""
        print(f"Starting deployment from {self.project_dir}...\n")

        # Stage 1: Pull latest code
        if not self.run_command(['git', 'pull', 'origin', 'main'], 'GIT PULL'):
            print("Deployment failed: Could not pull code")
            return False

        # Stage 2: Install dependencies
        if not self.run_command(['pip', 'install', '-r', 'requirements.txt'], 'PIP INSTALL'):
            print("Deployment failed: Could not install dependencies")
            return False

        # Stage 3: Run tests
        if not self.run_command(['pytest', 'tests/', '-v'], 'PYTEST'):
            print("Deployment failed: Tests did not pass")
            return False

        # Stage 4: Check static files
        if not self.run_command(['python', 'manage.py', 'collectstatic', '--noinput'], 'STATIC FILES'):
            print("Deployment warning: Static files collection had issues")
            # Don't fail deployment for this

        # Stage 5: Restart the service
        if not self.run_command(['sudo', 'systemctl', 'restart', 'myapp'], 'SERVICE RESTART'):
            print("Deployment failed: Could not restart service")
            return False

        print("\nDeployment completed successfully!")
        return True

# Usage
if __name__ == '__main__':
    deployer = DeploymentManager('/var/www/myapp')
    success = deployer.deploy()

    # Save deployment log
    log_file = Path('/var/log/deployment.log')
    log_file.write_text('\n'.join(deployer.deploy_log))

    sys.exit(0 if success else 1)

Sample Output:

Starting deployment from /var/www/myapp...

[GIT PULL] Running: git pull origin main
[GIT PULL] Success. Output: Already up to date.
[PIP INSTALL] Running: pip install -r requirements.txt
[PIP INSTALL] Success. Output: Collecting django==4.2
...
[PYTEST] Running: pytest tests/ -v
[PYTEST] Success. Output: test_user_login PASSED
test_api_endpoints PASSED
...
[SERVICE RESTART] Running: sudo systemctl restart myapp
[SERVICE RESTART] Success. Output:

Deployment completed successfully!

This example demonstrates several real-world patterns: running multiple sequential commands with proper error handling, logging output for auditing, using different working directories, capturing output for reporting, and failing fast when critical steps fail while allowing non-critical steps to have warnings. The check=True parameter makes sure any command failure is caught immediately, preventing partial deployments. The timeout ensures that if a command hangs, the deployment doesn’t wait forever.

Frequently Asked Questions

Why does my output look like b’…’ instead of normal text?

You’re getting bytes instead of strings, which happens when you don’t set text=True. By default, subprocess returns bytes because not all command output is valid text. Set text=True to tell subprocess to decode the bytes as UTF-8 strings automatically. If you need a different encoding, use encoding='latin-1' (or whatever encoding applies). This is the most common gotcha when starting with subprocess.

Can I send input to a process interactively?

Yes, using Popen. Set stdin=subprocess.PIPE, then write to process.stdin. However, if you need true interactive control (like responding to prompts), consider using the pexpect library instead, which handles terminal interaction better. For simple stdin/stdout piping, subprocess works fine. Be careful with buffering—you may need to flush process.stdin after writing.

How do I run a command in a different directory?

Use the cwd parameter: subprocess.run(['make', 'build'], cwd='/home/user/project'). This changes the working directory for that specific command without affecting your Python script’s working directory. This is safer than using os.chdir() because it doesn’t change the global state.

Why can’t my Python script find the command I’m trying to run?

Either the command isn’t in your PATH, or you need to use the full path. On Linux/Mac, try which command_name to find where it’s installed. On Windows, try where command_name. If the command exists but isn’t in PATH, use the full path: subprocess.run(['/usr/local/bin/mycommand']). If you inherit environment variables from os.environ (the default), PATH should be set correctly, but if you provide a custom env dictionary, make sure it includes PATH.

What if a command produces huge amounts of output?

Using capture_output=True stores everything in memory, which can be problematic for very large outputs. Instead, use Popen and stream the output. You can read from process.stdout in chunks or use process.communicate() with a generator pattern. For very large files, consider writing output to a file directly instead of capturing it in Python: subprocess.run(['bigcommand'], stdout=open('/tmp/output.txt', 'w')).

Does subprocess work on Windows?

Yes, but Windows has some differences. Command lists work the same way, but the commands themselves might be different. For example, use dir instead of ls, and type instead of cat. On Windows, you can also pass a string to subprocess.run() without setting shell=True if you want—it’s less safe, but Windows handles it differently than Unix shells. For cross-platform scripts, use libraries like pathlib and tools like shutil instead of shelling out when possible.

Conclusion: Master subprocess and Automate Safely

The subprocess module is one of Python’s most important tools for interacting with the system. Whether you’re deploying applications, automating routine tasks, or orchestrating complex workflows, subprocess gives you a safe, reliable way to run shell commands from Python code. The key principles are straightforward: use subprocess.run() for simple commands, pass arguments as a list to avoid shell injection, set check=True for fail-fast behavior, use capture_output=True and text=True for convenient output handling, and reach for Popen only when you need advanced control.

The security lessons matter deeply. Never use shell=True with untrusted input. Always separate your command and arguments into a list rather than constructing a command string. Remember that subprocess isn’t just about convenience—it’s about building reliable, secure systems that can interact with external tools without introducing vulnerabilities. The patterns we’ve covered here—timeouts, error handling, logging, and environment control—are the building blocks of production-grade automation.

As you build more complex scripts, you’ll discover edge cases and needs that require diving deeper into the documentation. The official Python documentation at docs.python.org/3/library/subprocess.html is comprehensive and authoritative. Start with the patterns in this article, experiment with your own scripts, and refer back to the docs as you encounter new challenges. With subprocess in your toolkit, you’re equipped to automate almost anything your system can do.

Senior developer standing confidently with arms crossed in front of a smooth-running industrial pipeline with green energy orbs
subprocess.run() — because sometimes Python needs to phone a friend and actually get a clean answer back.