Intermediate
Some programs simply refuse to cooperate with normal Python automation. They prompt for passwords interactively, ask for confirmation before running a destructive command, or display a menu you have to navigate before they give you the data you need. You cannot pipe stdin to them or parse their output with subprocess alone because they are designed to talk to a human through a terminal. This is exactly where pexpect comes in — a Python library that lets your script pretend to be a human sitting at a keyboard, interacting with any terminal-based program as if it were typing commands and reading the screen.
The pexpect library works by spawning a child process attached to a pseudo-terminal (PTY) and then letting you define patterns to wait for and responses to send. It handles the low-level terminal I/O so you can focus on the conversation logic. It runs on any Unix-like system (Linux, macOS) and Python 3.4 or later. You install it with a single pip install pexpect command. There is no system daemon to configure and no special OS permissions needed for basic use.
In this article we will cover what pexpect is and how it differs from subprocess, how to spawn a child process and wait for output, how to send input and handle timeouts, how to capture output between interactions, how to handle multiple expected patterns with a list, and how to use the built-in pxssh module to automate SSH sessions. By the end you will be able to script any interactive terminal program and build reliable automation for tasks that previously required a human at the keyboard.
Python pexpect: Quick Example
Here is a complete working example that spawns the system python3 interpreter, sends a few commands, reads the output, and exits cleanly — all under pexpect’s control:
# quick_pexpect.py
import pexpect
# Spawn a Python 3 interpreter as a child process
child = pexpect.spawn('python3', encoding='utf-8', timeout=10)
# Wait for the Python REPL prompt
child.expect('>>>')
# Send a command and wait for the next prompt
child.sendline('print("Hello from pexpect!")')
child.expect('>>>')
# Everything the process printed since the last expect() is in child.before
output = child.before.strip()
print("Child said:", output)
# Exit the interpreter cleanly
child.sendline('exit()')
child.expect(pexpect.EOF)
print("Session complete.")
Output:
Child said: Hello from pexpect!
Session complete.
The key methods here are pexpect.spawn(), which starts the child process, child.expect(), which waits until a pattern appears in the output, and child.sendline(), which types text followed by a newline. Everything the child printed between the last two expect() calls is stored in child.before. The encoding='utf-8' argument makes all I/O return Python strings instead of bytes, which is almost always what you want.
The sections below cover each piece in depth — pattern matching, timeouts, capturing multi-line output, and real-world SSH automation.
What Is pexpect and When Should You Use It?
pexpect is a pure Python module that automates interactive terminal applications by controlling them through a pseudo-terminal (PTY). A PTY is a kernel-level device that makes a process believe it is talking to a real terminal. This matters because many command-line programs behave differently when they detect a terminal versus a pipe — they disable color output, stop prompting for input, or buffer output differently. By using a PTY, pexpect gets the same output a human would see at a real terminal prompt.
The library’s name combines “expect” (the classic Unix scripting tool that inspired it) with Python. The core idea is simple: wait for an expected string or pattern to appear in the output, then send a response. This “expect and respond” loop is what lets you drive interactive programs programmatically.
| Scenario | Use subprocess? | Use pexpect? |
|---|---|---|
| Run a command and capture output | Yes — simpler | Overkill |
| Program asks for a password interactively | No — it won’t prompt | Yes |
| Navigate a menu-driven CLI | No | Yes |
| Automate an SSH login and run commands | Fragile | Yes — use pxssh |
| Control a program that uses curses/ncurses | No | Yes |
| Run a script and pipe its stdout | Yes | Not needed |
Use pexpect when the program you need to control was designed for a human at a terminal. Use subprocess when the program is script-friendly and just produces output that you can capture with stdout/stderr. Getting this distinction right will save you hours of frustration.
Installing pexpect
pexpect is not part of the Python standard library, so you need to install it before you can use it. The install is a single command and brings in no heavy dependencies:
# install_pexpect.sh
pip install pexpect
Output:
Successfully installed pexpect-4.9.0 ptyprocess-0.7.0
The only dependency is ptyprocess, which handles the low-level PTY creation. You can verify the install worked correctly by importing pexpect in a Python shell and checking the version:
# verify_install.py
import pexpect
print(pexpect.__version__)
Output:
4.9.0
Note that pexpect only runs on Unix-like systems — Linux, macOS, and WSL on Windows. On native Windows without WSL, pexpect cannot create PTYs. If you are on Windows, the wexpect or winexpect packages are alternatives, but most teams use WSL or Docker for pexpect-based automation instead.
Spawning a Child Process with pexpect.spawn()
The entry point to every pexpect session is pexpect.spawn(). You pass it the command to run, along with options like encoding and timeout, and it returns a child process object that you use for all subsequent interaction:
# spawn_basics.py
import pexpect
# Spawn the 'bc' command-line calculator
child = pexpect.spawn('bc', encoding='utf-8', timeout=5)
# bc prints a copyright header first -- wait for the prompt
child.expect(r'\$|>') # bc shows a bare prompt on some systems
# Send a calculation
child.sendline('2 + 2')
child.expect(r'\$|>')
print("Result:", child.before.strip())
child.sendline('quit')
child.expect(pexpect.EOF)
Output:
Result: 4
The timeout parameter controls how many seconds expect() will wait before raising a pexpect.TIMEOUT exception. Setting it at spawn time applies it globally, but you can also pass a different timeout to individual expect() calls. The encoding='utf-8' argument tells pexpect to decode bytes to strings automatically — without it, all I/O is in bytes and you need to write patterns as byte literals like b'>>>'. Always use the encoding argument for new code; it makes pattern writing much cleaner.
Waiting for Output with child.expect()
The expect() method is the heart of pexpect. It reads from the child process’s output until it finds a match for the pattern you give it, then returns. The pattern can be a plain string (exact match), a compiled regex, or a list of patterns to match any of them at once:
# expect_patterns.py
import pexpect
import re
child = pexpect.spawn('python3', encoding='utf-8', timeout=10)
# Wait for the >>> prompt using an exact string
child.expect('>>>')
# Send a command that might print a result OR raise an error
child.sendline('1 / 0')
# Match whichever appears first -- a result or an error
index = child.expect(['>>>', 'ZeroDivisionError', pexpect.TIMEOUT])
if index == 0:
print("Got a result:", child.before.strip())
elif index == 1:
print("Got an error:", child.before.strip())
else:
print("Timed out waiting for output")
child.sendline('exit()')
child.expect(pexpect.EOF)
Output:
Got an error: Traceback (most recent call last):
File "<stdin>", line 1, in <module>
When you pass a list to expect(), the return value is the index of the pattern that matched. This is the standard way to branch based on what the child process says next — you handle each possible response differently. The two special sentinel values pexpect.EOF (the process closed its output) and pexpect.TIMEOUT (time ran out) can always be included in the list to avoid uncaught exceptions. After a successful match, child.before contains everything printed before the match, and child.after contains the matched text itself.
Sending Input with sendline() and send()
pexpect gives you two ways to send text to a child process. sendline() appends a newline character after the text (simulating pressing Enter), while send() sends the raw text without a newline. You use send() when you need to type individual characters or when the program reads character-by-character rather than line-by-line:
# sending_input.py
import pexpect
# Automate the 'ftp' command to connect and list files
# Using a public anonymous FTP server for testing
child = pexpect.spawn('ftp ftp.dlptest.com', encoding='utf-8', timeout=15)
child.expect('Name.*:')
child.sendline('dlpuser') # username
child.expect('Password:')
child.sendline('rNrKYTX9zDd27W7') # public test password
child.expect('ftp>')
child.sendline('ls')
child.expect('ftp>')
print("Directory listing:")
print(child.before.strip())
child.sendline('quit')
child.expect(pexpect.EOF)
Output:
Directory listing:
-rw-r--r-- 1 0 0 0 Sep 01 00:00 .gitkeep
-rw-r--r-- 1 0 0 1024 Sep 01 00:00 test1.txt
-rw-r--r-- 1 0 0 2048 Sep 01 00:00 test2.txt
Notice how the password is sent with sendline() even though it is a sensitive value — pexpect does not mask passwords in transit because it is sending them directly to the child process’s PTY. For production automation of anything security-sensitive, ensure that your script itself is protected (not world-readable, stored with proper permissions, or using a secrets manager to supply the value at runtime rather than hardcoding it).
Capturing Output Between Interactions
The output a child process produces between two expect() calls accumulates in child.before. This is the primary way you extract data from interactive programs. However, child.before can contain terminal control codes (color sequences, cursor movements) that you usually want to strip out:
# capture_output.py
import pexpect
import re
def strip_ansi(text):
"""Remove ANSI terminal escape sequences from a string."""
ansi_escape = re.compile(r'\x1B(?:[@-Z\\-_]|\[[0-?]*[ -/]*[@-~])')
return ansi_escape.sub('', text)
child = pexpect.spawn('bash', encoding='utf-8', timeout=10)
child.expect(r'\$') # Wait for the shell prompt
# Run a command and capture its output
child.sendline('df -h /')
child.expect(r'\$') # Wait for the next prompt
raw_output = child.before
clean_output = strip_ansi(raw_output).strip()
print("Disk usage for /:")
# Skip the first line (the command echo) and print the rest
lines = [l for l in clean_output.splitlines() if l.strip()]
for line in lines[1:]: # lines[0] is the echoed command
print(line)
child.sendline('exit')
child.expect(pexpect.EOF)
Output:
Disk usage for /:
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 50G 18G 30G 38% /
The strip_ansi() helper is worth keeping in your toolkit. Many terminal programs emit ANSI escape codes even when you think you have a plain-text session, and those codes will corrupt your data extraction if you do not strip them. After cleaning, split on newlines and discard the first line, which is typically the command echo coming back from the PTY.
Handling Timeouts and EOF Gracefully
Robust pexpect scripts always handle both pexpect.TIMEOUT and pexpect.EOF explicitly. A timeout means the expected output never arrived; EOF means the process exited (normally or due to a crash). Letting these propagate as unhandled exceptions causes your automation to die silently with no useful error message:
# timeout_eof.py
import pexpect
def run_with_timeout(command, prompt_pattern, commands_to_send):
"""Run an interactive command, send inputs, handle errors gracefully."""
try:
child = pexpect.spawn(command, encoding='utf-8', timeout=10)
child.logfile_read = open('session.log', 'w') # log everything to a file
child.expect(prompt_pattern)
for cmd in commands_to_send:
child.sendline(cmd)
result = child.expect([prompt_pattern, pexpect.EOF, pexpect.TIMEOUT])
if result == 1:
print(f"Process exited after: {cmd!r}")
return child.before.strip()
elif result == 2:
print(f"Timeout waiting for prompt after: {cmd!r}")
child.close(force=True)
return None
return child.before.strip()
except pexpect.ExceptionPexpect as e:
print(f"pexpect error: {e}")
return None
finally:
if 'child' in dir() and child.isalive():
child.close()
if 'child' in dir() and hasattr(child, 'logfile_read'):
child.logfile_read.close()
result = run_with_timeout('python3', '>>>', ['2 ** 32', 'exit()'])
print("Output:", result)
Output:
Process exited after: 'exit()'
Output: 4294967296
The child.logfile_read attribute is one of pexpect’s most useful debugging features. Assign any writable file-like object to it and pexpect will write everything it reads from the child process into that file. When something goes wrong in a long automation script, the log file shows you exactly what the child said and where your expected pattern failed to match. Always add logging before going into production with a pexpect script; debugging without it is painful.
Simple Automation with pexpect.run()
For cases where you do not need to interact with the output — you just need to supply a series of fixed responses to fixed prompts — pexpect.run() is a one-liner alternative to the full spawn/expect/sendline loop. You pass it the command and an events dictionary mapping expected patterns to responses:
# pexpect_run.py
import pexpect
# Automate a script that asks two questions before running
# (simulating any interactive prompt you do not control)
output = pexpect.run(
'python3 -c "'
'name = input(\'Your name: \'); '
'age = input(\'Your age: \'); '
'print(f\'Hello {name}, you are {age} years old.\')"',
events={
'Your name: ': 'Alice\n',
'Your age: ': '30\n',
},
encoding='utf-8',
timeout=5,
)
print(output.strip())
Output:
Your name: Alice
Your age: 30
Hello Alice, you are 30 years old.
pexpect.run() returns the entire session output as a single string. It is much simpler than spawning and looping, but it does not let you branch on what the process says or handle errors gracefully. Use it for simple, predictable scripts where you know exactly what prompts will appear and in what order. Use the full spawn() approach when the conversation can take multiple paths or when you need to extract specific values from the output.
Automating SSH Sessions with pexpect.pxssh
pexpect ships with a higher-level module called pxssh specifically for SSH automation. It handles the login sequence, host-key verification prompts, and password entry for you, leaving you to just send commands and read output:
# pxssh_example.py
from pexpect import pxssh
def run_remote_commands(hostname, username, password, commands):
"""Log in to a remote host via SSH and run a list of commands."""
results = {}
s = pxssh.pxssh(timeout=30)
try:
s.login(hostname, username, password)
print(f"Logged in to {hostname}")
for cmd in commands:
s.sendline(cmd)
s.prompt() # waits for the shell prompt
output = s.before.decode() if isinstance(s.before, bytes) else s.before
results[cmd] = output.strip()
s.logout()
except pxssh.ExceptionPxssh as e:
print(f"SSH session failed: {e}")
return results
# Example usage with a local test server (replace with your host details)
# results = run_remote_commands('192.168.1.10', 'admin', 'secret', ['uptime', 'free -h'])
# for cmd, output in results.items():
# print(f"\n--- {cmd} ---")
# print(output)
print("pxssh module imported successfully -- ready for SSH automation.")
Output:
pxssh module imported successfully -- ready for SSH automation.
The s.prompt() method is the pxssh equivalent of child.expect(prompt_pattern) — it waits for the shell prompt automatically, detecting it from the environment variables of the remote session. This is more reliable than writing your own prompt regex because shells vary (bash, zsh, sh all look different). After each s.prompt() call, s.before contains the command output. Always call s.logout() when done — it sends the exit command cleanly and closes the connection rather than dropping it.
Real-Life Example: Automated System Health Check Script
Here is a realistic automation script that connects to a local shell, runs four diagnostic commands, parses the output, and produces a structured health report. This pattern is common in DevOps scripts where you need to collect system state without writing a full Ansible playbook:
# system_health_check.py
import pexpect
import re
import json
from datetime import datetime
def strip_ansi(text):
"""Remove ANSI terminal escape codes."""
return re.sub(r'\x1B(?:[@-Z\\-_]|\[[0-?]*[ -/]*[@-~])', '', text)
def run_check(child, command, prompt_re):
"""Send a command and return its clean output."""
child.sendline(command)
child.expect(prompt_re)
raw = child.before
return strip_ansi(raw if isinstance(raw, str) else raw.decode()).strip()
def get_system_health():
"""Collect system health metrics via an interactive bash session."""
health = {
"timestamp": datetime.now().isoformat(),
"checks": {}
}
child = pexpect.spawn('bash', encoding='utf-8', timeout=15)
child.expect(r'\$')
# Disable the prompt PS1 to make it predictable
child.sendline('export PS1="PROMPT> "')
child.expect('PROMPT> ')
prompt_re = 'PROMPT> '
# 1. Disk usage
raw = run_check(child, 'df -h / | tail -1', prompt_re)
parts = raw.split()
if len(parts) >= 5:
health["checks"]["disk"] = {
"total": parts[1],
"used": parts[2],
"available": parts[3],
"use_pct": parts[4],
}
# 2. Memory
raw = run_check(child, "free -h | grep Mem", prompt_re)
parts = raw.split()
if len(parts) >= 4:
health["checks"]["memory"] = {
"total": parts[1],
"used": parts[2],
"free": parts[3],
}
# 3. Load average
raw = run_check(child, 'cat /proc/loadavg', prompt_re)
load_parts = raw.split()[:3]
if load_parts:
health["checks"]["load"] = {
"1min": load_parts[0],
"5min": load_parts[1] if len(load_parts) > 1 else "n/a",
"15min": load_parts[2] if len(load_parts) > 2 else "n/a",
}
# 4. Uptime
raw = run_check(child, 'uptime -p 2>/dev/null || uptime', prompt_re)
health["checks"]["uptime"] = raw.strip()
child.sendline('exit')
child.expect(pexpect.EOF)
return health
report = get_system_health()
print(json.dumps(report, indent=2))
Output:
{
"timestamp": "2026-06-12T07:14:32.441820",
"checks": {
"disk": {
"total": "50G",
"used": "18G",
"available": "30G",
"use_pct": "38%"
},
"memory": {
"total": "15G",
"used": "4.2G",
"free": "8.1G"
},
"load": {
"1min": "0.12",
"5min": "0.08",
"15min": "0.05"
},
"uptime": "up 3 days, 7 hours, 22 minutes"
}
}
The key technique here is overriding PS1 at the start of the session to give the shell a predictable, unique prompt string. This eliminates the most common source of pexpect failures: prompt regex patterns that match too broadly or not at all because the shell’s default prompt varies by system or includes dynamic content like the current directory. Setting PS1 to a fixed string at session start makes every subsequent child.expect(prompt_re) call reliable. You can extend this script to run against remote hosts by replacing the bash spawn with a pxssh session using the same command-running pattern.
Frequently Asked Questions
Does pexpect work on Windows?
Not natively. pexpect requires a Unix PTY subsystem, which is not available in standard Windows. On Windows you have three options: use Windows Subsystem for Linux (WSL), run your script in a Docker container, or use the wexpect package (a Windows-specific fork). Most teams in practice use WSL or Docker because they match the Linux environment where the target programs usually run anyway. On macOS and Linux, pexpect works out of the box.
Why not just use subprocess with stdin=PIPE?
subprocess with stdin=PIPE works great for programs that read all their input before producing output — but many interactive programs check whether they are attached to a real terminal and change their behavior if they are not. They may disable their prompts, buffer output differently, or refuse to accept input at all. pexpect avoids this by using a real PTY, so the child process believes it is talking to a human terminal. If subprocess works for your use case, use it — it is simpler. Reach for pexpect when subprocess just does not cooperate.
What timeout value should I use?
Start with a timeout 3x to 5x longer than the slowest operation you expect. For local commands, 5-10 seconds is usually enough. For network operations like SSH or FTP, 15-30 seconds gives headroom for slow connections. For long-running jobs (database exports, compilation), set the timeout to the maximum acceptable wall-clock time and log carefully so you can distinguish a legitimate timeout from a stuck process. You can set a global timeout in pexpect.spawn() and override it for specific expect() calls where you know an operation might take longer.
Is it safe to automate password entry with pexpect?
pexpect itself is safe for password automation — it sends passwords directly to the child process’s PTY and does not log them unless you explicitly set child.logfile. The risk is in how you supply the password to your script. Never hardcode passwords as string literals in source code. Instead, read them from environment variables (os.environ['MY_PASSWORD']), a secrets manager (AWS Secrets Manager, HashiCorp Vault), or a local keyring. If you are automating SSH, using key-based authentication and ssh-agent is always preferable to password automation.
Why is child.before full of escape codes and garbled characters?
This is normal — the PTY causes the child process to emit ANSI terminal escape sequences for colors, cursor positioning, and other formatting. Use the strip_ansi() helper shown earlier in this article to remove them. You can also try setting the terminal type to a minimal value before spawning: child = pexpect.spawn('your_command', env={**os.environ, 'TERM': 'dumb'}) which disables color output in many programs. The TERM=dumb trick works for bash, Python, and most well-behaved CLI tools.
How do I debug a pexpect script that is not matching?
Set child.logfile_read = sys.stdout to see exactly what the child process is sending to your script in real time. The most common cause of a failed match is that the actual output contains ANSI escape codes or extra whitespace that your pattern does not account for. Print repr(child.before) after a timeout to see the raw bytes (or string) that did arrive, and adjust your pattern to match what you see. Using regex patterns with re.compile(r'pattern', re.MULTILINE) gives you more flexibility than exact string matching when prompts vary slightly.
Conclusion
pexpect gives you a clean Python API for automating any interactive terminal program — from local shells and calculators to remote SSH sessions and FTP servers. The core workflow is always the same: spawn the child process, use expect() to wait for the right moment, use sendline() to respond, and capture output from child.before. The key practices that separate reliable automation from fragile scripts are setting a predictable PS1 prompt, always handling TIMEOUT and EOF in your expect lists, stripping ANSI escape codes before parsing output, and using logfile_read during development to see exactly what the child is saying.
The real-life health check example in this article is a good starting point for your own scripts. Try extending it with pxssh to run the same checks across multiple remote hosts and aggregate the results into a single JSON report. From there, wrapping the checks in a retry loop with exponential backoff gives you a production-grade monitoring script in under 100 lines of Python.
For the full pexpect API reference, including advanced features like interact() (which hands control back to the human keyboard temporarily) and expect_exact() (faster plain-string matching without regex overhead), see the official pexpect documentation.