How To Use Python concurrent.futures for Thread and Process Pools

Last Updated: June 01, 2026

Table of Contents

concurrent.futures: Quick Example
What Is concurrent.futures and When Should You Use It?
ThreadPoolExecutor: Running I/O Tasks in Parallel
ProcessPoolExecutor: True Parallel CPU Work
Timeouts and Cancellation
Real-Life Example: Parallel Website Health Checker
Frequently Asked Questions
Conclusion
Related Articles

Intermediate

You have a Python script that makes 50 API calls or processes 100 image files, and it runs painfully slowly because it does each task one at a time. Every developer hits this wall. The fix is parallelism, but Python’s threading and multiprocessing modules can be verbose and error-prone. The concurrent.futures module is the modern answer — a clean, high-level interface that makes parallel execution almost as simple as a regular function call.

The concurrent.futures module ships with Python 3.2+ and requires zero installation. It gives you two executor classes: ThreadPoolExecutor for I/O-bound tasks (network requests, file operations) and ProcessPoolExecutor for CPU-bound tasks (image processing, number crunching). Both share the same API, so switching between them is usually a one-line change.

In this guide, we’ll cover how both executors work, when to choose threads vs processes, how to submit tasks and collect results with map() and submit(), how to handle errors gracefully, and how to process results as they complete with as_completed(). By the end, you’ll be able to turn any slow sequential loop into a fast parallel pipeline.

Written by Pubs

Python developer and educator with 15+ years building production systems across data engineering, web APIs, and AI tooling. Founder of Python How To Program — 270+ in-depth tutorials covering the modern Python stack.

View all tutorials by Pubs →

concurrent.futures: Quick Example

Here is a minimal example that downloads 5 URLs in parallel using a thread pool. This replaces a sequential loop that would take 5x longer:

# quick_concurrent.py
import urllib.request
from concurrent.futures import ThreadPoolExecutor

URLS = [
    "https://httpbin.org/delay/1",
    "https://httpbin.org/get",
    "https://httpbin.org/ip",
    "https://httpbin.org/uuid",
    "https://httpbin.org/user-agent",
]

def fetch(url):
    with urllib.request.urlopen(url, timeout=10) as response:
        return url, response.status

with ThreadPoolExecutor(max_workers=5) as executor:
    results = list(executor.map(fetch, URLS))

for url, status in results:
    print(f"{status} -- {url}")

Output:

200 -- https://httpbin.org/delay/1
200 -- https://httpbin.org/get
200 -- https://httpbin.org/ip
200 -- https://httpbin.org/uuid
200 -- https://httpbin.org/user-agent

The with ThreadPoolExecutor(max_workers=5) as executor: block creates a pool of 5 threads and shuts them down cleanly when the block exits. executor.map(fetch, URLS) dispatches all 5 calls in parallel and returns results in the same order as the input. What used to take 5 seconds of sequential I/O now takes about 1 second.

What Is concurrent.futures and When Should You Use It?

The concurrent.futures module provides a unified interface for running callables asynchronously. Under the hood, it manages worker threads or processes for you — no manual threading.Thread creation, no Queue wiring, no join() calls. You describe what to run, and the executor handles the rest.

The key question is which executor to use. Python’s Global Interpreter Lock (GIL) means threads cannot run Python bytecode in true parallel — they share one CPU core. However, the GIL is released during I/O operations, so threads do speed up I/O-bound work dramatically. Processes have no GIL limitation and run on separate CPU cores, making them right for CPU-bound work.

Task Type	Examples	Best Executor	Why
I/O-bound	HTTP requests, file reads, DB queries	ThreadPoolExecutor	GIL released during I/O; threads are lightweight
CPU-bound	Image processing, parsing, math	ProcessPoolExecutor	True parallelism across CPU cores; bypasses GIL
Mixed	Download + process	Both in pipeline	Thread pool to download, process pool to compute

If you are unsure, start with ThreadPoolExecutor. It is simpler (no pickling overhead) and works well for most real-world tasks that involve any I/O at all.

Using ThreadPoolExecutor — Futures let you fire and forget. Then collect results when ready.

ThreadPoolExecutor: Running I/O Tasks in Parallel

The ThreadPoolExecutor is the workhorse for network-heavy Python code. Create one with max_workers to control how many threads run simultaneously. A good starting number for HTTP requests is 10-20; going higher risks hitting server rate limits or exhausting local ports.

Using executor.map() for Uniform Tasks

executor.map(fn, iterable) is the easiest pattern. It mirrors Python’s built-in map() but runs the function in parallel. Results are returned in the same order as the input, even if some tasks finish earlier.

# thread_map.py
import time
import urllib.request
from concurrent.futures import ThreadPoolExecutor

def fetch_length(url):
    """Return the byte length of a URL's response body."""
    try:
        with urllib.request.urlopen(url, timeout=10) as r:
            return url, len(r.read())
    except Exception as e:
        return url, f"ERROR: {e}"

urls = [
    "https://httpbin.org/get",
    "https://httpbin.org/headers",
    "https://httpbin.org/ip",
    "https://httpbin.org/uuid",
    "https://httpbin.org/anything",
]

start = time.perf_counter()
with ThreadPoolExecutor(max_workers=5) as executor:
    results = list(executor.map(fetch_length, urls))
elapsed = time.perf_counter() - start

for url, length in results:
    print(f"{length:>8}  {url}")
print(f"\nCompleted in {elapsed:.2f}s")

Output:

     312  https://httpbin.org/get
     183  https://httpbin.org/headers
      45  https://httpbin.org/ip
      53  https://httpbin.org/uuid
     401  https://httpbin.org/anything

Completed in 0.61s

Running these 5 requests sequentially would take 2-4 seconds depending on network latency. In parallel, they all run at once and finish in the time it takes the slowest one to respond. The try/except inside fetch_length is important — if any URL fails and raises an exception inside executor.map(), the exception re-raises when you iterate the results.

Using executor.submit() for Flexible Futures

executor.submit(fn, *args) gives you more control. It returns a Future object immediately — a handle to a computation that may not have finished yet. You can collect futures and inspect them later, which is useful when tasks have different arguments or you want to process results as they arrive.

# thread_submit.py
import urllib.request
from concurrent.futures import ThreadPoolExecutor, as_completed

def check_url(url, timeout=5):
    """Return (url, status_code) or (url, error_message)."""
    try:
        with urllib.request.urlopen(url, timeout=timeout) as r:
            return url, r.status
    except Exception as e:
        return url, f"FAILED: {type(e).__name__}"

urls = [
    "https://httpbin.org/status/200",
    "https://httpbin.org/status/404",
    "https://httpbin.org/status/500",
    "https://httpbin.org/delay/0",
    "https://httpbin.org/get",
]

with ThreadPoolExecutor(max_workers=5) as executor:
    # Submit all tasks and keep a dict mapping Future -> url
    future_to_url = {executor.submit(check_url, url): url for url in urls}

    # Process results as each Future completes (not in submission order)
    for future in as_completed(future_to_url):
        url = future_to_url[future]
        try:
            _, status = future.result()
            print(f"[{status}] {url}")
        except Exception as exc:
            print(f"[EXCEPTION] {url}: {exc}")

Output (order varies by completion time):

[200] https://httpbin.org/get
[200] https://httpbin.org/delay/0
[404] https://httpbin.org/status/404
[500] https://httpbin.org/status/500
[200] https://httpbin.org/status/200

as_completed(future_to_url) yields futures in the order they finish, not the order they were submitted. This is ideal for displaying progress or handling results the moment they are ready. The future.result() call either returns the return value of your function or re-raises any exception that occurred inside the worker.

Collecting Future results — submit() returns a Future. The result arrives when it arrives.

ProcessPoolExecutor: True Parallel CPU Work

For CPU-bound tasks, threads provide no speedup because the GIL prevents true parallel execution. ProcessPoolExecutor spawns separate Python interpreter processes, each with its own GIL and memory space, enabling genuine multi-core parallelism.

# process_pool.py
import math
import time
from concurrent.futures import ProcessPoolExecutor

def is_prime(n):
    """CPU-intensive primality test."""
    if n < 2:
        return n, False
    if n == 2:
        return n, True
    if n % 2 == 0:
        return n, False
    for i in range(3, int(math.isqrt(n)) + 1, 2):
        if n % i == 0:
            return n, False
    return n, True

# Large numbers that require real computation to check
numbers = [
    999_999_937,
    999_999_929,
    999_999_893,
    999_999_883,
    999_999_877,
    999_999_613,
    999_999_541,
    999_999_527,
]

start = time.perf_counter()
with ProcessPoolExecutor() as executor:
    results = list(executor.map(is_prime, numbers))
elapsed = time.perf_counter() - start

for n, prime in results:
    status = "PRIME" if prime else "composite"
    print(f"{n:>15,}  {status}")
print(f"\nChecked {len(numbers)} numbers in {elapsed:.2f}s")

# What if __name__ == '__main__' is omitted?
# On Windows: RuntimeError -- processes can't spawn without this guard.

Output:

    999,999,937  PRIME
    999,999,929  PRIME
    999,999,893  PRIME
    999,999,883  PRIME
    999,999,877  PRIME
    999,999,613  PRIME
    999,999,541  PRIME
    999,999,527  PRIME

Checked 8 numbers in 0.38s

Without a process pool, checking 8 large primes sequentially might take 1-2 seconds on a single core. With ProcessPoolExecutor, all 8 run on separate cores simultaneously. Note that on Windows, code that creates processes must be inside a if __name__ == '__main__': guard — without it, Python tries to re-import the module in each subprocess and enters an infinite loop. This is not required on macOS/Linux but is still good practice.

The Pickling Constraint

Everything passed to a ProcessPoolExecutor must be picklable — Python’s serialization format used to send data between processes. This means functions defined at the module level (not inside other functions or as lambdas), and arguments that are built-in types, dataclasses, or picklable objects. This is the main gotcha that catches developers switching from ThreadPoolExecutor.

# pickling_gotcha.py
from concurrent.futures import ProcessPoolExecutor

# This works fine -- module-level function
def double(x):
    return x * 2

# This will FAIL -- lambda is not picklable
transform = lambda x: x * 3

with ProcessPoolExecutor() as executor:
    # OK:
    results = list(executor.map(double, [1, 2, 3, 4, 5]))
    print("double results:", results)

    # This raises PicklingError:
    # results = list(executor.map(transform, [1, 2, 3]))  # DO NOT DO THIS

Output:

double results: [2, 4, 6, 8, 10]

Waiting for futures — as_completed() delivers results as they finish, not in order.

Timeouts and Cancellation

Production code must handle slow or hanging tasks. Both executors support per-call timeouts via future.result(timeout=N). If the task does not finish within N seconds, a TimeoutError is raised. The task itself is not cancelled — it continues running in the background — but your main thread can move on.

# timeout_example.py
import urllib.request
from concurrent.futures import ThreadPoolExecutor, as_completed, TimeoutError

def slow_fetch(url, delay_seconds=3):
    """Fetch a URL that deliberately delays the response."""
    full_url = f"https://httpbin.org/delay/{delay_seconds}"
    try:
        with urllib.request.urlopen(full_url, timeout=10) as r:
            return url, r.status
    except Exception as e:
        return url, f"ERROR: {e}"

tasks = [
    ("fast", 0),
    ("medium", 2),
    ("slow", 5),
]

with ThreadPoolExecutor(max_workers=3) as executor:
    futures = {
        executor.submit(slow_fetch, name, delay): name
        for name, delay in tasks
    }

    for future in as_completed(futures):
        name = futures[future]
        try:
            _, status = future.result(timeout=3)  # 3-second deadline
            print(f"[OK] {name}: {status}")
        except TimeoutError:
            print(f"[TIMEOUT] {name}: took too long")

Output:

[OK] fast: 200
[OK] medium: 200
[TIMEOUT] slow: took too long

The “slow” task requested a 5-second delay, but our future.result(timeout=3) gives up after 3 seconds and raises TimeoutError. The underlying thread is still running — it is your responsibility to design workers that can be abandoned safely. For true cancellation, consider using asyncio with task cancellation support instead.

Real-Life Example: Parallel Website Health Checker

Let’s build a practical tool that checks a list of URLs in parallel, reports status codes and response times, and flags any that fail or respond too slowly.

Handling executor errors — When a Future raises, the exception waits for you to ask for it.

# url_health_checker.py
import time
import urllib.request
from concurrent.futures import ThreadPoolExecutor, as_completed
from dataclasses import dataclass

@dataclass
class CheckResult:
    url: str
    status: int
    elapsed_ms: float
    error: str = ""

def check_url(url):
    """Check a URL and return a CheckResult."""
    start = time.perf_counter()
    try:
        with urllib.request.urlopen(url, timeout=8) as response:
            elapsed = (time.perf_counter() - start) * 1000
            return CheckResult(url=url, status=response.status, elapsed_ms=round(elapsed, 1))
    except urllib.error.HTTPError as e:
        elapsed = (time.perf_counter() - start) * 1000
        return CheckResult(url=url, status=e.code, elapsed_ms=round(elapsed, 1))
    except Exception as e:
        elapsed = (time.perf_counter() - start) * 1000
        return CheckResult(url=url, status=0, elapsed_ms=round(elapsed, 1), error=str(e))

def run_health_check(urls, max_workers=10, slow_threshold_ms=2000):
    """Run parallel health checks and print a report."""
    print(f"Checking {len(urls)} URLs with {max_workers} workers...\n")
    print(f"{'Status':<8} {'Time (ms)':>10} {'URL'}")
    print("-" * 60)

    results = []
    start_total = time.perf_counter()

    with ThreadPoolExecutor(max_workers=max_workers) as executor:
        future_to_url = {executor.submit(check_url, url): url for url in urls}
        for future in as_completed(future_to_url):
            result = future.result()
            results.append(result)
            flag = " [SLOW]" if result.elapsed_ms > slow_threshold_ms else ""
            flag = " [ERROR]" if result.error else flag
            flag = " [DOWN]" if result.status >= 500 else flag
            print(f"{result.status:<8} {result.elapsed_ms:>10.1f} {result.url}{flag}")

    total_elapsed = (time.perf_counter() - start_total)
    ok = sum(1 for r in results if 200 <= r.status < 300)
    print(f"\nDone in {total_elapsed:.2f}s | OK: {ok}/{len(urls)}")
    return results

if __name__ == "__main__":
    urls_to_check = [
        "https://httpbin.org/get",
        "https://httpbin.org/status/200",
        "https://httpbin.org/status/404",
        "https://httpbin.org/delay/1",
        "https://httpbin.org/ip",
        "https://httpbin.org/uuid",
    ]
    run_health_check(urls_to_check, max_workers=6)

Output:

Checking 6 URLs with 6 workers...

Status   Time (ms)  URL
------------------------------------------------------------
200        312.4  https://httpbin.org/get
200        198.7  https://httpbin.org/ip
200        201.1  https://httpbin.org/uuid
200        203.9  https://httpbin.org/status/200
404        195.3  https://httpbin.org/status/404
200       1203.8  https://httpbin.org/delay/1

Done in 1.21s | OK: 5/6

This checker runs all 6 requests in parallel, prints each result as it arrives (thanks to as_completed), and provides a summary. The @dataclass makes the result clean and typed. You can extend it by adding CSV export, retry logic for 5xx errors, or a configurable slow_threshold_ms alert.

Frequently Asked Questions

How many workers should I use?

For ThreadPoolExecutor with I/O-bound tasks, a common rule is 10-50 workers depending on the task. The default (when max_workers is omitted) is min(32, os.cpu_count() + 4) in Python 3.8+. For ProcessPoolExecutor with CPU-bound tasks, use os.cpu_count() or leave it as default (which matches CPU count). Too many workers adds overhead and can trigger rate limiting on the server side.

When should I use map() vs submit()?

Use executor.map(fn, iterable) when all tasks are the same function with a single iterable argument and you want results in order. Use executor.submit(fn, *args) when tasks have different arguments, you need the results as they complete (via as_completed), or you want to inspect Future objects individually. For most batch processing, map() is simpler; for monitoring progress or mixed tasks, use submit().

How do exceptions work inside workers?

Any exception raised inside a worker function is captured and stored in the Future. It is re-raised when you call future.result() or when iterating executor.map() results. With map(), the exception is raised at the point you access the failing result in the iterator -- so wrap the iteration in a try/except. With submit() and as_completed(), wrap each future.result() call individually so one failing task does not stop the rest.

When should I use concurrent.futures vs asyncio?

Use concurrent.futures when you have synchronous (blocking) functions you want to run in parallel without rewriting them. It works with any existing code. Use asyncio when you are writing new I/O-heavy code from scratch and want maximum concurrency with minimal thread overhead -- asyncio can handle thousands of concurrent connections in a single thread. You can also combine them: asyncio.run_in_executor() lets you run blocking code in a thread pool from inside an async function.

What does the with block do for executors?

Using with ThreadPoolExecutor() as executor: calls executor.shutdown(wait=True) when the block exits. This waits for all submitted futures to complete before proceeding. If you create an executor without the context manager, you must call executor.shutdown() manually or risk leaving threads/processes running after your script ends. The context manager is the safer and recommended pattern.

Conclusion

The concurrent.futures module gives you clean, high-level parallelism with minimal code. You have learned when to use ThreadPoolExecutor for I/O-bound tasks and ProcessPoolExecutor for CPU-bound work, how executor.map() delivers ordered results effortlessly, and how executor.submit() with as_completed() lets you handle results the moment they arrive. You also know how to handle timeouts, exceptions, and the pickling constraint that affects process pools.

The health checker example is a real starting point -- extend it to check your own URLs, write results to a CSV, or send Slack alerts when a site goes down. The pattern scales from 5 URLs to 5,000 with a single max_workers change. For the full API reference, see the Python concurrent.futures documentation.

Continue Learning Python

Tutorials you might also find useful:

Post Views: 87

How To Use Python concurrent.futures for Thread and Process Pools

concurrent.futures: Quick Example

What Is concurrent.futures and When Should You Use It?

ThreadPoolExecutor: Running I/O Tasks in Parallel

Using executor.map() for Uniform Tasks

Using executor.submit() for Flexible Futures

ProcessPoolExecutor: True Parallel CPU Work

The Pickling Constraint

Timeouts and Cancellation

Real-Life Example: Parallel Website Health Checker

Frequently Asked Questions

How many workers should I use?

When should I use map() vs submit()?

How do exceptions work inside workers?

When should I use concurrent.futures vs asyncio?

What does the with block do for executors?

Conclusion

Related Articles

Continue Learning Python

Submit a Comment Cancel reply