Beginner

You kick off a script to process 10,000 records, or download a batch of files, or train a model — and then you wait. The terminal is blank. Is it working? Did it hang? How long until it finishes? The only answer you have is “I don’t know” and the only way to check is to add a print statement, kill the script, and run it again. This is a solved problem. The solution is tqdm.

tqdm is a Python library that wraps any iterable and displays a progress bar in the terminal automatically. It shows you the current iteration count, the percentage complete, the elapsed time, and the estimated time remaining — all in a single line that updates in place without scrolling. It works on Python loops, pandas DataFrames, file downloads, and async code. Install it with pip install tqdm.

In this article we’ll cover the basics of wrapping loops with tqdm, customizing the bar appearance, using tqdm.pandas() for DataFrame operations, building nested progress bars, manually updating a bar for non-iterable tasks, and integrating tqdm into CLI tools. By the end you’ll never ship a long-running script without a progress bar again.

tqdm Quick Example

Adding a progress bar to an existing loop takes exactly one line of code — wrap the iterable with tqdm():

# quick_tqdm.py
import time
from tqdm import tqdm

items = range(50)

for item in tqdm(items, desc="Processing"):
    time.sleep(0.05)  # simulate work

Output (updates in place in the terminal):

Processing: 100%|####################| 50/50 [00:02<00:00, 22.4it/s]

The bar shows the description (“Processing”), percentage, item count, elapsed time, and throughput in items per second. No configuration needed. The key thing happening here is that tqdm(items) wraps the original iterable transparently — the loop body receives the same values it would have without tqdm. All the display logic is handled automatically.

What Is tqdm and When Should You Use It?

tqdm stands for “taqaddum” — Arabic for “progress.” The name comes from the Python library’s origin in scientific computing, where long-running data pipelines and model training jobs made progress feedback essential. Today it’s one of the most downloaded Python packages, used everywhere from Jupyter notebooks to production ML pipelines.

The core idea is deceptively simple: tqdm wraps an iterable and intercepts each iteration to update a counter and render the bar. Because it only hooks into the iterator protocol, it works with any Python iterable — lists, generators, file objects, database cursors, anything you can loop over.

Use casetqdm approachWhat you get
Simple looptqdm(iterable)Progress bar with count and ETA
pandas applytqdm.pandas() + .progress_apply()Row-by-row progress in DataFrames
Unknown totaltqdm(total=None)Spinner-style counter (no percentage)
Manual tasksbar.update(n)Increment by any amount, any time
Nested loopstqdm(leave=False)Inner bar clears when done, outer persists

The right time to add tqdm is any time a user (including you) will wait more than 2-3 seconds for a script to complete. Feedback during waiting is not cosmetic — it’s a signal that the program is alive and working correctly.

Wrapping a loop with tqdm
One import, one wrapper, zero excuse for blank terminals.

Basic tqdm Usage

Wrapping Any Iterable

The simplest use is wrapping a list or range directly. The desc parameter adds a label, and unit changes the “it/s” suffix to something more meaningful for your context:

# basic_loop.py
import time
from tqdm import tqdm

# Process a list of URLs (simulated)
urls = [f"https://api.example.com/item/{i}" for i in range(30)]

results = []
for url in tqdm(urls, desc="Fetching", unit="req"):
    time.sleep(0.1)  # simulate network call
    results.append({"url": url, "status": 200})

print(f"\nFetched {len(results)} items")

Output:

Fetching: 100%|####################| 30/30 [00:03<00:00,  9.8req/s]

Fetched 30 items

Using unit="req" changes the display from “it/s” to “req/s”, which makes the bar much more readable when you’re processing HTTP requests. Choose a unit that matches what you’re iterating over: “file”, “row”, “batch”, “img” — whatever makes the bar self-explanatory at a glance.

Showing Live Metrics with postfix

The set_postfix() method lets you display live metrics next to the bar. This is invaluable for training loops where you want to show the current loss, accuracy, or any metric alongside the progress:

# postfix_example.py
import time
import random
from tqdm import tqdm

epochs = 10
losses = []

with tqdm(range(epochs), desc="Training", unit="epoch") as bar:
    for epoch in bar:
        time.sleep(0.3)  # simulate training step
        loss = round(1.0 / (epoch + 1) + random.uniform(0, 0.1), 4)
        acc = round(0.5 + epoch * 0.05 + random.uniform(0, 0.02), 4)
        losses.append(loss)
        bar.set_postfix(loss=loss, acc=acc)

print(f"\nFinal loss: {losses[-1]:4f}")

Output (last frame shown):

Training: 100%|####################| 10/10 [00:03<00:00,  3.3epoch/s, loss=0.1012, acc=0.9743]

Final loss: 0.1012

Using tqdm as a context manager (the with tqdm(...) as bar: pattern) ensures the bar is properly closed even if the loop raises an exception. The set_postfix() call accepts any keyword arguments and displays them as key=value pairs at the end of the bar line. This pattern is ubiquitous in ML training scripts for a good reason — you can see at a glance if a training run is converging.

Nested Progress Bars

When processing multiple files, each with multiple items, you often want an outer bar for files and an inner bar for items. The leave=False parameter on the inner bar makes it disappear when done, so only the outer bar persists in the terminal:

# nested_bars.py
import time
from tqdm import tqdm

datasets = ["train.csv", "test.csv", "validation.csv"]
rows_per_file = [1000, 250, 250]

for dataset, n_rows in tqdm(zip(datasets, rows_per_file), desc="Files", total=len(datasets)):
    for row in tqdm(range(n_rows), desc=f"  {dataset}", leave=False, unit="row"):
        time.sleep(0.001)  # simulate row processing

print("\nAll datasets processed.")

Output (inner bar clears after each file):

  train.csv: 100%|####################| 1000/1000 [00:01<00:00, 998.3row/s]
Files:  33%|######                | 1/3 [00:01<00:02,  1.02s/it]

The inner bar is indented with spaces in the description to visually show the hierarchy in the terminal. When leave=False is set, the inner bar erases itself when the inner loop completes, leaving the outer bar as the only persistent display. This prevents terminal clutter when processing hundreds of files with thousands of rows each.

Nested progress bars
leave=False on the inner bar. Otherwise your terminal looks like a stack trace.

tqdm with pandas

Pandas apply() and map() operations can be slow on large DataFrames, and by default they give no feedback at all. tqdm.pandas() patches pandas to add progress tracking to these operations with a single setup call:

# tqdm_pandas.py
import time
import pandas as pd
from tqdm import tqdm

# One-time setup: patch pandas
tqdm.pandas(desc="Processing rows")

# Create sample DataFrame
df = pd.DataFrame({
    "name": [f"user_{i}" for i in range(500)],
    "score": range(500),
})

def slow_transform(row):
    """Simulate a slow per-row operation."""
    time.sleep(0.005)
    return row["name"].upper() + f"_{row['score'] * 2}"

# Use progress_apply instead of apply
df["result"] = df.progress_apply(slow_transform, axis=1)

print(df[["name", "result"]].head(3))

Output:

Processing rows: 100%|####################| 500/500 [00:02<00:00, 223.4it/s]

        name         result
0     user_0    USER_0_0
1     user_1    USER_1_2
2     user_2    USER_2_4

After calling tqdm.pandas(), replace any .apply() call with .progress_apply() and any .map() with .progress_map(). The bar automatically knows the total row count from the DataFrame shape. This is one of the most practical tqdm features because pandas operations on large DataFrames are a common source of “is it still running?” anxiety.

Manual Progress Bar Updates

Sometimes you can’t wrap a simple loop — maybe you’re processing variable-size batches, waiting for callbacks, or tracking progress through a recursive operation. In these cases you can manage the bar manually using tqdm(total=n) and bar.update():

# manual_bar.py
import time
from tqdm import tqdm

# Simulate downloading files of different sizes (in MB)
files = {"report.pdf": 5, "data.zip": 42, "model.pkl": 128, "logs.tar": 18}
total_mb = sum(files.values())

with tqdm(total=total_mb, desc="Downloading", unit="MB") as bar:
    for filename, size_mb in files.items():
        bar.set_description(f"Downloading {filename}")
        # Simulate chunked download
        chunk_size = max(1, size_mb // 10)
        downloaded = 0
        while downloaded < size_mb:
            time.sleep(0.05)
            chunk = min(chunk_size, size_mb - downloaded)
            downloaded += chunk
            bar.update(chunk)  # advance by actual bytes downloaded

print("\nAll downloads complete.")

Output (last frame):

Downloading model.pkl: 100%|####################| 193/193 [00:09<00:00, 20.8MB/s]

All downloads complete.

The bar.update(n) call advances the bar by n units, not by 1. This is critical when you want the bar to reflect meaningful units like bytes or megabytes rather than iteration counts. You can call update() as often as needed -- including from inside nested loops, callbacks, or threads. Just make sure the bar's total matches the sum of all your update() calls.

File upload with tqdm
bar.update(chunk) -- because 42MB is not 42 iterations.

Real-Life Example: Batch File Processor with tqdm

Here's a practical script that processes a directory of text files -- counting words, detecting language, and writing a report -- with a nested progress bar showing both file-level and line-level progress:

# batch_processor.py
"""Process a directory of text files and generate a word count report."""
import os
import time
import random
from pathlib import Path
from collections import Counter
from tqdm import tqdm


def simulate_process_line(line: str) -> dict:
    """Simulate per-line processing (e.g. NLP tagging)."""
    time.sleep(0.002)
    words = line.split()
    return {"word_count": len(words), "char_count": len(line)}


def process_file(filepath: Path) -> dict:
    """Process a single file and return stats."""
    # Simulate file with random lines
    lines = [f"This is line {i} of the file with some content." for i in range(random.randint(20, 60))]
    total_words = 0
    total_chars = 0

    for line in tqdm(lines, desc=f"  {filepath.name}", leave=False, unit="line"):
        result = simulate_process_line(line)
        total_words += result["word_count"]
        total_chars += result["char_count"]

    return {
        "file": filepath.name,
        "lines": len(lines),
        "words": total_words,
        "chars": total_chars,
    }


def main():
    # Simulate a list of files
    files = [Path(f"document_{i:02d}.txt") for i in range(8)]
    results = []

    print(f"Processing {len(files)} files...\n")

    for filepath in tqdm(files, desc="Files", unit="file"):
        stats = process_file(filepath)
        results.append(stats)

    # Print summary report
    print("\n=== Report ===")
    total_words = sum(r["words"] for r in results)
    for r in results:
        print(f"{r['file']}: {r['lines']} lines, {r['words']} words")
    print(f"\nTotal words across all files: {total_words}")


if __name__ == "__main__":
    main()

Output (final state):

Processing 8 files...

Files: 100%|####################| 8/8 [00:04<00:00,  1.8file/s]

=== Report ===
document_00.txt: 45 lines, 315 words
document_01.txt: 33 lines, 231 words
...
Total words across all files: 2,187

This pattern -- an outer bar for files, an inner bar with leave=False for per-file work -- scales well to thousands of files. The inner bar gives real-time feedback on large files while the outer bar tracks overall progress. Extend this by replacing simulate_process_line() with your actual NLP, parsing, or transformation logic.

Frequently Asked Questions

Why isn't the progress bar showing in my Jupyter notebook?

In Jupyter, use from tqdm.notebook import tqdm instead of from tqdm import tqdm. The notebook version renders an HTML progress bar widget instead of a text bar, which handles Jupyter's output buffering correctly. The API is identical -- you can use the same tqdm(iterable, desc=...) pattern. The tqdm.auto import automatically chooses the right version based on the environment: from tqdm.auto import tqdm works in both terminal scripts and notebooks.

Does tqdm work with generators?

Yes, but without a total parameter the bar can't show a percentage -- it shows a counter and throughput only. If you know the total ahead of time, pass it explicitly: tqdm(my_generator, total=1000). If you genuinely don't know the total, tqdm will still show iteration count and speed, which is much more informative than a blank terminal. For generators from database queries, consider getting the count first with a COUNT(*) query and passing it as total.

How do I disable tqdm in production or tests?

Pass disable=True to any tqdm call, or use an environment variable check: tqdm(iterable, disable=os.getenv("CI") == "true"). This pattern lets CI pipelines run without progress bar output cluttering the logs, while development runs still get the full bar. You can also use tqdm(iterable, disable=not sys.stdout.isatty()) to automatically disable the bar when output is being piped or redirected.

Does tqdm work with multiprocessing?

Yes, but you need to use tqdm.contrib.concurrent or manage the bar manually. For simple parallel work, from tqdm.contrib.concurrent import process_map is a drop-in replacement for multiprocessing.Pool.map that adds a progress bar automatically. For more complex cases, use a tqdm instance with a multiprocessing.Queue to collect updates from worker processes and call bar.update() from the main process.

Can I print to stdout while tqdm is running?

Use tqdm.write("message") instead of print() inside a tqdm loop. Regular print() interferes with the bar's in-place update mechanism and leaves garbled output. tqdm.write() temporarily clears the bar, prints the message, and re-draws the bar below it -- so both your messages and the progress bar display cleanly.

Conclusion

tqdm turns a blank waiting terminal into an informative, reassuring feedback loop. We covered wrapping iterables with tqdm(), showing live metrics with set_postfix(), using tqdm.pandas() for DataFrame operations, building nested bars with leave=False, and manually controlling a bar with bar.update(n). These patterns cover 95% of the use cases you'll encounter in real scripts.

The best next step is to go through your existing scripts and add tqdm to any loop that takes more than a second. The one-line wrapper -- for item in tqdm(items): -- is all you need to start. From there, try tqdm.auto if you work across notebook and terminal environments, and explore tqdm.contrib.concurrent when you add multiprocessing to your pipelines.

For the full API reference, see the official tqdm documentation.