Intermediate
Every Python developer eventually builds the same small utilities: a function to chunk a list into fixed-size batches, a dict that remembers its insertion order and lets you access values by index, a memoization decorator that actually handles edge cases, a string formatter that gracefully handles missing keys. These snippets get copy-pasted from project to project, slightly different each time, and break in unpredictable ways.
boltons is a library of high-quality, production-tested utilities that solve exactly these recurring problems. It covers iterables, data structures, functional programming helpers, file utilities, text processing, caching, and more — all implemented with far more care than the typical StackOverflow snippet. The library is modular: you can import just the pieces you need without pulling in the whole thing.
This article covers the most useful parts of boltons across five categories: iterable utilities, ordered data structures, functional programming tools, text helpers, and caching. Each section includes runnable examples and real-world use cases. By the end you will have a new toolkit of reliable utilities to reach for before writing your own.
Chunking a List: Quick Example
One of the most commonly reinvented wheels in Python is splitting a list into fixed-size chunks. boltons.iterutils.chunked() does it cleanly in one line:
# quick_boltons.py
from boltons.iterutils import chunked
items = list(range(1, 13))
batches = chunked(items, 4)
for batch in batches:
print(batch)
Output:
[1, 2, 3, 4]
[5, 6, 7, 8]
[9, 10, 11, 12]
chunked() splits an iterable into lists of at most size elements. The last chunk may be shorter if the iterable’s length is not a multiple of size. It returns a list of lists — simple, correct, and no edge-case bugs. This is the boltons philosophy in miniature: a utility so common it should exist, implemented well enough that you never need to think about it again.
What Is boltons and What Problems Does It Solve?
boltons was created by Mahmoud Hashemi as a collection of utilities that feel like they belong in the Python standard library but either predate it or are too specialized to justify inclusion. Unlike most utility libraries, boltons imposes strict quality standards: every module is documented, tested, and designed to have no dependencies beyond the standard library.
| Module | What it provides |
|---|---|
iterutils | chunking, flattening, unique filtering, windowed iteration |
dictutils | ordered dicts with index access, inverted dicts, subdict views |
funcutils | memoization, partial application, function metadata inspection |
strutils | safe formatting, camelCase/snake_case conversion, slugification |
cacheutils | LRU cache, LRI cache, TTL cache with expiry |
fileutils | atomic file writing, temp directory management |
timeutils | human-readable durations, parse_timedelta |
Install with pip install boltons. The library is pure Python and has no dependencies, which makes it safe to add to any project without risk of version conflicts.
Iterable Utilities
Flattening Nested Lists
Nested lists appear constantly when aggregating results from async jobs, parsing JSON with variable depth, or working with hierarchical data. boltons.iterutils.flatten() handles arbitrary nesting depth and mixed types safely.
# flatten_demo.py
from boltons.iterutils import flatten, flatten_iter
nested = [1, [2, 3], [4, [5, 6]], 7]
print('Flat list:', flatten(nested))
# flatten_iter returns a generator for memory efficiency
print('Flat iter:', list(flatten_iter([['a', 'b'], ['c', ['d', 'e']]])))
# remap: transform nested data structures
from boltons.iterutils import remap
data = {'users': [{'name': 'alice', 'score': None}, {'name': 'bob', 'score': 42}]}
def drop_none(path, key, value):
return value is not None # Return False to drop this key
clean = remap(data, visit=drop_none)
print('Cleaned:', clean)
Output:
Flat list: [1, 2, 3, 4, 5, 6, 7]
Flat iter: ['a', 'b', 'c', 'd', 'e']
Cleaned: {'users': [{'name': 'alice'}, {'name': 'bob', 'score': 42}]}
remap() is one of boltons’ most powerful utilities — it traverses any nested Python structure (dicts, lists, tuples) and applies a visitor function to every key-value pair. The visitor returns True to keep the item, False to drop it, or a replacement value. It is the cleanest way to strip None values from an API response or sanitize deep config structures.
Sliding Windows and Unique Filtering
windowed() yields overlapping windows of a fixed size — essential for signal processing, time-series analysis, or any algorithm that needs to look at N consecutive items.
# windowed_unique.py
from boltons.iterutils import windowed, unique
temps = [22.1, 22.3, 22.8, 23.5, 23.1, 22.9]
print('3-point windows:')
for window in windowed(temps, 3):
avg = sum(window) / len(window)
print(f' {window} -> avg {avg:.2f}')
# unique: deduplicate while preserving order
tags = ['python', 'async', 'python', 'web', 'async', 'api']
print('Unique tags:', unique(tags))
Output:
3-point windows:
(22.1, 22.3, 22.8) -> avg 22.40
(22.3, 22.8, 23.5) -> avg 22.87
(22.8, 23.5, 23.1) -> avg 23.13
(23.5, 23.1, 22.9) -> avg 23.17
Unique tags: ['python', 'async', 'web', 'api']
unique() is the order-preserving dedup that set() is not — it keeps the first occurrence of each item and discards duplicates while maintaining the original sequence. This is critical when order matters (tags, categories, pipeline stages) and you cannot afford the shuffling that set() introduces.
Dict Utilities
OrderedMultiDict: Multiple Values Per Key
HTTP query strings and HTML form data can have multiple values for the same key (?tag=python&tag=async). Python’s regular dict drops duplicates. boltons.dictutils.OrderedMultiDict preserves all of them.
# multi_dict.py
from boltons.dictutils import OrderedMultiDict
# Simulate parsed query string with repeated keys
params = OrderedMultiDict([
('tag', 'python'),
('tag', 'async'),
('tag', 'web'),
('page', '1'),
('sort', 'date'),
])
print('All tag values:', params.getlist('tag'))
print('First tag:', params['tag'])
print('Page:', params['page'])
print('All keys:', list(params.keys(false)))
Output:
All tag values: ['python', 'async', 'web']
First tag: python
Page: 1
All keys: ['tag', 'tag', 'tag', 'page', 'sort']
getlist(key) returns all values for a key as a list. params[key] returns just the first value (matching the behavior most code expects). This structure is used internally by HTTP frameworks like Werkzeug — boltons provides a standalone version when you need it outside a web framework context.
Functional Utilities
Memoization with boltons.funcutils
boltons.funcutils.memoize is a memoization decorator that correctly handles unhashable argument types, None return values, and keyword arguments — all common failure modes of simple custom memoization implementations.
# memoize_demo.py
from boltons.funcutils import memoize
import time
@memoize
def slow_fibonacci(n):
if n <= 1:
return n
return slow_fibonacci(n - 1) + slow_fibonacci(n - 2)
start = time.perf_counter()
result = slow_fibonacci(35)
elapsed = time.perf_counter() - start
print(f'fib(35) = {result} ({elapsed:.4f}s)')
# Call again -- instantly from cache
start = time.perf_counter()
result = slow_fibonacci(35)
elapsed = time.perf_counter() - start
print(f'fib(35) cached = {result} ({elapsed:.6f}s)')
Output:
fib(35) = 9227465 (0.0021s)
fib(35) cached = 9227465 (0.000003s)
Note that boltons' memoize is unbounded -- it caches every unique argument combination forever. For production use with large argument spaces, use the LRU or TTL caches from cacheutils instead (covered below).
Caching with cacheutils
boltons.cacheutils provides three ready-to-use caches: LRU (evicts least-recently used), LRI (evicts least-recently inserted), and ThresholdCounter for frequency-limited operations. All are thread-safe.
# cache_demo.py
from boltons.cacheutils import LRU
# LRU cache with max 3 entries
cache = LRU(max_size=3)
def fetch_user(user_id):
# Simulate a DB query
if user_id in cache:
print(f' Cache hit: {user_id}')
return cache[user_id]
print(f' Cache miss: {user_id} (querying DB)')
result = {'id': user_id, 'name': f'user_{user_id}'}
cache[user_id] = result
return result
for uid in [1, 2, 3, 1, 4, 2]: # uid=4 evicts oldest entry
user = fetch_user(uid)
print(f' Got: {user["name"]}')
print(f'\nCache size: {len(cache)}, keys: {list(cache.keys())}')
Output:
Cache miss: 1 (querying DB)
Got: user_1
Cache miss: 2 (querying DB)
Got: user_2
Cache miss: 3 (querying DB)
Got: user_3
Cache hit: 1
Got: user_1
Cache miss: 4 (querying DB)
Got: user_4
Cache hit: 2
Got: user_2
Cache size: 3, keys: [1, 3, 4]
When uid=4 is fetched and the cache is full, the least-recently-used entry (uid=2) is evicted. Note that after eviction uid=2 returns as a cache miss when fetched again. The LRU object behaves like a dict -- you can check if key in cache and set with cache[key] = value, making it a transparent drop-in for most caching patterns.
Text Utilities
# strutils_demo.py
from boltons.strutils import (
slugify, camel2under, under2camel,
bytes2human, html2text
)
# URL-friendly slugs
title = 'Python asyncio -- Advanced Patterns & Best Practices'
print('Slug:', slugify(title))
# Case conversion
class_name = 'MyAwesomeDataParser'
print('Snake case:', camel2under(class_name))
field_name = 'created_at_timestamp'
print('Camel case:', under2camel(field_name))
# Human-readable byte sizes
print(bytes2human(1024))
print(bytes2human(1_500_000))
print(bytes2human(2_400_000_000))
Output:
Slug: python-asyncio-advanced-patterns-best-practices
Snake case: my_awesome_data_parser
Camel case: CreatedAtTimestamp
1.0 KiB
1.4 MiB
2.2 GiB
slugify() handles Unicode normalization, punctuation stripping, and hyphenation in one call -- the same operation Django and Flask do internally for URL generation. The case conversion utilities are useful when bridging JSON APIs (camelCase) with Python code (snake_case). bytes2human() saves you from writing a unit-conversion lookup table every time you need to display file sizes.
Real-Life Example: API Response Cleaner and Cache
# api_cache.py
from boltons.iterutils import remap
from boltons.cacheutils import LRU
from boltons.strutils import slugify
# Simulate fetched user profiles with messy data
RAW_PROFILES = {
1: {'name': 'Alice Wan', 'bio': None, 'tags': ['python', 'async', 'python'], 'score': 95},
2: {'name': 'Bob Smith', 'bio': 'Engineer', 'tags': ['web', 'api', None], 'score': None},
3: {'name': 'Carol Jones','bio': None, 'tags': ['data', 'ml'], 'score': 88},
}
profile_cache = LRU(max_size=10)
def clean_profile(raw):
# Remove None values at any depth
cleaned = remap(raw, visit=lambda p, k, v: v is not None)
# Deduplicate tags while preserving order
if 'tags' in cleaned:
seen = set()
cleaned['tags'] = [t for t in cleaned['tags'] if not (t in seen or seen.add(t))]
# Add a URL-safe slug from the name
if 'name' in cleaned:
cleaned['slug'] = slugify(cleaned['name'])
return cleaned
def get_profile(user_id):
if user_id in profile_cache:
return profile_cache[user_id]
raw = RAW_PROFILES.get(user_id)
if raw is None:
return None
cleaned = clean_profile(raw)
profile_cache[user_id] = cleaned
return cleaned
for uid in [1, 2, 3, 1]:
profile = get_profile(uid)
print(f'User {uid}: {profile}')
print(f'\nCache contains {len(profile_cache)} entries')
Output:
User 1: {'name': 'Alice Wan', 'tags': ['python', 'async'], 'score': 95, 'slug': 'alice-wan'}
User 2: {'name': 'Bob Smith', 'bio': 'Engineer', 'tags': ['web', 'api'], 'slug': 'bob-smith'}
User 3: {'name': 'Carol Jones', 'bio': None, 'tags': ['data', 'ml'], 'score': 88, 'slug': 'carol-jones'}
User 1: {'name': 'Alice Wan', 'tags': ['python', 'async'], 'score': 95, 'slug': 'alice-wan'}
The pipeline uses three boltons utilities: remap() to strip None values deep in the nested structure, manual order-preserving dedup for tags, and LRU to avoid re-cleaning the same profile on repeat fetches. The slugify() call adds a URL-friendly identifier derived from the display name. In a FastAPI context you would call get_profile() from your endpoint and serve the cleaned result directly to clients.
Frequently Asked Questions
Why use boltons when the standard library has functools and itertools?
boltons extends, not replaces, the standard library. functools.lru_cache is excellent but has no TTL support. itertools has no windowed(), no order-preserving unique(), and no nested remap(). boltons fills these gaps with the same quality bar as the standard library while adding useful extras.
Does boltons have any dependencies?
No. boltons is pure Python with zero third-party dependencies. This makes it safe to include in any project -- no version conflicts, no C extensions to compile, no transitive dependency chain to audit. It supports Python 2.7+ through Python 3.x, though new code should target Python 3.
Can remap() handle infinite recursion or circular references?
No. remap() does not handle circular references and will raise a RecursionError on deeply nested or circular structures. If your data might have circular references (e.g., ORM model instances), convert to a plain dict first using your ORM's serializer before passing to remap().
Is the boltons LRU cache thread-safe?
Yes. boltons.cacheutils.LRU uses an internal lock and is safe for use in multithreaded code. This differs from a simple dict-based cache that can cause race conditions during concurrent writes. For asyncio code use a separate locking pattern (e.g., asyncio.Lock) since thread locks do not integrate with the event loop.
Can I import only specific modules from boltons?
Yes. boltons is fully modular -- each module (iterutils, cacheutils, etc.) can be imported independently. You can even vendor a single module by copying the relevant .py file into your project. This is by design: the library is built to be a collection of independent, copy-pasteable utilities.
Conclusion
You have covered boltons' most useful modules: iterutils for chunking, flattening, windowed iteration, and order-preserving unique filtering; dictutils for multi-value dicts; funcutils for memoization; cacheutils for LRU caching; and strutils for slugification, case conversion, and human-readable sizes. The API cache example shows how combining three boltons utilities creates a clean, maintainable data pipeline.
The full library contains many more modules worth exploring: tableutils for tabular data, fileutils for atomic file writes, timeutils for duration parsing, and statsutils for descriptive statistics. Browse the official boltons documentation to see what else might replace a piece of custom utility code in your projects.