Intermediate
You build a cache that stores hundreds of large objects in memory, and your Python process gradually eats more and more RAM until the OS kills it. The root cause is often simple: normal dictionary references keep objects alive even after every other part of your code has stopped using them. Python’s weakref module solves this with weak references — pointers that do not prevent garbage collection.
The weakref module is built into Python’s standard library. You do not need to install anything. Weak references work with any class that supports the __weakref__ slot, which includes all user-defined classes by default.
In this article you will learn how weak references differ from strong references, how to use weakref.ref(), WeakValueDictionary, WeakKeyDictionary, and finalize() for cleanup callbacks, and when to use each tool in real code.
Python weakref: Quick Example
Here is a minimal demonstration of how a weak reference lets an object be garbage-collected while a normal reference would keep it alive:
# weakref_quick.py
import weakref
import gc
class DataChunk:
def __init__(self, name):
self.name = name
chunk = DataChunk("chunk-A")
weak = weakref.ref(chunk) # weak reference -- does NOT prevent GC
print(f"Before del: {weak()}") # dereference the weak ref
del chunk # drop the only strong reference
gc.collect() # force garbage collection
print(f"After del: {weak()}") # None -- object was collected
Output:
Before del: <__main__.DataChunk object at 0x7f1a2b3c4d00>
After del: None
Calling weak() returns the object if it is still alive, or None if it has been garbage-collected. This is the core pattern: always check for None before using the result of a weak reference call.
What Are Weak References and Why Use Them?
In Python, every normal (strong) reference to an object increments its reference count. The garbage collector only reclaims an object when its reference count reaches zero. A weak reference does not increment the reference count — it is an observer, not an owner.
Think of it like a library card. Owning a book (strong reference) keeps it on your shelf. Having a library card for that book (weak reference) lets you find it if someone else still has it checked out, but the book can be returned to circulation without asking your permission first.
| Reference Type | Keeps Object Alive? | Use Case |
|---|---|---|
| Strong (normal) | Yes | Most code — you own the object |
Weak (weakref.ref) | No | Caches, observers, callbacks |
WeakValueDictionary | No (values) | Object registries and caches |
WeakKeyDictionary | No (keys) | Per-object metadata storage |
Weak references are most valuable in three scenarios: caching computed results, implementing the observer pattern without creating retain cycles, and storing per-object metadata without preventing cleanup.
Using weakref.ref()
The basic weakref.ref(obj) call creates a callable weak reference. Call it with no arguments to get the current referent, or None if the object is gone. Always check for None before proceeding.
# weakref_ref.py
import weakref
import gc
class Connection:
def __init__(self, host):
self.host = host
def query(self):
return f"SELECT * FROM table ON {self.host}"
conn = Connection("db.example.internal")
ref = weakref.ref(conn)
# Safe dereference pattern
def use_connection(ref):
obj = ref() # may return None
if obj is None:
print("Connection was garbage-collected, reconnecting...")
return
print(obj.query())
use_connection(ref) # Object still alive
del conn
gc.collect()
use_connection(ref) # Object gone
Output:
SELECT * FROM table ON db.example.internal
Connection was garbage-collected, reconnecting...
The if obj is None check is mandatory — skipping it leads to AttributeError or TypeError when the object disappears between the check and the use. This defensive pattern mirrors how you handle optional values in any language.
Object Caches with WeakValueDictionary
WeakValueDictionary is a dictionary where the values are weak references. When the last strong reference to a value is dropped, the entry is automatically removed from the dictionary. This makes it ideal for caches where you want objects to live as long as something else is using them, but not a moment longer.
# weak_value_dict.py
import weakref
import gc
class ImageBuffer:
def __init__(self, name, data):
self.name = name
self.data = data
def __repr__(self):
return f"ImageBuffer({self.name!r})"
cache = weakref.WeakValueDictionary()
# Load images
img_a = ImageBuffer("hero.png", b"binary-data-a")
img_b = ImageBuffer("background.png", b"binary-data-b")
cache["hero"] = img_a
cache["background"] = img_b
print(f"Cache size: {len(cache)}")
print(f"hero: {cache.get('hero')}")
# Drop strong reference to img_a
del img_a
gc.collect()
print(f"After del img_a -- cache size: {len(cache)}")
print(f"hero: {cache.get('hero')}")
print(f"background: {cache.get('background')}")
Output:
Cache size: 2
hero: ImageBuffer('hero.png')
After del img_a -- cache size: 1
hero: None
background: ImageBuffer('background.png')
The cache.get() method returns None for missing keys rather than raising KeyError, which is exactly the pattern you want for cache lookups. If the item is missing, fetch it fresh and store the new strong reference — the cache entry will persist as long as you hold that reference.
Per-Object Metadata with WeakKeyDictionary
WeakKeyDictionary stores weak references as keys. When the key object is garbage-collected, the entry disappears. Use this to attach metadata to objects without modifying their class and without preventing their cleanup.
# weak_key_dict.py
import weakref
import gc
class Widget:
def __init__(self, name):
self.name = name
# Attach timestamps externally without modifying Widget
timestamps = weakref.WeakKeyDictionary()
btn = Widget("submit-button")
lbl = Widget("title-label")
timestamps[btn] = "2026-04-22T09:14:37"
timestamps[lbl] = "2026-04-22T09:14:38"
print(f"btn timestamp: {timestamps.get(btn)}")
del btn
gc.collect()
print(f"Entries remaining: {len(timestamps)}")
print(f"lbl timestamp: {timestamps.get(lbl)}")
Output:
btn timestamp: 2026-04-22T09:14:37
Entries remaining: 1
lbl timestamp: 2026-04-22T09:14:38
This pattern is used in frameworks to store per-object data (event listeners, profiling data, debug annotations) without coupling that data to the object’s class. When the object dies, the metadata vanishes with it automatically.
Cleanup Callbacks with weakref.finalize()
weakref.finalize(obj, callback, *args) registers a function to call when obj is about to be garbage-collected. Unlike __del__, finalizers are reliable and do not prevent GC from running.
# weakref_finalize.py
import weakref
import gc
class TempFile:
def __init__(self, path):
self.path = path
print(f"TempFile opened: {path}")
def cleanup(path):
print(f"Cleanup: would delete {path}")
tf = TempFile("/tmp/report_20260422.csv")
weakref.finalize(tf, cleanup, tf.path)
print("Dropping reference...")
del tf
gc.collect()
print("Done.")
Output:
TempFile opened: /tmp/report_20260422.csv
Dropping reference...
Cleanup: would delete /tmp/report_20260422.csv
Done.
The callback receives the arguments you passed at registration time — not a reference to the now-dead object (which would be None anyway). This is the right way to register teardown logic for objects whose lifetime you do not fully control.
Real-Life Example: LRU-Style Image Cache
Here is a practical image loader that uses WeakValueDictionary as a first-level cache. Images stay cached as long as any part of the application holds a reference. When the caller drops its reference, the cache entry disappears automatically — no explicit eviction needed.
# image_cache.py
import weakref
import gc
class Image:
def __init__(self, path, data):
self.path = path
self.data = data
self.size = len(data)
def __repr__(self):
return f"Image({self.path!r}, {self.size} bytes)"
class ImageLoader:
def __init__(self):
self._cache = weakref.WeakValueDictionary()
self._load_count = 0
def load(self, path):
img = self._cache.get(path)
if img is not None:
print(f" [cache hit] {path}")
return img
self._load_count += 1
fake_data = b"x" * (1024 * (self._load_count * 10))
img = Image(path, fake_data)
self._cache[path] = img
print(f" [cache miss] {path} -- loaded {img.size} bytes")
return img
loader = ImageLoader()
print("-- First load --")
hero_ref = loader.load("hero.png")
bg_ref = loader.load("background.png")
print("-- Second load (cached) --")
hero_ref2 = loader.load("hero.png")
print("-- Drop hero refs --")
del hero_ref, hero_ref2
gc.collect()
print("-- Reload after GC --")
hero_ref3 = loader.load("hero.png") # cache miss -- object was collected
Output:
-- First load --
[cache miss] hero.png -- loaded 10240 bytes
[cache miss] background.png -- loaded 20480 bytes
-- Second load (cached) --
[cache hit] hero.png
-- Drop hero refs --
-- Reload after GC --
[cache miss] hero.png -- loaded 30720 bytes
Notice that the cache hit works perfectly while the caller holds a reference. Once the caller drops its reference and GC runs, the entry evicts itself. This is far simpler than implementing manual eviction with timestamps or reference counting.
Frequently Asked Questions
When should I use weakref instead of a regular dict?
Use WeakValueDictionary when the dictionary is a secondary reference — when some other owner controls the object’s lifetime and the dict is just a lookup shortcut. Common cases: object registries, caches keyed by ID, and plugin systems where plugins register themselves. If your dict IS the owner, use a regular dict.
Why does weakref raise TypeError for built-in types?
Built-in types like int, str, tuple, and list do not support the __weakref__ slot by default, so you cannot create weak references to them. You can work around this by wrapping the value in a user-defined class. If you use __slots__ in your own class, add '__weakref__' to the slots list explicitly, or weak references to your class will also fail.
Do I need to call gc.collect() for weak references to work?
No. In CPython, objects are collected immediately when their reference count drops to zero — no GC cycle needed. gc.collect() only matters for objects in reference cycles (e.g., A refers to B which refers back to A). The examples in this article call gc.collect() explicitly to make the behavior deterministic in demos. In production code, weak references work without manual GC calls.
Is __del__ the same as weakref.finalize()?
__del__ is a destructor method defined on the class. It has well-known problems: it can prevent objects in reference cycles from being collected, and it can resurrect objects if it stores a new reference to self. weakref.finalize() is safer and more flexible — it does not prevent GC, it can be attached to objects you did not write, and it guarantees the callback runs at most once.
Can weak references cause memory leaks?
Not directly — that is the whole point. However, if you accidentally hold a strong reference somewhere (e.g., a local variable, a list element, a closure), the object stays alive and the weak reference never returns None. Memory leaks with weak references are usually caused by unintended strong references elsewhere in the code, not by the weak references themselves. Use gc.get_referrers(obj) to find what is keeping an object alive.
Conclusion
The weakref module gives you four tools for memory-efficient design: weakref.ref() for simple non-owning references, WeakValueDictionary for caches that self-evict, WeakKeyDictionary for external per-object metadata, and finalize() for reliable cleanup callbacks. Together they solve the class of bugs where dictionaries and registries prevent objects from being garbage-collected.
Extend the image cache example by adding a fallback to disk when memory is low — check cache.get(path), load from memory if available, load from disk if not, and let the WeakValueDictionary handle the eviction automatically.
For the complete API reference, see the Python weakref module documentation.