Last Updated: June 01, 2026
For some of your web apps you develop in python, you will want to run them on the cloud so that your script can run 24/7. For some of your smaller applications, you may want to find the right free python hosting service so you don’t have to worry about the per month charges. These web applications might be a website written in flask, or using another web framework, it might be other types of python apps that runs in the background and runs your automation. This is where you can consider some of the hosting services that have a free plan and are still very easy to setup.
To find the right hosting platforms that fits your needs, you want to consider a few things:
- Ease of access to upload projects
- What type of support they provide
- What specifications that virtual server environment has to offer
One such new platform is called deta.sh. Deta is a free hosting service that can be used to provide web hosting for deploying python web applications or other types of python applications that run in the background.
The deta service, as of mid-2022, is still in the development stage and is expected to have a permanent free python hosting service so that online python applications can be setup and deployed quickly and easily. Deta is a relatively new service but is a service that is intended to compete with pythonanywhere, heroku, and similar services to run python on web servers. The service lets you host python script online without fuss directly from a command line, much like how you can check in code to github. Although it is new, it has the potential to be one of the best free python hosting there is in order to get your python online.
The platform provides you mini virtual environments (called ‘micros’) where you can host your python scripts. These can be separated into workspaces called ‘projects’ so that you can also more easily manage your environments. The way you can access/upload your code is with the command line through a password Access Token.

We will go through step by step how to run your python online. For this article, we will guide you on using deta to host a simple flask based web page so that you can have python as a webserver.
Python developer and educator with 15+ years building production systems across data engineering, web APIs, and AI tooling. Founder of Python How To Program — 270+ in-depth tutorials covering the modern Python stack.
Signing up for Deta.sh
Deta.sh is effectively a cloud python hosting service which sits on top of AWS and allows you to deploy your python code into a virtual machine (called a deta micro), store files (called data drive) and also store data (called deta base). Unlike AWS or other hosting services, you can quickly host and run your script without going through the hassle of setting up server, security configurations etc.
The Deta.sh team offers the service for free in order to allow developers to monetize the solutions where deta.sh will be able to share some of that revenue. To date, there are no paid Deta.sh hosting plans for python hosting and no intention. So you can continue to run python code online forever.
To begin with, head over to the website https://deta.sh to first create an account.

Once you have submitted, go to your email and click on the verify link.

After you click on sign-in, enter the same username and password, and you will be taken to the default page where you will have the ability to “See My Key”

Click on the “See My Key” to see your secret password. You will only be able to see it once and will not be able to see it ever again.
This is what they project key will look like:

You need both the key and the project id.
Think of the key like a password and the “Project ID” as a password. When you want to access your deta.sh to upload programs, make changes, you will need to use your project key to access your space.
If you lose your project id/key, you will not be able to recover it. However, you can create a new one with Settings->Create Key option.

One thing I’d like to call out is the Project ID. This is the ID of this particular s[ace

If you have multiple programs which access deta.sh, it is best to have separate project keys. The reason is that if one of your keys are compromised, then you can simply just change that key and not have all your applications be affected.
Setting Up Your Remote Access For Deta.sh
We will first setup deta.sh in the command line interface so that you can communicate to your deta.sh space on the cloud.
You can do this with either one of:
Mac / Linux:
curl -fsSL https://get.deta.dev/cli.sh | sh
Windows:
iwr https://get.deta.dev/cli.ps1 -useb | iex
Once that’s done, what will happen is that there will be a hidden folder called $HOME/.deta that is created (specifically in the case of Mac / Linux). It’s in this directory that the deta command line application will be found.
You can type deta --help to check that the command line tool was installed correctly

Next, you will need to create an access token so that you can connect to your deta.sh account. For this you will need to create an access token. Go to your deta.sh home page (e.g. https://web.deta.sh/) and then go back to the main projects page.

Next, click on the Create Access token under settings

Once you create token, this will create an Access Token so that you don’t need to login each time.

Copy this Access Token and then, create a file called tokens in the $HOME/.deta/ directory. Steps for Mac/Linux are:
cd $HOME/.deta
nano tokens
You can then add the following json inside the tokens file:
{
"deta_access_token": "<your access token created above>"
}
Finally, you can install the python library that will be used to access the deta components with the deta library.
pip install deta
Have a Free Python Hosting Flask on Deta.sh
To create an environment to host your python code and have python web hosting, you need to create something called a “micro“. This is almost like a mini virtual server with 128mb of memory but will not be running all the time. They will wake up, execute your code, and then go back to sleep. Deta.sh is not designed for long running applications with heavy computations (use one of the public cloud providers for that!). Also, each micro has its own python online cloud private access.
To begin with, you can use the command deta new --python <micro name>. The <micro name> is the name to label the mini-virtual name.

The above command will create a directory called flask_test with a python script called main.py

The default code in the main.py is:
def app(event):
return "Hello, world!"
At the same time, this code will be uploaded to deta.sh. If you go to the dashboard page https://web.deta.sh/ you will see a sub-menu under the Micro menu. You may need to refresh your browser if you had it open.

You will notice that there’s also a URL for this deta micro which is the end point where your application output can be accessed. Think of this simply as the console output.

If you encountered any errors, in the command line, you can type deta logs to get an output of any errors from the logs.
To make a more useful application, we can create a flask application to show a more functional webpage. In order to do this, you will need to dell deta.sh to install the flask library. You cannot use pip install unfortunately, but instead you need to use the requirements.txt instead.
First, add flask into a requirements.txt file in your local directory. So your file should simply look like this:
#requirements.txt
flask
Then in your main.py code file, you add the following, again this is in your local directory
from flask import Flask
app = Flask(__name__)
@app.route('/', methods=["GET"])
def hello_world():
return "Hello Flask World"
# def app(event):
# return "Hello, world!"
In order to now upload the changes to your micro, you will need to run the command deta deploy. This will upload the files requirements.txt and updates to main.py into your micro.
deta deploy
When executed, this should upload the code and install the libraries:

Managing Flask Forms On Free Python Hosting
Now that we have a simple static web page, we can create a more complex example where there’s a form that can be submitted. Using the weather API from openweathermap API, we can show the weather for a given location.
To get the weather data, we need to install two libraries pyowm and datetime. Hence, this will need to be added to requirements.txt.
#requirements.txt
flask
pyowm
datetime
Then for the code, the following can be updated in the main.py:
from flask import Flask, request, jsonify
import pyowm, datetime
app = Flask(__name__)
@app.route('/', methods=["GET"])
def get_location():
return """<html>
<body>
<form action="weather" method="POST">
<input name="location" type="text">
<input type="submit" value="submit">
</form>
</body>
</html>"""
@app.route('/weather', methods=["POST", "GET"])
def get_weather():
api_key = '<your open weather map API ley>'
owm = pyowm.OWM( api_key ).weather_manager()
weather_data = owm.weather_at_place('Bangalore').weather
ref_time = datetime.datetime.fromtimestamp( weather_data.ref_time ).strftime('%Y-%m-%d %H:%M')
weather_str = f"<h1>Weather Report for: {request.form['location']}</h1>"
weather_str += f"<ul>"
weather_str += f"<li><b>Time:</b> { ref_time } </li>"
weather_str += f"<li><b>Overview:</b> {weather_data.detailed_status} </li>"
weather_str += f"<li><b>Wind Speed:</b> {weather_data.wind()} </li>"
weather_str += f"<li><b>Humidity:</b> {weather_data.humidity} </li>"
weather_str += f"<li><b>Temperature:</b> {weather_data.temperature('fahrenheit')} </li>"
weather_str += f"<li><b>Rain:</b> {weather_data.rain} </li>"
weather_str += f"</ul>"
return weather_str
# def app(event):
# return "Hello, world!"
Then to upload the code into deta.sh, you can use the command deploy:
deta deloy
Once deployed, you can then go to the website – this is the endpoint that was automatically generated by deta.sh above.

def get_location()Once submitted, then a call is made to OpenWeatherMap

/ url, then the function def get_weather() is called to process the form. The variable that was passed, can be access through request.form['location']. The above code works by first providing a form through the function def get_location() which generates a very simple form through HTML:
<html>
<body>
<form action="weather" method="POST">
<input name="location" type="text">
<input type="submit" value="submit">
</form>
</body>
</html>
When the submit button is pressed, the form calls the /weather URL with the field location. Once called, then the python function def get_weather() is called upon which a call to OpenWeatherMap.org is made to get the weather data for the given location.
Conclusion
This is just a tip of the iceberg of what you can do with deta. You can also run scheduled jobs, run a NoSQL database, and have file storage as well. Contact us if you’d like us to cover these areas too.
How To Use Python cachetools for In-Memory Caching
Intermediate
Your API call takes 300 milliseconds. You call it 50 times a minute, and the data changes only every few hours. That is 50 redundant round-trips, 50 sets of network overhead, and 50 chances for a transient failure to surface to your users. Caching the result for 10 minutes would cut 99% of those calls while keeping the data fresh enough for any reasonable use case. Python’s standard library offers functools.lru_cache, but it has no expiry time, no maximum memory limit, and no way to cache across multiple function arguments without manual key management. cachetools fills all these gaps.
cachetools is a Python library that provides several ready-made cache classes: LRU (Least Recently Used), TTL (Time to Live), LFU (Least Frequently Used), and more. Each cache is a dictionary-like object with a configurable maximum size and an eviction policy that removes entries when the cache is full. Install it with pip install cachetools. There are no mandatory dependencies — cachetools is pure Python and works in any environment.
This article covers the four most useful cache types (LRU, TTL, LFU, RR), using the @cached and @cachedmethod decorators for automatic function memoization, handling thread safety in concurrent applications, cache invalidation strategies, and a real-world example of caching API responses in a Flask application. By the end you will be able to add intelligent caching to any Python function in under five lines of code.
cachetools Quick Example
The fastest way to cache a function’s results is the @cached decorator with an LRU cache:
# quick_cache.py
import time
import cachetools
from cachetools import cached, LRUCache
@cached(cache=LRUCache(maxsize=128))
def get_user_data(user_id: int) -> dict:
"""Simulate a slow database or API call."""
time.sleep(0.5) # Simulate 500ms latency
return {"id": user_id, "name": f"User {user_id}", "score": user_id * 10}
# First call: slow (cache miss)
start = time.perf_counter()
result = get_user_data(42)
print(f"First call: {(time.perf_counter()-start)*1000:.0f}ms -- {result}")
# Second call: instant (cache hit)
start = time.perf_counter()
result = get_user_data(42)
print(f"Second call: {(time.perf_counter()-start)*1000:.0f}ms -- {result}")
# Different argument: slow again (cache miss)
start = time.perf_counter()
result = get_user_data(99)
print(f"Third call: {(time.perf_counter()-start)*1000:.0f}ms -- {result}")
Output:
First call: 502ms -- {'id': 42, 'name': 'User 42', 'score': 420}
Second call: 0ms -- {'id': 42, 'name': 'User 42', 'score': 420}
Third call: 501ms -- {'id': 99, 'name': 'User 99', 'score': 990}
The second call returns in under 1 millisecond because the result is already stored in the LRU cache. The third call is slow again because user_id=99 is a different cache key from user_id=42. The cache key is derived from the function arguments by default — you can customize it with the key parameter. The maxsize=128 means the cache holds up to 128 distinct argument combinations; older entries are evicted when the limit is reached.
What Is cachetools and Which Cache Type Should You Use?
cachetools provides several cache implementations, each with a different eviction policy. The eviction policy determines which entry gets removed when the cache is full and a new item needs to be stored. Choosing the right eviction policy depends on your access patterns.
| Cache | Eviction policy | Best for | Has TTL? |
|---|---|---|---|
| LRUCache | Least Recently Used | General purpose, most common pattern | No |
| TTLCache | Time To Live + LRU | API responses, config data that expires | Yes |
| LFUCache | Least Frequently Used | Non-uniform access, popular items differ | No |
| RRCache | Random Replacement | Uniform access, simple eviction | No |
| MRUCache | Most Recently Used | Sequential scans, most recent is least useful | No |
LRUCache is the right default for most applications — it keeps the items you accessed most recently and evicts items that have not been used for a while, which aligns with real access patterns (hot items get accessed repeatedly, cold items do not). TTLCache adds an expiry time, which is essential when caching data that changes over time, such as API responses or database records. Use LFUCache when a small subset of items is accessed far more often than others — it prioritizes keeping the most popular items regardless of recency.
LRUCache — Least Recently Used
LRUCache keeps the N most recently accessed entries. When the cache is full and a new entry arrives, the least recently used entry is evicted. This is the right choice for function results that are expensive to compute and accessed repeatedly with the same arguments.
# lru_cache_example.py
from cachetools import LRUCache, cached
# Create a cache that holds at most 3 entries
cache = LRUCache(maxsize=3)
@cached(cache=cache)
def compute_fibonacci(n: int) -> int:
"""Compute Fibonacci number -- expensive for large n."""
if n < 2:
return n
return compute_fibonacci(n - 1) + compute_fibonacci(n - 2)
# Fill the cache: fib(10), fib(11), fib(12) are cached
print(compute_fibonacci(10)) # 55
print(compute_fibonacci(11)) # 89
print(compute_fibonacci(12)) # 144
print(f"Cache size: {len(cache)}") # 3
print(f"Cache keys: {list(cache)}") # [(10,), (11,), (12,)]
# Access a new value -- fib(13) evicts the LRU entry (fib(10))
print(compute_fibonacci(13))
print(f"Cache keys after: {list(cache)}") # [(11,), (12,), (13,)]
# Inspect cache statistics
print(f"Cache info: {compute_fibonacci.cache_info()}")
Output:
55
89
144
Cache size: 3
Cache keys: [(10,), (11,), (12,)]
233
Cache keys after: [(11,), (12,), (13,)]
Cache info: CacheInfo(hits=X, misses=X, maxsize=3, currsize=3)
The cache is a regular dictionary-like object you can inspect, clear, and manipulate directly. cache.clear() invalidates all entries at once, and del cache[(10,)] invalidates a specific entry. This manual control is something functools.lru_cache does not easily support.
TTLCache -- Time To Live
TTLCache combines an LRU eviction policy with an expiry time. Every entry in the cache is considered stale after ttl seconds and will be evicted automatically on the next access or when iterating the cache. This makes TTLCache ideal for caching external data that changes over time.
# ttl_cache_example.py
import time
from cachetools import TTLCache, cached
# Cache holds up to 100 entries, each expires after 10 seconds
ttl_cache = TTLCache(maxsize=100, ttl=10)
@cached(cache=ttl_cache)
def fetch_exchange_rate(currency: str) -> float:
"""Simulate fetching a live exchange rate."""
print(f" [API call] Fetching rate for {currency}...")
# In production, this would call a real API
rates = {"USD": 1.0, "EUR": 0.92, "GBP": 0.79, "JPY": 149.5}
return rates.get(currency, 1.0)
print("First fetch (cache miss):")
rate = fetch_exchange_rate("EUR")
print(f"EUR rate: {rate}")
print("\nSecond fetch (cache hit -- no API call):")
rate = fetch_exchange_rate("EUR")
print(f"EUR rate: {rate}")
print(f"\nCache has {len(ttl_cache)} entry, expires in {ttl_cache.timer() - ttl_cache['EUR']:.1f}s" if False else "")
# Simulate cache expiry by using a very short TTL
short_cache = TTLCache(maxsize=10, ttl=1)
@cached(cache=short_cache)
def get_timestamp(key: str) -> float:
return time.time()
t1 = get_timestamp("a")
time.sleep(0.5)
t2 = get_timestamp("a") # Hit -- same cached value
time.sleep(0.6)
t3 = get_timestamp("a") # Miss -- TTL expired, new value
print(f"\nTimestamp test:")
print(f"t1={t1:.3f}, t2={t2:.3f} (same -- cache hit)")
print(f"t3={t3:.3f} (different -- TTL expired, new call)")
print(f"t3 > t1: {t3 > t1}")
Output:
First fetch (cache miss):
[API call] Fetching rate for EUR...
EUR rate: 0.92
Second fetch (cache hit -- no API call):
EUR rate: 0.92
Timestamp test:
t1=1717300000.123, t2=1717300000.123 (same -- cache hit)
t3=1717300001.234 (different -- TTL expired, new call)
t3 > t1: True
The TTL is measured from when the entry is first inserted, not from when it is last accessed. An entry inserted at t=0 with ttl=60 expires at t=60 regardless of how many times it was read in between. If you need sliding expiry (where an access resets the timer), you must manage that manually by deleting and re-inserting the entry on each access.
Thread Safety with cachetools
cachetools cache objects are NOT thread-safe by default. If multiple threads read from and write to the cache concurrently, you will get race conditions and corrupt cache state. cachetools provides a Lock parameter for the @cached decorator to serialize access:
# thread_safe_cache.py
import threading
import time
from cachetools import TTLCache, cached
# Thread-safe cache using a threading lock
cache = TTLCache(maxsize=200, ttl=60)
lock = threading.RLock()
@cached(cache=cache, lock=lock)
def get_config(config_key: str) -> str:
"""Simulate a slow config read."""
time.sleep(0.1)
configs = {
"db_host": "postgres.internal:5432",
"redis_url": "redis://cache.internal:6379",
"feature_flags": "new_ui=true,dark_mode=false",
}
return configs.get(config_key, "")
# Simulate 10 concurrent threads all requesting the same config key
results = []
errors = []
def worker(key: str):
try:
value = get_config(key)
results.append(value)
except Exception as e:
errors.append(str(e))
threads = [threading.Thread(target=worker, args=("db_host",)) for _ in range(10)]
start = time.perf_counter()
for t in threads:
t.start()
for t in threads:
t.join()
elapsed = time.perf_counter() - start
print(f"10 threads completed in {elapsed:.2f}s")
print(f"All results identical: {len(set(results)) == 1}")
print(f"Result: {results[0]}")
print(f"Errors: {errors}")
print(f"Cache size: {len(cache)}")
Output:
10 threads completed in 0.10s
All results identical: True
Result: postgres.internal:5432
Cache size: 1
With the lock=threading.RLock() parameter, the first thread to request "db_host" acquires the lock, makes the slow call, and populates the cache. All other threads wait for the lock, then immediately get the cached value without making additional calls. The total time is roughly one slow call (100ms) instead of ten (1000ms). Always pass a lock when your cached function will be called from multiple threads -- this includes web frameworks like Flask and Django where request handlers run concurrently.
Caching Instance Methods with @cachedmethod
The @cached decorator creates a single shared cache for all calls to a function. For class methods, you often want each instance to have its own cache. Use @cachedmethod with a cache accessor function:
# cachedmethod_example.py
import threading
from cachetools import TTLCache, cachedmethod
from cachetools.keys import hashkey
class WeatherService:
"""Fetches weather data with per-instance TTL caching."""
def __init__(self, api_key: str, cache_ttl: int = 300):
self.api_key = api_key
self._cache = TTLCache(maxsize=50, ttl=cache_ttl)
self._lock = threading.RLock()
@cachedmethod(cache=lambda self: self._cache, lock=lambda self: self._lock)
def get_weather(self, city: str) -> dict:
"""Simulate API call -- 200ms delay."""
import time
time.sleep(0.2)
# In production: return requests.get(f"https://api.weather.com/{city}").json()
return {
"city": city,
"temp_c": len(city) * 3, # predictable fake data
"humidity": 65,
"description": "Partly cloudy",
}
def clear_cache(self):
self._cache.clear()
@property
def cache_size(self):
return len(self._cache)
# Each instance gets its own cache
svc1 = WeatherService(api_key="key-abc", cache_ttl=300)
svc2 = WeatherService(api_key="key-xyz", cache_ttl=60)
import time
start = time.perf_counter()
print(svc1.get_weather("Sydney"))
print(f"First call: {(time.perf_counter()-start)*1000:.0f}ms")
start = time.perf_counter()
print(svc1.get_weather("Sydney")) # cache hit in svc1
print(f"Second call (svc1): {(time.perf_counter()-start)*1000:.0f}ms")
start = time.perf_counter()
print(svc2.get_weather("Sydney")) # cache miss in svc2 (separate instance)
print(f"First call (svc2): {(time.perf_counter()-start)*1000:.0f}ms")
print(f"\nsvc1 cache size: {svc1.cache_size}")
print(f"svc2 cache size: {svc2.cache_size}")
Output:
{'city': 'Sydney', 'temp_c': 18, 'humidity': 65, 'description': 'Partly cloudy'}
First call: 202ms
{'city': 'Sydney', 'temp_c': 18, 'humidity': 65, 'description': 'Partly cloudy'}
Second call (svc1): 0ms
{'city': 'Sydney', 'temp_c': 18, 'humidity': 65, 'description': 'Partly cloudy'}
First call (svc2): 201ms
svc1 cache size: 1
svc2 cache size: 1
The lambda self: self._cache accessor tells @cachedmethod which cache object to use for each instance. Because each instance stores its own self._cache, the two WeatherService objects have independent caches with independent TTLs. This pattern is especially useful when different instances connect to different backends or have different freshness requirements.
Real-Life Example: Caching API Responses in a Flask App
The following Flask application caches responses from a public REST API, demonstrating TTL caching, cache inspection, and manual invalidation:
# flask_cached_api.py
import time
import threading
from flask import Flask, jsonify
from cachetools import TTLCache, cached
app = Flask(__name__)
# Cache up to 200 responses, each valid for 5 minutes
response_cache = TTLCache(maxsize=200, ttl=300)
cache_lock = threading.RLock()
import urllib.request
import json as _json
def _fetch_post(post_id: int) -> dict:
"""Fetch a post from JSONPlaceholder (real public API)."""
url = f"https://jsonplaceholder.typicode.com/posts/{post_id}"
with urllib.request.urlopen(url, timeout=5) as resp:
return _json.loads(resp.read())
@cached(cache=response_cache, lock=cache_lock)
def get_post_cached(post_id: int) -> dict:
"""Return post data, hitting cache if available."""
return _fetch_post(post_id)
@app.route("/posts/")
def post_detail(post_id: int):
start = time.perf_counter()
data = get_post_cached(post_id)
elapsed_ms = (time.perf_counter() - start) * 1000
return jsonify({
"data": data,
"cache_size": len(response_cache),
"elapsed_ms": round(elapsed_ms, 1),
})
@app.route("/posts//invalidate", methods=["DELETE"])
def invalidate_post(post_id: int):
key = (post_id,)
with cache_lock:
if key in response_cache:
del response_cache[key]
return jsonify({"invalidated": True, "post_id": post_id})
return jsonify({"invalidated": False, "reason": "not in cache"})
@app.route("/cache/stats")
def cache_stats():
return jsonify({
"size": len(response_cache),
"maxsize": response_cache.maxsize,
"ttl_seconds": response_cache.ttl,
})
if __name__ == "__main__":
app.run(debug=True, port=5000)
Testing the API:
# Terminal -- run the Flask app first, then:
# First request (cache miss -- ~120ms API round-trip)
curl http://localhost:5000/posts/1
# {"data":{"userId":1,"id":1,"title":"sunt aut facere..."},"cache_size":1,"elapsed_ms":124.3}
# Second request (cache hit -- under 1ms)
curl http://localhost:5000/posts/1
# {"data":{"userId":1,"id":1,"title":"sunt aut facere..."},"cache_size":1,"elapsed_ms":0.1}
# Check cache stats
curl http://localhost:5000/cache/stats
# {"maxsize":200,"size":1,"ttl_seconds":300}
# Invalidate post 1 manually
curl -X DELETE http://localhost:5000/posts/1/invalidate
# {"invalidated":true,"post_id":1}
# Next request fetches fresh data
curl http://localhost:5000/posts/1
# {"data":{...},"cache_size":1,"elapsed_ms":118.7}
This pattern -- TTLCache + threading lock + manual invalidation endpoint -- covers 90% of real API caching scenarios. The TTL handles the common case (data goes stale and we eventually want fresh data), the lock handles concurrent requests (only one thread calls the API for each unique post ID), and the DELETE endpoint handles the uncommon case (we know the data changed and want to force a refresh immediately).
Frequently Asked Questions
What is the difference between cachetools and functools.lru_cache?
The main differences are expiry, size control, and flexibility. functools.lru_cache is built in to Python, never expires its entries, and cannot be shared across instances. cachetools adds TTL-based expiry, several eviction policies beyond LRU, and a cache object you can inspect and manipulate independently of the cached function. If you need expiry or you need to invalidate specific cache entries, use cachetools. For simple memoization with no expiry, functools.lru_cache is simpler.
How do I cache functions with unhashable arguments?
cachetools cache keys must be hashable by default. If your function takes a list, dict, or other unhashable type, use a custom key function. For example, to cache a function that takes a list: @cached(cache=LRUCache(128), key=lambda lst: tuple(sorted(lst))). The key function converts the unhashable argument to a hashable representation. cachetools also provides cachetools.keys.hashkey and cachetools.keys.typedkey for common scenarios.
Does cachetools work with async functions?
The standard @cached decorator does not work with async def functions. For async code, you need to either use a synchronous wrapper (call the async function synchronously inside a cached sync function) or use an async-aware caching library like asyncache, which provides @acached and @acached_method decorators compatible with cachetools cache classes. The cachetools cache objects themselves are compatible with async code as long as you use an appropriate async lock such as asyncio.Lock.
How do I handle cache stampede (thundering herd)?
Cache stampede happens when many concurrent requests arrive for an expired cache entry simultaneously -- they all miss the cache, all call the underlying function at the same time, and all receive the result within milliseconds of each other, flooding your backend. The threading lock pattern shown in this article prevents stampede by serializing cache misses: only one thread calls the underlying function at a time. For async applications with very high concurrency, consider adding a probabilistic early expiry (compute a new value when the TTL is 80% elapsed with some probability) or use a separate in-progress flag to detect and suppress parallel computations.
When should I use Redis instead of cachetools?
Use cachetools when the cached data only needs to survive within a single process. Use Redis when you need to share the cache across multiple processes or machines (such as multiple web server workers), when the cache must survive process restarts, or when the cached data is large enough that storing it in-process would consume too much RAM. A common production architecture uses both: cachetools for a fast per-process L1 cache with a short TTL, and Redis for a shared L2 cache with a longer TTL.
Conclusion
cachetools gives you LRU, TTL, LFU, and random-replacement caches as plain Python objects, plus the @cached and @cachedmethod decorators for zero-friction function memoization. You have seen how to apply LRUCache for general memoization, TTLCache for time-expiring API responses, thread-safe caching with lock=threading.RLock(), per-instance caching with @cachedmethod, and a real Flask application with manual cache invalidation.
The best next step is to profile your application and find the three slowest function calls that are called repeatedly with the same arguments. Wrap each one with a TTLCache and an appropriate TTL -- 60 seconds for data that changes often, 300 seconds for data that changes rarely. Measure the before-and-after response times and cache hit rates. For most applications, this change alone produces a measurable improvement in throughput and latency.
For the full cachetools API reference including MRU, RR caches, and custom key functions, see the official cachetools documentation.
Related Articles
Further Reading: For more details, see the Python virtual environments documentation.
Frequently Asked Questions
Is Deta still free for hosting Python apps?
Deta Space offers a free tier for personal use. The original Deta.sh Micros service has evolved. For free Python hosting alternatives, consider Railway, Render, PythonAnywhere, or Google Cloud Run’s free tier.
What are the best free Python hosting alternatives?
PythonAnywhere offers a free tier for web apps. Render provides free static sites and web services. Railway has a free trial. Google Cloud Run and AWS Lambda have generous free tiers for serverless deployments.
How do I deploy a Python Flask app for free?
Use Render (connect GitHub repo), PythonAnywhere (upload directly), or Railway (deploy from GitHub). Each provides different advantages for hobby and small-scale projects.
What should I consider when choosing Python hosting?
Consider free tier limits, sleep/cold-start behavior, database availability, custom domain support, deployment method, Python version support, and scaling options.
Can I host a Python bot or script for free?
Yes. PythonAnywhere allows always-on tasks. Google Cloud Functions and AWS Lambda handle event-driven scripts. For Discord/Telegram bots, Railway and Render offer free tiers suitable for small bots.
Continue Learning Python
Tutorials you might also find useful: