Cache invalidation — laranevans.com

A cached value is a copy of data that changes at its source. Cache invalidation is how a stale copy stops being served. The hard part is timing: hold a value too long and reads go stale, drop it too early and you lose the hit ratio the cache exists to provide. Every invalidation strategy sets a position on that line.

TTL bounds staleness by time

The simplest strategy attaches a time to live (TTL) to each entry and treats the entry as expired once its age passes the TTL. HTTP caching formalizes this as freshness: a stored response stays fresh while its age stays under its freshness lifetime, set by the max-age directive (RFC 9111). TTL trades precision for simplicity. You serve stale data through the window between a source change and the next expiry, and in exchange you track no per-key change events. A short TTL shrinks the stale window and cuts the hit ratio, since more reads land after expiry. The TTL is the dial between those two.

Explicit invalidation targets the change

When staleness carries a real cost, you invalidate on the write instead of waiting for a timer. Cache-aside does this by deleting the key after updating the database, so the next read reloads it (Microsoft's cache-aside reference). The order matters: update the store first, then invalidate the cache. Invalidate first, and a concurrent read reloads the old value before the store changes, leaving a stale entry behind (Microsoft). Explicit invalidation is precise and event-driven, and it costs you a path from every write to every cache that holds the key, which gets harder as caches multiply across machines.

A hot key's expiry triggers a stampede

Expiry has a failure mode that surfaces under load. When a popular key expires, every concurrent request misses at the same moment, and every one of them queries the database. This is a cache stampede, also called a thundering herd, and a single hot key turns into thousands of duplicate queries in the window before the cache refills. Three mitigations attack it from different sides:

Single-flight. Let one request recompute the value while the others wait for the refill. Facebook's memcache does this with leases, a token a client receives on a miss and presents when it sets the value, so one client repopulates a hot key while the rest hold (Nishtala et al.).
Stale-while-revalidate. Serve the expired value right away and refresh it in the background, so no request blocks on the recompute. HTTP defines this as the stale-while-revalidate directive (RFC 5861).
Probabilistic early expiration. Have each read recompute the value slightly before the TTL, with a probability that climbs as expiry nears, so refreshes spread across time instead of firing together. Vattani, Chierichetti, and Lowenstein give an optimal form of this scheme (Optimal Probabilistic Cache Stampede Prevention).

Adding random jitter to TTLs handles the related case where many keys loaded together expire together. A spread of a few percent desynchronizes their expiry.

Choosing an invalidation strategy

Start with a TTL sized to the staleness your readers tolerate, since it needs no write-path wiring and bounds how wrong a read gets. Add explicit invalidation on top when specific data has to reflect a write quickly, like a price or a permission. Reach for stampede protection once a single key gets hot enough that its expiry spikes the database. The freshness you promise is a consistency choice, the same one PACELC frames for replicated data: serving a stale cached read trades consistency for latency, on purpose.