Cache
Intro
Caching means saving and reusing the result for specific request without reprocessing repeatedly.
Ultimate goal: cache should contain most often used entries
Replacement strategies: Cache is fast but expensive, so usually limited in size. If cache is full, some entries need to be discarded to make room for future cache.
Incoherency: when there is difference in
- Cache vs main store
- Among multiple caches
Maintain consistency between cache vs store, among multiple caches.
Cache entry is created/saved in key-value map: input <-> output
Reading: hit or miss
Each time a client requests to the cache, the cache first check if the entry exist by search by key:
- If hit: the cache return value
- If miss:
- The cache read data from underlying store
- Save that data into cache: key-value
- Return the data to client
Writing/Creating policies
For the writing operation, when writing data to cache, it must write that data to the backing store as well: Wikipedia writing policies
Write-through
Write is done synchronously both to the cache and to the backing store
- Read operation:
- On read-hit: just return cache entry value
- On read-miss:
- Read data from store
- Find possible slot (empty or least recent used)
- Write to cache entry
- Return cache entry value
- Write operation:
- On write-hit: overwrite to cache + write to store
- On write-miss: write to store
Write-back (write-behind)
Some concepts:
- A cache entry is “dirty” if its value isn’t written to store yet.
- Lazy-write: we write a cached value to the store, only when that cached value is about to be overwritten.
- In a write operation, if write-miss, we have 2 options:
- Create a cache entry: called “write allocate” (aka fetch on write)
- Don’t create a cache entry: called “no-write allocate” (aka write-no-allocate or write around)
- Risk to lost data: when cache is shutdown ungracefully, dirty data isn’t written to store
For the case cache miss, when creating a cache entry:
- If there is an empty slot: use that slot
- If no empty slot:
- Find a possible existing entry to replace (many strategies)
- If the existing entry is dirty, write its value to store
- Mark the entry as “not-dirty” (since the data is the same in cache and store)
- Set the key-value to that entry
Read operation
- On read-hit: just return cache entry value
- On read-miss:
- Read data from store, and create a cache entry as above strategy
- Return cache entry value
Write operation
- On write-hit: overwrite to cache entry
- On write-miss:
- For strategy “write allocate” (update the cache on write)
- Create a cache entry as above strategy
- For strategy “no-write allocate” (don’t update the cache on write)
- Just write to store
- Don’t create cache entry
- For strategy “write allocate” (update the cache on write)
- Mark cache entry as dirty (means not written to store yet)
Write-through vs write-back
- Write-through: data consistency bw cache and store
- Write-back:
- Store is only updated when a cache entry is overwritten
- Dirty entries can be lost if cache is shutdown unexpectedly
- Extra work to track if a cache entry is dirty or not
- In read-miss:
- Write-through: 1 operations: Load from store
- Write-back: 2 operations: Write to store + Load from store
- For write-through, we always need to write directly to store, creating cache on write-miss doesn’t benefit. So, no-write-allocate strategy is preferred.
- For write-back, we don’t always write to store, instead to cache, so creating cache on write-miss benefits on next write-hit.
Cache coherence
There are cases that some modules directly change data in store -> cache data is out of date. Solutions:
- Store must trigger update cache if there is any change
- Or modules must trigger update cache if directly change the store
For the system has multiple caches, if one cache is changed, other caches is out of date. Some cache coherency protocols must be applied. Goals:
- Write Propagation: Changes in any cache must be propagated to other caches
- Transaction Serialization:
- All writes to the same memory location are performed in some sequential order
- All reads to the same memory location must be seen by all processors in the same order.
Mechanism: Snooping
Aka Bus-based.
Every cache has a monitor.
It watches the memory where it has cached entry, to detect when there is write operation to that memory.
When a write operation happens to a cache, it will broadcast notification to bus.
Then other cache controllers has 2 options:
- Write-invalidate:
- The cache controller marks the cache entry as invalid.
- So the next read to the cache will trigger re-read from memory to cache entry.
- Write-update:
- The cache controller immediately re-read from memory and update cache entry.
- Next read to cache doesn’t need to re-read from memory
This is basically broadcasting write-operation notification to all cache monitors.
Speed is faster, but require more bandwidth if the system becomes larger.
Mechanism: Directory-based
Instead of using bus, use a directory to track:
- The status of all cache blocks
- Which nodes are sharing that block at that time
All cache nodes don’t broadcast. Instead, it queries the directory:
- To find out which parties also have that cache entry
- Send request only to those parties
Replacement policies
The cache is expensive, so it’s usually limited in size. When full, cache needs strategy to make room for most often used entries.
Applications
Hardware cache
- CPU
- MMU, TLB
- GPU
- DSP
Disk cache
Web cache
- Client cache
- Web browser caches for static resources
- Server cache
- Proxy cache at ISP
- CDN
- Web server cache (for content from app servers)
- DNS cache (IP <-> Domain)
- Search engine cached web pages
- Database cache
- Query cache: Hash of query <-> result set.
- Issues: Changes in element (table, column) makes all related cache entries invalid
- Object cache
- Query cache: Hash of query <-> result set.