Design a distributed caching layer for a high-traffic read-heavy application
Reported in Booking.com European engineering loops. Senior system design combining sharding, eviction, consistency, and observability.
Interview scenario
Often asked in Booking.com loops at European offices (London, Berlin, Amsterdam, Paris, Stockholm, Dublin, and remote EU). Prepare a clear spoken answer plus key trade-offs.
Model answer
Try answering aloud first
Cover trade-offs, structure, and a concrete example before revealing the baseline response.
How to frame this at Booking.com: Connect your answer to measurable impact, clarity of thought, and trade-offs the team cares about. Below is a strong baseline response you can adapt with your own project examples.
Requirements: sub-millisecond reads, horizontal scale, TTL support, tolerate node failures, optional near-cache on app servers.
Architecture: Client-side consistent hashing to shard keys across Redis/Memcached cluster; replication for HA; optional local LRU (Caffeine) as L1 with invalidation via pub/sub.
Patterns: cache-aside with TTL jitter to prevent thundering herd; single-flight on miss; negative caching for absent keys briefly to protect DB.
Consistency: accept eventual consistency; version stamps in values; write-through for critical keys. Monitor hit ratio, latency p99, evictions, and hot keys—consider read replicas or key splitting for hot spots.
Discussion
Comments (0)
Share how this question came up in your loop, or add tips for others preparing.
Log in to comment on this question.