Design a centralized metrics and logging pipeline

Learn Netverks Curriculum map

BO Company prep Booking.com

Senior (5–8 years) System design Medium

Reported in Booking.com European engineering loops. Observability architecture interview question for platform roles.

Role: SRE
Location: Munich, Germany

Interview scenario

Context for Booking.com candidates:

Collect logs and metrics from thousands of services with searchable dashboards and alerting.

Spoiler-free prep mode

How to frame this at Booking.com: Connect your answer to measurable impact, clarity of thought, and trade-offs the team cares about. Below is a strong baseline response you can adapt with your own project examples.

Agents on each host collect telemetry and push to ingestion gateways with backpressure controls. Use separate streams for logs, metrics, and traces because storage and query patterns differ significantly.

Metrics flow into time-series storage with retention tiers; logs flow into indexed document storage or object-backed lake with hot-warm-cold strategy. Add schema standards for service name, environment, region, and correlation id.

Reliability requires buffering, retry queues, and sampling controls during incident storms. Explain SLO-driven alerts to reduce noisy thresholds and improve on-call signal quality.

Discussion

Comments (0)

Share how this question came up in your loop, or add tips for others preparing.

Interview scenario

Try answering aloud first

Comments (0)