T79 Dec 24, 2025 3 min read

Reliability

How consistently a system delivers correct results over time, including its ability to handle failures and meet targets.

Definition

Reliability is how consistently the system does the right thing over time.

In practice, you make reliability concrete with SLIs and SLOs.

Why it matters

Reliable systems fail in smaller, more predictable ways.

Unreliable systems fail as surprises.

How to make it concrete

  • Pick an SLI that matches user experience.
  • Set an SLO that matches the business reality.
  • Use the error budget to decide when to ship faster vs slow down.