2025-08-092 min read

SLIs vs. SLOs: Measuring What Matters to Your Users

SREObservabilityReliability

Reliability is the most important feature of any product. But how do you measure it? Service Level Indicators (SLIs) and Service Level Objectives (SLOs) are the foundation of Site Reliability Engineering (SRE).

SLI: The Indicator

An SLI is a quantitative measure of some aspect of the level of service provided.

Example: The percentage of successful HTTP requests over a 5-minute window.
Tip: Focus on user-facing metrics like latency, availability, and error rate.

SLO: The Objective

An SLO is a target value or range of values for a service level that is measured by an SLI.

Example: 99.9% of HTTP requests should be successful over a rolling 30-day window.
Tip: An SLO should be "good enough" for the user, not perfect. 100% is rarely the right target.

Error Budgets: The Magic Sauce

The difference between 100% and your SLO is your Error Budget.

If you have budget left, you can ship new features quickly.
If you exhaust your budget, you stop new releases and focus on reliability.

How to pick a good SLI (quick checklist)

A strong SLI is:

User-focused: what users feel (latency, errors, availability).
Actionable: when it drops, you can narrow it to a service, endpoint, region, or dependency.
Stable: avoid raw infrastructure signals; normalize by requests and use rolling windows.
Measurable end-to-end: instrument it so you don’t have blind spots during incidents.

Infrastructure metrics (CPU, memory) are great diagnostics, but they are rarely good contracts.

A simple example: Checkout API

SLI (availability): % of 2xx/3xx responses on /checkout.
SLI (latency): p95 < 300 ms on /checkout.
SLO: 99.9% success and p95 < 300 ms over 30 days.

Then translate the error budget into decisions: “if remaining budget < 20%, slow down releases and prioritize stability work”.

Common mistakes

SLOs everywhere: start with critical user journeys (login, checkout, search).
Unrealistic targets: 99.99% is expensive; align ambition with product value.
Fixed calendar windows: rolling windows make trends and regressions visible.

Conclusion

SLIs and SLOs turn reliability from a vague goal into a measurable contract. They align product and engineering teams by providing a shared language for balancing innovation speed with system stability.

Want to go deeper on this topic?

Contact Demkada