Alex Ewerlöf Notes

Share this post

Definition of good in SLI

blog.alexewerlof.com

Discover more from Alex Ewerlöf Notes

Technical Leadership, software reliability, growth
Over 5,000 subscribers
Continue reading
Sign in
SRE

Definition of good in SLI

How to use the definition of "good" in the service level formula to focus the optimization?

Alex Ewerlöf
Aug 8, 2023
Share this post

Definition of good in SLI

blog.alexewerlof.com
4
Share

In a recent article, we discussed the service level indicator formula:

\(SLI = \frac {\text{good}} {\text{valid}} \times 100\)

Another article discussed the valid. This article talks about the definition of good.

Depending on the type of SLI:

  • Time-based SLI: good specifies a good time slot

  • Event-based SLI: good specifies a good event

There are basically 4 types of declarations for good.

1. Upper bound

This is by far the most common type of declaration for good where the value of a metric is considered good if it is below an upper threshold —denoting “good enough”.

For example if our SLI is trying to improve the latency, a sufficiently fast request can have a latency of 200ms. Or it can be 1000ms. The value here needs to be connected to something the consumer cares about and how the reliability is perceived by the consumer.

2. Lower bound

Conceptually this one is similar to the upper bound but the opposite: the value of a metric is considered good if it is above a lower threshold —denoting “good enough”.

For example, an expensive worker that consumes some queue (think Midjourney prompts on an expensive GPU), the utilization on those machines should be high.

3. Range bound

A combination of the upper bound and lower bound. If the metric value is within a range, it’s considered good.

4. No bound

In this case, good defined as a subset of:

  • Time (for time-based SLIs). For example, if our goal is to improve the website uptime:

    • good time: minutes where the site can be pinged

    • valid time: all the minutes in the compliance period (eg. a month)

  • Events (for event-based SLIs). For example, if our goal is to improve the product purchase flow:

    • good events can be the number of orders processed with a settled payment

    • valid events can be the number of orders placed via the website and apps

Conclusion

Depending on the type of SLI, good either specifies good events or good time periods. See this other article for more information:

Time based vs Event based SLIs

Time based vs Event based SLIs

Alex Ewerlöf
·
Aug 7
Read full story

Definition of good is also related to valid so make sure to check that article as well:

SLI: Valid vs Total

SLI: Valid vs Total

Alex Ewerlöf
·
Aug 8
Read full story
Share this post

Definition of good in SLI

blog.alexewerlof.com
4
Share
4 Comments
Share this discussion

Definition of good in SLI

blog.alexewerlof.com
Jens Rantil
Sep 17

> For example, an expensive worker that consumes some queue (think Midjourney prompts on an expensive GPU), the utilization on those machines should be high.

Hm, I'm not entirely sure this is a good example of an SLI; I don't think the end-customer cares about "GPU utilization"... :-P

Expand full comment
Reply
Share
1 reply by Alex Ewerlöf
Jens Rantil
Sep 17Liked by Alex Ewerlöf

Actually, I don't understand the "2. lower bound". If the definition of of SLI being good/total then a lower bound means that you want _few_ good events. That doesn't make sense to me. I have always worked with upper thresholds for SLIs since I want to be above a certain ratio of "good".

Expand full comment
Reply
Share
1 reply by Alex Ewerlöf
2 more comments...
Top
New
Community

No posts

Ready for more?

© 2023 Alex Ewerlöf
Privacy ∙ Terms ∙ Collection notice
Start WritingGet the app
Substack is the home for great writing