TopHome
<2020-06-22 Mon>techk8s

Gotchas in k8s QoS class calculations

Pods in kubernetes are assigned to one of 3 QoS classes:

  1. Guaranteed: Best class that gets guaranteed resource reservation.
  2. Burstable: Class that has a minimum accounted for resource with some slack given to go upward.
  3. Best Effort: No guarantees around resource availability.

The primary source to understand this is the reference page and the links there.

But, what is not immediatly clear is how the QoS class calculations are done in:

  1. The presence of multiple containers in a pod.
  2. Different resource types, ie CPU and Memory.

A first visual representation of the classes is as follows:

Request Limit QoS class Source
x x Guaranteed Ref
x - Burstable Ref
- x Guaranteed Ref (k8s autofills request to limit)
- - Best Effort Obvious
x > x Burstable Ref
x < x Deployment Failure Deployment check flags this and exits

The kicker is that we are not done here. This table above is only valid if both CPU and Memory are specified.

For example, what do you think happens if Mem(x, x) and CPU(-, -)? We would guess Guaranteed or at least Burstable, but empirically it turns out to be Best Effort.

Emperically, with Kubernetes version 1.17, the following matrix holds where Res1 is one of CPU/Memory and Res2 the other.

Res1 Res2 QoS Class
G G G
B B B
BE BE BE
G BE BE
G B B
BE B B

The first 3 rows make sense, if both resources are specified at the same QoS, the pod's QoS follows naturally. The 2 rows after also make some sense, in that, if one of the resources is Guaranteed and the other one is one of the weaker classes, the weaker class is taken. The last row says that if one them is burstable and the other best effort, the stronger class, burstable is taken as the final QoS. This is a bit un-intuitive.

I did not continue this analysis with multiple containers, but the same confusion follows. To be fair, the main reference does call out some of these in the text (not all of these combinations), though it is not the most clearest of representations.

One of the morals I take away is this: if you want a pod to be guaranteed, specify everything for all the containers (including that tailer or debugger). It's the only way to be sure.