On Cgroup based CPU Limiting for Kuberenetes pods

Obviously, cgroups can be used to restrict resources to a process, like CPU levels, since that's the whole point of it. But, how do you actually do it? Especially with existing Kubernetes pods?

You can use the cgcreate and cgexec to create and run processes in cgroups, but let us assume you already have cgroups, because the process you want to focus on are actually running inside containers in pods managed by Kubernetes.

The cgroup CPU filesystem is mounted at /sys/fs/cgroup/cpu. Within this you can find the cgroup hierarchy. Some of these entries refer to the cgroups created automatically by Kubernetes. The file names reveal the relation to the pods - for example, all Burstable pods are within a top level cgroup, with each pod inside having its own hierarchy.

At the leaf of the hierarchy, is a file cpu.cfs_quota_us which has a single entry implying the cores the cgroup is allowed to use per unit time which is us here. A value of 100000 means 1 core. So, 50000 means half a core and 200000 means 2 cores.

Simply writing the desired value into this file controls the amount of CPU allowed to the cgroup. The change is instantaneous and fine-grained.

Based on this concept, I have written a kubectl-plugin: https://github.com/ChanderG/kubectl-vscale

The idea is to very easily control the amount of cpu resources alloted to a pod. This differes from the normal Pod scaling mechanisms since this does not restart anything. Obviously, this is not meant for production use, since k8s no longer has the correct picture of the node's total available resources.