Kubernetes Resource Management

One9twO

3 min readSep 22, 2023

Below is a test I performed to understand how Kubernetes allocates resources.

Test Plan

In a K8S sandbox, add a few deployments with varying requests and limits
Generate a fake high CPU load/high memory usage in the pods

Expected outcome

K8S does not allow over allocation (i.e. total resources requested in requests must not exceed total physical limit)
When no spare resources are available, apps get OOM if they try to go beyond requests even if limits are not hit

Test Environment

I am subscribed to acloudguru.com on the Personal Basic Plan which does not include sandbox access. I usually use the hands-on labs for testing. In this test, I’m using this lab: Upgrading the Kubernetes Cluster Using kubeadm

Kubernetes Controller and 2 workers: v1.17.8

Physical limits (all nodes): 2 CPU cores, 4GB RAM

Test 1 (over procurement)

create a deployment under the resource constraints

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: ubuntu-deployment
  labels:
    app: ubuntu
spec:
  replicas: 2
  selector:
    matchLabels:
      app: ubuntu
  template:
    metadata:
      labels:
        app: ubuntu
    spec:
      containers:
      - name: app
        image: ubuntu:latest
        command: ["/bin/sleep", "100d"]
        resources:
          requests:
            memory: "2G"
            cpu: "1"

After applying

$ kubectl get pods -o wide
NAME                                 READY   STATUS    RESTARTS   AGE   IP                NODE            NOMINATED NODE   READINESS GATES
ubuntu-deployment-5d684b4cbc-tnw5v   1/1     Running   0          37s   192.168.111.3     ip-10-0-1-103   <none>           <none>
ubuntu-deployment-5d684b4cbc-znkdh   1/1     Running   0          37s   192.168.163.197   ip-10-0-1-102   <none>           <none>

Change the replicas from 2 to 3 (i.e. requesting more resources than the physical limit). After applying, the 3rd pod is always in Pending state

~$ kubectl get pods -o wide
NAME                                 READY   STATUS    RESTARTS   AGE   IP                NODE            NOMINATED NODE   READINESS GATES
ubuntu-deployment-5d684b4cbc-9vkm4   0/1     Pending   0          7s    <none>            <none>          <none>           <none>
ubuntu-deployment-5d684b4cbc-tnw5v   1/1     Running   0          73s   192.168.111.3     ip-10-0-1-103   <none>           <none>
ubuntu-deployment-5d684b4cbc-znkdh   1/1     Running   0          73s   192.168.163.197   ip-10-0-1-102   <none>           <none>

A check for why it is pending (`kubectl describe pod ubuntu-deployment-5d684b4cbc-9vkm4`):

 0/3 nodes are available: 1 node(s) had taints that the pod didn't tolerate, 2 Insufficient cpu.

So this proves that over procuring resources is prevented at creation.

But what happens if the physical limit drops in the existing cluster? i.e. I did not over procure, but half of my cluster is unavailable.

To test this, I modified the deployment back to 2 replicas, then powered off one of the worker nodes (worker2). After a few minutes, the pod that lived on worker 2 showed Terminating and a new pod tried to start on worker1 but since no resources are available, it is stuck as Pending.

Test 2 (limits)

First, create a deployment where requests are below the physical limit, but setlimits above the physical limit.

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: ubuntu-deployment
  labels:
    app: ubuntu
spec:
  replicas: 2
  selector:
    matchLabels:
      app: ubuntu
  template:
    metadata:
      labels:
        app: ubuntu
    spec:
      containers:
      - name: app
        image: ubuntu:latest
        command: ["/bin/sleep", "100d"]
        resources:
          requests:
            memory: "1G"
            cpu: "100m"
          limits:
            memory: "6G"
            cpu: "4"

Then log on to one of the pods and generate a high memory load.

root@ubuntu-deployment-d4cdbbc86-8vvx8:/# head -c 5G /dev/zero | tail
Killed

The rogue process was killed and kubectl describe node <the node> shows a system OOM too:

Events:
  Type     Reason                   Age                  From                    Message
  ----     ------                   ----                 ----                    -------
  Warning  SystemOOM                34s                  kubelet, ip-10-0-1-102  System OOM encountered, victim process: tail, pid: 19925

Conclusion

over procuring resources are prevented at pod creation
when a cluster has a dip in available resources, pods may change from RUNNING to PENDING until resources become available again
when a process consumes less than limits but more than the physical limit of the cluster, the process will be killed

Kubernetes Resource Management

Written by One9twO