Below is a test I performed to understand how Kubernetes allocates resources.
Test Plan
- In a K8S sandbox, add a few deployments with varying
requests
andlimits
- Generate a fake high CPU load/high memory usage in the pods
Expected outcome
- K8S does not allow over allocation (i.e. total resources requested in
requests
must not exceed total physical limit) - When no spare resources are available, apps get
OOM
if they try to go beyondrequests
even iflimits
are not hit
Test Environment
I am subscribed to acloudguru.com on the Personal Basic Plan which does not include sandbox access. I usually use the hands-on labs for testing. In this test, I’m using this lab: Upgrading the Kubernetes Cluster Using kubeadm
Kubernetes Controller and 2 workers: v1.17.8
Physical limits (all nodes): 2 CPU cores, 4GB RAM
Test 1 (over procurement)
create a deployment under the resource constraints
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: ubuntu-deployment
labels:
app: ubuntu
spec:
replicas: 2
selector:
matchLabels:
app: ubuntu
template:
metadata:
labels:
app: ubuntu
spec:
containers:
- name: app
image: ubuntu:latest
command: ["/bin/sleep", "100d"]
resources:
requests:
memory: "2G"
cpu: "1"
After applying
$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
ubuntu-deployment-5d684b4cbc-tnw5v 1/1 Running 0 37s 192.168.111.3 ip-10-0-1-103 <none> <none>
ubuntu-deployment-5d684b4cbc-znkdh 1/1 Running 0 37s 192.168.163.197 ip-10-0-1-102 <none> <none>
Change the replicas from 2 to 3 (i.e. requesting more resources than the physical limit). After applying, the 3rd pod is always in Pending state
~$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
ubuntu-deployment-5d684b4cbc-9vkm4 0/1 Pending 0 7s <none> <none> <none> <none>
ubuntu-deployment-5d684b4cbc-tnw5v 1/1 Running 0 73s 192.168.111.3 ip-10-0-1-103 <none> <none>
ubuntu-deployment-5d684b4cbc-znkdh 1/1 Running 0 73s 192.168.163.197 ip-10-0-1-102 <none> <none>
A check for why it is pending (`kubectl describe pod ubuntu-deployment-5d684b4cbc-9vkm4`):
0/3 nodes are available: 1 node(s) had taints that the pod didn't tolerate, 2 Insufficient cpu.
So this proves that over procuring resources is prevented at creation.
But what happens if the physical limit drops in the existing cluster? i.e. I did not over procure, but half of my cluster is unavailable.
To test this, I modified the deployment back to 2 replicas, then powered off one of the worker nodes (worker2). After a few minutes, the pod that lived on worker 2 showed Terminating
and a new pod tried to start on worker1 but since no resources are available, it is stuck as Pending
.
Test 2 (limits)
First, create a deployment where requests
are below the physical limit, but setlimits
above the physical limit.
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: ubuntu-deployment
labels:
app: ubuntu
spec:
replicas: 2
selector:
matchLabels:
app: ubuntu
template:
metadata:
labels:
app: ubuntu
spec:
containers:
- name: app
image: ubuntu:latest
command: ["/bin/sleep", "100d"]
resources:
requests:
memory: "1G"
cpu: "100m"
limits:
memory: "6G"
cpu: "4"
Then log on to one of the pods and generate a high memory load.
root@ubuntu-deployment-d4cdbbc86-8vvx8:/# head -c 5G /dev/zero | tail
Killed
The rogue process was killed and kubectl describe node <the node>
shows a system OOM too:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning SystemOOM 34s kubelet, ip-10-0-1-102 System OOM encountered, victim process: tail, pid: 19925
Conclusion
- over procuring resources are prevented at pod creation
- when a cluster has a dip in available resources, pods may change from RUNNING to PENDING until resources become available again
- when a process consumes less than
limits
but more than the physical limit of the cluster, the process will be killed