Always Set Resource Requests and Limits on Every Kubernetes Pod

Kubernetes uses resource requests to schedule Pods and limits to cap runtime consumption. Omitting either creates operational problems that are nearly impossible to debug under load.

Requests vs Limits

Request — Reserved capacity the scheduler uses to find a suitable node. A Pod won't be placed unless a node has at least this much free.
Limit — Runtime cap. Exceed memory: OOMKilled. Exceed CPU: throttled (no kill, but latency spikes).

Finding the Right Values

Use kubectl top pods or Prometheus container_cpu_usage_seconds_total and container_memory_working_set_bytes metrics from a representative load period. A safe heuristic: requests at P50, limits at P99.

resources:
  requests:
    cpu: "250m"
    memory: "256Mi"
  limits:
    cpu: "500m"
    memory: "512Mi"

QoS Classes

When requests equal limits, the Pod gets Guaranteed QoS — the highest eviction priority. Kubernetes evicts BestEffort and Burstable Pods first under node pressure. For latency-sensitive services, always use Guaranteed QoS.

Enforce Cluster-Wide

Use a LimitRange to set namespace defaults and a ResourceQuota to cap total consumption. Without both, one misconfigured Deployment can exhaust all node resources and starve every other workload in the namespace.

Requests vs Limits

Finding the Right Values

QoS Classes

Enforce Cluster-Wide

More Cloud Tips