Cloud Tip — Mar 2026

Always Set Resource Requests and Limits on Every Kubernetes Pod

The EaseCloud Team · · 1 min read

Kubernetes uses resource requests to schedule Pods and limits to cap runtime consumption. Omitting either creates operational problems that are nearly impossible to debug under load.

Requests vs Limits

  • Request — Reserved capacity the scheduler uses to find a suitable node. A Pod won't be placed unless a node has at least this much free.
  • Limit — Runtime cap. Exceed memory: OOMKilled. Exceed CPU: throttled (no kill, but latency spikes).

Finding the Right Values

Use kubectl top pods or Prometheus container_cpu_usage_seconds_total and container_memory_working_set_bytes metrics from a representative load period. A safe heuristic: requests at P50, limits at P99.

resources:
  requests:
    cpu: "250m"
    memory: "256Mi"
  limits:
    cpu: "500m"
    memory: "512Mi"

QoS Classes

When requests equal limits, the Pod gets Guaranteed QoS — the highest eviction priority. Kubernetes evicts BestEffort and Burstable Pods first under node pressure. For latency-sensitive services, always use Guaranteed QoS.

Enforce Cluster-Wide

Use a LimitRange to set namespace defaults and a ResourceQuota to cap total consumption. Without both, one misconfigured Deployment can exhaust all node resources and starve every other workload in the namespace.

Ready to apply this in production?

Our engineers have deployed this pattern across 100+ client environments. We'll scope it for your stack and get it right.

Talk to an Expert