Troubleshooting Project Quotas and Resource Limits¶

This guide helps L1 analysts diagnose issues where applications cannot scale or new resources cannot be created due to Namespace-level restrictions.

1. Resource Quota Exhaustion¶

Resource Quotas limit the total amount of CPU, Memory, and other resources a project can consume.

Common Symptoms¶

Forbidden Errors: "Error from server (Forbidden): pods 'xyz' is forbidden: exceeded quota."
Scaling Failures: A deployment is scaled up, but the new pods never appear (not even in Pending state).

Resolution Steps¶

Check Quota Status: Identify which specific resource (CPU, Memory, Pods, Services) has reached its limit.

oc get quota -n <namespace>

Describe Quota Details: See the "Used" vs "Hard" limits.

oc describe quota <quota_name> -n <namespace>

2. LimitRange Enforcement¶

LimitRanges set the minimum and maximum resource requests/limits for individual containers within a specific namespace. They also define default values for containers that do not specify their own resource requirements.

Common Symptoms¶

Validation Errors: Attempts to create or scale a Pod fail with messages like: "Pod is forbidden: maximum memory usage per Container is 512Mi, but limit is 1Gi."
Missing Requests: Pod creation is blocked because the namespace requires explicit resource definitions: "Pod is forbidden: CPU request for container is required."
Unexpected Defaults: Pods are running with different resource values than expected because the LimitRange applied a default "burst" setting.

Resolution Steps¶

View LimitRange Rules: Identify the constraints (min, max, and default) enforced within the namespace.

# List all LimitRanges in the namespace
oc get limitrange -n <namespace>

# View the specific constraints of a LimitRange
oc describe limitrange <limitrange_name> -n <namespace>

Action: Advise the application owner or developer to adjust the resources.requests and resources.limits sections in their Deployment/Pod YAML to comply with the boundaries identified above.

3. Application Rollbacks (Safe Recovery)¶

When a new deployment version fails due to misconfiguration, image issues, or application crashes, L1 support can perform a safe rollback to the last known working state to minimize downtime.

Resolution Steps¶

View Deployment History: Check the list of previous revisions to identify the stable version of the workload.

# List all previous revisions for a deployment
oc rollout history deployment/<deployment_name> -n <namespace>

# View details of a specific revision (e.g., revision 2)
oc rollout history deployment/<deployment_name> -n <namespace> --revision=2

Perform Rollback: Immediately revert the deployment to the previous version to restore service.

# Rollback to the immediate previous version
oc rollout undo deployment/<deployment_name> -n <namespace>

# Rollback to a specific stable revision
oc rollout undo deployment/<deployment_name> -n <namespace> --to-revision=2

4. Project Request Failures¶

This scenario occurs when a user is unable to create a new Project (Namespace) via the OpenShift Web Console or the CLI, often resulting in "Access Denied" or timeout errors.

Resolution Steps¶

Check Self-Provisioner Status: Verify if the user (or their associated group) has the necessary self-provisioner role to request new projects.

oc adm policy who-can create projectrequests

Check Cluster Load and Templates: If project creation is significantly delayed or failing with internal errors, verify the status and existence of the global project-request template.

oc get template -n openshift-config

Escalation Criteria: If a project requires a permanent increase in Quota limits to accommodate new workloads, gather the current usage statistics (oc describe quota -n ) and escalate the request to the Project Owner or L2 Cluster Admins for formal approval and implementation.