Skip to main content

Automatic Right-Sizing Troubleshooting

Cloud service provider relevance: EKS, AKS, and GKE.

VPA not reporting message appears at the top of the right-sizing page

vpa-not-reporting

This message typically indicates that the VPA updater and admission controller pods are not reporting their health status. Ocean depends on these components to inject right-sizing recommendations when pods launch. If they fail, workloads cannot be optimized, and any workload with an attached rule will move to Limited status.

The impact includes:

  • No new rules can be attached.
  • Existing rules cannot be applied to the recommendations.
  • Optimization is paused until VPA health is restored.

What You Should Do

  1. Check VPA Pod Health

    • For the Spot Ocean VPA, Spot automatically checks the health of VPA pods.

    • For the Native VPA Project, health checks work only if the deployment name hasn't been changed.

    • Run kubectl get pods -n <VPA_NAMESPACE>

      Look for updater and admission-controller pods.

      If no VPA components are listed, VPA is not installed. Continue with Step 3, sub-step 2.

  2. Verify Pod Status

    • Pods should be in the Running state.

    • If any pod is in the CrashLoopBackOff or Pending state, describe the pod by running this command:

      kubectl describe pod <pod-name> -n <VPA_NAMESPACE>

      Check for errors like missing permissions or resource limits.

  3. Restart or Redeploy VPA

    1. If pods are unhealthy, restart the deployment with this command:

      kubectl rollout restart deployment <deployment-name> -n <VPA_NAMESPACE>

    2. If VPA is missing:

      • VPA Missing Due to Permissions: The VPA pods may be missing because the Ocean controller was deployed with the disableAutoRightSizing parameter set to true in the Helm chart. This prevents automatic creation of VPA resources in the cluster.

        What You Should Do:

        1. Check the Helm chart parameter with this command:

          helm get values <release-name> -n <namespace>

          Look for:

          disableAutoRightSizing: true
        2. Fix the parameter:

          1. Change disableAutoRightSizing to false in your Helm values file.

          2. Redeploy the Ocean controller:

          3. Verify VPA deployment with this command:

            kubectl get deployments -n <VPA_NAMESPACE>

  4. Confirm Health

    • After fixing, verify that all VPA pods are running and metrics are flowing. Run this command:

      kubectl logs <recommender-pod> -n <VPA_NAMESPACE>

      Look for resource recommendation logs.

Security Group Not Correctly Configured

In this case, your pod may not be launched according to the values defined in the VPA.

To avoid this issue, ensure that the inbound rule for your node group's security group allows traffic on TCP port 443, which is used by the Spot webhook listener. This enables smooth communication between the Kubernetes API server and the webhook. Also, configure TCP port 8000 for internal health checks and metrics endpoints, which are required for webhook readiness.

See Create a security group for your Amazon EC2 instance - Amazon Elastic Compute Cloud.