Symptom: autoscaler works (it can scale up) but for some reasons, it doesn’t scale down after the load goes away.
I spent sometimes debugging and turns out, it’s not really a bug per se. More of a bad luck pod placement on my Kubernetes cluster.
I first added
--v=4 to get more verbose logging in
cluster-autoscaler and watch
kubectl get logs -f cluster-autoscaler-xxx. I notice this line from the logs
<node-name> cannot be removed: non-deamons set, non-mirrored, kube-system pod present: tiller-deploy-aydsfy
This node is in fact under-ultilized but there is a
non-deamons set, non-mirrored, kube-system pod presented, that’s why it can’t be removed.
tiller-deploy is a deployment that comes with Helm package manager.
So it seems I just have to migrate the pod to another node and it’s gonna be fine.
You can also read more on how cluster-autoscaler works here on GitHub