I can't seem to find an answer to this but what is the relationship between an HPA and ReplicaSet? From what I know we define a Deployment object which defines replicas which creates the RS and the RS is responsible for supervising our pods and scale up and down.
Where does the HPA fit into this picture? Does it wrap over the Deployment object? I'm a bit confused as you define the number of replicas in the manifest for the Deployment object.
When we create a deployment it create a replica set and number of pods (that we gave in
replicas). Deployment control the RS, and RS controls pods. Now, HPA is another abstraction which give the instructions to deployment and through RS make sure the pods fullfil the respective scaling.
As far the k8s doc: The Horizontal Pod Autoscaler automatically scales the number of Pods in a replication controller, deployment, replica set or stateful set based on observed CPU utilization (or, with custom metrics support, on some other application-provided metrics). Note that Horizontal Pod Autoscaling does not apply to objects that can't be scaled, for example, DaemonSets.
A brief high level overview is: Basically it's all about controller. Every k8s object has a controller, when a deployment object is created then respective controller creates the rs and associated pods, rs controls the pods, deployment controls rs. On the other hand, when hpa controllers sees that at any moment number of pods gets higher/lower than expected then it talks to deployment.
Read more from k8s doc