I have an auto-scaling group with two target scaling policies, one is based on SQS backlog and another on CPU-Utilization. I have initially set the capacities at minimum-1,desired-1,maximum-4. During the scale-out process, my desired capacity changed to 2 automatically. Now all the messages in the queues were successfully processed and the two scaling processes are requesting the ASG to scale in. But, the desired capacity is still at 2 and is not coming down. Is there a logic that I am missing? or do I need to do it manually? 
- Can you please show us your Scale-In Policies?John Rotenstein– John Rotenstein2021-06-17 07:50:38 +00:00Commented Jun 17, 2021 at 7:50
- @JohnRotenstein hi I have uploaded the image with two target tracking policies. One is base on a custom metric and another is based on CPU utilization.DC_Valluru– DC_Valluru2021-06-17 07:57:40 +00:00Commented Jun 17, 2021 at 7:57
- You seem to have two Target Tracking policies -- one related to Average CPU of EC2 instance(s), and the other related to what appears to be a custom metric? It isn't clear what would happen if the two policies conflict with each other. I would recommend choosing only one Target Tracking metric.John Rotenstein– John Rotenstein2021-06-17 08:02:47 +00:00Commented Jun 17, 2021 at 8:02
- As per AWS documentation, a conflict would not occur between two target scaling policies as AWS tries to provide the maximum number of instances after comparing the two policies. Now the scale-in alarms for both the policies are active and still, the desired capacity is not coming down to its original set value.DC_Valluru– DC_Valluru2021-06-17 08:05:53 +00:00Commented Jun 17, 2021 at 8:05
- Well, perhaps one policy is wanting 2 instances and the other policy is wanting 1 instance. From what you say, it would provide "the maximum number of instances", which would be 2.John Rotenstein– John Rotenstein2021-06-17 08:07:28 +00:00Commented Jun 17, 2021 at 8:07
1 Answer
I had the same issue and found that scaling policies are involved. For example, if we set a scaling policy based on CPU utilization, like scaling up when it goes above 30%, AWS automatically adjusts the desired capacity, but only up to the maximum limit we specify. When CPU utilization decreases, the desired capacity can reach the minimum capacity value.
In essence, the desired capacity serves as a pointer that can change within the specified minimum and maximum values. However, this change only happens when scaling policies are defined. If no scaling policies are set, the desired capacity stays the same by default, so manual adjustments are necessary if we want it to change.
For more reference you can go to this: What "desired instances" is needed for? AWS Amazon Webservices AutoScaling group