0

I want to have a scale-in policy where nodes are scaled-in when the CPU is under 30%.

How can I control that the individual instance selected for scaling-in is the one with lowest CPU or at least a CPU under 30%?

1 Answer 1

3

By default, Auto Scaling first identifies which Availability Zone has the most instances (or, if they are equal, it picks a random AZ). Then, it uses the Termination Policy to determine which instance to control which instance Auto Scaling terminates during scale-in.

However, Auto Scaling does not select an instance based upon how 'busy' that particular is -- after all, that instance might be more busy, or less busy, a few seconds later.

Fortunately, Auto Scaling uses connection draining to allow in-flight requests to complete before the instance is terminated. Therefore, in theory, it doesn't matter that the instance is temporarily busy.

If you have long-running tasks on an instance that you don't want interrupted, you can configure Auto Scaling Lifecycle Hooks to move instances into a Terminating:Wait state. The instances will not receive any new traffic. Your application can then signal when the long-running task has completed (eg copying log files to S3, or finishing video rendering) and Auto Scaling will then terminate the instance.

Finally, if you want more fine-grained control over which instance will be terminated, you (or your application) can specifically Detach EC2 Instances From Your Auto Scaling Group via the management console or a describe-auto-scaling-instances API call.

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.