How aboutOne key aspect of the system which is not specified is the relative frequency of tasks. Are tasks short and frequent, or long and unusual?
If they are short and frequent, consider rate of change of summed duration?. Summed duration what defines the amount of work required by the system, and thus how much of its capacity the messages are taking up. However that isn't directly what causes issues, since as you mentioned, the system can scale its capacity. Your issue is that the system scaling isn't perticuarly elastic, and you need to detect when the rate of increase in demand is higher than that scaling rate. This means that what you are looking for is message types where the amount of capacity consumed has changed suddenly, so rate of change of summed duration.
A simple way of measuring this is to measure it in fixed windows (say 1/10th of your scaling interval), and then calculate the average difference between subsequent windows.
However if the tasks are long and unusual, the issue is that its less meaningful to say a type of task is responsible, as opposed to the overall system is responsible. You need enough of a population of messages of any given type to establish an expected behavior. It might be worth grouping your messages in this case and only trying to identify the group.