Consider a statistical test $\delta$, whose power function is $\pi(\theta|\delta)=\text{Pr}(H_0 \text{ is rejected}|\theta)$. The size of a statistical test $\delta$ is then given as
$$\alpha(\delta)=\sup_{\theta\in\Theta_0}\pi(\theta|\delta)$$
I understand $\alpha(\delta)$ as a summarization statistic of $\pi(\theta|\delta)$ over $\Theta_0$. It gives an overview of how likely $\delta$ would make type-I error, given that the null hypothesis $H_0$ is true.
However, I don't understand why supremum is preferred over other statistics, e.g. mean, median, etc. in the formulation of $\alpha(\delta)$. For example, we may take
$$\alpha(\delta) = E_{\theta\in\Theta_0}[\pi(\theta|\delta)]$$
Is there any motivation behind using $\sup_{\theta\in\Theta_0}$ for $\alpha(\delta)$?