Why is standard error of an estimator $\hat \theta$ defined as $$se = \sqrt{Var(\hat \theta)}$$ and not $$se = \sqrt {MSE(\hat \theta)} = \sqrt{Bias^2(\hat \theta) + Var(\hat \theta)}.$$
That is, standard error should be the square root of mean squared error. Of course, if the estimator is unbiased, there's no difference. But in any case I can think of where we use standard error, if the estimator is biased, that bias needs to be part of the error.
For example, consider performing the Wald test. We can always come up with an estimator of $\sigma^2$ of arbitrarily low variance, if we are are willing to increase the bias. For example, given $\hat \sigma^2$, define $$\hat \sigma_1^2 = (1-t)\hat \sigma^2 + tk$$ for arbitrary constants $t,k$ will give such an estimator. If we use this to perform the Wald test, we can get whatever $\alpha$ we desired, simply by lowering the se, without actually improving the test.
This problem would be solved if the definition of se would include bias - and this would be more consistent with the words standard error. Why don't we do that?
Update - Relevance for Hypothesis Testing
Terminology aside, there is a impactful question here: In cases where our estimator is indeed biased, should we use standard error or the above definition in hypothesis testing? There are cases where this will make a difference in the test result.