If a distribution belongs to a certain class, then the distribution with the largest entropy in that class is typically referred to as the least-informative distribution. To me, this his highly confusing. We have the following definition of self-information (or information content):
$$I(X) = -\text{log}(P(X))$$
More often than not, this is referred to as surprise, but I prefer the term information (after all, it's called information theory not surprise theory).
Entropy, then, is the expected surprise or expected information of a random variable
$$H(X) = \mathbb{E}[I(X)] = \mathbb{E}[-\text{log}(P(X))]$$
So, the distribution which has the maximum entropy in a given class is the distribution that is, on average, "the most surprising", but that should mean, given our setup, that this is the distribution which has the "most information" on average by the definition that we've given of self-information.
EDIT In the Wikipedia article on this topic, there is the following phrase:
Consider a discrete probability distribution among $m$ mutually exclusive propositions. The most informative distribution would occur when one of the propositions was known to be true. In that case, the information entropy would be equal to zero.
My understanding is that, in information theory, a sure event has zero information content. So, when we say that the maximum entropy distribution is the least informative are we using a definition of information that is the exact opposite of the definition set out in information theory?
All this boils down to the following question: when we say that "a maximum entropy distribution is the least informative distribution in a given class" is this in fact using the opposite definition set out in information theory? Or is the definition of information actually compatible with this and I am missing something?