How does entropy measure the key strength?

Question

I mean, in a pragmatic sense, in terms of a brute-force difficulty.

For example, if I have a random key accumulating $N$ bits of entropy, I’d expect to brute-force it in the order of $2^N$ trials. It works well with unbiased random bits (the key is $N$ bits long). But apparently, not if the random bits are biased.

Let’s say my bits have $P\{bit=0\} = 0.25$ and $P\{bit=1\} = 0.75$. Then the one-bit entropy is $0.81$, and a 49-bit key's Shannon entropy is 40 bits. How can I brute-force it in $2^{40}$ trials, even on average?

The first idea is to generate random trial keys with the same bias and see if they match. However, it is not difficult to see that this approach gives a different estimate of the average number of trials before success.

You proceed with the traditional brute force method but put lower priority on members of the key space that stray from the bias, rank priority based on deviation from the bias. Note that 2^40 is correct on average for an average key. — Lamira Ya
– Lamira Ya, Commented Jan 14 at 22:06

kodlu · Accepted Answer · 2025-01-15 12:23:34Z

Interesting question. Shannon entropy is not always representative of brute force strength, but it is representative in this case.

TL;DR: Either generate all strings with ratio of 0's to 1's approximately 1/3 and try those, and then other strings "near them". This is because these sequences have the maximal probability of occurence under the binomial distribution with $Pr[bit=1]=3/4:=p.$ When picking sequences near these flipping a $0$ to a $1$ has a smaller probability penalty than doing the opposite so try those first. Scroll to the end for a description of the attack.

Edited to add reference: See the Asymptotic Equipartition Property which states in its simplest and weakest form that for any finite alphabet (here {0,1}) the set defined in my answer has probability $1-\epsilon$. Cover and Thomas, Elements of Information Theory, 2nd Edition, Theorem 3.1.2 is a good place to start. Summary on Page 64. This book can be found online. There are also stronger versions proved where convergence is not just in probability but almost sure. Keywords: "Method of Types", under more assumptions (bounded moments, e.g., which apply here) which work for conditional distributions as well.

Background: Note that a better general measure of the strength of a key against brute force attacks is the so-called guessing entropy or Renyi entropy $H_{1/2}$ of order $1/2.$ See this question and this one for some details: in summary for certain (not necessarily fractal) type of distributions (see this paper for details) the Shannon entropy is $H$ but the key can be brute forced in much smaller than $2^H$ guesses on average. One would want to use reasonable key distributions, where $p_{max}$ was upper bounded (equivalently the min-entropy was lower bounded). In an excellent comment one key value is given as $1/2$ but this is not a good distribution to draw from for the cryptographer, but still the point about high Shannon entropy not being enough is valid.

Detailed Answer: But for your specific case independent biased bits this phenomenon is not present, and ou are on the right track, by trying random keys with the same biased distribution. Given the distribution $P=(1/4,3/4)$ almost all the probable sequences (keys) which occur under this distribution are from the typical set a concept from Shannon entropy based information theory. These are all the sequences whose empirical distribution of $0$ and $1$ are "close enough" to the ratio $(1/4,3/4).$

Let your distribution be denoted Bernoulli$-1/4$ or $B(0.25)$ for $n=49$ independent bits. The definition of typical set of $n$ i.i.d variables along with some $\epsilon > 0$ is basically $$A_{\epsilon}^{(n)} = \{(x_1, ..., x_n): 2^{-n[H(B(0.25)) + \epsilon]} \leq p(x_1, ..., x_n) \leq 2^{-n[H(B(0.25)) - \epsilon]}\}$$ and Shannon tells us that with overwhelming probability all sequences from this biased source have probability in the interval given in the inequalities, i.e., belong to the typical set. While $n=49$ is not that large, but you get probability 0.834 or so under AEP if you sum probabilities from 7 1's to 15 1's: $$ \sum_{k=7}^{15} \binom{49}{k} (1/4)^k(3/4)^{49-k}\approx 0.834 $$

Here $H(B(0.25))$ is the entropy of a Bernoulli variable with parameter $0.25$ which is $\approx0.81$. So, $2^{-n(0.81 - \epsilon)} \approx 0.57^{n}$. Take $n=49$ to get $0.57^{49}\approx 1.1\times 10^{-12}.$ This is the probability of $p(x_1, ..., x_n)$ of a typical sequence and thus the number of typical sequences is approximately its reciprocal which is approximately $2^{39.7}.$

Attack: All computations approximate

So you need to first collect all the strings with about a quarter of coordinates (say 12, or 13) equal to zeroes and the rest equal to ones. Try those first, call this set $S_0$. Since those have the highest probability of occuring. Then you could try strings at Hamming distance 1 from those, etc. Let's compute how many such sequences are there in $S_0.$ There are $$ \binom{49}{12}+\binom{49}{13}\approx 3.54 \times 10^{11}\approx 2^{38.36}, $$ in this set $S_0.$ This is off by a multiplicative factor $2^{38.36-39.7}=2^{-1.34}$ from covering the typical set which contains approximately $2^{39.7}$ strings. Just going one Hamming distance beyond will cover almost all sequences in the typical set.

If this deterministic attack is too expensive, just randomly generate strings with the same Bernoulli distribution and you will on average discover the key with not much worse performance.

It is worth mentioning that for "for certain fractal type distributions (see this paper for details) the Shannon entropy is $H$ but the key can be brute forced in much smaller than $2^H$ guesses on average" doesn't require weird/"fractal" distributions to demonstrate. Let $K$ either sample the all 0 key with probability 1/2, or sample a uniformly random key on $k$ bits otherwise. So $K = (1/2)\cdot 0^k + (1/2) \cdot \mathsf{Unif}([2^k])$. It is straightforward to show that $H(K) \geq (1/2) H(0^k) + (1/2) H(\mathsf{Unif}([2^k])) = (1/2) H(\mathsf{Unif}([2^k]))$ is still relatively high, despite — Mark Schultz-Wu
– Mark Schultz-Wu ♦, Commented Jan 14 at 23:18
@MarkSchultz-Wu, good comment, my fractal was too strong a term, I meant something like your example to be included. — kodlu
– kodlu, Commented Jan 15 at 1:19
@kodlu: Can you refer me to a proof of your claim that with overwhelming probability all sequences from this biased source belong to the interval defined in the terms of the Shannon entropy? — uk-ny
– uk-ny, Commented Jan 15 at 10:36

fgrieu · Accepted Answer · 2025-01-17 08:34:08Z

If I have a random key accumulating $N$ bits of entropy, I’d expect to brute-force it in the order of $2^N$ trials.

That's the right order of magnitude. For uniform distribution, the expected number of failed trials is $(2^N-1)/2$. For non-uniform distribution, that can be less by a small factor, or more, by some much larger factor. I assume $N$ is the Shannon entropy in bit.

The method leading to the least expected number of failed trials is to make guesses from most to least probable key. For a key of $k=4$ symbols with 1 having probability $0\le p\le 1/2$ and 0 probability $1-p$, a suitable order for the $2^k$ candidates is

0000 0001 0010 0100 1000 0011 0101 0110 1001 1010 1100 0111 1011 1101 1110 1111

The expected number of failed trials before hitting the right key is $$\begin{array}{lll} &(1+2+3+4)\,p(1-p)^3\\ \quad+&(5+6+7+8+9+10)\,p^2(1-p)^2\\ \quad+&(11+12+13+14)\,p^3(1-p)\\ \quad+&15\,p^4\\ =&10p+15p^2-10p^3\end{array}$$ Note: to extend to $0\le p\le 1$, change $p$ to $\min(p,1-p)$.

Plotting that expected number of failed attempts against the entropy-based approximation $(2^N-1)/2$ with Shannon entropy (in bit) $N=-k(p\,\log_2(p)+(1-p)\log_2(1-p))$:

Thus for $k=4$ the entropy-based approximation is fine, and always by excess. But it turns our that changes for more bits, e.g. for $k=32$:

Plotting the ratio actual/expected for various $k$ (with log scales):

As apparent in these examples, the actual expected number of failed trials can get arbitrarily higher than the Shannon entropy estimate suggests (e.g. for $k=128$, $p=1/21$, it's $>120.000.000$ times higher). But whatever the distribution of the keys it never gets much lower. As pointed by kodlu answering my reformulation in more mathematical terms on Math-SE, a lower bound or essentially $1/2$ is given in J.L. Massey's Guessing and entropy, in proceedings of ISIT 1994.

gnasher729 · Accepted Answer · 2025-01-15 10:28:00Z

If you use ASCII to represent text, then picking a random ASCII character gives you 7 bits of entropy. Picking random characters from an English text gives you much less entropy because they occur more often. If you take 14 consecutive characters from an English text then the entropy is lower again because letters are correlated.

If those 14 characters had 40 bits of entropy then you’d need 128^14 random guesses of ASCII characters to find them, thats 2^98. So you don’t make random guesses but guesses that look like English text. If the entropy is lower because of the characteristics of the text, then you’d need guesses matching the characteristics.

Paul Uszak · Accepted Answer · 2025-01-15 03:18:17Z

This question illustrates a common misconception of entropy. Re. Para 2: If the key has $N$ bits of entropy, that's it. It has $N$ bits of entropy. The key symbols may be biased, correlated or quasi-sequential. The key may be 1,000 bytes long, but if the measured entropy is $H_{\infty}$, it’s $H_{\infty}$. That’s it. $H_{\infty}(\text{key}) \ne | \text{key}|$.

Para 3 is subsumed into this argument/measurement. And thus you end up with $2^N$ whilst remembering that $N$ is the entropy, not the key length.

P.S. We use Min entropy, not Shannon entropy in cryptography because we're mean.

There shall be a link between the entropy and the brute-force cost, at least in the natural case of a string of i.i.d. biased bits. If you know that there is no such link, you'd better explain why, rather than calling this a misconception. — uk-ny
– uk-ny, Commented Jan 15 at 10:29

Stack Exchange Network

How does entropy measure the key strength?

4 Answers 4

Linked

Hot Network Questions

How does entropy measure the key strength?

4 Answers 4

Linked

Related

Hot Network Questions