3
$\begingroup$

Supposing that one has reliably random data of uniform distribution to use as an input, how can an integer in the range $[0, k-1]$ be selected at random without bias and in constant-time with respect to the size of $k$?

$\endgroup$
4
  • 1
    $\begingroup$ Can we have a little bit more? Uniformly distributed whats? $\endgroup$ Commented May 22 at 19:57
  • $\begingroup$ You want this algorithm: arxiv.org/abs/1805.10941. Notes by someone trying to understand the algorithm: sts10.github.io/2020/10/10/…. Note that this method uses rejection sampling, so it's not strictly constant-time, but it does not leak timing data. $\endgroup$ Commented May 22 at 23:14
  • $\begingroup$ @swineone How does it thwart timing attacks? $\endgroup$ Commented May 23 at 11:36
  • $\begingroup$ See discussion in a cryptographic context here: cic.iacr.org/p/1/2/14/pdf $\endgroup$ Commented May 24 at 3:21

2 Answers 2

12
$\begingroup$

I'll assume that, by 'random data of uniform distribution', you mean a source of uniformly and independently distributed random bits. The below argument can be modified to work if the source isn't binary; however binary is the usual case.

Obviously, unless $k$ happens to be a power of two, any constant time algorithm (that is, has an upper bound on the number of random bits it can sample) can't be precisely uniform - there will always be a bias.

However, in practical terms, this bias can be made sufficiently small for any practical purpose.

One approach is to sample $\lceil \log_2 k \rceil + 128$ random bits, interpret that as an integer $z$ and compute $z \bmod k$; that's your random value. This can be done in constant time, and the maximum bias between possible values is $< 2^{-128}$; to detect such a bias, you would need to examine $2^{256}$ such values, and we'll never generate that many. Hence, this output is effectively uniformly distributed, even if it is theoretically not.

$\endgroup$
8
  • $\begingroup$ Where does that +128 come from? $\endgroup$ Commented May 22 at 19:53
  • 3
    $\begingroup$ @bdegnan: that's to reduce any potential bias to undetectable. With $k$ extra bits, the number of outputs needed to detect the bias is around $2^{2k}$; with $k=128$, that is an unfeasibly huge amount. With $k=64$, it's almost always sufficient; I decided to go with massive overkill when writing this $\endgroup$ Commented May 22 at 19:57
  • $\begingroup$ ah, see. I appreciate the explanation. $\endgroup$ Commented May 22 at 23:32
  • 3
    $\begingroup$ @Melab: the proof is quite simple: let us call the bound on sampled random bits $a$. Any deterministic algorithm whose output relies on a maximum $a$ random bits would necessarily have output probability for any specific output be an integer multiple of $2^{-a}$. For the output of the algorithm to be precisely uniform, then each possible output must occur with probability $k^{-1}$. If $k$ is not a power of two, then $k^{-1}$ is not an integer multiple of $2^{-a}$, for any $a$, hence the probability of that output must be either higher or lower than precisely uniform. $\endgroup$ Commented May 23 at 11:38
  • 1
    $\begingroup$ I never understood why even competent people like you often use in this situation the more complicated modular reduction mod $k$ instead of simply multiplying by $k$ and cutting off the lowest 128 bits. $\endgroup$ Commented May 24 at 7:48
1
$\begingroup$

One possible interpretation is that we have a source that produces a fixed-length uniform random bit string (say, a 256 long bitstring), and we want to generate a random number in the range $[0,k)$. We could even have $k \geq 2^{256}$ since at this point we only expect computational indistinguishability.

Construction based on a PRG: Let $s$ be a 256-bit uniform random secret, $G$ be a PRG that bitstrings of length $l$; we sample random numbers in range as follows $r = int(G(s)) \mod k$. The security of this scheme against a computationally limited attacker can be summarized as follows:

  1. $G(s)$ can be replaced with a random bitstring $x$ by the security of $G$
  2. $X = int(x)$ is uniformly distributed in $[0, 2^l)$. If $Y$ is the uniform distribution on $[0, k)$, let $Z = X \mod k$; then the statistical distance $\Delta(Y; Z)$ is upperbounded by $k / 2^l$ (for $k \leq 2^l)$. The bound is loose, and something tighter can be computed, but it is good enough for the sake of choosing secure instantiations.

In other words, the probability that a computationally bounded attacker distinguishes $Z$ from $Y$ is upper bounded by $\Pr[\text{Break} G] + \Delta(Y; X) $. Assuming that $k \leq 2^n$, then the statistical distance is $1/2^{l-n}$. So if we want to keep the statistical distance to a negligible amount, we would need, for example, $l = n + 128$.

Sample several times: The construction above can be generalized so that the same seed generates a few random numbers. For this, we replace $G$ by a PRF $F$ that outputs bitstrings of length $l$. To generate the next value in range $[0, k)$, we use a nonce $n$ and compute $int(F(seed, n)) \mod k$. The justification of security is essentially the same as in the PRG case.

$\endgroup$
2
  • $\begingroup$ I don't understand a bit of this. $\endgroup$ Commented May 23 at 11:28
  • 1
    $\begingroup$ @Melab, is there anything in particular that I can help clarify? In summary, this is essentially @ poncho's answer with a twist. Namely, we approximate sampling a long random string by sampling a short random string, stretching it, and then doing what @ poncho said. $\endgroup$ Commented May 23 at 11:36

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.