19

Yesterday i had this interview question, which I couldn't fully answer:

Given a function f() = 0 or 1 with a perfect 1:1 distribution, create a function f(n) = 0, 1, 2, ..., n-1 each with probability 1/n

I could come up with a solution for if n is a natural power of 2, ie use f() to generate the bits of a binary number of k=ln_2 n. But this obviously wouldn't work for, say, n=5 as this would generate f(5) = 5,6,7 which we do not want.

Does anyone know a solution?

3

4 Answers 4

26

You can build a rng for the smallest power of two greater than n as you described. Then whenever this algorithm generates a number larger than n-1, throw that number away and try again. This is called the method of rejection.

Addition

The algorithm is

Let m = 2^k >= n where k is is as small as possible. do Let r = random number in 0 .. m-1 generated by k coin flips while r >= n return r 

The probability that this loop stops with at most i iterations is bounded by 1 - (1/2)^i. This goes to 1 very rapidly: The loop is still running after 30 iterations with probability less than one-billionth.

You can decrease the expected number of iterations with a slightly modified algorithm:

Choose p >= 1 Let m = 2^k >= p n where k is is as small as possible. do Let r = random number in 0 .. m-1 generated by k coin flips while r >= p n return floor(r / p) 

For example if we are trying to generate 0 .. 4 (n = 5) with the simpler algorithm, we would reject 5, 6 and 7, which is 3/8 of the results. With p = 3 (for example), pn = 15, we'd have m = 16 and would reject only 15, or 1/16 of the results. The price is needing four coin flips rather than 3 and a division op. You can continue to increase p and add coin flips to decrease rejections as far as you wish.

Sign up to request clarification or add additional context in comments.

16 Comments

How is this going to produce the range 0, 1, 2, ..., n - 2, n - 1?
A useful method, but note that it can in theory take an infinitely long time to run. The probability of needing at least m+1 "rolls" (each of k coin-flips) is (2^k - n)^m/2^k, which goes down quickly with increasing m, but never reaches zero.
@j_random_hacker It's not too difficult to show that any method that gives a uniform result must discard some results. If you generate z random bits (for any z), you've got 2^z possible outcomes in total. If n isn't a factor of 2^z, there is no uniform way of partitioning these 2^z outcomes into n different results. The best you can hope for is to assign almost all of them, and leave at most n-1 unassigned; these will have to be rejected.
@j_random_hacker It's impossible, I'm afraid... Suppose it was bounded by R. That means that there's an algorithm that requires at most R random bits, and produces something uniform from 0 to n-1. But as in the previous comment, there are 2^R possible inputs to the function (regardless of what it uses and what it ignores), and they are uniformly distributed. You can't uniformly partition these into n results unless n is a factor of 2^R.
|
2

Another interesting solution can be derived through a Markov Chain Monte Carlo technique, the Metropolis-Hastings algorithm. This would be significantly more efficient if a large number of samples were required but it would only approach the uniform distribution in the limit.

 initialize: x[0] arbitrarily for i=1,2,...,N if (f() == 1) x[i] = (x[i-1]++) % n else x[i] = (x[i-1]-- + n) % n 

For large N the vector x will contain uniformly distributed numbers between 0 and n. Additionally, by adding in an accept/reject step we can simulate from an arbitrary distribution, but you would need to simulate uniform random numbers on [0,1] as a sub-procedure.

Comments

0
def gen(a, b): min_possible = a max_possible = b while True: floor_min_possible = floor(min_possible) floor_max_possible = floor(max_possible) if max_possible.is_integer(): floor_max_possible -= 1 if floor_max_possible == floor_min_possible: return floor_max_possible mid = (min_possible + max_possible)/2 if coin_flip(): min_possible = mid else: max_possible = mid 

Comments

-3
My #RandomNumberGenerator #RNG /w any f(x) that gives rand ints from 1 to x, we can get rand ints from 1 to k, for any k: get ints p & q, so p^q is smallest possible, while p is a factor of x, & p^q >= k; Lbl A i=0 & s=1; while i < q { s+= ((f(x) mod p) - 1) * p^i; i++; } if s > k, goto A, else return s //** about notation/terms: rand = random int = integer mod is (from) modulo arithmetic Lbl is a “Label”, from the Basic language, & serves as a coordinates for executing code. After the while loop, if s > k, then “goto A” means return to the point of code where it says “Lbl A”, & resume. If you return to Lbl A & process the code again, it resets the values of i to 0 & s to 1. i is an iterator for powers of p, & s is a sum. "s+= foo" means "let s now equal what it used to be + foo". "i++" means "let i now equal what it used to be + 1". f(x) returns random integers from 1 to x. **// I figured out/invented/solved it on my own, around 2008. The method is discussed as common knowledge here. Does anyone know since when the random number generator rejection method has been common knowledge? RSVP. 

5 Comments

Lbl is a “Label”, from the Basic language, & serves as a coordinates for executing code. After the while loop, if s > k, then “goto A” means return to the point of code where it says “Lbl A”, & resume. If you return to Lbl A & process the code again, it resets the values of i to 0 & s to 1. i is an iterator for powers of p, & s is a sum. s+= foo means let s now equal what it used to be + foo. f(x) returns random integers between 1 & x. Does that help? Cheers.
add explanation to your answer not in comments.
This is the basic RNG version that can upgrade. I'll soon begin working on the upgraded code. This basic RNG works /w only 1 factorization p of x, however it's possible to use multiple factors to lower calculation time. e.g. use dice, or f(6), to generate rand ints from 1 to 12: 1. use f(6) to generate a result from 1 to 6. 2. use {(f(6) mod 2)-1} to generate a result from 0 to 1. 3. return {(the result from step 1) + (6 * the result from step 2)}. Comparatively, the basic RNG may use p=2 & q=4
e.g. use dice, or f(6), to generate rand ints from 1 to 17 (i.e. “use f(6) to generate rand ints from 1 to 18, accepting only returns from 1 to 17”) 1. use f(6) to get a RNG int from 1 to 6. 2. use {(f(6) mod 3) - 1} to RNG an int from 0 to 2. 3. if {(the result from step 1) + (6 * the result from step 2)} > 17, restart /w step 1, else return that value.
Additional comments, including an algorithm for using coin flips to generate (12 sided die, 'd12') random integers from 1 to 12, exceeds stackoverflow comment character count limits, so are linked plus.google.com/u/0/113366929351507812798/posts/h2dJSEYQeuU

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.