2

Given the sequence

1/1, 1/2, 1/3, ... , 1/n 

How can I calculate at what point I will not be able to distinguish with precision E between two consecutive elements 1/i and 1/i+1 if I use numpy.float16 ? i.e. what is 'i' ?

What about the other np-floats ?

What is the smallest E ? and how to calculate 'i' for it ?

For example if E = 0.01 I can distinguish between 1/9 and 1/10, but not between 1/10 and 1/11, because :

1/9 = 0.111 1/10 = 0.100 1/11 = 0.091 0.111 - 0.100 = 0.01 >= E 0.100 - 0.091 = 0.009 < E i = 10 

In more abstract way, given f(i) what is the maximum 'i' representable in np.floatXX ?


Interestingly the precision in practice is worse than calculated : /the place where logic breaks/

for i in range(int(1e3),int(12e6)) : if not np.floatXX(1/i) > np.floatXX(1/(i+1)) : print(i); break float32: 11864338 float16: 1464 
1
  • 1
    Had to post a second answer because I finally understood the exact point of your question. It's derivable from what I originally wrote, but definitely not trivial. Commented Sep 16, 2021 at 5:30

2 Answers 2

2

Instead of adding 1, double the denominator. You can safely assume that it's some binary number. Here is one simple method:

one = np.float64(1.0) two = np.float64(2.0) n = one bits = 0 while one + n != one: bits += 1 n /= two 

You start with bits = 0 because otherwise you would get the count of bits that took you past resolution.

In the end, you get bits = 53, which is the number of bits in an IEEE-754 encoded 64-bit floating point number.

That means that for any number, which is encoded in what is effectively binary scientific notation, the ULP (unit of least precision) is approximately n * 2**-53. Specifically, where n is the number rounded to its highest bit. You won't be able to resolve smaller relative changes in a float.

Bonus: Running the above code for the other floating point types gives:

float16 (half): 11 bits float32 (single): 24 bits float64 (double): 53 bits float96 (sometimes longdouble): 80 bits float128 (when available): 113 bits 

You can modify the code above to work for any target number:

target = np.float16(0.0004883) one = np.float16(1.0) two = np.float16(2.0) n = two**(np.floor(np.log2(target)) - one) bits = 0 while target + n != target: bits += 1 n /= two 

The result (ULP) is given by n * 2 since the loop stops after you lose resolution. This is the same reason we start with bits = 0. In this case:

>>> n * two 5e-07 

You can entirely short circuit the computation if you know the number of bits in the mantissa up front. So for float16, where bits = 11, you can do

>>> two**(np.floor(np.log2(target)) - np.float16(bits)) 5e-07 

Read more here:

Sign up to request clarification or add additional context in comments.

9 Comments

so for f16, n=0.0004883, bits=11 so max-i is ??
@sten. Updated with an example.
@sten. 2.4e-07. If you plug all the correct types in
thanks but i'm confused is E <=> n and i <=> 2^bits ?
@sten. Bits is the number of bits of precision you can store in the mantissa. n is the largest number that is a power of 2 and can not be resolved as a difference from the target number, i.e., the one bit of n is 12 binary places below the topmost bit of the target. n * two is the smallest number that can be resolved. It will cause a change of exactly one in the ULP of the target.
|
-1

My other answer provides the theory behind what you are actually asking, but needs some non-trivial interpretation. Here is the missing step:

Given some integer i, you can write

1 / i - 1 / (i + 1) = (i + 1 - i) / (i * (i + 1)) = 1 / (i * (i + 1)) = 1 / (i**2 + i) 

To find an i such that 1 / (i**2 + i) falls below ULP of 1 / i in some binary representation, you can use my other answer pretty directly.

The ULP of 1 / i is given by

ulp = 2**(floor(log2(1 / i)) - (bits + 1)) 

You can try to find an i such that

1 / (i**2 + i) < 2**(floor(log2(1 / i)) - (bits + 1)) 1 / (i**2 + i) < 2**floor(log2(1 / i)) / 2**(bits + 1) 2**(bits + 1) < (i**2 + i) * 2**floor(log2(1 / i)) 

It's not trivial as written because of the floor operation, and Wolfram Alpha runs out of time. Since I am cheap and don't want to buy Mathematica, let's just approximate:

2**(bits + 1) < (i**2 + i) * 2**floor(log2(1 / i)) 2**(bits + 1) < (i**2 + i) / i 2**(bits + 1) < i + 1 

You may be off by one or so, but you should see that around i = 2**(bits + 1) - 1, the difference stops being resolvable. And indeed for the 11 bit mantissa of float16, we see:

>>> np.float16(1 / (2**12 - 1)) - np.float16(1 / (2**12)) 0.0 

The actual number is a tiny bit less here (remember that approximation where we took away the floor):

>>> np.float16(1 / (2**12 - 5)) - np.float16(1 / (2**12 - 4)) 0.0 >>> np.float16(1 / (2**12 - 6)) - np.float16(1 / (2**12 - 5)) 2.4e-07 

As you noted in your comments, i is

>>> 2**12 - 6 4090 

You can compute the exact values for all the other floating point types in a similar manner. But that is indeed left as an exercise for the reader.

4 Comments

“The ULP of 1 / i is given by ulp = 2**(floor(log2(1 / i)) - (bits + 1))” is incorrect. Take the case of i=1, for which 1/i = 1, and its float16 ULP is 2^−10. From your other answer, bits is 11 for float16. Then 2**(floor(log2(1 / i)) - (bits + 1)) = 2**(floor(log2(1)) - (11 + 1)) = 2**(floor(0) - 12) = 2**(0-12) = 2**-12.
The criterion given, 1 / (i**2 + i) < 2**(floor(log2(1 / i)) - (bits + 1)), even when corrected to 1 / (i**2 + i) < 2**(floor(log2(1 / i)) - (bits - 1)), determines when the difference between two successive terms falls below the ULP of the earlier term. However, that is not generally where the first failure to distinguish successive terms appears. Two terms that are less than an ULP apart are still distinguishable if they do not round to the same representable number.
Observe the points reported by the OP, 11864338 and 1464, both have significands near sqrt(2), 1.414339… and 1.4296875. I suspect that is not merely a coincidence.
Indeed, testing significand widths of 11 and up shows the significands of i for the last 1/i and 1/(i+1) that are distinguishable are 1.42969, 1.43896, 1.42358, 1.41626, 1.42108, 1.41928, 1.41809, 1.41733, 1.41549, 1.41473, 1.41518, 1.41499, 1.41453, 1.41434, 1.41445, 1.41441, 1.41429, 1.41422, 1.41427, 1.41425, 1.41424,… However, it drops below sqrt(2) at 37 bits and keeps going.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.