Calculating square root using only integer math in python

Question

I'm working on a microcontroller that does not support floating point math. Integer math only. As such, there is no sqrt() function and I can't import any math modules. The MCU is running a subset of python that supports eight Python data types: None, integer, Boolean, string, function, tuple, byte list, and iterator. Also, the MCU can't do floor division (//).

My problem is that I need to calculate the magnitude of 3 signed integers.

mag = sqrt(x**2+y**2+z**2)

FWIW, the values can only be in the range of +/-1024 and I just need a close approximation. Does anyone have a pattern for solving this problem?

What - precisely - does "close" mean? How important is speed? Can you use a table of pre-computed constants, and if so how many bytes of pre-computed data can you afford? What does the MCU support - Integer addition? Right shift? Integer multiplication? And on what size integers? — Tim Peters
– Tim Peters, Commented Jul 4, 2016 at 4:48
Do you really need the magnitude, or does using the magnitude-squared directly work? — o11c
– o11c, Commented Jul 4, 2016 at 5:03
Does it support division at all? The floor division operator is a recent addition to distinguish from real division, a pointless distinction if only integers are supported. But division is frequently expensive in the first place; it's not unlikely a successive approximation using shifts, additions and multiplications is cheaper, which might in turn benefit from operations like ctz or clrsb to find the most significant 1 bit. I'm rather curious what Python-like system you're using. PyMite perhaps? — Yann Vernier
– Yann Vernier, Commented Jul 4, 2016 at 5:27

Tim Peters · Accepted Answer · 2016-07-04 19:29:40Z

Note that the largest possible sum is 3*1024**2, so the largest possible square root is 1773 (floor - or 1774 rounded).

So you could simply take 0 as a starting guess, and repeatedly add 1 until the square exceeds the sum. That can't take more than about 1770 iterations.

Of course that's probably too slow. A straightforward binary search can cut that to 11 iterations, and doesn't require division (I'm assuming the MCU can shift right by 1 bit, which is the same as floor-division by 2).

EDIT

Here's some code, for a binary search returning the floor of the true square root:

def isqrt(n): if n <= 1: return n lo = 0 hi = n >> 1 while lo <= hi: mid = (lo + hi) >> 1 sq = mid * mid if sq == n: return mid elif sq < n: lo = mid + 1 result = mid else: hi = mid - 1 return result

To check, run:

from math import sqrt assert all(isqrt(i) == int(sqrt(i)) for i in range(3*1024**2 + 1))

That checks all possible inputs given what you said - and since binary search is notoriously tricky to get right in all cases, it's good to check every case! It doesn't take long on a "real" machine ;-)

PROBABLY IMPORTANT

To guard against possible overflow, and speed it significantly, change the initialization of lo and hi to this:

 hi = 1 while hi * hi <= n: hi <<= 1 lo = hi >> 1

Then the runtime becomes proportional to the number of bits in the result, greatly speeding smaller results. Indeed, for sloppy enough definitions of "close", you could stop right there.

FOR POSTERITY ;-)

Looks like the OP doesn't actually need square roots at all. But for someone who may, and can't afford division, here's a simplified version of the code, also removing multiplications from the initialization. Note: I'm not using .bit_length() because lots of deployed Python versions don't support that.

def isqrt(n): if n <= 1: return n hi, hisq = 2, 4 while hisq <= n: hi <<= 1 hisq <<= 2 lo = hi >> 1 while hi - lo > 1: mid = (lo + hi) >> 1 if mid * mid <= n: lo = mid else: hi = mid assert lo + 1 == hi assert lo**2 <= n < hi**2 return lo from math import sqrt assert all(isqrt(i) == int(sqrt(i)) for i in range(3*1024**2 + 1))

this suffers the same as mine ... except it always rounds up instead of down ... (ie 10.01 -> 11) but +1 as this is plenty fast
In the absence of a definition for "close" ... ;-) But a binary search can be coded to compute the floor of the true sqrt, or the ceiling, or to pick the i such that i**2 is closest to the input n. For example, if it "naturally" produces the floor, then to get the ceiling instead just add if i*i != n: i += 1 at the end.
@TimPeters, I was wrong about the range. The adxl345 stores the accel data in 2 eight bit registers for each axis. I have to shift the msbyte<<8 then | the lsbyte. The MCU only supports signed 16 bit integers, (+/-32768)
Thanks for taking the time to help me. I'll mark the answer as correct.
@cce1911, if you missed the comment on a deleted answer: checking whether sqrt(n) > t is the same as checking whether n > t*t. So you don't need square roots at all. If your threshold is t, just compare x**2 + y**2 + z**2 to t**2.

Copperfield · Accepted Answer · 2016-07-04 05:08:04Z

there is a algorithm to calculate it, but it use floor division, without that this is what come to my mind

def isqrt_linel(n): x = 0 while (x+1)**2 <= n: x+=1 return x

by the way, the algorithm that I know use the Newton method:

def isqrt(n): #https://en.wikipedia.org/wiki/Integer_square_root #https://gist.github.com/bnlucas/5879594 if n>=0: if n == 0: return 0 a, b = divmod(n.bit_length(), 2) x = 2 ** (a + b) while True: y = (x + n // x) >> 1 if y >= x: return x x = y else: raise ValueError("negative number")

Collectives™ on Stack Overflow

Calculating square root using only integer math in python

2 Answers 2

EDIT

PROBABLY IMPORTANT

FOR POSTERITY ;-)

5 Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

EDIT

PROBABLY IMPORTANT

FOR POSTERITY ;-)

5 Comments

Comments

Related