2

I'm working on a microcontroller that does not support floating point math. Integer math only. As such, there is no sqrt() function and I can't import any math modules. The MCU is running a subset of python that supports eight Python data types: None, integer, Boolean, string, function, tuple, byte list, and iterator. Also, the MCU can't do floor division (//).

My problem is that I need to calculate the magnitude of 3 signed integers.

mag = sqrt(x**2+y**2+z**2) 

FWIW, the values can only be in the range of +/-1024 and I just need a close approximation. Does anyone have a pattern for solving this problem?

7
  • so you need the closest int to the sqrt(x**2+y**2+z**2)? Commented Jul 4, 2016 at 4:48
  • 3
    What - precisely - does "close" mean? How important is speed? Can you use a table of pre-computed constants, and if so how many bytes of pre-computed data can you afford? What does the MCU support - Integer addition? Right shift? Integer multiplication? And on what size integers? Commented Jul 4, 2016 at 4:48
  • 5
    Do you really need the magnitude, or does using the magnitude-squared directly work? Commented Jul 4, 2016 at 5:03
  • ahhh good question @o11c ... Commented Jul 4, 2016 at 5:04
  • 1
    Does it support division at all? The floor division operator is a recent addition to distinguish from real division, a pointless distinction if only integers are supported. But division is frequently expensive in the first place; it's not unlikely a successive approximation using shifts, additions and multiplications is cheaper, which might in turn benefit from operations like ctz or clrsb to find the most significant 1 bit. I'm rather curious what Python-like system you're using. PyMite perhaps? Commented Jul 4, 2016 at 5:27

2 Answers 2

3

Note that the largest possible sum is 3*1024**2, so the largest possible square root is 1773 (floor - or 1774 rounded).

So you could simply take 0 as a starting guess, and repeatedly add 1 until the square exceeds the sum. That can't take more than about 1770 iterations.

Of course that's probably too slow. A straightforward binary search can cut that to 11 iterations, and doesn't require division (I'm assuming the MCU can shift right by 1 bit, which is the same as floor-division by 2).

EDIT

Here's some code, for a binary search returning the floor of the true square root:

def isqrt(n): if n <= 1: return n lo = 0 hi = n >> 1 while lo <= hi: mid = (lo + hi) >> 1 sq = mid * mid if sq == n: return mid elif sq < n: lo = mid + 1 result = mid else: hi = mid - 1 return result 

To check, run:

from math import sqrt assert all(isqrt(i) == int(sqrt(i)) for i in range(3*1024**2 + 1)) 

That checks all possible inputs given what you said - and since binary search is notoriously tricky to get right in all cases, it's good to check every case! It doesn't take long on a "real" machine ;-)

PROBABLY IMPORTANT

To guard against possible overflow, and speed it significantly, change the initialization of lo and hi to this:

 hi = 1 while hi * hi <= n: hi <<= 1 lo = hi >> 1 

Then the runtime becomes proportional to the number of bits in the result, greatly speeding smaller results. Indeed, for sloppy enough definitions of "close", you could stop right there.

FOR POSTERITY ;-)

Looks like the OP doesn't actually need square roots at all. But for someone who may, and can't afford division, here's a simplified version of the code, also removing multiplications from the initialization. Note: I'm not using .bit_length() because lots of deployed Python versions don't support that.

def isqrt(n): if n <= 1: return n hi, hisq = 2, 4 while hisq <= n: hi <<= 1 hisq <<= 2 lo = hi >> 1 while hi - lo > 1: mid = (lo + hi) >> 1 if mid * mid <= n: lo = mid else: hi = mid assert lo + 1 == hi assert lo**2 <= n < hi**2 return lo from math import sqrt assert all(isqrt(i) == int(sqrt(i)) for i in range(3*1024**2 + 1)) 
Sign up to request clarification or add additional context in comments.

5 Comments

this suffers the same as mine ... except it always rounds up instead of down ... (ie 10.01 -> 11) but +1 as this is plenty fast
In the absence of a definition for "close" ... ;-) But a binary search can be coded to compute the floor of the true sqrt, or the ceiling, or to pick the i such that i**2 is closest to the input n. For example, if it "naturally" produces the floor, then to get the ceiling instead just add if i*i != n: i += 1 at the end.
@TimPeters, I was wrong about the range. The adxl345 stores the accel data in 2 eight bit registers for each axis. I have to shift the msbyte<<8 then | the lsbyte. The MCU only supports signed 16 bit integers, (+/-32768)
Thanks for taking the time to help me. I'll mark the answer as correct.
@cce1911, if you missed the comment on a deleted answer: checking whether sqrt(n) > t is the same as checking whether n > t*t. So you don't need square roots at all. If your threshold is t, just compare x**2 + y**2 + z**2 to t**2.
1

there is a algorithm to calculate it, but it use floor division, without that this is what come to my mind

def isqrt_linel(n): x = 0 while (x+1)**2 <= n: x+=1 return x 

by the way, the algorithm that I know use the Newton method:

def isqrt(n): #https://en.wikipedia.org/wiki/Integer_square_root #https://gist.github.com/bnlucas/5879594 if n>=0: if n == 0: return 0 a, b = divmod(n.bit_length(), 2) x = 2 ** (a + b) while True: y = (x + n // x) >> 1 if y >= x: return x x = y else: raise ValueError("negative number") 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.