5

I am currently trying to figure out how to multiply two numbers in fixed point representation.

Say my number representation is as follows:

[SIGN][2^0].[2^-1][2^-2]..[2^-14] 

In my case, the number 10.01000000000000 = -0.25.

How would I for example do 0.25x0.25 or -0.25x0.25 etc?

Hope you can help!

6
  • On a piece of paper or are you using a programming language of some form? If so you should add it to the tags. Commented Dec 23, 2012 at 3:19
  • a calculator seems appropriate ... Commented Dec 23, 2012 at 3:21
  • Haha, I'm fairly new to the website so I dont know how to accept questions posted!? Commented Dec 23, 2012 at 16:33
  • You click the checkbox outline next to the up/down buttons next to your favorite answer. It will turn green, and you'll get 2 points when you do if you did not write the answer yourself. Commented Jan 30, 2013 at 20:16
  • use array of unsigned integer types instead of separate bits representation and use build in arithmetics, take a look here stackoverflow.com/q/18465326/2521214 or search for bigint multiplication. Its way much faster than bit approach Commented Oct 14, 2013 at 10:04

3 Answers 3

5

You should use 2's complement representation instead of a seperate sign bit. It's much easier to do maths on that, no special handling is required. The range is also improved because there's no wasted bit pattern for negative 0. To multiply, just do as normal fixed-point multiplication. The normal Q2.14 format will store value x/214 for the bit pattern of x, therefore if we have A and B then

\frac{A}{2^{14}}\times\frac{B}{2^{14}} = \frac{A \times B}{2^{14}}\times \frac{1}{2^{14}}

So you just need to multiply A and B directly then divide the product by 214 to get the result back into the form x/214 like this

AxB = ((int32_t)A*B) >> 14; 

A rounding step is needed to get the nearest value. You can find the way to do it in Q number format#Math operations. The simplest way to round to nearest is just add back the bit that was last shifted out (i.e. the first fractional bit) like this

AxB = (int32_t)A*B; AxB = (AxB >> 14) + ((AxB >> 13) & 1); 

You might also want to read these

With 2 bits you can represent the integer range of [-2, 1]. So using Q2.14 format, -0.25 would be stored as 11.11000000000000. Using 1 sign bit you can only represent -1, 0, 1, and it makes calculations more complex because you need to split the sign bit then combine it back at the end.

Sign up to request clarification or add additional context in comments.

3 Comments

well but if you have big-enough bit count (so you do not have big enough integer type variable at your disposal and need to write arithmetics your self) then separate sign is much better option.
@Spektre: using a seperate sign would only increase complexity. Addition, subtraction, multiplication and division all would need the programmer to take care about sign. Whereas 2's complement they'll only need one straitforward version for both signed and unsigned numbers
Try to do it on numbers (128*32)bit wide and then you will start to see the benefits. If you use already present data types arithmetics then you are correct. btw. few ifs in bignum +,- operations are not as much bad thing as the horriblnes of *,/,%,log,pow on 2's complement
3

Multiply into a larger sized variable, and then right shift by the number of bits of fixed point precision.

1 Comment

You didn't even specify a programming language so I don't know what kind of example to give.
2

Here's a simple example in C:

int a = 0.25 * (1 << 16); int b = -0.25 * (1 << 16); int c = (a * b) >> 16; printf("%.2f * %.2f = %.2f\n", a / 65536.0, b / 65536.0 , c / 65536.0); 

You basically multiply everything by a constant to bring the fractional parts up into the integer range, then multiply the two factors, then (optionally) divide by one of the constants to return the product to the standard range for use in future calculations. It's like multiplying prices expressed in fractional dollars by 100 and then working in cents (i.e. $1.95 * 100 cents/dollar = 195 cents).

Be careful not to overflow the range of the variable you are multiplying into. Your constant might need to be smaller to avoid overflow, like using 1 << 8 instead of 1 << 16 in the example above.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.