5
\$\begingroup\$

I've a particular case of floating point addition. As you know for given floating point numbers \$x,y\$ one of the steps of the addition involves the fixed point sum:

$$ s = 1.m_x + (-1)^{s_x \oplus s_y} 2^{-\delta}1.m_y, $$

I'm assuming $ |x| > |y| $, sx and sy are inputs signs, delta is the exponent difference. In all my reference the utility of the guard, round and sticky bits is always explained when a subtraction is performed, who would imply a left shift as normalization step. I have now this special case to analize

$$ t = 1.m_x + 2^{-\gamma} 1.m_y + 2^{-\delta} 1.m_z $$

where $$ 0 \leq \gamma \leq \delta. $$

It easy to see that the normalization process in such a case would always imply a "right" shift, never a left shift. I want to perform a correctly rounded operation so I would like to understand how to implement it. As a start point I started from the computation of s, I propose to analyze this special case:

$$ s = 1.m_x + 2^{-\delta}1.m_y, $$

Are in this case still necessary all the three special bits? or are just two of them necessary? (guard and sticky maybe?)

Once answered to that question how can I understand how many bits to look at when I perform the sum of three numbers?

\$\endgroup\$
2
  • 1
    \$\begingroup\$ I'm voting to close this question as off-topic because this is computer science. \$\endgroup\$ Commented Sep 8, 2016 at 12:17
  • 2
    \$\begingroup\$ It's hardware design... it's not computer science. \$\endgroup\$ Commented Sep 8, 2016 at 12:19

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.