Timeline for Fixed point scaling; float -> Q31 -> float
Current License: CC BY-SA 4.0
19 events
| when toggle format | what | by | license | comment | |
|---|---|---|---|---|---|
| Dec 19, 2022 at 8:58 | comment | added | Zitrax | @robertbristow-johnson, I don't expect it to fit in a Q31, that's why I had the downscale/upscale before and after the Q31 use. But Hilmar explained the issue with that in the currently accepted answer. | |
| Dec 18, 2022 at 18:22 | answer | added | Dan Boschen | timeline score: 3 | |
| Dec 18, 2022 at 15:04 | comment | added | robert bristow-johnson | But Q1.31 has a range of -1 to +0.99999. How are you going to fit 60.0 into that? You don't want Q1.31 to contain a number with quantitative value of 60. If it's signed, you need at least 7 bits left of the binary point. | |
| Dec 18, 2022 at 9:37 | comment | added | Zitrax | @robertbristow-johnson The shifting is to get back to a Q31 number, i.e. Q31*Q31->Q31, I think its the standard way of doing it. Its indeed done temporarily int64. | |
| Dec 18, 2022 at 0:50 | vote | accept | Zitrax | ||
| Dec 17, 2022 at 23:37 | answer | added | robert bristow-johnson | timeline score: 1 | |
| Dec 17, 2022 at 22:57 | comment | added | robert bristow-johnson | Multiply in Q31: 2576980*4294967 >> 31 = 5152 ...... Why are you shifting right 31 bits? And you know that right-shifting loses precision, right? There is truncation. Are you rounding in that truncation? Also, have you considered that your intermediate value from multiplication might be too big for a 32-bit word? You might need to do that in int64_t. | |
| Dec 17, 2022 at 21:00 | history | tweeted | twitter.com/StackSignals/status/1604219899605286916 | ||
| Dec 17, 2022 at 20:33 | history | became hot network question | |||
| Dec 17, 2022 at 19:28 | answer | added | TimWescott | timeline score: 7 | |
| Dec 17, 2022 at 19:22 | comment | added | robert bristow-johnson | "Q31" (which I would call "Q1.31") is not the only way to do 32-bit fixed point. If you have a range up to ±64.0 (or -64.0 to +63.999999999), then the fixed-point format that you want is Q7.25. 25 bits to the right of the binary point. This means that, in a language like C, your integer representation is $2^{25}$ times larger than the value it represents. | |
| Dec 17, 2022 at 18:57 | history | edited | Zitrax | CC BY-SA 4.0 | correct mistake |
| Dec 17, 2022 at 18:51 | comment | added | Zitrax | @TimWescott I updated with all the steps and the final result which is far from 60. Either my Downscale or Upscale functions are wrong or it's not possible in general to take it back to the original range and expect matching results? | |
| Dec 17, 2022 at 18:49 | history | edited | Zitrax | CC BY-SA 4.0 | added 585 characters in body |
| Dec 17, 2022 at 16:56 | comment | added | TimWescott | It would be helpful if you would edit your question with this information. | |
| Dec 17, 2022 at 15:42 | comment | added | TimWescott | "I do not end up with 60" -- what do you end up with? if you end up with $60 \pm \epsilon$, and $\epsilon$ is significantly smaller than the errors caused by noise in your input data, then you're close enough. If you end up with 2 or 200, then you have problems. | |
| Dec 17, 2022 at 14:52 | answer | added | Hilmar | timeline score: 9 | |
| S Dec 17, 2022 at 12:33 | review | First questions | |||
| Dec 17, 2022 at 12:37 | |||||
| S Dec 17, 2022 at 12:33 | history | asked | Zitrax | CC BY-SA 4.0 |