Numpy vs Python floating point calculations produce different results

Question

I have a list of lists of floats, like this:

u = [[1.2, 1.534, 23.5, ...], [0.2, 11.5, 3.3223, ...], ...]

Using Python to calculate a new list (height and width are the lists dimensions, u2 is a list of lists of floats set to 0.0):

for time in xrange(start, stop): for i in xrange(1,height-1): for j in xrange(1, width-1): u2[i][j] = u[i][j-1] + u[i-1][j] - time * (u[i][j+1] / u[i+1][j]) u = deepcopy(u2)

As expected, this produces a new list of lists of floats.

However, transferring this to Numpy, with a simple:

un = array(u)

Then using the same kind of loop (u2 being an array of zeroes this time):

for time in xrange(start, stop): for i in xrange(1,height-1): for j in xrange(1, width-1): u2[i][j] = un[i][j-1] + un[i-1][j] - time * (un[i][j+1] / un[i+1][j]) un = u2

... produces equal results as the Python implementation as long as height, width and the timerange are all small, but differing results as these variables are set higher and higher.

Is there a way to prevent this build-up of float-inaccuracy?

(This is not real code, just me fiddling around to understand how numbers are treated in Python and Numpy, so any suggestions regarding vectorization or other Numpy-efficiency stuff is off-topic)

Possible duplicate of Floating point math in python / numpy not reproducible across machines — 301_Moved_Permanently
– 301_Moved_Permanently, Commented Oct 26, 2015 at 9:26
Thanks. But is the difference between floats in Numpy and pure Python equal to the difference of floats on other platforms? What is the best practice to avoid a build-up of inaccuracy between results from calculations in pure Python vs results from calculations in Numpy? — darthur
– darthur, Commented Oct 26, 2015 at 10:12
@MathiasEttinger I'm not convinced that this is a duplicate - that question is about differences in floating point results between the same code run on different machines, whereas the OP is asking about two specific Python and numpy implementations that are (presumably?) being executed on the same machine. — ali_m
– ali_m, Commented Oct 26, 2015 at 11:58

ali_m · Accepted Answer · 2015-10-26 09:44:47Z

2

At first glance the problem seems to be un = u2. This creates a reference to u2 rather than a copy, so you are directly modifying u inside your inner loop. This will give you different results to the pure Python version since the value at u2[i][j] depends on u[i][j-1] and u[i-1][j].

Try un = u2.copy() to force a copy instead.

answered Oct 26, 2015 at 9:44

ali_m

74.6k28 gold badges230 silver badges315 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

darthur Over a year ago

Still different. The inaccuracies exists even when using a vectorized calculation.

ali_m Over a year ago

Can you turn your code into a reproducible example?

darthur Over a year ago

Will post when I'm back in front of my computer. Also, is using u2 = u2[i][j] + .... in the loop body alright (or as bad as un = u2 after the loop)?

ali_m Over a year ago

I assume you mean u2[i][j] = u2[i][j-1] + ...? Yes, that would be "as bad" as assigning un = u2 outside the loop. In your Python code, for each iteration over time the updated values in u2 depend only on the "untainted" values from the previous time step that are stored separately in u. In your numpy code you effectively have a single array that is constantly being updated inside your inner loops.

B. M. Over a year ago

Like ali_m said, you must put un = u2.copy() to have the same calculation. Moreover, u[i][j+1] / u[i+1][j] puts nan values in your data after the second loop. But np.nan==np.nan is evaluated False , so un and u seems to be different with the same data. It's difficult to say more without concrete data and measure of inaccuracy.

|

Collectives™ on Stack Overflow

Numpy vs Python floating point calculations produce different results

1 Answer 1

6 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

6 Comments

Linked

Related