1

What am I missing? I tried appending .round(3) to the end of of the api call but it doesnt work, and it also doesnt work in separate calls. The data types for all columns is numpy.float32.

>>> summary_data = api._get_data(units=list(units.id), downsample=downsample, table='summary_tb', db=db).astype(np.float32) >>> summary_data.head() id asset_id cycle hs alt Mach TRA T2 0 10.0 1.0 1.0 1.0 3081.0 0.37945 70.399887 522.302124 1 20.0 1.0 1.0 1.0 3153.0 0.38449 70.575668 522.428162 2 30.0 1.0 1.0 1.0 3229.0 0.39079 70.575668 522.645020 3 40.0 1.0 1.0 1.0 3305.0 0.39438 70.575668 522.651184 4 50.0 1.0 1.0 1.0 3393.0 0.39690 70.663559 522.530090 >>> summary_data = summary_data.round(3) >>> summary_data.head() id asset_id cycle hs alt Mach TRA T2 0 10.0 1.0 1.0 1.0 3081.0 0.379 70.400002 522.302002 1 20.0 1.0 1.0 1.0 3153.0 0.384 70.575996 522.427979 2 30.0 1.0 1.0 1.0 3229.0 0.391 70.575996 522.645020 3 40.0 1.0 1.0 1.0 3305.0 0.394 70.575996 522.651001 4 50.0 1.0 1.0 1.0 3393.0 0.397 70.664001 522.530029 >>> print(type(summary_data)) pandas.core.frame.DataFrame >>> print([type(summary_data[col][0]) for col in summary_data.columns]) [numpy.float32, numpy.float32, numpy.float32, numpy.float32, numpy.float32, numpy.float32, numpy.float32, numpy.float32] 

It does in fact look like some form of rounding is taking place, but something weird is happening. Thanks in advance.

EDIT

The point of this is to use 32 bit floating numbers, not 64 bit. I have since used pd.set_option('precision', 3), but according the the documentation this only affects the display, but not the underlying value. As mentioned in a comment below, I am trying to minimize the number of atomic operations. Calculations on 70.575996 vs 70.57600 are more expensive, and this is the issue I am trying to tackle. Thanks in advance.

0

1 Answer 1

1

Hmm, this might be a floating-point issue. You could change the dtype to float instead of np.float32:

>>> summary_data.astype(float).round(3) id asset_id cycle hs alt Mach TRA T2 0 10.0 1.0 1.0 1.0 3081.0 0.379 70.400 522.302 1 20.0 1.0 1.0 1.0 3153.0 0.384 70.576 522.428 2 30.0 1.0 1.0 1.0 3229.0 0.391 70.576 522.645 3 40.0 1.0 1.0 1.0 3305.0 0.394 70.576 522.651 4 50.0 1.0 1.0 1.0 3393.0 0.397 70.664 522.530 

If you change it back to np.float32 afterwards, it re-exhibits the issue:

>>> summary_data.astype(float).round(3).astype(np.float32) id asset_id cycle hs alt Mach TRA T2 0 10.0 1.0 1.0 1.0 3081.0 0.379 70.400002 522.302002 1 20.0 1.0 1.0 1.0 3153.0 0.384 70.575996 522.427979 2 30.0 1.0 1.0 1.0 3229.0 0.391 70.575996 522.645020 3 40.0 1.0 1.0 1.0 3305.0 0.394 70.575996 522.651001 4 50.0 1.0 1.0 1.0 3393.0 0.397 70.664001 522.530029 
Sign up to request clarification or add additional context in comments.

4 Comments

Thanks for your answer. From what I know, float is a c double type and uses 64 bytes of memory, and I'm trying to use 32 bytes. It looks like the second link provided above by the moderator "seems" to solve my problem, but I have further questions regarding computation expense, which is what I am trying to reduce. Does 70.575996 result in more atomic operations than 70.576? Yes, it does. But according to pandas, pd.set_option('precision', 3) only affects the display, not the underlying value. So my question it would seem is still valid, in a sense.
Okay, sounds interesting. If your question is unique, then clarify that in the question body, and we can probably get it reopened. (By the way, Henry Ecker is not a moderator - he's just a normal user who has the gold Python badge, which allows him to close questions with the Python tag a duplicates of others with that tag, whithout needing other users' approval, because he's assumed to be experienced enough :)
bits not bytes ^^
Looks like the question was opened back up, thanks! I'm coming back to this now in my project and remembered why I needed it. Any updates?

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.