Detecting unexpected type conversion in python

Question

I have a piece of complex Python code involving the using of 32-bit numerical values (for saving memory and bandwidth). But later I discovered many of these 32-bit numbers were implicitly converted to 64-bit in some high-level functions. For example, the sum function, by default, can transforms a 32bit array to a 64bit number.

In [152]: x32 Out[152]: array([ 0. , 1.010101, 2.020202, 3.030303, 4.040404, 5.050505, 6.060606, 7.070707, 8.080808, 9.090909, 10.10101 , 11.111111, 12.121212, 13.131313, 14.141414, 15.151515, 16.161615, 17.171717, 18.181818, 19.19192 , 20.20202 , 21.212122, 22.222221, 23.232323, 24.242424, 25.252525, 26.262627, 27.272728, 28.282827, 29.292929, 30.30303 , 31.313131, 32.32323 , 33.333332, 34.343433, 35.353535, 36.363636, 37.373737, 38.38384 , 39.39394 , 40.40404 , 41.414143, 42.424244, 43.434345, 44.444443, 45.454544, 46.464645, 47.474747, 48.484848, 49.49495 , 50.50505 , 51.515152, 52.525253, 53.535355, 54.545456, 55.555557, 56.565655, 57.575756, 58.585857, 59.59596 , 60.60606 , 61.61616 , 62.626263, 63.636364, 64.64646 , 65.65656 , 66.666664, 67.676765, 68.68687 , 69.69697 , 70.70707 , 71.71717 , 72.72727 , 73.73737 , 74.747475, 75.757576, 76.76768 , 77.77778 , 78.78788 , 79.79798 , 80.80808 , 81.818184, 82.828285, 83.83839 , 84.84849 , 85.85859 , 86.86869 , 87.878784, 88.888885, 89.89899 , 90.90909 , 91.91919 , 92.92929 , 93.93939 , 94.94949 , 95.959595, 96.969696, 97.9798 , 98.9899 , 100. ], dtype=float32) In [153]: sum(x32) Out[153]: 4999.999972701073 In [154]: type(sum(x32)) Out[154]: numpy.float64

The reason in this case sum(x32) is 64-bit should be from the default accumulator of sum, 0, as shown here:

In [156]: type(sum(x32, start=np.float32(0))) Out[156]: numpy.float32

Above, I use the sum function as an example, to explain type conversion is everywhere if I use 32bit as inputs. I have changed the sum part to avoid such implicit type conversion. But I would like to know if internally in my library call, there is any other unexpected 32bit -> 64bit conversion. Is there a general programming language solution to monitor any possible type conversion? For example, can I run my python code with some special debugging tool so that any type conversion from 32bit to 64bit will trigger an alarm or being logged?

You could use np.sum instead, keeping in mind that numpy will not report overflow and will not give the right answer if you exceed int32. — jwal
– jwal, Commented Nov 21, 2022 at 8:51
Thanks. I used the sum function as an example, to explain type conversion is everywhere if I use 32bit as inputs. So the question is how can we monitor these type conversion systematically? — zell
– zell, Commented Nov 21, 2022 at 10:15
Why not save your initial dtype, carry out the operation, save that dtype and then assert them to be equal? You could make a decorator to assist too. Not sure of the best way to decorate stackoverflow.com/questions/22600365/… — jtlz2
– jtlz2, Commented Nov 21, 2022 at 10:42

jtlz2 · Accepted Answer · 2022-11-22 09:01:51Z

I think you are nearly there to be honest.

original_dtype = x32.dtype new_dtype = sum(x32, start=np.float32(0))).dtype assert new_dtype == original_dtype, f"dtypes differ, {new_dtype=} != {original_dtype=}"

To use this method globally, you can write something like:

def type_checker_func(func,input_array,*args): dtype_orig = input_array.dtype result = func(input_array,*args) dtype_new = result.dtype if dtype_new != dtype_orig: print(f"dtypes differ, {dtype_new=} != {dtype_orig=}") return result my_answer = type_checker_func(sum,x32,start=np.float32(0))

But I am not sure how you would best handle multiple return values (consider np.histogram), all sorts of args, etc. etc.

I am also not sure how to invoke the type_checker_func globally / implicitly (if only for numpy fns).

Update: I posted a github question asking about doing this for every function call using line_profiler - see https://github.com/pyutils/line_profiler/issues/188 - fingers crossed.

Thanks. The problem is how to invoke the type_checker_func globally and implicitly for all functions. My code consists of high-level tensorflow code. It looks not easy to force the use of type_checker_function everywhere internally.
@zell I think you could try hacking pypi.org/project/line-profiler or pypi.org/project/memory-profiler [no longer maintained in the case of the latter :( ]. Maybe post some issues on their respective github pages and see if they can assist?
@zell I opened a line_profiler github issue - see here github.com/pyutils/line_profiler/issues/188

Collectives™ on Stack Overflow

Detecting unexpected type conversion in python

1 Answer 1

3 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Linked

Related