Why do I get bad timing results when I put "%timeit" inside a function with IPython/Jupyter?

Question

I am learning how to use the %timeit magic command in IPython, actually using Jupyter notebook with Python 3. If I try to time the NumPy sorting function for various sized arrays:

n = 10 for i in range(n): arr = np.random.rand(2**(i+10)) %timeit -n 2 np.sort(arr)

Then I get a sequence of roughly increasing times, like I would expect.

If I try to pack this code into a function, however, I do not get the output I expect: all of the times are about the same!

def my_func(n): for i in range(n): arr = np.random.rand(2**(i+10)) %timeit -n 10 np.sort(arr) my_func(10)

Please see the Jupyter notebook showing the results here.

Can anyone explain either what I am doing wrong, or what I am misunderstanding?

%timeit is a special syntax Jupyter supports, it’s not actually valid Python code. So I would expect that Jupyter parses this separately and it has an effect on the whole executed command. Try using the timeit module directly instead. — poke
– poke, Commented Sep 3, 2017 at 17:04
In the second case, you're just repeatedly sorting the global arr you made in the first cell, not the local arr. Change the local variable name to something else and you'll see what %timeit is complaining about. — pvg
– pvg, Commented Sep 3, 2017 at 17:30

MSeifert · Accepted Answer · 2017-09-03 17:42:59Z

%timeit isn't supposed to work correctly inside functions (currently). If you start a fresh notebook (or restart yours) and only use:

import numpy as np def my_func(n): for i in range(n): arr = np.random.rand(2**(i+10)) %timeit -n 10 np.sort(arr) my_func(10)

It will throw a NameError:

NameError: name 'arr' is not defined

That's because %timeit only inspects the global variables not the local ones (so it ignores the variable arr = np.random.rand(2**(i+10)) defined inside your function).

If you use this code it will be obvious:

import numpy as np arr = np.array([1, 2, 3]) def my_func(n): for i in range(n): arr = np.random.rand(2**(i+10)) %timeit -n 2 -r 1 print(arr) my_func(10)

which prints:

 [1 2 3] [1 2 3] 3.44 ms ± 0 ns per loop (mean ± std. dev. of 1 run, 2 loops each) [1 2 3] [1 2 3] 670 µs ± 0 ns per loop (mean ± std. dev. of 1 run, 2 loops each) [1 2 3] [1 2 3] 2.04 ms ± 0 ns per loop (mean ± std. dev. of 1 run, 2 loops each) [1 2 3] [1 2 3] 451 µs ± 0 ns per loop (mean ± std. dev. of 1 run, 2 loops each) [1 2 3] [1 2 3] 906 µs ± 0 ns per loop (mean ± std. dev. of 1 run, 2 loops each) [1 2 3] [1 2 3] 1.01 ms ± 0 ns per loop (mean ± std. dev. of 1 run, 2 loops each) [1 2 3] [1 2 3] 767 µs ± 0 ns per loop (mean ± std. dev. of 1 run, 2 loops each) [1 2 3] [1 2 3] 890 µs ± 0 ns per loop (mean ± std. dev. of 1 run, 2 loops each) [1 2 3] [1 2 3] 1.28 ms ± 0 ns per loop (mean ± std. dev. of 1 run, 2 loops each) [1 2 3] [1 2 3] 919 µs ± 0 ns per loop (mean ± std. dev. of 1 run, 2 loops each)

So in your case it always found the last arr from your non-function runs (which was global). Which also explains why the time was roughly identical for the function. Because it always found the same arr.

Collectives™ on Stack Overflow

Why do I get bad timing results when I put "%timeit" inside a function with IPython/Jupyter?

1 Answer 1

Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Related