-1
# Iterative factorial function to safely compute very large factorials without hitting Python's recursion limit. # Recursive functions have a maximum depth, usually 1000, and factorial(2000) would exceed this, causing RecursionError. # Using iteration, we multiply numbers from 1 to n in a loop, avoiding recursion and stack overflow issues. def factorial_iterative(n): result = 1 for i in range(1, n + 1): result *= i return result # Compute factorial of small number print("Factorial of 5:", factorial_iterative(5)) # Output: 120 # Compute factorial of large number print("Factorial of 2000 has", len(str(factorial_iterative(2000))), "digits") # Avoid printing huge number 
2
  • describe problem in question's body, not in title, and not in comments in code. Commented Sep 21 at 16:20
  • 1
    What does your question body have to do with your title? Your not even using recursion. Commented Sep 21 at 19:55

2 Answers 2

4

how is the limit decided?

You decide the limit, whatever makes sense for your use case, by calling setrecursionlimit(n).

When we RTFD we see that

The highest possible limit is platform-dependent. A user may need to set the limit higher when they have a program that requires deep recursion and a platform that supports a higher limit.

The default setting strikes a good balance between

  • supporting most recursive use cases that are "reasonable", and
  • failing "quickly" with a good diagnostic when the app failed to properly handle the base case

Recall the work of Church, Turing, and Gödel. The halting problem tells us that our interpreter cannot, in general, predict whether a python function will finish within the next K steps. But if we've already done a thousand recursive calls and have not yet bottomed out with a base case, then it's a fair heuristic to say that the next million recursive calls are unlikely to make a difference. So just give up by reporting "infinite recursion detected".


Why does python limit recursion depth?

Here are two moderately common buggy situations that programmers tend to write:

 while True: i += 1 # notice that we don't conditionally `break` out 
 def foo(): foo() # notice there's no decreasing argument and no conditional 

Both do nothing. The iterative example is mostly harmless.

OTOH the infinite recursion example is allocating another stack frame on each call, so eventually we exhaust virtual memory, malloc() fails, and the interpreter bails out. But on an interactive machine we're likely to see pager thrashing against the swap partition, which can make using things like a windowed GUI unpleasantly slow. And just before the interpreter exits, the low memory condition can cause fork() / malloc() calls from crond and other processes to fail.

Some small platforms which python runs on don't implement virtual memory, and provide only a fixed amount of storage for stack space. Debugging tends to be much simpler when the interpreter avoids bumping up against that system limit.


Unlike the Scheme language, python doesn't do TCO. And GvR likes it that way.

BTW the 3.14 interpreter dispatches opcodes with an interesting technique, but the release notes explain:

This is not to be confused with tail call optimization of Python functions, which is currently not implemented in CPython.


I dove into @jsbueno's remarks and ran some tests on my Apple M4 MacBook Pro.

3.14.0rc2 30,000,000 1.274513 sec; 23,538,402.4 count/sec 3.13.7 30,000,000 1.282322 sec; 23,395,061.2 count/sec 3.12.8 30,000,000 1.268134 sec; 23,656,808.2 count/sec 3.11.11 30,000,000 1.246084 sec; 24,075,419.4 count/sec 3.10.16 42,780 0.008580 sec; 4,986,031.8 count/sec 3.9.23 72,287 0.013965 sec; 5,176,249.8 count/sec 3.8.20 72,287 0.016485 sec; 4,384,959.7 count/sec 

That first line is saying that modern interpreter can recursively count to 30 M ("infinite") in one and a quarter seconds, a rate of roughly 24 M integer increments per second. The iterative rate is closer to 75 M inc/sec.

I found 30 M convenient because during my edit-debug cycle it runs in about a second; 300 M works fine, in 15 seconds. Interestingly the rate then goes down to about 20 M inc/sec, as the stack operations are forced to make more L3 cache accesses. I didn't try anything greater than 1 B, which completes in 160 sec at a rate of 6 M inc/sec. Note that above 1 B the increment operation becomes slightly more expensive, as we allocate another 4 bytes to store the count.

Note that hitting CTRL-C asks for a stack trace, which is expensive to reconstruct, leading to very a noticeable delay. I found it convenient to use CTRL-\ as control backslash exits immediately without showing a backtrace. It's the difference between sending INT versus QUIT signals.

If I ask interpreter 3.8 to count to 73 K or greater, it will blow the C stack and die with signal 11 segfault. Same result with 3.9, and then in 3.10 it looks like we're pushing another field or two onto the stack, so the feasible limit is reduced to ~ 43 K.


Naïve use of numba's @jit leads to the compiler optimizing away the loop and immediately returning n. If I ask it to do two things, like increment and store that to first element of a module-level list, then I see iterative rates of 3.4 B inc/sec. If I entirely replace i with that initial list element then the generated code is less efficient and we see 630 M inc/sec. In any event we can only count to 2 B as we're using int32_t beneath the hood.

In the recursive case, @jit lets us count as high as 174 K before blowing the C stack and seg faulting. The rate is 130 M inc/sec.


Unlike uint64_t or IEEE floats, python integers faithfully implement a mathematician's notion of the number line. Python will exactly record any integer you care to name, much like Scheme's BigNums will.

It is straightforward to see what int values cause python to allocate additional storage. We start out being able to represent signed integers up to 2**30, and keep allocating storage for another 30 bits each time the current allocation is on the verge of overflow (wraparound).

import sys n = 1 for i in range(91): print(sys.getsizeof(n), f"{i:6d}\t{n:,d}") n *= 2 
28 0 1 28 1 2 28 2 4 28 3 8 28 4 16 28 5 32 ... 28 29 536,870,912 32 30 1,073,741,824 ... 32 59 576,460,752,303,423,488 36 60 1,152,921,504,606,846,976 ... 36 89 618,970,019,642,690,137,449,562,112 40 90 1,237,940,039,285,380,274,899,124,224 
Sign up to request clarification or add additional context in comments.

1 Comment

It may be noted that in the case of the second "common buggy situation" above, python 3.12 prints an erroneous message in the stack trace: "[Previous line repeated 997 more times]" even after you have successfully increased the recursion depth. This message is correct only for the case of the default recursion depth of 1000. I recommend using a try/except structure to catch the "RecursionError" exception to avoid any confusion over whether a increased recursion depth was exceeded.
1

Python, pre Python 3.11, will create an object - a real, solid, concrete, Python assessible object, called "Frame" whenever you call a function - these are not free: they have a lot of context, embedded dictionaries, etc...there are optimizations in place to reuse these frame objects when a function returns and another call is started - but recursion will always require new frames.

The cost for these, I believe, has been the main driver for the historic recursion limit (set at 1000 by default, and able to be changed with a sys.setrecursionlimit call). This default is still at 1000, and with Python up to version 3.10, you could raise it to about 10_000 - depending on your system specs and OS limitations, before it would crash the runtime.

Post Python 3.11, the historic default is still 1000, but a concrete Frame object is no longer created when a Python function is called (only when one tries to introspect the running context, the Frame is lazily created) - this reduced the memory (and possibly other system resources)overheads of function calling in Python by at least 2 orders of magnitude.

So, if you pick a recent Python, you can just do import sys; sys.setrecursionlimit(100_000_000) and go for a ride: Historically, and most algorithms in real-world scenarios will seldom need 1000 levels of recursion, and Python coders have learned over decades to use interactive approaches when possible, anyway - so the default of 1000 did not change.

SO here are two interactive sessions, first with Python 3.9, showing some settings and timings - with Python 3.9 and then Python 3.13 which showcase this (and a really, really dumb recursive call). Ipython is the enhanced REPL wich offers the %time magic to print the running time.

Python 3.9.21 (main, Feb 10 2025, 00:00:00) Type 'copyright', 'credits' or 'license' for more information IPython 8.18.1 -- An enhanced Interactive Python. Type '?' for help. In [1]: import sys In [2]: sys.setrecursionlimit(100_000_000) In [3]: def count(i, ceil): ...: if i < ceil: ...: return count(i + 1, ceil) ...: return i ...: In [4]: %time count(0, 1_000) CPU times: user 25 µs, sys: 1.94 ms, total: 1.97 ms Wall time: 2.02 ms Out[4]: 1000 In [5]: %time count(0, 15_000) CPU times: user 15 ms, sys: 16.5 ms, total: 31.5 ms Wall time: 32.8 ms Out[5]: 15000 In [6]: %time count(0, 150_000) Segmentation fault (core dumped) 
(env313t) gwidion@fedora:~/tmp01$ ipython Python 3.13.2+ experimental free-threading build (heads/3.13:fc1c9f8, Feb 17 2025, 17:58:08) [GCC 14.2.1 20250110 (Red Hat 14.2.1-7)] Type 'copyright', 'credits' or 'license' for more information IPython 8.27.0 -- An enhanced Interactive Python. Type '?' for help. In [1]: import sys In [2]: sys.setrecursionlimit(100_000_000) ...: In [3]: >>> def count(i, ceil): ...: if i < ceil: ...: return count(i + 1, ceil) ...: return i ...: ... In [5]: %time count(0, 10) CPU times: user 22 μs, sys: 2 μs, total: 24 μs Wall time: 34.8 μs Out[5]: 10 ... In [11]: %time count(0, 1_000) CPU times: user 0 ns, sys: 768 μs, total: 768 μs Wall time: 728 μs Out[11]: 1000 In [12]: %time count(0, 15_000) CPU times: user 9.08 ms, sys: 2.97 ms, total: 12.1 ms Wall time: 11.4 ms Out[12]: 15000 In [13]: %time count(0, 150_000) CPU times: user 80.9 ms, sys: 27.4 ms, total: 108 ms Wall time: 110 ms Out[13]: 150000 In [14]: %time count(0, 1_500_000) CPU times: user 645 ms, sys: 385 ms, total: 1.03 s Wall time: 1.05 s Out[14]: 1500000 In [15]: %time count(0, 15_000_000) CPU times: user 5.59 s, sys: 3 s, total: 8.59 s Wall time: 8.77 s Out[15]: 15000000 

So, while there is still a tremendous overhead for such a simple run compared with what a compiled-to-native-code , non dynamic, language would do, the internal objects and data structures created have been really optimized nowadays.

To make this other efficiency distinction clear: C or Rust would run each of these calls in maybe less than 20 machine ops at full CPU clock (~3GHz) so counting to millions and billions in these languages by "adding one" makes sense. In Python, the object being added is a "Python object" - it has to check for the internal addition operators at each time, create a new int object and so on. If one is "doing raw-numbers" in Python it is needed to use things like numpy/pandas/polars/dask which wrap the whole chain of simple operations in native code. As the algorithm in question involves higher level data structures (like string processing/network stuff) the inner methods of Python are already optimized at the same magnitude the code in C/etc... would be running, and this efficiency gap gets smaller.

3 Comments

How do you know that memory overheads of function calling in Python were reduced by at least 2 orders of magnitude? Do you have more information about that?
Are you saying that because of your experiments with your count() calls? In 3.9 that used C stack space, which is probably just a few megabytes. In 3.13 it doesn't and is instead limited by your whole memory, which is probably gigabytes. That's why you can go so much deeper there. Not because of massively reduced overheads.
WHat I call "reduced overheads" including not using system-resources that are otherwise limited (in this case, the C stack) - although I should be more explicit about that in the text, of course.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.