2

I'm trying to read a massive sexp from file into memory, and it seems to be working out fine for smaller inputs, but on more deeply nested ones sbcl conks out with stack exhaustion. There seems to be a hard recursion limit (at 1000 functions deep) that sbcl simply cannot surpass (strangely, even when its stack size is increased). Example (code is here): make check-c works, but make check-cpp exhausts the stack as below:

INFO: Control stack guard page unprotected Control stack guard page temporarily disabled: proceed with caution Unhandled SB-KERNEL::CONTROL-STACK-EXHAUSTED in thread #<SB-THREAD:THREAD "main thread" RUNNING {10034E6DE3}>: Control stack exhausted (no more space for function call frames). This is probably due to heavily nested or infinitely recursive function calls, or a tail call that SBCL cannot or has not optimized away. PROCEED WITH CAUTION. Backtrace for: #<SB-THREAD:THREAD "main thread" RUNNING {10034E6DE3}> 0: ((LAMBDA NIL :IN SB-DEBUG::FUNCALL-WITH-DEBUG-IO-SYNTAX)) 1: (SB-IMPL::CALL-WITH-SANE-IO-SYNTAX #<CLOSURE (LAMBDA NIL :IN SB-DEBUG::FUNCALL-WITH-DEBUG-IO-SYNTAX) {100FC9006B}>) 2: (SB-IMPL::%WITH-STANDARD-IO-SYNTAX #<CLOSURE (LAMBDA NIL :IN SB-DEBUG::FUNCALL-WITH-DEBUG-IO-SYNTAX) {100FC9003B}>) ... 

Why am I using recursion, then? Actually, I'm not, but unfortunately the builtin (read) uses recursion, and that's where the stack overflow is occurring. The other option (which I've started working on) is to write an iterative version of read which relies upon the more limited syntax that I'm feeding into it from a separate program to avoid the complexity of re-implementing read (my (currently broken) attempts at that are in the lisp branch of the above repository).

However, I'd prefer a more canonical solution. Are there alternatives to the builtin read that can parse deeply nested structures by avoiding recursion?

EDIT: This appears to be an insurmountable issue with sbcl itself, not the input data. For a quick example, try running:

(for i in $(seq 1 2000); do echo -n "(" done; echo -n "2"; for i in $(seq 1 2000); do echo -n ")" done; echo) > file 

And then in sbcl:

(with-open-file (file "file" :direction :input) (read file)) 

The same failure occurs.

EDIT: Asked around on #sbcl, and apparently the control stack size really applies only to new threads, and that the stack size for the main thread is affected by a lot of other factors as well. So I tried putting the read in a separate thread. Still didn't work. Checkout this repo and run make check if you're interested.

10
  • Recursive functions are the natural way to parse recursive data formats, such as S-expressions. And because of the way that Lisp read macros work, it's practically required. Commented May 7, 2015 at 23:27
  • True, it'll be more difficult to do it iteratively, but since there doesn't appear to be a way to bypass the stack limit for recursive reads, I don't see a way to parse it correctly in a recursive fashion. The parser I'm looking for could possibly be tail-recursive, though. Commented May 7, 2015 at 23:47
  • You certainly can write a custom parser like that, especially if your input is not as general as read is required to handle. But there's nothing built-in that works that way. Commented May 7, 2015 at 23:49
  • Probably nothing builtin, yeah. I was looking for some possibly little-known external library, since parsing deeply nested sexps seems like an application that someone else would have tackled already (applications to logging structures, serialization, etc). Come to think of it, if I do write a reader like that I might as well put it into a separate library. Commented May 7, 2015 at 23:54
  • 1
    So..... why SBCL instead of a Common Lisp that actually works for this purpose???? Your simple example is handled easily by both GNU CLISP and Embeddable Common Lisp, for example. common-lisp.net/~dlw/LispSurvey.html Also, you might try searching for other posts with similar problems that might have useful answers, such as this one: stackoverflow.com/a/19338030/816536 Commented May 8, 2015 at 2:31

1 Answer 1

1

I don't know what you did (because you didn't show it exactly), but when I start sbcl as follows your example works fine for me:

sbcl --control-stack-size 100 

Of course I recommended GNU CLISP and Embedded Common Lisp as they also work A-OK for your example.

I'll add a reference to this answer for future readers: https://stackoverflow.com/a/9002973/816536

I'll also mention that compiling the code with appropriate optimization options may be necessary in many CL implementations to benefit from tail-call optimization.

Sign up to request clarification or add additional context in comments.

3 Comments

Right, the example at the bottom of the question was trivial. As stated at the top of the question, the testing I'm doing is based in the linked github repo. I've added some more explanation at the bottom in case it wasn't clear. I tried CLISP, and it didn't die like sbcl, but it also didn't manage to complete within a half hour (it wasn't hanging, I checked), which is not ideal. I've had to change my entire strategy because even if read doesn't blow up the stack, it's still recursive, and therefore extremely slow on large inputs.
Well now, there you go! If you ask the interpreter to do effectively unbounded recursion with too much data then you effectively invoke "undefined behaviour", which as they say can mean anything, including making daemons fly out of your nose. If you can't do it with a predictable preset stack limit, or a tail-recursive algorithm, then you'll have to do it iteratively and still keep an eye on your heap use. Better yet, fix your data representation!
Yeah, the only reason I asked the question was because I thought it was strange that (read) was (non tail-)recursive when it's such a basic function and recursion is less efficient. I ended up "fixing the data representation" of the AST I was parsing by avoiding bouncing it to text at all and working directly in C++ without involving lisp at all.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.