1

Running seq (GNU coreutils 8.21) without specifying a format string runs extremely quickly compared with any possible format I tried:

$ time seq 1e8 > /dev/null seq 1e8 > /dev/null 0.68s user 0.02s system 99% cpu 0.703 total $ time seq -f '%1.f' 1e8 > /dev/null seq -f '%1.f' 1e8 > /dev/null 53.82s user 0.03s system 99% cpu 53.875 total 

What is going on here? Is it possible to reproduce the performance when explicitly providing a format string?

1 Answer 1

3

This is quite the observation, but it actually makes sense. Here is the source code: http://code.metager.de/source/xref/gnu/coreutils/src/seq.c

Firstly, notice the seq_fast function and the comment before it is called:

608 /* If the following hold: 609 - no format string, [FIXME: relax this, eventually] 610 - integer start (or no start) 611 - integer end 612 - increment == 1 or not specified [FIXME: relax this, eventually] 613 then use the much more efficient integer-only code. */ 

We see that they have a better algorithm when these conditions are met. Indeed, if we instead add an increment we get the same slower behavior because print_numbers is used instead of seq_fast:

time seq 1e9 > /dev/null seq 1e9 > /dev/null 4.68s user 0.09s system 99% cpu 4.770 total time seq 1 7 1e9 > /dev/null seq 1 7 1e9 > /dev/null 56.78s user 0.02s system 99% cpu 56.801 total 

As for why formatting takes even longer (1 minute time with 1e8 instead of 1e9), notice that 53/10^8 seconds = 530 nanoseconds. So on average, format code (which has to run on every single number before printing) adds about 530 nanoseconds per printed number. Considering all the branching and complex logic involved in formatting, this makes sense too.

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.