Skip to main content
missing quotes
Source Link
Stéphane Chazelas
  • 586.4k
  • 96
  • 1.1k
  • 1.7k

If you want lines X to Y inclusive (starting the numbering at 1), use

tail -n +$X"+$X" /path/to/file | head -n $"$((Y-X+1))" 

tail will read and discard the first X-1 lines (there's no way around that), then read and print the following lines. head will read and print the requested number of lines, then exit. When head exits, tail receives a SIGPIPE signal and dies, so it won't have read more than a buffer size's worth (typically a few kilobytes) of lines from the input file.

Alternatively, as gorkypl suggested, use sed:

sed -n -e "$X,$Y p" -e "$Y q" /path/to/file 

The sed solution is significantly slower though (at least for GNU utilities and Busybox utilities; sed might be more competitive if you extract a large part of the file on an OS where piping is slow and sed is fast). Here are quick benchmarks under Linux; the data was generated by seq 100000000 >/tmp/a, the environment is Linux/amd64, /tmp is tmpfs and the machine is otherwise idle and not swapping.

real user sys command 0.47 0.32 0.12 </tmp/a tail -n +50000001 | head -n 10 #GNU 0.86 0.64 0.21 </tmp/a tail -n +50000001 | head -n 10 #BusyBox 3.57 3.41 0.14 sed -n -e '50000000,50000010 p' -e '50000010q' /tmp/a #GNU 11.91 11.68 0.14 sed -n -e '50000000,50000010 p' -e '50000010q' /tmp/a #BusyBox 1.04 0.60 0.46 </tmp/a tail -n +50000001 | head -n 40000001 >/dev/null #GNU 7.12 6.58 0.55 </tmp/a tail -n +50000001 | head -n 40000001 >/dev/null #BusyBox 9.95 9.54 0.28 sed -n -e '50000000,90000000 p' -e '90000000q' /tmp/a >/dev/null #GNU 23.76 23.13 0.31 sed -n -e '50000000,90000000 p' -e '90000000q' /tmp/a >/dev/null #BusyBox 

If you know the byte range you want to work with, you can extract it faster by skipping directly to the start position. But for lines, you have to read from the beginning and count newlines. To extract blocks from x inclusive to y exclusive starting at 0, with a block size of b:

dd bs=$bbs="$b" seek=$xseek="$x" count=$count="$((y-x))" </path/to/file 

If you want lines X to Y inclusive (starting the numbering at 1), use

tail -n +$X /path/to/file | head -n $((Y-X+1)) 

tail will read and discard the first X-1 lines (there's no way around that), then read and print the following lines. head will read and print the requested number of lines, then exit. When head exits, tail receives a SIGPIPE signal and dies, so it won't have read more than a buffer size's worth (typically a few kilobytes) of lines from the input file.

Alternatively, as gorkypl suggested, use sed:

sed -n -e "$X,$Y p" -e "$Y q" /path/to/file 

The sed solution is significantly slower though (at least for GNU utilities and Busybox utilities; sed might be more competitive if you extract a large part of the file on an OS where piping is slow and sed is fast). Here are quick benchmarks under Linux; the data was generated by seq 100000000 >/tmp/a, the environment is Linux/amd64, /tmp is tmpfs and the machine is otherwise idle and not swapping.

real user sys command 0.47 0.32 0.12 </tmp/a tail -n +50000001 | head -n 10 #GNU 0.86 0.64 0.21 </tmp/a tail -n +50000001 | head -n 10 #BusyBox 3.57 3.41 0.14 sed -n -e '50000000,50000010 p' -e '50000010q' /tmp/a #GNU 11.91 11.68 0.14 sed -n -e '50000000,50000010 p' -e '50000010q' /tmp/a #BusyBox 1.04 0.60 0.46 </tmp/a tail -n +50000001 | head -n 40000001 >/dev/null #GNU 7.12 6.58 0.55 </tmp/a tail -n +50000001 | head -n 40000001 >/dev/null #BusyBox 9.95 9.54 0.28 sed -n -e '50000000,90000000 p' -e '90000000q' /tmp/a >/dev/null #GNU 23.76 23.13 0.31 sed -n -e '50000000,90000000 p' -e '90000000q' /tmp/a >/dev/null #BusyBox 

If you know the byte range you want to work with, you can extract it faster by skipping directly to the start position. But for lines, you have to read from the beginning and count newlines. To extract blocks from x inclusive to y exclusive starting at 0, with a block size of b:

dd bs=$b seek=$x count=$((y-x)) </path/to/file 

If you want lines X to Y inclusive (starting the numbering at 1), use

tail -n "+$X" /path/to/file | head -n "$((Y-X+1))" 

tail will read and discard the first X-1 lines (there's no way around that), then read and print the following lines. head will read and print the requested number of lines, then exit. When head exits, tail receives a SIGPIPE signal and dies, so it won't have read more than a buffer size's worth (typically a few kilobytes) of lines from the input file.

Alternatively, as gorkypl suggested, use sed:

sed -n -e "$X,$Y p" -e "$Y q" /path/to/file 

The sed solution is significantly slower though (at least for GNU utilities and Busybox utilities; sed might be more competitive if you extract a large part of the file on an OS where piping is slow and sed is fast). Here are quick benchmarks under Linux; the data was generated by seq 100000000 >/tmp/a, the environment is Linux/amd64, /tmp is tmpfs and the machine is otherwise idle and not swapping.

real user sys command 0.47 0.32 0.12 </tmp/a tail -n +50000001 | head -n 10 #GNU 0.86 0.64 0.21 </tmp/a tail -n +50000001 | head -n 10 #BusyBox 3.57 3.41 0.14 sed -n -e '50000000,50000010 p' -e '50000010q' /tmp/a #GNU 11.91 11.68 0.14 sed -n -e '50000000,50000010 p' -e '50000010q' /tmp/a #BusyBox 1.04 0.60 0.46 </tmp/a tail -n +50000001 | head -n 40000001 >/dev/null #GNU 7.12 6.58 0.55 </tmp/a tail -n +50000001 | head -n 40000001 >/dev/null #BusyBox 9.95 9.54 0.28 sed -n -e '50000000,90000000 p' -e '90000000q' /tmp/a >/dev/null #GNU 23.76 23.13 0.31 sed -n -e '50000000,90000000 p' -e '90000000q' /tmp/a >/dev/null #BusyBox 

If you know the byte range you want to work with, you can extract it faster by skipping directly to the start position. But for lines, you have to read from the beginning and count newlines. To extract blocks from x inclusive to y exclusive starting at 0, with a block size of b:

dd bs="$b" seek="$x" count="$((y-x))" </path/to/file 
replaced http://unix.stackexchange.com/ with https://unix.stackexchange.com/
Source Link

If you want lines X to Y inclusive (starting the numbering at 1), use

tail -n +$X /path/to/file | head -n $((Y-X+1)) 

tail will read and discard the first X-1 lines (there's no way around that), then read and print the following lines. head will read and print the requested number of lines, then exit. When head exits, tail receives a SIGPIPE signal and dies, so it won't have read more than a buffer size's worth (typically a few kilobytes) of lines from the input file.

Alternatively, as gorkyplgorkypl suggested, use sed:

sed -n -e "$X,$Y p" -e "$Y q" /path/to/file 

The sed solution is significantly slower though (at least for GNU utilities and Busybox utilities; sed might be more competitive if you extract a large part of the file on an OS where piping is slow and sed is fast). Here are quick benchmarks under Linux; the data was generated by seq 100000000 >/tmp/a, the environment is Linux/amd64, /tmp is tmpfs and the machine is otherwise idle and not swapping.

real user sys command 0.47 0.32 0.12 </tmp/a tail -n +50000001 | head -n 10 #GNU 0.86 0.64 0.21 </tmp/a tail -n +50000001 | head -n 10 #BusyBox 3.57 3.41 0.14 sed -n -e '50000000,50000010 p' -e '50000010q' /tmp/a #GNU 11.91 11.68 0.14 sed -n -e '50000000,50000010 p' -e '50000010q' /tmp/a #BusyBox 1.04 0.60 0.46 </tmp/a tail -n +50000001 | head -n 40000001 >/dev/null #GNU 7.12 6.58 0.55 </tmp/a tail -n +50000001 | head -n 40000001 >/dev/null #BusyBox 9.95 9.54 0.28 sed -n -e '50000000,90000000 p' -e '90000000q' /tmp/a >/dev/null #GNU 23.76 23.13 0.31 sed -n -e '50000000,90000000 p' -e '90000000q' /tmp/a >/dev/null #BusyBox 

If you know the byte range you want to work with, you can extract it faster by skipping directly to the start position. But for lines, you have to read from the beginning and count newlines. To extract blocks from x inclusive to y exclusive starting at 0, with a block size of b:

dd bs=$b seek=$x count=$((y-x)) </path/to/file 

If you want lines X to Y inclusive (starting the numbering at 1), use

tail -n +$X /path/to/file | head -n $((Y-X+1)) 

tail will read and discard the first X-1 lines (there's no way around that), then read and print the following lines. head will read and print the requested number of lines, then exit. When head exits, tail receives a SIGPIPE signal and dies, so it won't have read more than a buffer size's worth (typically a few kilobytes) of lines from the input file.

Alternatively, as gorkypl suggested, use sed:

sed -n -e "$X,$Y p" -e "$Y q" /path/to/file 

The sed solution is significantly slower though (at least for GNU utilities and Busybox utilities; sed might be more competitive if you extract a large part of the file on an OS where piping is slow and sed is fast). Here are quick benchmarks under Linux; the data was generated by seq 100000000 >/tmp/a, the environment is Linux/amd64, /tmp is tmpfs and the machine is otherwise idle and not swapping.

real user sys command 0.47 0.32 0.12 </tmp/a tail -n +50000001 | head -n 10 #GNU 0.86 0.64 0.21 </tmp/a tail -n +50000001 | head -n 10 #BusyBox 3.57 3.41 0.14 sed -n -e '50000000,50000010 p' -e '50000010q' /tmp/a #GNU 11.91 11.68 0.14 sed -n -e '50000000,50000010 p' -e '50000010q' /tmp/a #BusyBox 1.04 0.60 0.46 </tmp/a tail -n +50000001 | head -n 40000001 >/dev/null #GNU 7.12 6.58 0.55 </tmp/a tail -n +50000001 | head -n 40000001 >/dev/null #BusyBox 9.95 9.54 0.28 sed -n -e '50000000,90000000 p' -e '90000000q' /tmp/a >/dev/null #GNU 23.76 23.13 0.31 sed -n -e '50000000,90000000 p' -e '90000000q' /tmp/a >/dev/null #BusyBox 

If you know the byte range you want to work with, you can extract it faster by skipping directly to the start position. But for lines, you have to read from the beginning and count newlines. To extract blocks from x inclusive to y exclusive starting at 0, with a block size of b:

dd bs=$b seek=$x count=$((y-x)) </path/to/file 

If you want lines X to Y inclusive (starting the numbering at 1), use

tail -n +$X /path/to/file | head -n $((Y-X+1)) 

tail will read and discard the first X-1 lines (there's no way around that), then read and print the following lines. head will read and print the requested number of lines, then exit. When head exits, tail receives a SIGPIPE signal and dies, so it won't have read more than a buffer size's worth (typically a few kilobytes) of lines from the input file.

Alternatively, as gorkypl suggested, use sed:

sed -n -e "$X,$Y p" -e "$Y q" /path/to/file 

The sed solution is significantly slower though (at least for GNU utilities and Busybox utilities; sed might be more competitive if you extract a large part of the file on an OS where piping is slow and sed is fast). Here are quick benchmarks under Linux; the data was generated by seq 100000000 >/tmp/a, the environment is Linux/amd64, /tmp is tmpfs and the machine is otherwise idle and not swapping.

real user sys command 0.47 0.32 0.12 </tmp/a tail -n +50000001 | head -n 10 #GNU 0.86 0.64 0.21 </tmp/a tail -n +50000001 | head -n 10 #BusyBox 3.57 3.41 0.14 sed -n -e '50000000,50000010 p' -e '50000010q' /tmp/a #GNU 11.91 11.68 0.14 sed -n -e '50000000,50000010 p' -e '50000010q' /tmp/a #BusyBox 1.04 0.60 0.46 </tmp/a tail -n +50000001 | head -n 40000001 >/dev/null #GNU 7.12 6.58 0.55 </tmp/a tail -n +50000001 | head -n 40000001 >/dev/null #BusyBox 9.95 9.54 0.28 sed -n -e '50000000,90000000 p' -e '90000000q' /tmp/a >/dev/null #GNU 23.76 23.13 0.31 sed -n -e '50000000,90000000 p' -e '90000000q' /tmp/a >/dev/null #BusyBox 

If you know the byte range you want to work with, you can extract it faster by skipping directly to the start position. But for lines, you have to read from the beginning and count newlines. To extract blocks from x inclusive to y exclusive starting at 0, with a block size of b:

dd bs=$b seek=$x count=$((y-x)) </path/to/file 
added benchmarks for tail|head vs sed
Source Link
Gilles 'SO- stop being evil'
  • 865.9k
  • 205
  • 1.8k
  • 2.3k

If you want lines X to Y inclusive (starting the numbering at 1), use

tail -n +$X /path/to/file | head -n $((Y-X+1)) 

tail will read and discard the first X-1 lines (there's no way around that), then read and print the following lines. head will read and print the requested number of lines, then exit. When head exits, tail receives a SIGPIPE signal and dies, so it won't have read more than a buffer size's worth (typically a few kilobytes) of lines from the input file.

Alternatively, as gorkypl suggested, use sed:

sed -n -e "$X,$Y p" -e "$Y q" /path/to/file 

I expectThe sed solution is significantly slower though (at least for GNU utilities and Busybox utilities; sed might be more competitive if you extract a large part of the file on an OS where piping is slow and sed is fast). Here are quick benchmarks under Linux; the data was generated by tailseq |100000000 head>/tmp/a, the environment is Linux/amd64, /tmp is tmpfs and sed to have similar performancethe machine is otherwise idle and not swapping.

real user sys command 0.47 0.32 0.12 </tmp/a tail -n +50000001 | head -n 10 #GNU 0.86 0.64 0.21 </tmp/a tail -n +50000001 | head -n 10 #BusyBox 3.57 3.41 0.14 sed -n -e '50000000,50000010 p' -e '50000010q' /tmp/a #GNU 11.91 11.68 0.14 sed -n -e '50000000,50000010 p' -e '50000010q' /tmp/a #BusyBox 1.04 0.60 0.46 </tmp/a tail -n +50000001 | head -n 40000001 >/dev/null #GNU 7.12 6.58 0.55 </tmp/a tail -n +50000001 | head -n 40000001 >/dev/null #BusyBox 9.95 9.54 0.28 sed -n -e '50000000,90000000 p' -e '90000000q' /tmp/a >/dev/null #GNU 23.76 23.13 0.31 sed -n -e '50000000,90000000 p' -e '90000000q' /tmp/a >/dev/null #BusyBox 

If you know the byte range you want to work with, you can extract it faster by skipping directly to the start position. But for lines, you have to read from the beginning and count newlines. To extract blocks from x inclusive to y exclusive starting at 0, with a block size of b:

dd bs=$b seek=$x count=$((y-x)) </path/to/file 

If you want lines X to Y inclusive (starting the numbering at 1), use

tail -n +$X /path/to/file | head -n $((Y-X+1)) 

tail will read and discard the first X-1 lines (there's no way around that), then read and print the following lines. head will read and print the requested number of lines, then exit. When head exits, tail receives a SIGPIPE signal and dies, so it won't have read more than a buffer size's worth (typically a few kilobytes) of lines from the input file.

Alternatively, as gorkypl suggested, use sed:

sed -n -e "$X,$Y p" -e "$Y q" /path/to/file 

I expect tail | head and sed to have similar performance.

If you know the byte range you want to work with, you can extract it faster by skipping directly to the start position. But for lines, you have to read from the beginning and count newlines. To extract blocks from x inclusive to y exclusive starting at 0, with a block size of b:

dd bs=$b seek=$x count=$((y-x)) </path/to/file 

If you want lines X to Y inclusive (starting the numbering at 1), use

tail -n +$X /path/to/file | head -n $((Y-X+1)) 

tail will read and discard the first X-1 lines (there's no way around that), then read and print the following lines. head will read and print the requested number of lines, then exit. When head exits, tail receives a SIGPIPE signal and dies, so it won't have read more than a buffer size's worth (typically a few kilobytes) of lines from the input file.

Alternatively, as gorkypl suggested, use sed:

sed -n -e "$X,$Y p" -e "$Y q" /path/to/file 

The sed solution is significantly slower though (at least for GNU utilities and Busybox utilities; sed might be more competitive if you extract a large part of the file on an OS where piping is slow and sed is fast). Here are quick benchmarks under Linux; the data was generated by seq 100000000 >/tmp/a, the environment is Linux/amd64, /tmp is tmpfs and the machine is otherwise idle and not swapping.

real user sys command 0.47 0.32 0.12 </tmp/a tail -n +50000001 | head -n 10 #GNU 0.86 0.64 0.21 </tmp/a tail -n +50000001 | head -n 10 #BusyBox 3.57 3.41 0.14 sed -n -e '50000000,50000010 p' -e '50000010q' /tmp/a #GNU 11.91 11.68 0.14 sed -n -e '50000000,50000010 p' -e '50000010q' /tmp/a #BusyBox 1.04 0.60 0.46 </tmp/a tail -n +50000001 | head -n 40000001 >/dev/null #GNU 7.12 6.58 0.55 </tmp/a tail -n +50000001 | head -n 40000001 >/dev/null #BusyBox 9.95 9.54 0.28 sed -n -e '50000000,90000000 p' -e '90000000q' /tmp/a >/dev/null #GNU 23.76 23.13 0.31 sed -n -e '50000000,90000000 p' -e '90000000q' /tmp/a >/dev/null #BusyBox 

If you know the byte range you want to work with, you can extract it faster by skipping directly to the start position. But for lines, you have to read from the beginning and count newlines. To extract blocks from x inclusive to y exclusive starting at 0, with a block size of b:

dd bs=$b seek=$x count=$((y-x)) </path/to/file 
Source Link
Gilles 'SO- stop being evil'
  • 865.9k
  • 205
  • 1.8k
  • 2.3k
Loading