Skip to main content
added 39 characters in body
Source Link
Stéphane Chazelas
  • 586.4k
  • 96
  • 1.1k
  • 1.7k

I suggest the sed solution, but for the sake of completeness,

awk 'NR >= 57890000 && NR <= 57890010' /path/to/file 

To cut out after the last line:

awk 'NR < 57890000 { next } { print } NR == 57890010 { exit }' /path/to/file 

Speed test (here on macOS, YMMV on other systems):

  • 100,000,000-line file generated by seq 100000000 > test.in
  • Reading lines 50,000,000-50,000,010
  • Tests in no particular order
  • real time as reported by bash's builtin time
 4.373 4.418 4.395 tail -n+50000000 test.in | head -n10 5.210 5.179 6.181 sed -n '50000000,50000010p;57890010q' test.in 5.525 5.475 5.488 head -n50000010 test.in | tail -n10 8.497 8.352 8.438 sed -n '50000000,50000010p' test.in 22.826 23.154 23.195 tail -n50000001 test.in | head -n10 25.694 25.908 27.638 ed -s test.in <<<"50000000,50000010p" 31.348 28.140 30.574 awk 'NR<57890000{next}1;NR==57890010{exit}' test.in 51.359 50.919 51.127 awk 'NR >= 57890000 && NR <= 57890010' test.in 

These are by no means precise benchmarks, but the difference is clear and repeatable enough* to give a good sense of the relative speed of each of these commands.

*: Except between the first two, sed -n p;q and head|tail, which seem to be essentially the same.

I suggest the sed solution, but for the sake of completeness,

awk 'NR >= 57890000 && NR <= 57890010' /path/to/file 

To cut out after the last line:

awk 'NR < 57890000 { next } { print } NR == 57890010 { exit }' /path/to/file 

Speed test:

  • 100,000,000-line file generated by seq 100000000 > test.in
  • Reading lines 50,000,000-50,000,010
  • Tests in no particular order
  • real time as reported by bash's builtin time
 4.373 4.418 4.395 tail -n+50000000 test.in | head -n10 5.210 5.179 6.181 sed -n '50000000,50000010p;57890010q' test.in 5.525 5.475 5.488 head -n50000010 test.in | tail -n10 8.497 8.352 8.438 sed -n '50000000,50000010p' test.in 22.826 23.154 23.195 tail -n50000001 test.in | head -n10 25.694 25.908 27.638 ed -s test.in <<<"50000000,50000010p" 31.348 28.140 30.574 awk 'NR<57890000{next}1;NR==57890010{exit}' test.in 51.359 50.919 51.127 awk 'NR >= 57890000 && NR <= 57890010' test.in 

These are by no means precise benchmarks, but the difference is clear and repeatable enough* to give a good sense of the relative speed of each of these commands.

*: Except between the first two, sed -n p;q and head|tail, which seem to be essentially the same.

I suggest the sed solution, but for the sake of completeness,

awk 'NR >= 57890000 && NR <= 57890010' /path/to/file 

To cut out after the last line:

awk 'NR < 57890000 { next } { print } NR == 57890010 { exit }' /path/to/file 

Speed test (here on macOS, YMMV on other systems):

  • 100,000,000-line file generated by seq 100000000 > test.in
  • Reading lines 50,000,000-50,000,010
  • Tests in no particular order
  • real time as reported by bash's builtin time
 4.373 4.418 4.395 tail -n+50000000 test.in | head -n10 5.210 5.179 6.181 sed -n '50000000,50000010p;57890010q' test.in 5.525 5.475 5.488 head -n50000010 test.in | tail -n10 8.497 8.352 8.438 sed -n '50000000,50000010p' test.in 22.826 23.154 23.195 tail -n50000001 test.in | head -n10 25.694 25.908 27.638 ed -s test.in <<<"50000000,50000010p" 31.348 28.140 30.574 awk 'NR<57890000{next}1;NR==57890010{exit}' test.in 51.359 50.919 51.127 awk 'NR >= 57890000 && NR <= 57890010' test.in 

These are by no means precise benchmarks, but the difference is clear and repeatable enough* to give a good sense of the relative speed of each of these commands.

*: Except between the first two, sed -n p;q and head|tail, which seem to be essentially the same.

not sure, but I think the second ' added to the second awk was necessary. Added /path/to/file where it was possibly missing.
Source Link

I suggest the sed solution, but for the sake of completeness,

awk 'NR >= 57890000 && NR <= 57890010' /path/to/file 

To cut out after the last line:

awk 'NR < 57890000 { next } { print } NR == 57890010 { exit }' /path/to/file 

Speed test:

  • 100,000,000-line file generated by seq 100000000 > test.in
  • Reading lines 50,000,000-50,000,010
  • Tests in no particular order
  • real time as reported by bash's builtin time
 4.373 4.418 4.395 tail -n+50000000 test.in | head -n10 5.210 5.179 6.181 sed -n '50000000,50000010p;57890010q' test.in 5.525 5.475 5.488 head -n50000010 test.in | tail -n10 8.497 8.352 8.438 sed -n '50000000,50000010p' test.in 22.826 23.154 23.195 tail -n50000001 test.in | head -n10 25.694 25.908 27.638 ed -s test.in <<<"50000000,50000010p" 31.348 28.140 30.574 awk 'NR<57890000{next}1;NR==57890010{exit}' test.in 51.359 50.919 51.127 awk 'NR >= 57890000 && NR <= 57890010' test.in 

These are by no means precise benchmarks, but the difference is clear and repeatable enough* to give a good sense of the relative speed of each of these commands.

*: Except between the first two, sed -n p;q and head|tail, which seem to be essentially the same.

I suggest the sed solution, but for the sake of completeness,

awk 'NR >= 57890000 && NR <= 57890010' 

To cut out after the last line:

awk 'NR < 57890000 { next } { print } NR == 57890010 { exit } 

Speed test:

  • 100,000,000-line file generated by seq 100000000 > test.in
  • Reading lines 50,000,000-50,000,010
  • Tests in no particular order
  • real time as reported by bash's builtin time
 4.373 4.418 4.395 tail -n+50000000 test.in | head -n10 5.210 5.179 6.181 sed -n '50000000,50000010p;57890010q' test.in 5.525 5.475 5.488 head -n50000010 test.in | tail -n10 8.497 8.352 8.438 sed -n '50000000,50000010p' test.in 22.826 23.154 23.195 tail -n50000001 test.in | head -n10 25.694 25.908 27.638 ed -s test.in <<<"50000000,50000010p" 31.348 28.140 30.574 awk 'NR<57890000{next}1;NR==57890010{exit}' test.in 51.359 50.919 51.127 awk 'NR >= 57890000 && NR <= 57890010' test.in 

These are by no means precise benchmarks, but the difference is clear and repeatable enough* to give a good sense of the relative speed of each of these commands.

*: Except between the first two, sed -n p;q and head|tail, which seem to be essentially the same.

I suggest the sed solution, but for the sake of completeness,

awk 'NR >= 57890000 && NR <= 57890010' /path/to/file 

To cut out after the last line:

awk 'NR < 57890000 { next } { print } NR == 57890010 { exit }' /path/to/file 

Speed test:

  • 100,000,000-line file generated by seq 100000000 > test.in
  • Reading lines 50,000,000-50,000,010
  • Tests in no particular order
  • real time as reported by bash's builtin time
 4.373 4.418 4.395 tail -n+50000000 test.in | head -n10 5.210 5.179 6.181 sed -n '50000000,50000010p;57890010q' test.in 5.525 5.475 5.488 head -n50000010 test.in | tail -n10 8.497 8.352 8.438 sed -n '50000000,50000010p' test.in 22.826 23.154 23.195 tail -n50000001 test.in | head -n10 25.694 25.908 27.638 ed -s test.in <<<"50000000,50000010p" 31.348 28.140 30.574 awk 'NR<57890000{next}1;NR==57890010{exit}' test.in 51.359 50.919 51.127 awk 'NR >= 57890000 && NR <= 57890010' test.in 

These are by no means precise benchmarks, but the difference is clear and repeatable enough* to give a good sense of the relative speed of each of these commands.

*: Except between the first two, sed -n p;q and head|tail, which seem to be essentially the same.

added 64 characters in body
Source Link
Kevin
  • 41.7k
  • 17
  • 91
  • 113

I suggest the sed solution, but for the sake of completeness,

awk 'NR >= 57890000 && NR <= 57890010' 

To cut out after the last line:

awk 'NR < 57890000 { next } { print } NR == 57890010 { exit } 

Speed test:

  • 100,000,000-line file generated by seq 100000000 > test.in
  • Reading lines 50,000,000-50,000,010
  • Tests in no particular order
  • real time as reported by bash's builtin time
 4.373 4.418 4.395 tail -n+50000000 test.in | head -n10 5.210 5.179 6.181 sed -n '50000000,50000010p;57890010q' test.in 5.525 5.475 5.488 head -n50000010 test.in | tail -n-10n10 8.497 8.352 8.438 sed -n '50000000,50000010p' test.in 22.950826 2223.939154 23.441195 tail -n-50000000n50000001 test.in | head -n10 25.694 25.908 27.638 ed -s test.in <<<"50000000,50000010p" 31.348 28.140 30.574 awk 'NR<57890000{next}1;NR==57890010{exit}' test.in 51.359 50.919 51.127 awk 'NR >= 57890000 && NR <= 57890010' test.in 

These are by no means precise benchmarks, but the difference is clear and repeatable enough* to give a good sense of the relative speed of each of these commands.

*: Except between the first two, sed -n p;q and head|tail, which seem to be essentially the same.

I suggest the sed solution, but for the sake of completeness,

awk 'NR >= 57890000 && NR <= 57890010' 

To cut out after the last line:

awk 'NR < 57890000 { next } { print } NR == 57890010 { exit } 

Speed test:

  • 100,000,000-line file generated by seq 100000000 > test.in
  • Reading lines 50,000,000-50,000,010
  • Tests in no particular order
  • real time as reported by bash's builtin time
 5.210 5.179 6.181 sed -n '50000000,50000010p;57890010q' test.in 5.525 5.475 5.488 head -n50000010 test.in | tail -n-10 8.497 8.352 8.438 sed -n '50000000,50000010p' test.in 22.950 22.939 23.441 tail -n-50000000 test.in | head -n10 25.694 25.908 27.638 ed -s test.in <<<"50000000,50000010p" 31.348 28.140 30.574 awk 'NR<57890000{next}1;NR==57890010{exit}' test.in 51.359 50.919 51.127 awk 'NR >= 57890000 && NR <= 57890010' test.in 

These are by no means precise benchmarks, but the difference is clear and repeatable enough* to give a good sense of the relative speed of each of these commands.

*: Except between the first two, sed -n p;q and head|tail, which seem to be essentially the same.

I suggest the sed solution, but for the sake of completeness,

awk 'NR >= 57890000 && NR <= 57890010' 

To cut out after the last line:

awk 'NR < 57890000 { next } { print } NR == 57890010 { exit } 

Speed test:

  • 100,000,000-line file generated by seq 100000000 > test.in
  • Reading lines 50,000,000-50,000,010
  • Tests in no particular order
  • real time as reported by bash's builtin time
 4.373 4.418 4.395 tail -n+50000000 test.in | head -n10 5.210 5.179 6.181 sed -n '50000000,50000010p;57890010q' test.in 5.525 5.475 5.488 head -n50000010 test.in | tail -n10 8.497 8.352 8.438 sed -n '50000000,50000010p' test.in 22.826 23.154 23.195 tail -n50000001 test.in | head -n10 25.694 25.908 27.638 ed -s test.in <<<"50000000,50000010p" 31.348 28.140 30.574 awk 'NR<57890000{next}1;NR==57890010{exit}' test.in 51.359 50.919 51.127 awk 'NR >= 57890000 && NR <= 57890010' test.in 

These are by no means precise benchmarks, but the difference is clear and repeatable enough* to give a good sense of the relative speed of each of these commands.

*: Except between the first two, sed -n p;q and head|tail, which seem to be essentially the same.

added 67 characters in body
Source Link
Kevin
  • 41.7k
  • 17
  • 91
  • 113
Loading
added 5 characters in body
Source Link
Kevin
  • 41.7k
  • 17
  • 91
  • 113
Loading
added 827 characters in body
Source Link
Kevin
  • 41.7k
  • 17
  • 91
  • 113
Loading
Source Link
Kevin
  • 41.7k
  • 17
  • 91
  • 113
Loading