Like
If I have :
1st line (keep) 2nd line (keep) 3rd line (keep) 4rth lines (delete) 5th (del) 6th (keep) 7nth (keep) 8th lines (keep) 9th (del) 10th (del) 11th (keep) 12th (keep) 13th (keep) 14th (del) 15th (del) etc....
Try:
awk '(NR-1)%5<3' file For example:
$ awk '(NR-1)%5<3' file 1st line (keep) 2nd line (keep) 3rd line (keep) 6th (keep) 7nth (keep) 8th lines (keep) 11th (keep) 12th (keep) 13th (keep) The command (NR-1)%5<3 tells awk to print any line for which (NR-1)%5<3 is true. In awk, NR is the line number with the first line counting as 1. For every five lines in the file, that statement will be true for the first three.
awk '(NR-1)%6<4' file. A simple command is:
awk '{if((NR-1) % 5<=2){print $0}}' file It will only print first 3 lines in sequence of 5 lines. Because (NR-1)%5 will give output like 0 1 2 3 4, and first 3 lines are less than equal to 2. So it will only print them.
I have file with contents:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 The output is:
1 2 3 6 7 8 11 12 13 Or as suggested in comments you can use:
awk '(NR - 1) % 5 <= 2' file awk syntax: awk '(NR - 1) % 5 <= 2' file Basically, you want something like 'Fizz-Buzz' in awk ...
awk '{ if (i++%5 < 3) print $0;}' To show this works...
for x in 1 2 3 4 5 6 7 8 9 10 ; do echo $x; done | awk '{ if (i++%5 < 3) print $0;}' When your file is named, 'mybigfile.csv',
awk '{ if (i++%5 < 3) print $0;}' < mybigfile.csv > mybigfile-123.csv A generic solution for masking out a particular pattern of lines from a file:
#!/bin/sh # The pattern is given on the command line. pattern=$1 # The period is simply the length of the pattern. period=${#pattern} # Use bc to convert the binary pattern to an integer. mask=$( printf 'ibase=2; %s\n' "$pattern" | bc ) awk -v mask="$mask" -v period="$period" ' BEGIN { p = lshift(1, period-1) } and(rshift(p, (FNR-1) % period), mask)' This relies on awk implementing the non-standard functions and() (bitwise AND), rshift() and lshift() (bitwise right and left shift), which both GNU awk and some BSD implementations of awk does, but not mawk.
This takes a pattern, which is a binary number representing both the cyclic period and what lines within each period should be kept or masked out. A 1 means "keep" and a 0 means "delete".
For example: The pattern of line that should be applied in your question is 11100, which means "for each set of five lines, keep the first three and delete the others".
Using 01001000 would delete all but the 2nd and 5th lines in every 8 lines.
The awk program could also be written without the BEGIN block as
and(lshift(1, (period-1) - (FNR-1) % period), mask) Left-shifting 1 by (period-1) - (FNR-1) % period positions is the same as calculating 2 to that power, but I'm using lshift() since awk does its arithmetics using floating point operations rather than in exact integer arithmetics.
Since the code relies on the binary representation of the pattern, very long patterns may not work well.
Testing:
Removing the lines you want to remove:
$ sh script.sh 11100 <file 1st line (keep) 2nd line (keep) 3rd line (keep) 6th (keep) 7nth (keep) 8th lines (keep) 11th (keep) 12th (keep) 13th (keep) Inverting the pattern:
$ sh script.sh 00011 <file 4rth lines (delete) 5th (del) 9th (del) 10th (del) 14th (del) 15th (del) This can be solved using GNU sed:
sed '4~5,5~5d' file Note that this uses a GNU-specific extension to the sed standard, and thus doesn't work with e.g. BSD sed on macOS. However, GNU sed can be installed on macOS using brew, after which it can be used as gsed. On Linux, GNU sed is the default.
This prints every line that does not fall in the fourth till fifth line of every five lines; for a clearer example: sed '3~10,6~10d' fill select lines 1, 2, 7, 8, 9, 10 of every group of 10 lines by deleting lines 3 till 6.
The top-voted answer suggests using awk '(NR-1)%5<3'. On my machine, on a file containing the numbers 1 till 2 million, this takes about 0.6 seconds, while the sed solution in this answer takes about 0.35 seconds. This is reasonable, since sed is in general a simpler tool, and can thus work faster than the more complicated, but more full-featured, awk.
Tried with below command and it worked fine
for((i=1;i<=20;i++)); do j=$(($i+2)); sed -n ''$i','$j'p' filename;i=$(($j+2)); done output
1st line (keep) 2nd line (keep) 3rd line (keep) 6th (keep) 7nth (keep) 8th lines (keep) 11th (keep) 12th (keep) 13th (keep)
print lines 1,2,3 out of each 5 linesfor ex:seq 15 | awk 'BEGIN { a[1] a[2] a[3] }; NR % 5 in a'andseq 15 | sed -n 'p;n;p;n;p;n;n'sedversion above might be faster than theawkone for large files