I have a tab separated file where the last fifteen fields are formed of zeros and ones. What it's need to do is print lines that do not contain more than five consecutive zeros or more than five consecutive ones, between those fifteen fields separated by groups of five fields.
File:
abadenguísimo abadenguísimo adjective n/a n/a singular n/a masculine 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 abalaustradísimo abalaustradísimo adjective n/a n/a singular n/a masculine 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 abiertísimas abiertísimo adjective n/a n/a plural n/a feminine 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 abellacadísimo abellacadísimo adjective n/a n/a singular n/a masculine 1 0 1 1 1 0 0 1 0 0 1 0 0 0 0 cansonísimos cansonísimo adjective n/a n/a plural n/a masculine 0 1 1 1 0 0 0 0 1 0 0 0 0 0 1 Output:
abellacadísimo abellacadísimo adjective n/a n/a singular n/a masculine 1 0 1 1 1 0 0 1 0 0 1 0 0 0 0 cansonísimos cansonísimo adjective n/a n/a plural n/a masculine 0 1 1 1 0 0 0 0 1 0 0 0 0 0 1 I tried this:
BEGIN { FS = "\t" } { a=0; b=0; c=0; num[A]=""; num[B]=""; num[C]=""; for ( i = 9; i <= 13; i++) num[A]=num[A]""$i; for (j = 14; j <= 18; j++) num[B]=num[B]""$j; for (k = 19; k <= 23; k++) num[C]=num[C]""$k; if ((num[A] != "00000") && (num[A] != "11111")) { a=1; } if (num[B] != "00000") { b=1; } if (num[C] != "00000") { c=1; } if ((a == 1) || (b == 1) || (c == 1)) { print; } } Finally I think I've found a solution, I don't know why the other code doesn't work for me.
BEGIN { FS = "\t" cont=0; } { a=0; b=0; c=0; sum1=$9+$10+$11+$12+$13; sum2=$14+$15+$16+$17+$18; sum3=$19+$20+$21+$22+$23; if (( sum1 > 0 ) && ( sum1 < 5 )) { a=1; } if ( sum2 > 0 ) { b=1; } if ( sum3 > 0 ) { c=1; } if ((a == 1) || (b == 1) || (c == 1)) { cont++; print; } } END { print "Total: "NR; print "OK: "cont; }
00000 11111 00000, but you WOULD want to print a line with00011 11100 01010. Is that correct?