awk print specific number of character in columns

Question

I have a file with many columns and rows and I want to remove the rows that are more than one character in the fourth and fifth columns.

Input:

--- 22:16050115:G:A 16050115 GGG A --- 22:16050213:C:T 16050213 C T --- 22:16050319:C:T 16050319 C T --- 22:16050527:C:A 16050527 C AAA --- 22:16050568:C:A 16050568 CC A --- 22:16050607:G:A 16050607 G A --- 22:16050627:G:T 16050627 G TGG --- 22:16050646:G:T 16050646 G T --- 22:16050655:G:A 16050655 GTAA A ...

Desired output:

--- 22:16050213:C:T 16050213 C T --- 22:16050319:C:T 16050319 C T --- 22:16050607:G:A 16050607 G A --- 22:16050646:G:T 16050646 G T ...

Thank you very much.

P.... · Accepted Answer · 2017-02-27 04:33:59Z

awk 'length($4)==1 && length($5)==1' inputfile --- 22:16050213:C:T 16050213 C T --- 22:16050319:C:T 16050319 C T --- 22:16050607:G:A 16050607 G A --- 22:16050646:G:T 16050646 G T

This will check the length of $4 and $5 using length() function of awk. This is using comparison operator == . You can modify it to < ,> ,<= etc. So the above command will print the lines which have only one character in their 4th and 5th column.

Nice. You can avoid the {print $0} part as well. awk 'length($4)==1 && length($5)==1' file.

Collectives™ on Stack Overflow

awk print specific number of character in columns

1 Answer 1

1 Comment

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Related