1

I have a long file (showing only a piece):

145 27262253 2093226 747883433 76303046 2.74331 146 27992017 2188217 747883433 76303046 2.8678 147 30385435 2433407 747883433 76303046 3.18913 148 31218703 2514902 747883433 76303046 3.29594 149 33852828 2660530 747883433 76303046 3.48679 150 36161756 2836045 747883433 76303046 3.71682 Alignments 747883433 76303046 Bases 111613795461 11392665612 1 40000373 2754292 838333186 82982133 3.31914 2 35955786 2451917 838333186 82982133 2.95475 3 33056935 2241392 838333186 82982133 2.70105 4 32241895 2172229 838333186 82982133 2.61771 145 29490370 2184347 838333186 82982133 2.63231 146 30252912 2282821 838333186 82982133 2.75098 147 32862262 2544600 838333186 82982133 3.06644 148 33769718 2631164 838333186 82982133 3.17076 149 36673113 2787718 838333186 82982133 3.35942 150 39222287 2975755 838333186 82982133 3.58602 Alignments 838333186 82982133 Bases 125129342261 12391027833 1 35736929 2509527 741319423 80995147 3.09837 2 32185143 2238927 741319423 80995147 2.76427 3 29595482 2043259 741319423 80995147 2.52269 4 28861157 1978254 741319423 80995147 2.44244 

I want to match the blank line before Alignments word and the word itself. Expecting:

Alignments 747883433 76303046 Alignments 838333186 82982133 

Is it possible? I have many others blank lines and Alignments words. My try: | awk '{if($1 ~ /^[[:space:]]*Alignments/) {print $0}}'. However, I get:

Alignments 747883433 76303046 Alignments 838333186 82982133 
1
  • 1
    You are after a multiline grep/awk. There are several questions that ask the same question. Commented Aug 30, 2019 at 5:39

6 Answers 6

2
$ awk '/^$|^Alignments/' input.txt | uniq Alignments 747883433 76303046 Alignments 838333186 82982133 

the uniq makes sure there will be no more than one blank line before, after, or between any Alignments lines.

grep could be used instead. or sed -n. or perl -n. e.g.

$ grep -E '^$|Alignments' input.txt | uniq 
3
  • How can achieve the same behavior of uniq without using the pipe? I have created a variable in awk so I wouldn't like to lose it. I was thinking something like: awk '/(^$){1}|^Alignments/' input.txt but it matches more than just one blank lines before. Commented Aug 30, 2019 at 12:27
  • 1
    generic answer is "do more stuff in the awk script before piping to uniq"... what exactly are you trying to do? what variable have you created and what's it for? why do you think you would lose it? Commented Aug 30, 2019 at 12:47
  • Thank you @cas for your help. Finally, I realized that I needed to delete the blank line after Bases and 1 because is not part of the original file that I was editing. I achieved it by using this: | awk 'm=($1 == "==>") {save_location=$2} !m&&$1=="1",$1=="Bases" {print > save_location}' which is a piece of a longer instruction. However, your answer responds what I was asking and it's neat! Commented Aug 31, 2019 at 14:44
2

Why don't we use grep? :

grep -A1 "^$" file | grep -B1 'Alignments' | grep -v -- "^--$" 
2

Using GNU awk:

awk -v RS='\nAlignments[ 0-9]*' '{print RT}' file 

The record separtor RS is set to the expected match and is printed for every record using RT (record terminator).

1

You can use the below Awk for your problem. Match a empty line and see if the line starting with Alignments immediately starts after that

awk '!NF { line = NR; next } (NR = line + 1 ) && /^Alignments/{ printf "\n%s\n",$0; }' file 
3
  • 2
    we are hardcoring the newline in output... it would print the same if we don't have an empty line... Commented Aug 30, 2019 at 5:29
  • 2
    @msp9011 : No only if the previous line is empty! Commented Aug 30, 2019 at 5:46
  • 1
    just check by removing empty line in input file... Commented Aug 30, 2019 at 5:52
1

Sed excels in such tasks. First we stick the next line to the current provided the current is empty. Then interrogate and print upon meeting the criterion set.

$ sed -ne ' /./!N /^\nAlignments/p ' file.txt 
0

Tried with Below command and it worked fine

awk '{a[++i]=$0}/Alignments/{for(x=NR-1;x<=NR;x++)print a[x]}' filename| sed -n '/^$/,+1p' 

output

Alignments 747883433 76303046 Alignments 838333186 82982133 

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.