0

I need to do this:

##fsdfsd ##sdd-ver gen 5.5.7 Xm Gen CDS 1 148 . + . Name=;created by=User;modified by=User;ID=Bm Xm Gen CDS 149 193 . + . Name=;created by=User;modified by=User;ID=Bm Xm Gen CDS 194 279 . + . Name=;created by=User;modified by=User;ID=Bm Xm Gen CDS 280 412 . + . Name=;created by=User;modified by=User;ID=Bm Xm Gen CDS 413 499 . + . Name=;created by=User;modified by=User;ID=Bm Xm Gen CDS 500 702 . + . Name=;created by=User;modified by=User;ID=Bm Xm Gen extracted region 1 148 . + . Name=Extracted region from gi|371442828|gb|JH557032.1|;Extracted interval="437225 <- 437372";ID=Bm Xm Gen extracted region 149 193 . + . Name=Extracted region from gi|371442828|gb|JH557032.1|;Extracted interval="436969 <- 437013";ID=Bm Xm Gen extracted region 194 279 . + . Name=Extracted region from gi|371442828|gb|JH557032.1|;Extracted interval="435418 <- 435503";ID=Bm Xm Gen extracted region 280 412 . + . Name=Extracted region from gi|371442828|gb|JH557032.1|;Extracted interval="435209 <- 435341";ID=Bm Xm Gen extracted region 413 499 . + . Name=Extracted region from gi|371442828|gb|JH557032.1|;Extracted interval="434376 <- 434462";ID=Bm Xm Gen extracted region 500 702 . + . Name=Extracted region from gi|371442828|gb|JH557032.1|;Extracted interval="434084 <- 434286";ID=Bm 

Replace (Xm Gen CDS) rows with values present in (Xm Gen extracted region) rows. i.e first row ($4 column:1 is replaced with 437225 value and $5 column:148 is replaced with 437372, in row 2 ($4 column:149 replaced with 436969, $5 column:193 replaced with 437013 and so on) and print output like below

##gff-version 2 ##source-version geneious 5.5.7 Xm Gen CDS 437225 437372 . + . Name=;created by=User;modified by=User;ID=Bm Xm Gen CDS 436969 437013 . + . Name=;created by=User;modified by=User;ID=Bm Xm Gen CDS 435418 435503 . + . Name=;created by=User;modified by=User;ID=Bm Xm Gen CDS 435209 435341 . + . Name=;created by=User;modified by=User;ID=Bm Xm Gen CDS 434376 434462 . + . Name=;created by=User;modified by=User;ID=Bm Xm Gen CDS 434084 434286 . + . Name=;created by=User;modified by=User;ID=Bm Xm Gen extracted region 1 148 . + . Name=Extracted region from gi|371442828|gb|JH557032.1|;Extracted interval="437225 <- 437372";ID=Bm Xm Gen extracted region 149 193 . + . Name=Extracted region from gi|371442828|gb|JH557032.1|;Extracted interval="436969 <- 437013";ID=Bm Xm Gen extracted region 194 279 . + . Name=Extracted region from gi|371442828|gb|JH557032.1|;Extracted interval="435418 <- 435503";ID=Bm Xm Gen extracted region 280 412 . + . Name=Extracted region from gi|371442828|gb|JH557032.1|;Extracted interval="435209 <- 435341";ID=Bm Xm Gen extracted region 413 499 . + . Name=Extracted region from gi|371442828|gb|JH557032.1|;Extracted interval="434376 <- 434462";ID=Bm Xm Gen extracted region 500 702 . + . Name=Extracted region from gi|371442828|gb|JH557032.1|;Extracted interval="434084 <- 434286";ID=Bm 
2
  • Please take some time to read the editing help (click on the yellow ? on the top right of the question editor). In particular, indent data snippets with four spaces in the editor (use the code button or press Ctrl+K). Commented Jul 3, 2012 at 23:42
  • Just to add to @Gilles's comment, here is a reference for editing Commented Jul 4, 2012 at 0:40

1 Answer 1

3

A little bit complicated variant, however it works pretty well.

head -2 file join <(grep "Xm Gen CDS" file | cat -n) \ <(grep "Xm Gen extracted region" file | cat -n) | \ sed 's/^[0-9]* //;s/CDS [0-9]*\s[0-9]*\(\s.*interval="\([0-9]*\)\s<-\s\([0-9]*\)\)/CDS\t\2\t\3\t\1/;s/ Xm Gen extracted.*//' grep "Xm Gen extracted region" file 

to run it as a shell script

#!/bin/bash FILE="$1" head -2 "$FILE" join <(grep "Xm Gen CDS" "$FILE" | cat -n) \ <(grep "Xm Gen extracted region" "$FILE" | cat -n) | \ sed 's/^[0-9]* //;s/CDS [0-9]*\s[0-9]*\(\s.*interval="\([0-9]*\)\s<-\s\([0-9]*\)\)/CDS\t\2\t\3\t\1/;s/ Xm Gen extracted.*//' grep "Xm Gen extracted region" "$FILE" 
5
  • It would be great..if there is any perl script from the computer experts Commented Jul 5, 2012 at 14:25
  • Great use of join & grep. Can you please tell me, how to run this as shell script, getting input from user and giving output. Commented Jul 5, 2012 at 15:07
  • @bioman just copy the second part to a script file, make it executable with chmod +x scriptname and execute as ./scriptname file_with_data. Commented Jul 5, 2012 at 19:10
  • thanks.Is it possible to get perl code. Because Xm Gen CDS line may differ as (eg.Ym Genx CDS) in various formats, so I cannot use grep Commented Jul 5, 2012 at 19:54
  • Unfortunately I'm not good enough in perl. May be someone else will help you. Commented Jul 5, 2012 at 20:10

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.