Skip to main content
deleted 39 characters in body
Source Link
Sundeep
  • 12.2k
  • 3
  • 28
  • 75

If perl is okay:

$ perl -F'\t' -lane 'if(!$#ARGV){$r .= "$p$_"; $p="|";$h{$_}=1; close ARGV if eof; next } @i = grep {$F[$_] =~exists /^($r)$/$h{$F[$_]} } 4..$#F if $.==1; print join "\t", @F[0..3, @i]' f2.txt f1.tsv chrom pos ref alt a1 a4 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd 

The $rhash variable will constructuse the regex (in this example, a1|a4)lines from the second file as keys.

Then, grep is used to get the index from the header line of the TSV file based on the regex in $r (bounded by line anchors to prevent partial match)testing field names against the hash keys.

Finally, the first 4 columns and the filtered index values are used for printing.

If perl is okay:

$ perl -F'\t' -lane 'if(!$#ARGV){$r .= "$p$_"; $p="|"; close ARGV if eof; next} @i = grep {$F[$_] =~ /^($r)$/} 4..$#F if $.==1; print join "\t", @F[0..3, @i]' f2.txt f1.tsv chrom pos ref alt a1 a4 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd 

The $r variable will construct the regex (in this example, a1|a4).

Then, grep is used to get the index from the header line of the TSV file based on the regex in $r (bounded by line anchors to prevent partial match).

Finally, the first 4 columns and the filtered index values are used for printing.

If perl is okay:

$ perl -F'\t' -lane 'if(!$#ARGV){ $h{$_}=1; close ARGV if eof; next } @i = grep { exists $h{$F[$_]} } 4..$#F if $.==1; print join "\t", @F[0..3, @i]' f2.txt f1.tsv chrom pos ref alt a1 a4 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd 

The hash variable will use the lines from the second file as keys.

Then, grep is used to get the index from the header line of the TSV file by testing field names against the hash keys.

Finally, the first 4 columns and the filtered index values are used for printing.

edited body
Source Link
Sundeep
  • 12.2k
  • 3
  • 28
  • 75

If perl is okay:

$ perl -F'\t' -lane 'if(!$#ARGV){$r .= "$p$_"; $p="|"; close ARGV if eof; next} @i = grep {$F[$_] =~ /^($r)$/} 04..$#F if $.==1; print join "\t", @F[0..3, @i]' f2.txt f1.tsv chrom pos ref alt a1 a4 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd 

The $r variable will construct the regex (in this example, a1|a4).

Then, grep is used to get the index from the header line of the TSV file based on the regex in $r (bounded by line anchors to prevent partial match).

Finally, the first 4 columns and the filtered index values are used for printing.

If perl is okay:

$ perl -F'\t' -lane 'if(!$#ARGV){$r .= "$p$_"; $p="|"; close ARGV if eof; next} @i = grep {$F[$_] =~ /^($r)$/} 0..$#F if $.==1; print join "\t", @F[0..3, @i]' f2.txt f1.tsv chrom pos ref alt a1 a4 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd 

The $r variable will construct the regex (in this example, a1|a4).

Then, grep is used to get the index from the header line of the TSV file based on the regex in $r (bounded by line anchors to prevent partial match).

Finally, the first 4 columns and the filtered index values are used for printing.

If perl is okay:

$ perl -F'\t' -lane 'if(!$#ARGV){$r .= "$p$_"; $p="|"; close ARGV if eof; next} @i = grep {$F[$_] =~ /^($r)$/} 4..$#F if $.==1; print join "\t", @F[0..3, @i]' f2.txt f1.tsv chrom pos ref alt a1 a4 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd 

The $r variable will construct the regex (in this example, a1|a4).

Then, grep is used to get the index from the header line of the TSV file based on the regex in $r (bounded by line anchors to prevent partial match).

Finally, the first 4 columns and the filtered index values are used for printing.

Source Link
Sundeep
  • 12.2k
  • 3
  • 28
  • 75

If perl is okay:

$ perl -F'\t' -lane 'if(!$#ARGV){$r .= "$p$_"; $p="|"; close ARGV if eof; next} @i = grep {$F[$_] =~ /^($r)$/} 0..$#F if $.==1; print join "\t", @F[0..3, @i]' f2.txt f1.tsv chrom pos ref alt a1 a4 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd 

The $r variable will construct the regex (in this example, a1|a4).

Then, grep is used to get the index from the header line of the TSV file based on the regex in $r (bounded by line anchors to prevent partial match).

Finally, the first 4 columns and the filtered index values are used for printing.