Return to Answer

deleted 39 characters in body

edited Mar 22, 2024 at 15:39

12.2k
3
28
75

If perl is okay:

$ perl -F'\t' -lane 'if(!$#ARGV){$r .= "$p$_"; $p="|";$h{$_}=1; close ARGV if eof; next } @i = grep {$F[$_] =~exists /^($r)$/$h{$F[$_]} } 4..$#F if $.==1; print join "\t", @F[0..3, @i]' f2.txt f1.tsv chrom pos ref alt a1 a4 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd

The $rhash variable will constructuse the regex (in this example, a1|a4)lines from the second file as keys.

Then, grep is used to get the index from the header line of the TSV file based on the regex in $r (bounded by line anchors to prevent partial match)testing field names against the hash keys.

Finally, the first 4 columns and the filtered index values are used for printing.

If perl is okay:

$ perl -F'\t' -lane 'if(!$#ARGV){$r .= "$p$_"; $p="|"; close ARGV if eof; next} @i = grep {$F[$_] =~ /^($r)$/} 4..$#F if $.==1; print join "\t", @F[0..3, @i]' f2.txt f1.tsv chrom pos ref alt a1 a4 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd

The $r variable will construct the regex (in this example, a1|a4).

Then, grep is used to get the index from the header line of the TSV file based on the regex in $r (bounded by line anchors to prevent partial match).

Finally, the first 4 columns and the filtered index values are used for printing.

If perl is okay:

$ perl -F'\t' -lane 'if(!$#ARGV){ $h{$_}=1; close ARGV if eof; next } @i = grep { exists $h{$F[$_]} } 4..$#F if $.==1; print join "\t", @F[0..3, @i]' f2.txt f1.tsv chrom pos ref alt a1 a4 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd

The hash variable will use the lines from the second file as keys.

Then, grep is used to get the index from the header line of the TSV file by testing field names against the hash keys.

Finally, the first 4 columns and the filtered index values are used for printing.

edited body

Source Link

edited Mar 22, 2024 at 8:32

Sundeep

12.2k
3
28
75

If perl is okay:

$ perl -F'\t' -lane 'if(!$#ARGV){$r .= "$p$_"; $p="|"; close ARGV if eof; next} @i = grep {$F[$_] =~ /^($r)$/} 04..$#F if $.==1; print join "\t", @F[0..3, @i]' f2.txt f1.tsv chrom pos ref alt a1 a4 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd

The $r variable will construct the regex (in this example, a1|a4).

Then, grep is used to get the index from the header line of the TSV file based on the regex in $r (bounded by line anchors to prevent partial match).

Finally, the first 4 columns and the filtered index values are used for printing.

If perl is okay:

$ perl -F'\t' -lane 'if(!$#ARGV){$r .= "$p$_"; $p="|"; close ARGV if eof; next} @i = grep {$F[$_] =~ /^($r)$/} 0..$#F if $.==1; print join "\t", @F[0..3, @i]' f2.txt f1.tsv chrom pos ref alt a1 a4 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd

The $r variable will construct the regex (in this example, a1|a4).

Then, grep is used to get the index from the header line of the TSV file based on the regex in $r (bounded by line anchors to prevent partial match).

Finally, the first 4 columns and the filtered index values are used for printing.

If perl is okay:

$ perl -F'\t' -lane 'if(!$#ARGV){$r .= "$p$_"; $p="|"; close ARGV if eof; next} @i = grep {$F[$_] =~ /^($r)$/} 4..$#F if $.==1; print join "\t", @F[0..3, @i]' f2.txt f1.tsv chrom pos ref alt a1 a4 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd

The $r variable will construct the regex (in this example, a1|a4).

Then, grep is used to get the index from the header line of the TSV file based on the regex in $r (bounded by line anchors to prevent partial match).

Finally, the first 4 columns and the filtered index values are used for printing.

Source Link

answered Mar 22, 2024 at 8:25

Sundeep

12.2k
3
28
75

If perl is okay:

$ perl -F'\t' -lane 'if(!$#ARGV){$r .= "$p$_"; $p="|"; close ARGV if eof; next} @i = grep {$F[$_] =~ /^($r)$/} 0..$#F if $.==1; print join "\t", @F[0..3, @i]' f2.txt f1.tsv chrom pos ref alt a1 a4 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd 10 12345 C T aa dd

The $r variable will construct the regex (in this example, a1|a4).

Then, grep is used to get the index from the header line of the TSV file based on the regex in $r (bounded by line anchors to prevent partial match).

Finally, the first 4 columns and the filtered index values are used for printing.