Questions tagged [bioinformatics]
Use this tag for questions relating to common bioinformatics tasks performed on a *nix system. Things like manipulating/converting between standard biological text formats, extracting data of interest from such formats etc.
322 questions
1 vote
3 answers
128 views
edit all the values in a specific column based on row numbers range
I have a PDB file (coordinates of atoms in a protein) on a Linux machine: ATOM 1 N GLY A 1 0.535 51.766 5.682 1.00 0.00 ATOM 2 CA GLY A 1 -0.712 50....
4 votes
3 answers
247 views
Add columns from variable number of files to base file
I'm dealing with a series of bed files, which look like this: chr1 100 110 0.5 chr1 150 175 0.2 chr1 200 300 1.5 With the columns being chromosome, start, end, score. I have multiple different files ...
6 votes
3 answers
1k views
bash script quoting frustration
This problem is driving me crazy. From the command prompt I can enter this command and it works as expected (records where the INFO/RegionType tag contains the value Core are emitted in the output ...
5 votes
6 answers
321 views
subset columns from the 1st file using column names in 2nd file
I have two text files: 1st file is a Tab delimited file which looks like this: chrom pos ref alt a1 a2 a3 a4 10 12345 C T aa bb cc dd 10 12345 C T aa bb cc dd 10 12345 C ...
1 vote
5 answers
159 views
sed command to replace a word within a line following a pattern
I'm working with a file that looks like the following, containing with over 50,000 lines of gene IDs followed by their sequence: gene_A:3342234 CTCTTTCTTTTACGCCT gene_A:1244-5205 CTCTTTCTTTTACGCCT ...
1 vote
4 answers
185 views
Remove everything in a third column but only keep specific text
I have a data set with three columns: https://drive.google.com/file/d/1gtCssfAXHxRjGfX8uTAaimGPWCA2cnci/view?usp=sharing Here are the first few lines: ID transcript_id go_description ...
1 vote
5 answers
106 views
How to count word from a column when consecutive cells are equal in a different column using shell script!
I'm trying to count the number of C_R and S_R in column 9 when consecutive cells in column 2, column 3, and column 1 are the same. The file is in bed format (tab-separated format). The original file ...
0 votes
1 answer
60 views
Counting characters between grep searches
Is there a way I can use the grep command in conjunction with a series of other commands to find a character sequence (ie 'GAATTC' in a fasta file) and count how many characters are between each match?...