I have an Excel sheet from the CRISPR library where I have sequences of gRNAs (with 30 nucleotides) in a column and I only need to keep the first 20 nucleotides for those gRNAs and delete the rest nucleotides. The column has 1000 rows that have 30 nucleotide gRNA sequences. How can I delete extra nucleotides using R
$\begingroup$ $\endgroup$
1 - $\begingroup$ Please clarify your specific problem or provide additional details to highlight exactly what you need. As it's currently written, it's hard to tell exactly what you're asking. $\endgroup$Community– Community Bot2022-11-27 11:44:29 +00:00Commented Nov 27, 2022 at 11:44
Add a comment |
1 Answer
$\begingroup$ $\endgroup$
You will need to "read" your data. There are packages that can read Excel files directly but I would save the data as csv first and then use read.csv(). The following would create a new column with your desired output:
data_frame$new_col <- substring(data_frame$col_with_seqs, 1, 20) Alternatively you can just use Excel's LEFT() since your data is already in Excel:
=LEFT(A1, 20)