Convert row names into new columns in a data frame

Question

Apologies in advance if this has already been asked elsewhere, but I've tried different attempts and nothing has worked so far.

In my data frame Mesure I would like to split the values of the column Row.names into two new columns named Sample_type and Locality. I try to use a tidyverse solution but R returns me that the column must not be dupicated... How can I modify it ? Also, is it possible to remove the "<" ?

> head(Mesure) Row.names mean_Mesure max_Mesure min_Mesure 1 Aquatic_moss.Paris.AG-110m.< 100 110 90 2 Aquatic_moss.Paris.BE-7. 123 177 53 3 Aquatic_moss.Paris.CO-57.< 40 60 20 4 Aquatic_moss.Paris.CO-58.< 40 50 30 5 Aquatic_moss.Paris.CO-60.< 50 70 30 6 Aquatic_moss.Paris.CS-134.< 200 300 100 > > library(tidyverse) > new_df <- Mesure %>% + rownames_to_column(var = "Row.names") %>% + separate(Row.names,sep = ".",into = c("Sample_type","Locality")) Error: Column name `Row.names` must not be duplicated. Run `rlang::last_error()` to see where the error occurred.

When you create Mesure, wouldn't as_tibble(..., rownames="Row.names") give you what you want? — Limey
– Limey, Commented Jun 24, 2020 at 14:12
Well Mesure is the merging of several lists that come from the splitting of several different data frames. During splitting of these initial data frames, I renamed the dataframes — Sylvain
– Sylvain, Commented Jun 24, 2020 at 14:17

score 3 · Accepted Answer · 2020-06-24 14:28:51Z

To separate that with the first "dot" you can use:

Mesure %>% separate(Row.names, sep = "\\.", into = c("Sample_type", "Locality"), extra = "merge")

Explanation:

You don't need to convert rownames_to_column(), because "Row.names" is already a column.
sep = "." is not enough as the . is taken as a regular expression.
There are many . in the column, so you need to specify extra = "merge" to separate only at first appearance. If you would like to keep only "Paris" without AG-110m etc, you specify extra = "drop" there.

Result with extra = "merge":

 Sample_type Locality mean_Mesure max_Mesure min_Mesure 1 Aquatic_moss Paris.AG-110m.< 100 110 90 2 Aquatic_moss Paris.BE-7. 123 177 53 3 Aquatic_moss Paris.CO-57.< 40 60 20 4 Aquatic_moss Paris.CO-58.< 40 50 30 5 Aquatic_moss Paris.CO-60.< 50 70 30 6 Aquatic_moss Paris.CS-134.< 200 300 100

Result with extra = "drop":

 Sample_type Locality mean_Mesure max_Mesure min_Mesure 1 Aquatic_moss Paris 100 110 90 2 Aquatic_moss Paris 123 177 53 3 Aquatic_moss Paris 40 60 20 4 Aquatic_moss Paris 40 50 30 5 Aquatic_moss Paris 50 70 30 6 Aquatic_moss Paris 200 300 100

If you need to drop "<" at the end of Locality column, run something like:

Mesure$Locality <- gsub("<$", "", Mesure$Locality)

where "<$" means "< at the end of the string".

To drop "<" in the same command line, it is also possible to use extra = "drop" : new_df <- Mesure %>% separate(Row.names,sep = "\\.",into = c("Sample_type","Locality", "Chemicals"), extra = "drop")
@Sylvain Oh yes, this is even better! I was only thinking about two columns as you specified in the question, however separating the chemicals and dropping the rest is definitely a better idea.

Limey · Accepted Answer · 2020-06-24 14:20:19Z

Apologies. I should read your question properly. The second part of your answer would be:

d %>% separate(Row.names, into=c("Sample_type","Locality"), extra="drop") # A tibble: 6 x 6 Sample_type Locality mean_Mesure max_Mesure min_Mesure <chr> <chr> <dbl> <dbl> <dbl> 1 Aquatic moss 100 110 90 2 Aquatic moss 123 177 53 3 Aquatic moss 40 60 20 4 Aquatic moss 40 50 30 5 Aquatic moss 50 70 30 6 Aquatic moss 200 300 100

I can't help you with the first part because I don't know how you create the input data frame.

Collectives™ on Stack Overflow

Convert row names into new columns in a data frame

2 Answers 2

2 Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Related