Rename one named column in R

Question

I want to update one column of a dataframe, referencing it using its original name, is this possible? For example say I had the table 'data'

a b c 1 2 2 3 2 3 4 1 2

and I wanted to update the name of column b to 'd'. I know I could use

colnames(data)[2] <- 'd'

but can I make the change by specifically referencing b, i.e. something like

colnames(data)['b'] <- 'd'

so that if the column ordering of the dataframe changes the correct column name will still be updated.

Thanks in advance

Good question! Was trying this: colnames(data['b']) <- 'd', also not good! As Chase points out, this is the way: colnames(data)[colnames(data) == "b"] <- "d" — PatrickT
– PatrickT, Commented Apr 5, 2014 at 14:15

Josh O'Brien · Accepted Answer · 2012-05-18 18:07:15Z

There is a function setnames built into package data.table for exactly that.

setnames(DT, "b", "d")

It changes the names by reference with no copy at all. Any other method using names(data)<- or names(data)[i]<- or similar will copy the entire object, usually several times. Even though all you're doing is changing a column name.

DT must be type data.table for setnames to work, though. So you'd need to switch to data.table or convert using as.data.table, to use it.

Here is the extract from ?setnames. The intention is that you run example(setnames) at the prompt and then the comments relate to the copies you see being reported by tracemem.

DF = data.frame(a=1:2,b=3:4) # base data.frame to demo copies tracemem(DF) colnames(DF)[1] <- "A" # 4 copies of entire object names(DF)[1] <- "A" # 3 copies of entire object names(DF) <- c("A", "b") # 2 copies of entire object `names<-`(DF,c("A","b")) # 1 copy of entire object x=`names<-`(DF,c("A","b")) # still 1 copy (so not print method) # What if DF is large, say 10GB in RAM. Copy 10GB just to change a column name? DT = data.table(a=1:2,b=3:4,c=5:6) tracemem(DT) setnames(DT,"b","B") # by name; no match() needed. No copy. setnames(DT,3,"C") # by position. No copy. setnames(DT,2:3,c("D","E")) # multiple. No copy. setnames(DT,c("a","E"),c("A","F")) # multiple by name. No copy. setnames(DT,c("X","Y","Z")) # replace all. No copy.

But is loading of new package worth all the hustle for the sake of simple column renaming? =)
Absolutely. It can make the difference between out of memory, or not. And it's shorter, easier and slightly less chance of bugs.
@Tyler There are two (rather long) threads on r-devel about this: speeding up perception and (perhaps most relevant) confused about NAMED and probably others.
@Tyler Now on these benchmarks that show data.table is slower, can you point me to just one please?
@MatthewDowle -- Just added one more tracemem test to your example, just b/c it's kind of hilarious how variable R's behavior is, and b/c I kind of like the count down of 4, 3, 2, 1, ... data.table .

A.L · Accepted Answer · 2015-01-19 14:46:04Z

16

As of October 2014 this can now be done easily in the dplyr package:

rename(data, d = b)

edited Jan 19, 2015 at 14:46

A.L

10.6k10 gold badges73 silver badges106 bronze badges

answered Jan 19, 2015 at 14:20

Sam Firke

23.4k11 gold badges100 silver badges117 bronze badges

Comments

Chase · Accepted Answer · 2012-05-18 16:02:20Z

This seems like a hack, but the first thing that came to mind was to use grepl() with a sufficiently detailed enough search string to only get the column you want. I'm sure there are better options:

dat <- data.frame(a = 1:3, b = 1:3, c = 1:3) colnames(dat)[grepl("b", colnames(dat))] <- "foo" dat #------ a foo c 1 1 1 1 2 2 2 2 3 3 3 3

As Joran points out below, I overcomplicated things...no need for a regex at all. This saves a few characters on the typing too.

colnames(dat)[colnames(dat) == "foo"] <- "bar" #------ a bar c 1 1 1 1 2 2 2 2 3 3 3 3

Or you could simply index the column names using colnames(dat) == 'b', but its going to be circular no matter what you do.
Don't use regexes for simple stuff like this. I'd rather stick with simple == relational operator.
I thought on first glance that Chase used agrep which could have some advantages.
@aL3xa, if you have many similar column prefixes/suffixes to rename, gsub is invaluable. But yeah one isolated case is generally overkill.

Tyler Rinker · Accepted Answer · 2012-05-18 15:34:28Z

Yes but it's more difficult (as far as I know) than numeric indexing. I'm going to provide a dirty function that will do this and if you want to see how to do it just tear the function apart line by line:

rename <- function(df, column, new){ x <- names(df) #Did this to avoid typing twice if (is.numeric(column)) column <- x[column] #Take numeric input by indexing names(df)[x %in% column] <- new #What you're interested in return(df) } #try it out rename(mtcars, 'mpg', 'NEW') rename(mtcars, 1, 'NEW')

aL3xa · Accepted Answer · 2012-05-18 16:06:08Z

I disagree with @Chase - the grepl solution ain't the luckiest one. I'd say: go with simple ==. Here's why:

d <- data.frame(matrix(rnorm(100), 10)) colnames(d) <- replicate(10, paste(sample(letters[1:5], size = 5, replace=TRUE, prob=c(.1, .6, .1, .1, .1)), collapse = ""))

Now try doing grepl("b", colnames(d)). Either pass fixed = TRUE, or even better do simple colnames(d) == "b" like @joran suggested. Regex matching will always be slower than ==, so for simple tasks like this you may want to use simple ==.

I think I pointed out in my answer that I was sure there are better answers, specifically the part I'm sure there are better options. As Joran pointed out in the comments, directly using == is better, which I recognize and show an example of in my answer now too :) I'll leave the top half for posterity's sake.
This answer is essentially the same as mine in that I use colnames(d) %in% "b". In this case they're doing the same thing, though I suppose the == will be faster.

Collectives™ on Stack Overflow

Rename one named column in R

5 Answers 5

19 Comments

Comments

4 Comments

Comments

2 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

19 Comments

Comments

4 Comments

Comments

2 Comments

Linked

Related