Systematically rename column names using pre-existing name in R

Question

I have a data frame formatted like so:

val1 = c(.35, .36, .35, .22, .27, .25) val2 = c(.35, .35, .37, .40, .42, .46) val3 = c(.88, .9, .87, .35, .35, .36) val4 = c(.9, .91, .82, .36, .36, .36) df = data.frame (val1, val2, val3, val4) colnames(df)[1] = "group 1_31" colnames(df)[2] = "group 1_32" colnames(df)[3] = "group 2_32" colnames(df)[4] = "group 10_310"

I know these column names are less than ideal, but unfortunately they are automatically supplied by the program I'm running. I'd like to rename each column, such that group a_bc becomes bca, like so:

colnames(df)[1] = "311" colnames(df)[2] = "321" colnames(df)[3] = "322" colnames(df)[4] = "31010"

I know I can get rid of "group" by doing:

colnames(df)=sub("group ","",colnames(df))

but that still leaves me with "1_31", "1_32", etc.

Is there a way to automatically convert a_bc to bca across all columns names (I have 55 that need this conversion)?

I've read Rename Dataframe Column Names in R using Previous Column Name and Regex Pattern but I think my case is different because I need to reorder the existing column name, not just cut them off at a specific position.

akrun · Accepted Answer · 2020-02-12 18:04:45Z

3

We can rearrange the backreferences after capturing as a group

colnames(df) <- sub('group (\\d+)_(\\d+)', "\\2\\1", colnames(df)) colnames(df) #[1] "311" "321" "322" "31010"

answered Feb 12, 2020 at 18:04

akrun

891k38 gold badges590 silver badges700 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Carlo Over a year ago

Can you briefly exlpain what is this doing? It seems to me: you define group with first element(\\d+) and a second element after the _ (\\d). Then in the replacement of the sub funtion basically you are saying to switch the positions. What if i put colnames(df3) = sub('group (\\d+)_(\\d+)', "\\3\\1", colnames(df3))? I obtain c(1,1,2,10). Why? Thank you

akrun Over a year ago

@Carlo the pattern match.ed is after the group and space (` ), capture one or more. digits inside brackets ((\\d+)) followed by _

, then then second set of digits in capture grouop. In the replacement, we. switch the backreference in the reverse order i.e. 2nd followed by 1st (

\\2\\1`)

alan ocallaghan · Accepted Answer · 2020-02-12 18:05:25Z

val1 = c(.35, .36, .35, .22, .27, .25) val2 = c(.35, .35, .37, .40, .42, .46) val3 = c(.88, .9, .87, .35, .35, .36) val4 = c(.9, .91, .82, .36, .36, .36) df = data.frame (val1, val2, val3, val4) colnames(df)[1] = "group 1_31" colnames(df)[2] = "group 1_32" colnames(df)[3] = "group 2_32" colnames(df)[4] = "group 10_310" gsub("^group (\\d+)_(\\d+)", "\\2\\1", colnames(df)) [1] "311" "321" "322" "31010"

Collectives™ on Stack Overflow

Systematically rename column names using pre-existing name in R

2 Answers 2

2 Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Linked

Related