2

I have a data frame formatted like so:

val1 = c(.35, .36, .35, .22, .27, .25) val2 = c(.35, .35, .37, .40, .42, .46) val3 = c(.88, .9, .87, .35, .35, .36) val4 = c(.9, .91, .82, .36, .36, .36) df = data.frame (val1, val2, val3, val4) colnames(df)[1] = "group 1_31" colnames(df)[2] = "group 1_32" colnames(df)[3] = "group 2_32" colnames(df)[4] = "group 10_310" 

I know these column names are less than ideal, but unfortunately they are automatically supplied by the program I'm running. I'd like to rename each column, such that group a_bc becomes bca, like so:

colnames(df)[1] = "311" colnames(df)[2] = "321" colnames(df)[3] = "322" colnames(df)[4] = "31010" 

I know I can get rid of "group" by doing:

colnames(df)=sub("group ","",colnames(df)) 

but that still leaves me with "1_31", "1_32", etc.

Is there a way to automatically convert a_bc to bca across all columns names (I have 55 that need this conversion)?

I've read Rename Dataframe Column Names in R using Previous Column Name and Regex Pattern but I think my case is different because I need to reorder the existing column name, not just cut them off at a specific position.

2 Answers 2

3

We can rearrange the backreferences after capturing as a group

colnames(df) <- sub('group (\\d+)_(\\d+)', "\\2\\1", colnames(df)) colnames(df) #[1] "311" "321" "322" "31010" 
Sign up to request clarification or add additional context in comments.

2 Comments

Can you briefly exlpain what is this doing? It seems to me: you define group with first element(\\d+) and a second element after the _ (\\d). Then in the replacement of the sub funtion basically you are saying to switch the positions. What if i put colnames(df3) = sub('group (\\d+)_(\\d+)', "\\3\\1", colnames(df3))? I obtain c(1,1,2,10). Why? Thank you
@Carlo the pattern match.ed is after the group and space (` ), capture one or more. digits inside brackets ((\\d+)) followed by _, then then second set of digits in capture grouop. In the replacement, we. switch the backreference in the reverse order i.e. 2nd followed by 1st (\\2\\1`)
2
val1 = c(.35, .36, .35, .22, .27, .25) val2 = c(.35, .35, .37, .40, .42, .46) val3 = c(.88, .9, .87, .35, .35, .36) val4 = c(.9, .91, .82, .36, .36, .36) df = data.frame (val1, val2, val3, val4) colnames(df)[1] = "group 1_31" colnames(df)[2] = "group 1_32" colnames(df)[3] = "group 2_32" colnames(df)[4] = "group 10_310" gsub("^group (\\d+)_(\\d+)", "\\2\\1", colnames(df)) [1] "311" "321" "322" "31010" 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.