1

this is a follow up question . I wanna know how can I intercalate dynamically the columns in the bigger data set?

  • Rationale: I've conducted a for-loop to import 16 dataframes. After that, I did this to merge all dataframes:
### Merge all dataframes: (ps: I got this code here in SO :) mergefun <- function(x, y) merge(x, y, by= "ID", all = T) merged_DF <- Reduce(mergefun, dataList) 

Each dataframes has an "ID" column (which is the same for every one), but they have different column names (the ones I've created based on the other posts' answer). Hence,

  • I have, in total (the head() of each dataframe):
ID NARR_G1_50_AAA NARR_G1_50_AAC NARR_G1_50_AC NARR_G1_50_AB ID NARR_G1_100_AAA NARR_G1_100_AAC NARR_G1_100_AC NARR_G1_100_AB ID NARR_G1_150_AAA NARR_G1_150_AAC NARR_G1_150_AC NARR_G1_150_AB ID NARR_G1_200_AAA NARR_G1_200_AAC NARR_G1_200_AC NARR_G1_200_AB ID NARR_G2_50_AAA NARR_G2_50_AAC NARR_G2_50_AC NARR_G2_50_AB ID NARR_G2_100_AAA NARR_G2_100_AAC NARR_G2_100_AC NARR_G2_100_AB ID NARR_G2_150_AAA NARR_G2_150_AAC NARR_G2_150_AC NARR_G2_150_AB ID NARR_G2_200_AAA NARR_G2_200_AAC NARR_G2_200_AC NARR_G2_200_AB ID ARG_G1_50_AAA ARG_G1_50_AAC ARG_G1_50_AC ARG_G1_50_AB ID ARG_G1_100_AAA ARG_G1_100_AAC ARG_G1_100_AC ARG_G1_100_AB ID ARG_G1_150_AAA ARG_G1_150_AAC ARG_G1_150_AC ARG_G1_150_AB ID ARG_G1_200_AAA ARG_G1_200_AAC ARG_G1_200_AC ARG_G1_200_AB ID ARG_G2_50_AAA ARG_G2_50_AAC ARG_G2_50_AC ARG_G2_50_AB ID ARG_G2_100_AAA ARG_G2_100_AAC ARG_G2_100_AC ARG_G2_100_AB ID ARG_G2_150_AAA ARG_G2_150_AAC ARG_G2_150_AC ARG_G2_150_AB ID ARG_G2_200_AAA ARG_G2_200_AAC ARG_G2_200_AC ARG_G2_200_AB 

I need two arrange the joined dataframe columns in these two orders:

SET 1 :

###Desired output 1: NARR_G1_50_AAA, NARR_G2_50_AAA, NARR_G1_50_AAC, NARR_G2_50_AAC, NARR_G1_50_AC, NARR_G2_50_AC, NARR_G1_50_AB, NARR_G2_50_AB, ARG_G1_50_AAA, ARG_G2_50_AAA, ARG_G1_50_AAC, ARG_G2_50_AAC, ARG_G1_50_AC, ARG_G2_50_AC, ARG_G1_50_AB, ARG_G2_50_AB........then with 100,150 and 200 

SET 2 :

###Desired output 2: NARR_G1_50_AAA, ARG_G1_50_AAA, NARR_G2_50_AAA, ARG_G2_50_AAA, NARR_G1_50_AAC, ARG_G1_50_AAC, NARR_G2_50_AAC, ARG_G2_50_AAC, NARR_G1_50_AC, ARG_G1_50_AC, NARR_G2_50_AC, ARG_G2_50_AC, NARR_G1_50_AB, ARG_G1_50_AB, NARR_G2_50_AB, ARG_G2_50_AB,........then with 100,150 and 200 
  • I've tried many things, but I can't get the desired orders...the closer I got was this:
dfPaired <- merged_DF %>% ###still doesn't produce the desired output # dplyr::select(sort(names(.))) %>% dplyr::select(order(gsub("G1", "G2", names(.)))) %>% 

Question:

  • How can I get the desired orders (set 1 and set 2) without manually intercalating the columns in select() ?

  • Further notes:

SET 1:

I need to intercalate (in increasing order 50, then 100, then 150, then 200) "G1" and "G2" within each variable. Ex: NARR_G1_50_AAA, NARR_G2_50_AAA... There are 4 per number (AAA, AAB, AC and AB)

SET 2:

I need to intercalate (in increasing order 50, then 100, then 150, then 200) "NARR" and "ARG" comparing G1 and G2. Such as: NARR_G1_50_AAA, NARR_G2_50_AAA... thanks in advance :)

0

1 Answer 1

1

If it should be custom order, an option would be to split up the column names at _, then convert to factor with levels specified in the order we wanted

lvls1 <- c("NARR", "ARG") lvls2 <- c("G1", "G2") lvls3 <- c("AAA", "AAC", "AC", "AB") #v1 <- names(merged_DF)[-1] # assuming 'ID' is the first column d1 <- read.table(text = v1, header = FALSE, sep = "_") i1 <- !sapply(d1, is.numeric) d1[i1] <- Map(factor, d1[i1], levels = list(lvls1, lvls2, lvls3)) v2 <- v1[do.call(order, d1[c(3, 1,4, 2)])] library(dplyr) merged_DF %>% select(ID, all_of(v2)) 

where v2 is

> v2 [1] "NARR_G1_50_AAA" "NARR_G2_50_AAA" "NARR_G1_50_AAC" "NARR_G2_50_AAC" "NARR_G1_50_AC" "NARR_G2_50_AC" "NARR_G1_50_AB" "NARR_G2_50_AB" [9] "ARG_G1_50_AAA" "ARG_G2_50_AAA" "ARG_G1_50_AAC" "ARG_G2_50_AAC" "ARG_G1_50_AC" "ARG_G2_50_AC" "ARG_G1_50_AB" "ARG_G2_50_AB" [17] "NARR_G1_100_AAA" "NARR_G2_100_AAA" "NARR_G1_100_AAC" "NARR_G2_100_AAC" "NARR_G1_100_AC" "NARR_G2_100_AC" "NARR_G1_100_AB" "NARR_G2_100_AB" [25] "ARG_G1_100_AAA" "ARG_G2_100_AAA" "ARG_G1_100_AAC" "ARG_G2_100_AAC" "ARG_G1_100_AC" "ARG_G2_100_AC" "ARG_G1_100_AB" "ARG_G2_100_AB" [33] "NARR_G1_150_AAA" "NARR_G2_150_AAA" "NARR_G1_150_AAC" "NARR_G2_150_AAC" "NARR_G1_150_AC" "NARR_G2_150_AC" "NARR_G1_150_AB" "NARR_G2_150_AB" [41] "ARG_G1_150_AAA" "ARG_G2_150_AAA" "ARG_G1_150_AAC" "ARG_G2_150_AAC" "ARG_G1_150_AC" "ARG_G2_150_AC" "ARG_G1_150_AB" "ARG_G2_150_AB" 

data

# it is a random order of the column names which is ordered in the code v1 <- c("NARR_G1_100_AB", "NARR_G1_150_AAC", "NARR_G2_50_AB", "NARR_G1_150_AB", "NARR_G2_100_AAA", "NARR_G1_100_AAC", "ARG_G1_150_AC", "ARG_G2_50_AAA", "ARG_G2_150_AAA", "ARG_G1_50_AAA", "ARG_G2_100_AC", "NARR_G1_150_AAA", "NARR_G2_100_AC", "ARG_G1_50_AC", "NARR_G1_100_AAA", "ARG_G2_50_AB", "NARR_G1_150_AC", "ARG_G2_50_AAC", "ARG_G2_150_AB", "NARR_G2_100_AAC", "NARR_G2_150_AAA", "NARR_G1_100_AC", "ARG_G1_150_AB", "ARG_G1_50_AAC", "NARR_G1_50_AC", "ARG_G2_150_AAC", "NARR_G1_50_AAA", "NARR_G2_150_AB", "NARR_G2_150_AAC", "ARG_G1_150_AAA", "ARG_G2_50_AC", "NARR_G2_50_AC", "ARG_G1_150_AAC", "ARG_G1_100_AC", "ARG_G1_100_AAA", "NARR_G1_50_AAC", "NARR_G2_150_AC", "ARG_G1_100_AAC", "ARG_G2_100_AAA", "ARG_G2_100_AAC", "NARR_G1_50_AB", "NARR_G2_100_AB", "ARG_G2_100_AB", "ARG_G1_50_AB", "NARR_G2_50_AAA", "ARG_G1_100_AB", "ARG_G2_150_AC", "NARR_G2_50_AAC" ) 
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.