104

Say we have the following data frame:

> df A B C 1 1 2 3 2 4 5 6 3 7 8 9 

We can select column 'B' from its index:

> df[,2] [1] 2 5 8 

Is there a way to get the index (2) from the column label ('B')?

1

10 Answers 10

140

you can get the index via grep and colnames:

grep("B", colnames(df)) [1] 2 

or use

grep("^B$", colnames(df)) [1] 2 

to only get the columns called "B" without those who contain a B e.g. "ABC".

Sign up to request clarification or add additional context in comments.

4 Comments

Your original example's advantages could be demonstrated in code if you showed its use in something like df[ , grep("^B", colnames(df)) ], i.e, returning the dataframe columns starting with "B". Feel free to use in a further edit if you agree.
Or even df[ , grep("^[BC]", colnames(df)) ], i.e., the columns that start with either B or C.
@Dwin: As @aix already said, the asker wants the index. But I also usually use grep the way you describe it.
@Henrik. Thank you so much. This must be the single most useful command to work with dplyr and variables!
110

The following will do it:

which(colnames(df)=="B") 

6 Comments

The problem with grep is also the advantage, namely that it uses regular expressions (so you can search for any pattern in your colnames). To just get the colnames "B" use "^B$" as the pattern in grep. ^ is the metacharacter for the beginning and $ for the end of a string.
You don't even need which. You can directly use df[names(df)=="B"]
@nico The question is to get the index of the column.
"Which" worked for me in every case. I couldn't get a column with the name "fBodyAcc-meanFreq()-Z" using grep.
@Kabamaru: Grep will work as long as you escape the metacharacters. For the example you gave, this will work: grep("^fBodyAcc-meanFreq\\()-Z$",colnames(df)) or also grep("^fBodyAcc-meanFreq\\(\\)-Z$",colnames(df)).
|
9

I wanted to see all the indices for the colnames because I needed to do a complicated column rearrangement, so I printed the colnames as a dataframe. The rownames are the indices.

as.data.frame(colnames(df)) 1 A 2 B 3 C 

2 Comments

A more concise way to do this is cbind(names(df)).
@lillemets if brevity is your goal, t(t(names(df))) saves you 2 characters ;)
8

Following on from chimeric's answer above:

To get ALL the column indices in the df, so i used:

which(!names(df)%in%c()) 

or store in a list:

indexLst<-which(!names(df)%in%c()) 

1 Comment

i think this is the best answer because it can be generalized
4

This seems to be an efficient way to list vars with column number:

cbind(names(df)) 

Output:

 [,1] [1,] "A" [2,] "B" [3,] "C" 

Sometimes I like to copy variables with position into my code so I use this function:

varnums<- function(x) {w=as.data.frame(c(1:length(colnames(x))), paste0('# ',colnames(x))) names(w)= c("# Var/Pos") w} varnums(df) 

Output:

# Var/Pos # A 1 # B 2 # C 3 

Comments

2
match("B", names(df)) 

Can work also if you have a vector of names.

Comments

2

To generalize @NPE's answer slightly:

which(colnames(dat) %in% var) 

where var is of the form

c("colname1","colname2",...,"colnamen") 

returns the indices of whichever column names one needs.

Comments

0

Use t function:

t(colnames(df)) [,1] [,2] [,3] [,4] [,5] [,6] [1,] "var1" "var2" "var3" "var4" "var5" "var6" 

Comments

0

Here is an answer that will generalize Henrik's answer.

df=data.frame(A=rnorm(100), B=rnorm(100), C=rnorm(100)) numeric_columns<-c('A', 'B', 'C') numeric_index<-sapply(1:length(numeric_columns), function(i) grep(numeric_columns[i], colnames(df))) 

2 Comments

That sapply is a long way to write match(numeric_columns, names(df)) --- unless you really need the regex power rather than exact string matching.
thanks @GregorThomas...not super familar with match. In this case it is quite a bit shorter, but I like the sapply because it's a little more explicit what is going on...to each their own i guess (havem't benchmarked any performance differences)
0

#I wanted the column index instead of the column name. This line of code worked for me:

which (data.frame (colnames (datE)) == colnames (datE[c(1:15)]), arr.ind = T)[,1] #with datE being a regular dataframe with 15 columns (variables) data.frame(colnames(datE)) #> colnames.datE. #> 1 Ce #> 2 Eu #> 3 La #> 4 Pr #> 5 Nd #> 6 Sm #> 7 Gd #> 8 Tb #> 9 Dy #> 10 Ho #> 11 Er #> 12 Y #> 13 Tm #> 14 Yb #> 15 Lu which(data.frame(colnames(datE))==colnames(datE[c(1:15)]),arr.ind=T)[,1] #> [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.