2

Below I have a working example of what I would like the function to do, and then script for the function, noting where the Error occurs.

The error message is:

Error: index out of bounds 

Which I know usually means R can’t find the variable that’s being called.

Interestingly, in my function example below, if I only group by my subgroup_name (which is passed to the function and becomes a column in the newly created dataframe) the function will successfully regroup that variable, but I also want to group by a newly created column (from the melt) called variable.

Similar code used to work for me using regroup(), but that has been deprecated. I am trying to use group_by_() but to no avail.

I have read many other posts and answers and experimented several hours today but still not successful.

# Initialize example dataset database <- ggplot2::diamonds database$diamond <- row.names(diamonds) # needed for melting subgroup_name <- "cut" # can replace with "color" or "clarity" subgroup_column <- 2 # can replace with 3 for color, 4 for clarity # This works, although it would be preferable not to need separate variables for subgroup_name and subgroup_column number df <- database %>% select(diamond, subgroup_column, x,y,z) %>% melt(id.vars=c("diamond", subgroup_name)) %>% group_by(cut, variable) %>% summarise(value = round(mean(value, na.rm = TRUE),2)) # This does not work, I am expecting the same output as above subgroup_analysis <- function(database,...){ df <- database %>% select(diamond, subgroup_column, x,y,z) %>% melt(id.vars=c("diamond", subgroup_name)) %>% group_by_(subgroup_name, variable) %>% # problem appears to be with finding "variable" summarise(value = round(mean(value, na.rm = TRUE),2)) print(df) } subgroup_analysis(database, subgroup_column, subgroup_name) 
6
  • @Richard Scriven - I think I am, with the function call in the last line of the code: subgroup_analysis(database, subgroup_column, subgroup_name). However, should/could probably pass "cut" and 2 directly instead of the proxy variables. Let me know if I am missing something, and thanks for looking Commented Jan 27, 2015 at 1:02
  • Yes, sorry I didn't see the final call. Are you sure you want to do this with dots ... instead of named arguments? Commented Jan 27, 2015 at 1:07
  • I would be happy to use named arguments, and I did experiment that way too. reading about group_by_ lead to tinkering with ... but I am not that experienced with them. Commented Jan 27, 2015 at 1:08
  • 1
    Shouldn't it be group_by_(subgroup_name, quote(variable)) Commented Jan 27, 2015 at 1:12
  • 2
    As a side note, if you plan to assign the result to a new variable, then you may want to remove the print call at the end and just write df, or not even assign df at all in the function. Otherwise the result prints even when you do x <- subgroup_analysis(...) Commented Jan 27, 2015 at 1:19

1 Answer 1

5

From the NSE vignette:

If you also want to output variables to vary, you need to pass a list of quoted objects to the .dots argument:

Here, variable should be quoted:

subgroup_analysis <- function(database,...){ df <- database %>% select(diamond, subgroup_column, x,y,z) %>% melt(id.vars=c("diamond", subgroup_name)) %>% group_by_(subgroup_name, quote(variable)) %>% summarise(value = round(mean(value, na.rm = TRUE),2)) print(df) } subgroup_analysis(database, subgroup_column, subgroup_name) 

As mentionned by @RichardScriven, if you plan to assign the result to a new variable, then you may want to remove the print call at the end and just write df, or not even assign df at all in the function

Otherwise the result prints even when you do x <- subgroup_analysis(...)

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.