My current data frame looks like this:
# Create sample data my_df <- data.frame(seq(1, 100), rep(c("ind_1", "", "", ""), times = 25), rep(c("", "ind_2", "", ""), times = 25), rep(c("", "", "ind_3", ""), times = 25), rep(c("", "", "", "ind_4"), times = 25)) # Rename columns names(my_df)[names(my_df)=="seq.1..100."] <- "value" names(my_df)[names(my_df)=="rep.c..ind_1................times...25."] <- "ind_1" names(my_df)[names(my_df)=="rep.c......ind_2............times...25."] <- "ind_2" names(my_df)[names(my_df)=="rep.c..........ind_3........times...25."] <- "ind_3" names(my_df)[names(my_df)=="rep.c..............ind_4....times...25."] <- "ind_4" # Replace empty elements with NA my_df[my_df==''] = NA What I want to script is a rather simple for loop that calculates the sum of the value column for each of the four ind_*columns and prints the result.
So far my very meagre attempt has been:
# Create a vector with all individuals individuals <- c("ind_1", "ind_2", "ind_3", "ind_4") # Calculate aggregates for each individual for (i in individuals){ ind <- 1 sum_i <- aggregate(value~ind_1, data = my_df, sum) print(paste("Individual", i, "possesses an aggregated value of", sum_i$value)) ind <- ind + 1 } As you can see, I currently struggle to include the correct command to calculate the sum based on one column after another as the current output, naturally, only calculates the result of ind_1. What needs to be changed in the aggregatecommand to achieve the desired result (I'm a total beginner but thought of using indices for proceeding from one column to another?)?
colnames(my_df) <- c("value", "ind_1", "ind_2", "ind_3" ,"ind_4")