0

I have a doubt about how to sample a dataframe. The dataset is like this:

with 114 rows with 9 columns. I need to extract 3 subsets, each one of 38 rows (114 / 3).

I have this script, but it doesn't works for the last subset:

install.packages("Rcmdr") library(Rcmdr) ana <- read.delim("~/Desktop/ana", header=TRUE, dec=",") set1 <- ana[sample(nrow(ana), 38), ] set1.index <- as.numeric(rownames(set1)) ana2 <- ana[(-set1.index),] set2 <- ana2[sample(nrow(ana2), 38), ] set2.index <- as.numeric(rownames(set2)) ana3 <- ana2[(-set2.index),] ana3 

For set1 and set2 I get the subsets correctly, but for set3 I get 50 rows (or less).

(Thank you in advance! =) )

2
  • 1
    Welcome to StackOverflow! Please read about how to provide your data in a reproducible format so that it's easier for us to help you. Supplying your data as an image makes it difficult to quickly work with your example code, so consider constructing a fake dataset and supplying it using dput. Commented Feb 22, 2015 at 21:23
  • It seems that you want each row to be in exactly one of the subsets. Here's an example how you could do it using the iris data set. set_number <- sample(1:3, nrow(iris), replace = TRUE); set1 <- iris[set_number == 1, ]; set2 <- iris[set_number == 2, ]; set3 <- iris[set_number == 3, ] or my_sets <- split(iris, set_number) Commented Feb 22, 2015 at 21:42

1 Answer 1

2

Generally @docendodiscimus give valid advice but the sampling code he offers will not guarantee equal numbers in the subsets (see below). Try this instead:

 set.seed(123) # best to set a seed to allow roproducibility sampidx <- sample( rep(1:3, each=38) set1 <- ana[sampidx==1, ] # logical indexing of dataframe set2 <- ana[sampidx==2, ] set3 <- ana[sampidx==3, ] 

Lack of equipartition with sample using replacement:

> table( sample(1:3, nrow(iris), replace = TRUE) ) 1 2 3 52 52 46 > table( sample(1:3, nrow(iris), replace = TRUE) ) 1 2 3 51 49 50 # notice that it also varies from draw to draw > table(sampidx) sampidx 1 2 3 38 38 38 
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you very much, BondedDust, that's exactly what I was looking for! Thank you docendodiscimus for you too! =)

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.