Random Sample of rows from an R dataset [duplicate]

Question

Suppose I have a dataset with (90,000 x 17) i.e. (n x p) where n is the number of observations and p is the number of variables and I would like to take a random sample of 20% of rows from my whole dataset how can this be done in R?

After taking a random sample I will be performing cluster analysis accordingly.

I had tried using other questions to answer my question but they were inconclusive because it was not giving me what I needed.

Remember to fix seed set.seed(1492) (or any number) in order to obtain reproducibility of your sample! — LocoGris
– LocoGris, Commented Mar 5, 2019 at 14:35

NelsonGon · Accepted Answer · 2019-03-05 14:38:04Z

6

You can do it with sample_frac from dplyr, here is an example with the database iris

 library(dplyr) #data(iris) sample20 <- iris %>% sample_frac(0.2)

edited Mar 5, 2019 at 14:38

NelsonGon

13.3k7 gold badges32 silver badges60 bronze badges

answered Mar 5, 2019 at 14:35

Derek Corcoran

4,1422 gold badges29 silver badges60 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Random Sample of rows from an R dataset [duplicate]

1 Answer 1

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Linked

Related