Identical values generated from random samples from a uniform distribution in dplyr

Question

This is a follow up to previous question. My question was not fully formulated and therefore not fully answered in my last post. Forgive me, I'm new to using stack overflow.

My professor has assigned a problem set, and we are required to use dplyr and other tidyverse packages. I'm very aware that most (if not all) the tasks that I'm trying to execute are possible in base r, but that's not in agreement with my instructions.

First we are asked to generate a tibble of 1000 random samples from a uniform distribution:

2a. Create a new tibble called uniformDf containing a variable called unifSamples that contains 10000 random samples from a uniform distribution. You should use the runif() function to create the uniform samples. {r 2a} uniformDf <- tibble(unifSamples = runif(1000))

This goes well.

Then we are asked to loop thru this tibble 1000 times, each time choosing 20 random samples and computing the mean and saving it to a tibble:

2c. Now let's loop through 1000 times, sampling 20 values from a uniform distribution and computing the mean of the sample, saving this mean to a variable called sampMean within a tibble called uniformSampleMeans. {r 2c} unif_sample_size = 20 # sample size n_samples = 1000 # number of samples # set up q data frame to contain the results uniformSampleMeans <- tibble(sampMean=rep(NA,n_samples)) # loop through all samples. for each one, take a new random sample, # compute the mean, and store it in the data frame for (i in 1:n_samples){ uniformSampleMeans$sampMean[i] <- uniformDf %>% sample_n(unif_sample_size) %>% summarize(sampMean = mean(sampMean)) }

This all runs, well, I believe until I look at my uniformSampleMeans tibble. Which looks like this:

1 0.471271611726843 2 0.471271611726843 3 0.471271611726843 4 0.471271611726843 5 0.471271611726843 6 0.471271611726843 7 0.471271611726843 ... 1000 0.471271611726843

All the values are identical! Does anyone have any insight as to why my output is like this? I'd be less concerned if they varied by +/- 0.000x values seeing as how this is from a distribution that ranges from 0 to 1 but the values are all identical even out to the 15th decimal place! Any help is much appreciated!

You have sampMean = mean(sampMean). You haven't shown where you create the sampMean object, but it looks like a fixed value that's produced outside the for loop. It should probably be sampMean = mean(unifSamples). — Marius
– Marius, Commented Feb 20, 2020 at 5:10
YES thank you! I now realize how dumb of a mistake this was. — Salma Abdel-Raheem
– Salma Abdel-Raheem, Commented Feb 20, 2020 at 5:13

Ronak Shah · Accepted Answer · 2020-02-20 05:19:59Z

The following selects random unif_sample_size rows and gives it's mean

library(dplyr) uniformDf %>% sample_n(unif_sample_size) %>% pull(unifSamples) %>% mean #[1] 0.5563638

If you want to do this n times use replicate and repeat it n times

n <- 10 replicate(n, uniformDf %>% sample_n(unif_sample_size) %>% pull(unifSamples) %>% mean) #[1] 0.5070833 0.5259541 0.5617969 0.4695862 0.5030998 0.5745950 0.4688153 0.4914363 0.4449804 0.5202964

Collectives™ on Stack Overflow

Identical values generated from random samples from a uniform distribution in dplyr

1 Answer 1

Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Related