Revisions to "Kernel density estimation" is a convolution of what?

displayed rather than inline MathJax

edited Dec 29, 2024 at 20:38

13.1k
2
36
62

I am trying to get a better understanding of kernel density estimation.

Using the definition from Wikipedia: https://en.wikipedia.org/wiki/Kernel_density_estimation#Definition

$ \hat{f_h}(x) = \frac{1}{n}\sum_{i=1}^n K_h (x - x_i) \quad = \frac{1}{nh} \sum_{i=1}^n K\Big(\frac{x-x_i}{h}\Big) $$$ \hat{f_h}(x) = \frac{1}{n}\sum_{i=1}^n K_h (x - x_i) \quad = \frac{1}{nh} \sum_{i=1}^n K\Big(\frac{x-x_i}{h}\Big) $$

Let's take $K()$ to be a rectangular function which gives $1$ if $x$ is between $-0.5$ and $0.5$ and $0$ otherwise, and $h$ (window size) to be 1.

I understand that the density is a convolution of two functions, but I am not sure I know how to define these two functions. One of them should (probably) be a function of the data which, for every point in R, tells us how many data points we have in that location (mostly $0$). And the other function should probably be some modification of the kernel function, combined with the window size. But I am not sure how to define it.

Any suggestions?

Bellow is an example R code which (I suspect) replicates the settings I defined above (with a mixture of two Gaussians and $n=100$), on which I hope to see a "proof" that the functions to be convoluted are as we suspect.

# example code: set.seed(2346639) x <- c(rnorm(50), rnorm(50,2)) plot(density(x, kernel='rectangular', width=1, n = 10**4)) rug(x)

enter image description here

I am trying to get a better understanding of kernel density estimation.

Using the definition from Wikipedia: https://en.wikipedia.org/wiki/Kernel_density_estimation#Definition

$ \hat{f_h}(x) = \frac{1}{n}\sum_{i=1}^n K_h (x - x_i) \quad = \frac{1}{nh} \sum_{i=1}^n K\Big(\frac{x-x_i}{h}\Big) $

Let's take $K()$ to be a rectangular function which gives $1$ if $x$ is between $-0.5$ and $0.5$ and $0$ otherwise, and $h$ (window size) to be 1.

I understand that the density is a convolution of two functions, but I am not sure I know how to define these two functions. One of them should (probably) be a function of the data which, for every point in R, tells us how many data points we have in that location (mostly $0$). And the other function should probably be some modification of the kernel function, combined with the window size. But I am not sure how to define it.

Any suggestions?

Bellow is an example R code which (I suspect) replicates the settings I defined above (with a mixture of two Gaussians and $n=100$), on which I hope to see a "proof" that the functions to be convoluted are as we suspect.

# example code: set.seed(2346639) x <- c(rnorm(50), rnorm(50,2)) plot(density(x, kernel='rectangular', width=1, n = 10**4)) rug(x)

enter image description here

I am trying to get a better understanding of kernel density estimation.

Using the definition from Wikipedia: https://en.wikipedia.org/wiki/Kernel_density_estimation#Definition

$$ \hat{f_h}(x) = \frac{1}{n}\sum_{i=1}^n K_h (x - x_i) \quad = \frac{1}{nh} \sum_{i=1}^n K\Big(\frac{x-x_i}{h}\Big) $$

Let's take $K()$ to be a rectangular function which gives $1$ if $x$ is between $-0.5$ and $0.5$ and $0$ otherwise, and $h$ (window size) to be 1.

I understand that the density is a convolution of two functions, but I am not sure I know how to define these two functions. One of them should (probably) be a function of the data which, for every point in R, tells us how many data points we have in that location (mostly $0$). And the other function should probably be some modification of the kernel function, combined with the window size. But I am not sure how to define it.

Any suggestions?

Bellow is an example R code which (I suspect) replicates the settings I defined above (with a mixture of two Gaussians and $n=100$), on which I hope to see a "proof" that the functions to be convoluted are as we suspect.

# example code: set.seed(2346639) x <- c(rnorm(50), rnorm(50,2)) plot(density(x, kernel='rectangular', width=1, n = 10**4)) rug(x)

enter image description here

Notice removed Reward existing answer by Glen_b

occurred Nov 1, 2015 at 20:55

Bounty Ended with whuber's answer chosen by Glen_b

occurred Nov 1, 2015 at 20:55

Notice added Reward existing answer by Glen_b

occurred Oct 27, 2015 at 1:06

Bounty Started worth 50 reputation by Glen_b

occurred Oct 27, 2015 at 1:06

edited tags

Link

edited Jun 1, 2015 at 1:33

Danica

25.5k
2
79
142

r kernel kernel-smoothing convolution

Tweeted twitter.com/#!/StackStats/status/393128714831417344

occurred Oct 23, 2013 at 21:36

small tweaks to presentation

Source Link

edited Oct 23, 2013 at 19:43

Nick Cox

62.1k
8
145
231

I am trying to get a better understanding of kernel density estimation.

Using the definition from Wikipedia: https://en.wikipedia.org/wiki/Kernel_density_estimation#Definition

$ \hat{f}_h(x) = \frac{1}{n}\sum_{i=1}^n K_h (x - x_i) \quad = \frac{1}{nh} \sum_{i=1}^n K\Big(\frac{x-x_i}{h}\Big) $$ \hat{f_h}(x) = \frac{1}{n}\sum_{i=1}^n K_h (x - x_i) \quad = \frac{1}{nh} \sum_{i=1}^n K\Big(\frac{x-x_i}{h}\Big) $

Let's take k$K()$ to be a rectangular function which gives 1$1$ if "x"$x$ is between -0.5 to 0.5$-0.5$ and (0$0.5$ and $0$ otherwise), and h $h$ (window size) to be 1.

I understand that the density is a convolution of two functions, but I am not sure I know how to define these two functions. One of them should (probably) be a function of the data which, for every point in R, tells us how many data points we have in that location (mostly 0$0$). And the other function should probably be some modification of the kernel function, combined with the window size. But I am not sure how to define it.

Any suggestions?

Bellow is an example R code which (I suspect) replicates the settings I defined above (with a mixture of two GaussianGaussians and n=100$n=100$), on which I hope to see a "proof" that the functions to be convoluted are as we suspect.

# example code: set.seed(2346639) x <- c(rnorm(50), rnorm(50,2)) plot(density(x, kernel='rectangular', width=1, n = 10**4)) rug(x)

enter image description here

I am trying to get a better understanding of kernel density estimation.

Using the definition from Wikipedia: https://en.wikipedia.org/wiki/Kernel_density_estimation#Definition

$ \hat{f}_h(x) = \frac{1}{n}\sum_{i=1}^n K_h (x - x_i) \quad = \frac{1}{nh} \sum_{i=1}^n K\Big(\frac{x-x_i}{h}\Big) $

Let's take k to be a rectangular function which gives 1 if "x" is between -0.5 to 0.5 (0 otherwise), and h (window size) to be 1.

I understand that the density is a convolution of two functions, but I am not sure I know how to define these two functions. One of them should (probably) be a function of the data which, for every point in R, tells us how many data points we have in that location (mostly 0). And the other function should probably be some modification of the kernel function, combined with the window size. But I am not sure how to define it.

Any suggestions?

Bellow is an example R code which (I suspect) replicates the settings I defined above (with a mixture of two Gaussian and n=100), on which I hope to see a "proof" that the functions to be convoluted are as we suspect.

# example code: set.seed(2346639) x <- c(rnorm(50), rnorm(50,2)) plot(density(x, kernel='rectangular', width=1, n = 10**4)) rug(x)

enter image description here

I am trying to get a better understanding of kernel density estimation.

Using the definition from Wikipedia: https://en.wikipedia.org/wiki/Kernel_density_estimation#Definition

$ \hat{f_h}(x) = \frac{1}{n}\sum_{i=1}^n K_h (x - x_i) \quad = \frac{1}{nh} \sum_{i=1}^n K\Big(\frac{x-x_i}{h}\Big) $

Let's take $K()$ to be a rectangular function which gives $1$ if $x$ is between $-0.5$ and $0.5$ and $0$ otherwise, and $h$ (window size) to be 1.

I understand that the density is a convolution of two functions, but I am not sure I know how to define these two functions. One of them should (probably) be a function of the data which, for every point in R, tells us how many data points we have in that location (mostly $0$). And the other function should probably be some modification of the kernel function, combined with the window size. But I am not sure how to define it.

Any suggestions?

Bellow is an example R code which (I suspect) replicates the settings I defined above (with a mixture of two Gaussians and $n=100$), on which I hope to see a "proof" that the functions to be convoluted are as we suspect.

# example code: set.seed(2346639) x <- c(rnorm(50), rnorm(50,2)) plot(density(x, kernel='rectangular', width=1, n = 10**4)) rug(x)

enter image description here

Source Link

asked Oct 23, 2013 at 19:36

Tal Galili

22.1k
36
150
216

Loading

Stack Exchange Network

Return to Question