How to choose a kernel for KDE

Question

There are a lot of kernels available for a univariate KDE. R uses normal by default, but the efficacy discussion seems to support the use of Epanechnikov. What should influence kernel choice for univariate exploratory analysis?

Since you're doing EDA, one thought is to use a range of kernels and look at the results. In most applications you will find the choice of kernel makes little difference; the bandwidth is more important by far and usually is worth some exploration and visual fine-tuning. The largest qualitative difference among kernel shapes is between those that are discontinuous and those that are highly differentiable. (Discontinuous--uniform--kernels actually are routinely used in 2D analyses, despite the discontinuous effects they produce.) — whuber
– whuber ♦, Commented Oct 15, 2014 at 22:55
@whuber Could you provide some examples of discontinuous kernels in EDA? I remember seeing Epanechnikov one, but there were so many data points that it looked smooth anyway. — Simon Kuang
– Simon Kuang, Commented Oct 15, 2014 at 23:10

Dianne Cook · Accepted Answer · 2014-10-15 16:20:38Z

This is not really a data visualization question. The information is fairly readily available online, eg http://homepages.inf.ed.ac.uk/rbf/CVonline/LOCAL_COPIES/AV0405/MISHRA/kde.html

mentions using AMISE to select bandwidth, same approach for kernels could be used. But for EDA, you would want to work like the recommendation for histograms, re-plot with different binwidths to learn different things in the data. Sometimes using a different kernel may be helpful. The normal kernel is generally useful, and I think the bandwidth is more important than the actual kernel.

I would suggest adding tags: distributions, nonparametric. Possibly get better answers under these topics.

jpmuc · Accepted Answer · 2015-06-17 09:53:46Z

The framework of regularization theory (see Regularization Theory and Neural Networks Architectures by Girosi et. al) allows to tackle the problem of looking for a good kernel in a systematic way.

The idea is that the kernel is determined by a smoothness stabilizer which is analogous to controlling the complexity in the MDL sense, or the bias-variance error decomposition.

The idea is that you attempt to solve the problem, $$ H(f) = \sum_{i}\left(f(x_{i})-y_{i}\right)^{2} + \lambda ||Df||^{2} $$ where $D$ is a differential operator like for example $\frac{d^{2}}{dx^{2}}$. Now it can be proved that this results in the following solution, $$ f(x) = \sum_{i}c_{i}G(x-x_{i}) $$ where $G$ is the Green function associated with the regularizer. By means of cross-validation you can search for good values of $\lambda$ and the order of the differential operator.

Stack Exchange Network

How to choose a kernel for KDE

2 Answers 2

Hot Network Questions

How to choose a kernel for KDE

2 Answers 2

Related

Hot Network Questions