7
$\begingroup$

There are a lot of kernels available for a univariate KDE. R uses normal by default, but the efficacy discussion seems to support the use of Epanechnikov. What should influence kernel choice for univariate exploratory analysis?

$\endgroup$
2
  • 2
    $\begingroup$ Since you're doing EDA, one thought is to use a range of kernels and look at the results. In most applications you will find the choice of kernel makes little difference; the bandwidth is more important by far and usually is worth some exploration and visual fine-tuning. The largest qualitative difference among kernel shapes is between those that are discontinuous and those that are highly differentiable. (Discontinuous--uniform--kernels actually are routinely used in 2D analyses, despite the discontinuous effects they produce.) $\endgroup$ Commented Oct 15, 2014 at 22:55
  • $\begingroup$ @whuber Could you provide some examples of discontinuous kernels in EDA? I remember seeing Epanechnikov one, but there were so many data points that it looked smooth anyway. $\endgroup$ Commented Oct 15, 2014 at 23:10

2 Answers 2

3
$\begingroup$

This is not really a data visualization question. The information is fairly readily available online, eg http://homepages.inf.ed.ac.uk/rbf/CVonline/LOCAL_COPIES/AV0405/MISHRA/kde.html

mentions using AMISE to select bandwidth, same approach for kernels could be used. But for EDA, you would want to work like the recommendation for histograms, re-plot with different binwidths to learn different things in the data. Sometimes using a different kernel may be helpful. The normal kernel is generally useful, and I think the bandwidth is more important than the actual kernel.

I would suggest adding tags: distributions, nonparametric. Possibly get better answers under these topics.

$\endgroup$
2
$\begingroup$

The framework of regularization theory (see Regularization Theory and Neural Networks Architectures by Girosi et. al) allows to tackle the problem of looking for a good kernel in a systematic way.

The idea is that the kernel is determined by a smoothness stabilizer which is analogous to controlling the complexity in the MDL sense, or the bias-variance error decomposition.

The idea is that you attempt to solve the problem, $$ H(f) = \sum_{i}\left(f(x_{i})-y_{i}\right)^{2} + \lambda ||Df||^{2} $$ where $D$ is a differential operator like for example $\frac{d^{2}}{dx^{2}}$. Now it can be proved that this results in the following solution, $$ f(x) = \sum_{i}c_{i}G(x-x_{i}) $$ where $G$ is the Green function associated with the regularizer. By means of cross-validation you can search for good values of $\lambda$ and the order of the differential operator.

$\endgroup$

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.