0
$\begingroup$

Suppose I'm interested in estimating the probability $p=\Pr((U,V)\in A)$ with a random sample $\{(U_i,V_i)\}_{i=1}^N$. The easiest way of doing it is to use the sample mean: $\widehat{p}=1/N\times \sum_{i=1}^N 1((U_i,V_i)\in A)$, i.e., the relative frequency estimator based on the indicator function, and weak law of large numbers guarantee the consistency of $\widehat{p}$. But the indicator function is nonsmooth, and I want a smoothed estimator. I know the Nadaraya-Watson kernel density estimator, I'm considering proposing something that might looks similarly as follows: $\widehat{p}_s=1/N\times \sum_{i=1}^N \frac{1}{h^2} k(\frac{(U_i,V_i)???}{h})????$, where $k(\cdot)$ is the kernel function and $h$ is the bandwidth. I come across the difficulty of not knowing what to write inside the kernel function (the question marks) and thus don't know how to proceed.

Thus my question is, how to construct a smoothed estimator (based on kernel smoothing) for the probability? When is it consistent?

It would be great if you could lay out the conditions for the kernel and bandwidth so that the estimator is consistent for the probability of interest, and prove its consistency under your conditions.

$\endgroup$
4
  • 1
    $\begingroup$ For any kernel, the NW estimator recovers the true density convolved with the kernel. It's not immediate but you can find that the total error is the sum of a bias (because we have convolved the true density) and a variance (because of the sampling). The optimal trade-off between the two gives a value of the bandwith $h$. You will find a rigorous treatment of this question in any reference manual. The usual recommendation is "Elements of statistical learning" (Hastie Tipshirani) but I haven't checked it to see how they deal with this particular point. $\endgroup$ Commented Nov 4, 2024 at 11:06
  • $\begingroup$ @GuillaumeDehaene Thank you very much! Will check out the book for the consistency theory. But a more urgent question for me is what my intended estimator should look like. Do you have any ideas? $\endgroup$ Commented Nov 4, 2024 at 12:28
  • 1
    $\begingroup$ My default choice would be either an exponential kernel or a step function. Many people would default to a Gaussian kernel, and that's also valid. In the end, I'm not sure it actually matters too much, honestly? Just make sure that you don't' trust the details of the predicted density, and especially not for very low probabilities. $\endgroup$ Commented Nov 5, 2024 at 9:40
  • $\begingroup$ @GuillaumeDehaene Thank you very much! $\endgroup$ Commented Nov 8, 2024 at 10:46

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.