0
$\begingroup$

I have the following question relating to Marginal Parameter Estimation vs. Joint Parameter Estimation .

Suppose you generate random points from a (univariate) normal distribution ~ (0,1). Then, you generate random points from a (univariate) exponential distribution ~ (1). Now, you want to fit a ("Clayton") Copula model to this data - this would require you to estimate (using Maximum Likelihood Estimation):

  • the "mean and standard deviation" of the normal distribution
  • the "rate parameter" of the exponential distribution
  • the "correlation parameter" of the copula

We can see how this would be done using the R programming language:

enter image description here

My Question: I can understand why the "correlation parameter" ("alpha") needs to be estimated - but why do we need to estimate the "mean and standard deviation" of the normal distribution and the "rate parameter" of the exponential distribution?

Since Copulas are able to take the marginal distributions and create a joint distribution by using the cumulative probability distributions of the marginal distributions (via the uniform distribution and the probability integral transform) - why are we required to re-estimate the "mean and standard deviation" of the normal distribution and the "rate parameter" of the exponential distribution?

Could we not just calculate the "mean and standard deviation" of the normal distribution and the "rate parameter" of the exponential distribution based on the data itself? E.g.
mean_normal = sum(xi)/n , sd_normal = sqrt(sum(xi-mean)^2/n), rate_exp = n/sum(xi)

In all fairness, I can see that the estimated mean of the normal distribution, the estimated variance of the normal distribution and the estimated rate parameter of the exponential distribution are very close to the values of these parameters used to generate the points - but why do we need to re-estimate them?

Is this because that these parameters might be slightly different in a "joint estimation setting"?

Thanks!

Note: When you fit a (joint) Multivariate Normal Distribution (MVN) to some data, even if you decide to use a MVN where the variance-covariance matrix is not an identity matrix - as far as a I understand, you still use the standard MLE formulas for the means, variances and covariances.

For example, if you have 3 variables X1, X2, X3 and decide to fit a MVN to this data, we need to estimate the following 9 parameters:

  • Mean(X1) = Sum (X1_i)/N-1

  • Mean(X2) = Sum (X2_i)/N-1

  • Mean(X3) = Sum (X3_i)/N-1

  • Cov(X1,X1) = Var(X1) = Sum(X1_i - Mean(X1))^2 / N-1

  • Cov(X2,X2) = Var(X2) = Sum(X2_i - Mean(X2))^2 / N-1

  • Cov(X3,X3) = Var(X3) = Sum(X3_i - Mean(X3))^2 / N-1

  • Cov(X1,X2) = Cov(X2,X1) = Sum(X1_i - Mean(X1))*(X2_i - Mean(X2)) / N-1

  • Cov(X3,X2) = Cov(X2,X3) = Sum(X3_i - Mean(X3))*(X2_i - Mean(X2)) / N-1

  • Cov(X3,X1) = Cov(X1,X3) = Sum(X3_i - Mean(X3))*(X1_i - Mean(X1)) / N-1

As we can see, even though we are fitting a "joint probability distribution" to this data - the 9 parameters that need to be estimated can effectively be estimated "marginally".

In the case of the Copula, the estimate for the mean of the normal distribution on its own (i.e. the marginal case) is different from the value it assumes within the Copula model (i.e. the joint case).

$\endgroup$
5
  • $\begingroup$ What do you mean by re-estimate? Have you estimated them before? $\endgroup$ Commented Dec 3, 2021 at 8:37
  • $\begingroup$ Ok. If you know the marginal distribution, which is not the case in most of real life data, then you just need to transfer the univariate margins to copula and then estimate the copula parameters. However, if you do not know the margins then you need to estimate them first. You can estimate the margins using nonparametric pseudo maximum liklihood method, as you mentioned. But, this may result in inaccurate result due to misspecification of the margins. – $\endgroup$ Commented Dec 24, 2021 at 6:41
  • $\begingroup$ Copula is based on margins even if you can estimate them nonparametrically. $\endgroup$ Commented Dec 24, 2021 at 6:42
  • $\begingroup$ You may read some papers of Prof. Claudia Cazdo about this point $\endgroup$ Commented Dec 24, 2021 at 6:42
  • $\begingroup$ What do you think about my answer? If it is helpful and clear, you may accept it by clicking on the tick mark to the left. Otherwise, you may ask for further clarification. This is how Cross Validated works. $\endgroup$ Commented Jan 12, 2022 at 15:50

1 Answer 1

2
$\begingroup$

Since Copulas are able to take the marginal distributions and create a joint distribution by using the cumulative probability distributions of the marginal distributions (via the uniform distribution and the probability integral transform) - why are we required to re-estimate the "mean and standard deviation" of the normal distribution and the "rate parameter" of the exponential distribution?

How are you going to implement the probability integral transform (PIT) if you do not know the parameters of the distribution? It is for the feasibility of PIT that you need to estimate the parameters. In a simulation, you will know the true parameters, but to make the exercise realistic, you pretend you do not know them and then estimate them from the data.

Could we not just calculate the "mean and standard deviation" of the normal distribution and the "rate parameter" of the exponential distribution based on the data itself? E.g. mean_normal = sum(xi)/n , sd_normal = sqrt(sum(xi-mean)^2/n), rate_exp = n/sum(xi)

Yes, you could. That would be stepwise estimation, as you would estimate the parameters of the marginals separately from these of the copula. Perhaps in joint estimation there is something more complicated going on (cannot tell).

$\endgroup$
0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.