Revisions to Can one measure the degree of empirical data being Gaussian?

Removed gaussian-process tag

2.2k
2
24
37

deleted 1 characters in body

edited Jun 24, 2013 at 15:03

62.1k
8
145
231

I would like to inquire about a simple test I might be able to perform to determine how 'nicely Gaussian' my empirical data are. If they are 'nicely Gaussian', then I can perform some other analysis, that assumes my data are Gaussian to begin with.

I am looking for a concrete test. I have a simple 1-dimensional data vector, with $N\approx 10,000$, so I have plenty of data. I want to determine if these data are Gaussian.

What I have tried:

I understand that the Gaussian PDF has no skewness, and no kurtosis, so I have implemented those metrics and taken a measure. This works OK--I think--so as it stands, this is my plan B. Perhaps there is a better way?
I have heard the term "chi-squared" being thrown around. I understand that it is a PDF in its own right, but I am not sure how this might apply to this problem.
Although half in jest, my current way is to simply eyeball the data. Needless to say, this is OK for some cases, but it will not work when data is being run and I am sleeping...

EDIT:

It has been suggested that I talk some more about the 'other analysis' I had in mind, so here goes: If my data is gaussianGaussian, I can readily apply a thresholdsthreshold developed (ex, here), but they only apply for data that is gaussianGaussian. Now, if my test comes back "not gaussian"Gaussian", then what I would like to do is determine what is the closest PDF that matches it, so that I can attempt to derive thresholds myself.

Now, thanks to everyone, I understand that there is an infinite number of PDFs, and I realize my question might have been somewhat open ended.

So to put a lot more clarity into the picture, I can say that my data is either a 'nice gaussian'Gaussian' looking PDF, or, it tends to be a "gaussian"Gaussian PDF with symmetric long tails". So, if my test comes back and says "Yes, your data is gaussian"Gaussian", I can use one of the canned threshold tests I linked to earlier. If on the other hand my test says "No, the tails are way too long for a typical gaussianGaussian...", then I would want to: 1) Know what type of PDF is this, 2) Estimate new thresholds on my own based on this.

I hope this clarifies some more, thanks for everyone.

I would like to inquire about a simple test I might be able to perform to determine how 'nicely Gaussian' my empirical data are. If they are 'nicely Gaussian', then I can perform some other analysis, that assumes my data are Gaussian to begin with.

I am looking for a concrete test. I have a simple 1-dimensional data vector, with $N\approx 10,000$, so I have plenty of data. I want to determine if these data are Gaussian.

What I have tried:

I understand that the Gaussian PDF has no skewness, and no kurtosis, so I have implemented those metrics and taken a measure. This works OK--I think--so as it stands, this is my plan B. Perhaps there is a better way?
I have heard the term "chi-squared" being thrown around. I understand that it is a PDF in its own right, but I am not sure how this might apply to this problem.
Although half in jest, my current way is to simply eyeball the data. Needless to say, this is OK for some cases, but it will not work when data is being run and I am sleeping...

EDIT:

It has been suggested that I talk some more about the 'other analysis' I had in mind, so here goes: If my data is gaussian, I can readily apply a thresholds developed (ex, here), but they only apply for data that is gaussian. Now, if my test comes back "not gaussian", then what I would like to do is determine what is the closest PDF that matches it, so that I can attempt to derive thresholds myself.

Now, thanks to everyone, I understand that there is an infinite number of PDFs, and I realize my question might have been somewhat open ended.

So to put a lot more clarity into the picture, I can say that my data is either a 'nice gaussian' looking PDF, or, it tends to be a "gaussian PDF with symmetric long tails". So, if my test comes back and says "Yes, your data is gaussian", I can use one of the canned threshold tests I linked to earlier. If on the other hand my test says "No, the tails are way too long for a typical gaussian...", then I would want to: 1) Know what type of PDF is this, 2) Estimate new thresholds on my own based on this.

I hope this clarifies some more, thanks for everyone.

I would like to inquire about a simple test I might be able to perform to determine how 'nicely Gaussian' my empirical data are. If they are 'nicely Gaussian', then I can perform some other analysis, that assumes my data are Gaussian to begin with.

I am looking for a concrete test. I have a simple 1-dimensional data vector, with $N\approx 10,000$, so I have plenty of data. I want to determine if these data are Gaussian.

What I have tried:

I understand that the Gaussian PDF has no skewness, and no kurtosis, so I have implemented those metrics and taken a measure. This works OK--I think--so as it stands, this is my plan B. Perhaps there is a better way?
I have heard the term "chi-squared" being thrown around. I understand that it is a PDF in its own right, but I am not sure how this might apply to this problem.
Although half in jest, my current way is to simply eyeball the data. Needless to say, this is OK for some cases, but it will not work when data is being run and I am sleeping...

EDIT:

It has been suggested that I talk some more about the 'other analysis' I had in mind, so here goes: If my data is Gaussian, I can readily apply a threshold developed (ex, here), but they only apply for data that is Gaussian. Now, if my test comes back "not Gaussian", then what I would like to do is determine what is the closest PDF that matches it, so that I can attempt to derive thresholds myself.

Now, thanks to everyone, I understand that there is an infinite number of PDFs, and I realize my question might have been somewhat open ended.

So to put a lot more clarity into the picture, I can say that my data is either a 'nice Gaussian' looking PDF, or, it tends to be a "Gaussian PDF with symmetric long tails". So, if my test comes back and says "Yes, your data is Gaussian", I can use one of the canned threshold tests I linked to earlier. If on the other hand my test says "No, the tails are way too long for a typical Gaussian...", then I would want to: 1) Know what type of PDF is this, 2) Estimate new thresholds on my own based on this.

I hope this clarifies some more, thanks for everyone.

added 1208 characters in body

Source Link

edited Jun 24, 2013 at 14:49

Creatron

1.7k
2
20
26

I would like to inquire about a simple test I might be able to perform to determine how 'nicely Gaussian' my empirical data are. If they are 'nicely Gaussian', then I can perform some other analysis, that assumes my data are Gaussian to begin with.

I am looking for a concrete test. I have a simple 1-dimensional data vector, with $N\approx 10,000$, so I have plenty of data. I want to determine if these data are Gaussian.

What I have tried:

I understand that the Gaussian PDF has no skewness, and no kurtosis, so I have implemented those metrics and taken a measure. This works OK--I think--so as it stands, this is my plan B. Perhaps there is a better way?
I have heard the term "chi-squared" being thrown around. I understand that it is a PDF in its own right, but I am not sure how this might apply to this problem.
Although half in jest, my current way is to simply eyeball the data. Needless to say, this is OK for some cases, but it will not work when data is being run and I am sleeping...

EDIT:

It has been suggested that I talk some more about the 'other analysis' I had in mind, so here goes: If my data is gaussian, I can readily apply a thresholds developed (ex, here), but they only apply for data that is gaussian. Now, if my test comes back "not gaussian", then what I would like to do is determine what is the closest PDF that matches it, so that I can attempt to derive thresholds myself.

Now, thanks to everyone, I understand that there is an infinite number of PDFs, and I realize my question might have been somewhat open ended.

So to put a lot more clarity into the picture, I can say that my data is either a 'nice gaussian' looking PDF, or, it tends to be a "gaussian PDF with symmetric long tails". So, if my test comes back and says "Yes, your data is gaussian", I can use one of the canned threshold tests I linked to earlier. If on the other hand my test says "No, the tails are way too long for a typical gaussian...", then I would want to: 1) Know what type of PDF is this, 2) Estimate new thresholds on my own based on this.

I hope this clarifies some more, thanks for everyone.

I would like to inquire about a simple test I might be able to perform to determine how 'nicely Gaussian' my empirical data are. If they are 'nicely Gaussian', then I can perform some other analysis, that assumes my data are Gaussian to begin with.

I am looking for a concrete test. I have a simple 1-dimensional data vector, with $N\approx 10,000$, so I have plenty of data. I want to determine if these data are Gaussian.

What I have tried:

I understand that the Gaussian PDF has no skewness, and no kurtosis, so I have implemented those metrics and taken a measure. This works OK--I think--so as it stands, this is my plan B. Perhaps there is a better way?
I have heard the term "chi-squared" being thrown around. I understand that it is a PDF in its own right, but I am not sure how this might apply to this problem.
Although half in jest, my current way is to simply eyeball the data. Needless to say, this is OK for some cases, but it will not work when data is being run and I am sleeping...

I would like to inquire about a simple test I might be able to perform to determine how 'nicely Gaussian' my empirical data are. If they are 'nicely Gaussian', then I can perform some other analysis, that assumes my data are Gaussian to begin with.

I am looking for a concrete test. I have a simple 1-dimensional data vector, with $N\approx 10,000$, so I have plenty of data. I want to determine if these data are Gaussian.

What I have tried:

I understand that the Gaussian PDF has no skewness, and no kurtosis, so I have implemented those metrics and taken a measure. This works OK--I think--so as it stands, this is my plan B. Perhaps there is a better way?
I have heard the term "chi-squared" being thrown around. I understand that it is a PDF in its own right, but I am not sure how this might apply to this problem.
Although half in jest, my current way is to simply eyeball the data. Needless to say, this is OK for some cases, but it will not work when data is being run and I am sleeping...

EDIT:

It has been suggested that I talk some more about the 'other analysis' I had in mind, so here goes: If my data is gaussian, I can readily apply a thresholds developed (ex, here), but they only apply for data that is gaussian. Now, if my test comes back "not gaussian", then what I would like to do is determine what is the closest PDF that matches it, so that I can attempt to derive thresholds myself.

Now, thanks to everyone, I understand that there is an infinite number of PDFs, and I realize my question might have been somewhat open ended.

So to put a lot more clarity into the picture, I can say that my data is either a 'nice gaussian' looking PDF, or, it tends to be a "gaussian PDF with symmetric long tails". So, if my test comes back and says "Yes, your data is gaussian", I can use one of the canned threshold tests I linked to earlier. If on the other hand my test says "No, the tails are way too long for a typical gaussian...", then I would want to: 1) Know what type of PDF is this, 2) Estimate new thresholds on my own based on this.

I hope this clarifies some more, thanks for everyone.