Skip to main content
deleted 4 characters in body
Source Link
gung - Reinstate Monica
  • 150.3k
  • 90
  • 418
  • 748

Here are some thoughts:

  1. As @whuber notes, I doubt Gelman said that (although he may have said something similar sounding). Five percent of cases where the null is true will yield significant results (type I errors) using an alpha of .05. If we assume that the true power for all studies where the null was false were $80\%$, the statement could only be true if the ratio of studies undertaken where the null was true to studies in which the null was false was $100/118.75 \approx 84\%$.
  2. Model selection criteria, such as the AIC, can be seen as a way of selecting an appropriate $p$-value. To understand this more fully, it may help to read @Glen_b's answer here: Stepwise regression in R – Critical p-valueStepwise regression in R – Critical p-value. Moreover, nothing prevents people from 'AIC-hacking', if the AIC became the requirement for publication.
  3. A good guide to fitting models in such a manner that you don't invalidate your $p$-values would be Frank Harrell's book, Regression Modeling Strategies.
  4. I am not dogmatically opposed to using Bayesian methods, but I do not believe they would solve this problem. For example, you can just keep collecting data until the credible interval no longer included whatever value you wanted to reject. Thus you have 'credible interval-hacking'. As I see it, the issue is that many practitioners are not intrinsically interested in the statistical analyses they use, so they will use whichever method is required of them in an unthinking and mechanical way. For more on my perspective here, it may help to read my answer to: Effect size as the hypothesis for significance testing.

Here are some thoughts:

  1. As @whuber notes, I doubt Gelman said that (although he may have said something similar sounding). Five percent of cases where the null is true will yield significant results (type I errors) using an alpha of .05. If we assume that the true power for all studies where the null was false were $80\%$, the statement could only be true if the ratio of studies undertaken where the null was true to studies in which the null was false was $100/118.75 \approx 84\%$.
  2. Model selection criteria, such as the AIC, can be seen as a way of selecting an appropriate $p$-value. To understand this more fully, it may help to read @Glen_b's answer here: Stepwise regression in R – Critical p-value. Moreover, nothing prevents people from 'AIC-hacking', if the AIC became the requirement for publication.
  3. A good guide to fitting models in such a manner that you don't invalidate your $p$-values would be Frank Harrell's book, Regression Modeling Strategies.
  4. I am not dogmatically opposed to using Bayesian methods, but I do not believe they would solve this problem. For example, you can just keep collecting data until the credible interval no longer included whatever value you wanted to reject. Thus you have 'credible interval-hacking'. As I see it, the issue is that many practitioners are not intrinsically interested in the statistical analyses they use, so they will use whichever method is required of them in an unthinking and mechanical way. For more on my perspective here, it may help to read my answer to: Effect size as the hypothesis for significance testing.

Here are some thoughts:

  1. As @whuber notes, I doubt Gelman said that (although he may have said something similar sounding). Five percent of cases where the null is true will yield significant results (type I errors) using an alpha of .05. If we assume that the true power for all studies where the null was false were $80\%$, the statement could only be true if the ratio of studies undertaken where the null was true to studies in which the null was false was $100/118.75 \approx 84\%$.
  2. Model selection criteria, such as the AIC, can be seen as a way of selecting an appropriate $p$-value. To understand this more fully, it may help to read @Glen_b's answer here: Stepwise regression in R – Critical p-value. Moreover, nothing prevents people from 'AIC-hacking', if the AIC became the requirement for publication.
  3. A good guide to fitting models in such a manner that you don't invalidate your $p$-values would be Frank Harrell's book, Regression Modeling Strategies.
  4. I am not dogmatically opposed to using Bayesian methods, but I do not believe they would solve this problem. For example, you can just keep collecting data until the credible interval no longer included whatever value you wanted to reject. Thus you have 'credible interval-hacking'. As I see it, the issue is that many practitioners are not intrinsically interested in the statistical analyses they use, so they will use whichever method is required of them in an unthinking and mechanical way. For more on my perspective here, it may help to read my answer to: Effect size as the hypothesis for significance testing.
replaced http://stats.stackexchange.com/ with https://stats.stackexchange.com/
Source Link

Here are some thoughts:

  1. As @whuber notes, I doubt Gelman said that (although he may have said something similar sounding). Five percent of cases where the null is true will yield significant results (type I errors) using an alpha of .05. If we assume that the true power for all studies where the null was false were $80\%$, the statement could only be true if the ratio of studies undertaken where the null was true to studies in which the null was false was $100/118.75 \approx 84\%$.
  2. Model selection criteria, such as the AIC, can be seen as a way of selecting an appropriate $p$-value. To understand this more fully, it may help to read @Glen_b's answer here: Stepwise regression in R – Critical p-valueStepwise regression in R – Critical p-value. Moreover, nothing prevents people from 'AIC-hacking', if the AIC became the requirement for publication.
  3. A good guide to fitting models in such a manner that you don't invalidate your $p$-values would be Frank Harrell's book, Regression Modeling Strategies.
  4. I am not dogmatically opposed to using Bayesian methods, but I do not believe they would solve this problem. For example, you can just keep collecting data until the credible interval no longer included whatever value you wanted to reject. Thus you have 'credible interval-hacking'. As I see it, the issue is that many practitioners are not intrinsically interested in the statistical analyses they use, so they will use whichever method is required of them in an unthinking and mechanical way. For more on my perspective here, it may help to read my answer to: Effect size as the hypothesis for significance testingEffect size as the hypothesis for significance testing.

Here are some thoughts:

  1. As @whuber notes, I doubt Gelman said that (although he may have said something similar sounding). Five percent of cases where the null is true will yield significant results (type I errors) using an alpha of .05. If we assume that the true power for all studies where the null was false were $80\%$, the statement could only be true if the ratio of studies undertaken where the null was true to studies in which the null was false was $100/118.75 \approx 84\%$.
  2. Model selection criteria, such as the AIC, can be seen as a way of selecting an appropriate $p$-value. To understand this more fully, it may help to read @Glen_b's answer here: Stepwise regression in R – Critical p-value. Moreover, nothing prevents people from 'AIC-hacking', if the AIC became the requirement for publication.
  3. A good guide to fitting models in such a manner that you don't invalidate your $p$-values would be Frank Harrell's book, Regression Modeling Strategies.
  4. I am not dogmatically opposed to using Bayesian methods, but I do not believe they would solve this problem. For example, you can just keep collecting data until the credible interval no longer included whatever value you wanted to reject. Thus you have 'credible interval-hacking'. As I see it, the issue is that many practitioners are not intrinsically interested in the statistical analyses they use, so they will use whichever method is required of them in an unthinking and mechanical way. For more on my perspective here, it may help to read my answer to: Effect size as the hypothesis for significance testing.

Here are some thoughts:

  1. As @whuber notes, I doubt Gelman said that (although he may have said something similar sounding). Five percent of cases where the null is true will yield significant results (type I errors) using an alpha of .05. If we assume that the true power for all studies where the null was false were $80\%$, the statement could only be true if the ratio of studies undertaken where the null was true to studies in which the null was false was $100/118.75 \approx 84\%$.
  2. Model selection criteria, such as the AIC, can be seen as a way of selecting an appropriate $p$-value. To understand this more fully, it may help to read @Glen_b's answer here: Stepwise regression in R – Critical p-value. Moreover, nothing prevents people from 'AIC-hacking', if the AIC became the requirement for publication.
  3. A good guide to fitting models in such a manner that you don't invalidate your $p$-values would be Frank Harrell's book, Regression Modeling Strategies.
  4. I am not dogmatically opposed to using Bayesian methods, but I do not believe they would solve this problem. For example, you can just keep collecting data until the credible interval no longer included whatever value you wanted to reject. Thus you have 'credible interval-hacking'. As I see it, the issue is that many practitioners are not intrinsically interested in the statistical analyses they use, so they will use whichever method is required of them in an unthinking and mechanical way. For more on my perspective here, it may help to read my answer to: Effect size as the hypothesis for significance testing.
Source Link
gung - Reinstate Monica
  • 150.3k
  • 90
  • 418
  • 748

Here are some thoughts:

  1. As @whuber notes, I doubt Gelman said that (although he may have said something similar sounding). Five percent of cases where the null is true will yield significant results (type I errors) using an alpha of .05. If we assume that the true power for all studies where the null was false were $80\%$, the statement could only be true if the ratio of studies undertaken where the null was true to studies in which the null was false was $100/118.75 \approx 84\%$.
  2. Model selection criteria, such as the AIC, can be seen as a way of selecting an appropriate $p$-value. To understand this more fully, it may help to read @Glen_b's answer here: Stepwise regression in R – Critical p-value. Moreover, nothing prevents people from 'AIC-hacking', if the AIC became the requirement for publication.
  3. A good guide to fitting models in such a manner that you don't invalidate your $p$-values would be Frank Harrell's book, Regression Modeling Strategies.
  4. I am not dogmatically opposed to using Bayesian methods, but I do not believe they would solve this problem. For example, you can just keep collecting data until the credible interval no longer included whatever value you wanted to reject. Thus you have 'credible interval-hacking'. As I see it, the issue is that many practitioners are not intrinsically interested in the statistical analyses they use, so they will use whichever method is required of them in an unthinking and mechanical way. For more on my perspective here, it may help to read my answer to: Effect size as the hypothesis for significance testing.