Skip to main content
added 389 characters in body
Source Link
Glen_b
  • 297.9k
  • 37
  • 675
  • 1.1k

Modelling is the process of identifying a suitable model.

Frequently a modeller will have a good idea of important variables, and perhaps even have a theoretical basis for a particular model. They will also know some facts about the response and the general kind of relationships with the predictors, but may still not be certain that their general idea of a model is completely adequate - even with an excellent theoretical idea of how the mean should work, they might not, for example, be confident that the variance isn't related to the mean, or they might suspect some serial dependence could be possible.

So there may be a cycle of several stages of model identification that makes reference to (at least some of) the data. The alternative is to regularly risk having quite unsuitable models.

(Of course, if they're being responsible, they must take account of how using data in this way impacts their inferences.)

The actual process varies somewhat from area to area and from person to person, but it's possible to find some people explicitly listing steps in their process (e.g. Box and Jenkins outline one such approach in their book on time series). Ideas about how to do model identification alter over time.

Modelling is the process of identifying a suitable model.

Frequently a modeller will have a good idea of important variables, and perhaps even have a theoretical basis for a particular model. They will also know some facts about the response and the general kind of relationships with the predictors, but may still not be certain that their general idea of a model is completely adequate - even with an excellent theoretical idea of how the mean should work, they might not, for example, be confident that the variance isn't related to the mean, or they might suspect some serial dependence could be possible.

So there may be a cycle of several stages of model identification that makes reference to (at least some of) the data. The alternative is to regularly risk having quite unsuitable models.

(Of course, if they're being responsible, they must take account of how using data in this way impacts their inferences.)

Modelling is the process of identifying a suitable model.

Frequently a modeller will have a good idea of important variables, and perhaps even have a theoretical basis for a particular model. They will also know some facts about the response and the general kind of relationships with the predictors, but may still not be certain that their general idea of a model is completely adequate - even with an excellent theoretical idea of how the mean should work, they might not, for example, be confident that the variance isn't related to the mean, or they might suspect some serial dependence could be possible.

So there may be a cycle of several stages of model identification that makes reference to (at least some of) the data. The alternative is to regularly risk having quite unsuitable models.

(Of course, if they're being responsible, they must take account of how using data in this way impacts their inferences.)

The actual process varies somewhat from area to area and from person to person, but it's possible to find some people explicitly listing steps in their process (e.g. Box and Jenkins outline one such approach in their book on time series). Ideas about how to do model identification alter over time.

Source Link
Glen_b
  • 297.9k
  • 37
  • 675
  • 1.1k

Modelling is the process of identifying a suitable model.

Frequently a modeller will have a good idea of important variables, and perhaps even have a theoretical basis for a particular model. They will also know some facts about the response and the general kind of relationships with the predictors, but may still not be certain that their general idea of a model is completely adequate - even with an excellent theoretical idea of how the mean should work, they might not, for example, be confident that the variance isn't related to the mean, or they might suspect some serial dependence could be possible.

So there may be a cycle of several stages of model identification that makes reference to (at least some of) the data. The alternative is to regularly risk having quite unsuitable models.

(Of course, if they're being responsible, they must take account of how using data in this way impacts their inferences.)