Skip to main content
Post Undeleted by Dikran Marsupial
Post Deleted by Dikran Marsupial
added 133 characters in body
Source Link
Dikran Marsupial
  • 58.3k
  • 9
  • 154
  • 236

G. C. Cawley and G. J. Janacek, On allometric equations for predicting body mass of dinosaurs, Journal of Zoology, vol. 280, no. 4, pp. 355-361, 2010. doi:10.1111/j.1469-7998.2009.00665.x (preprint)

This example shows some advantages of log transformation in allometry, (a) and (b) show the long bone circumferences and body masses of 33 mammals (used to predict quadrapedal dinosaur body masses), plotted on log-log axes and on linear axes. This shows that a linear model on log-log transformed data is a very reasonable model. (c) and (d) show the power law model of Packard et al. which was fitted using least squares without the log transformation, which looks O.K. (ish) on linear axes, but of you back-transform the model to log-log axes, then it is very clearly biased, with the body masses of smaller mammals being consistently over-predicted. The predictive error bars of the model also imply that it is possible for some smaller mammals to have a negative body mass.

The conventional allometric model is just about able to explain the observed body mass of an elephant (it is an outlier among mammals - it has very strong bones for it's body mass, perhaps because it is a very active animal). The model fitted on linear axes on the other hand has the elephant lying many standard deviations away from the predicted value. The conventional model predicts the elephant as having a body mass of about 10 tons (4 would be a more accurate estimate). The Packard et al model though, predicts the mass to be 36 tons, which is wildly wrong (and the estimates for the body masses of dinosaurs are also much higher - the estimated mass of Apatosaurus louisiae rises from 18.2 tons to 302 tons, which is completely implausible). The reason for this is the next most heavy mammal is the hippo, which has a relatively high body mass for its bone circumference.

G. C. Cawley and G. J. Janacek, On allometric equations for predicting body mass of dinosaurs, Journal of Zoology, vol. 280, no. 4, pp. 355-361, 2010. doi:10.1111/j.1469-7998.2009.00665.x

This example shows some advantages of log transformation in allometry, (a) and (b) show the long bone circumferences and body masses of 33 mammals (used to predict quadrapedal dinosaur body masses), plotted on log-log axes and on linear axes. This shows that a linear model on log-log transformed data is a very reasonable model. (c) and (d) show the power law model of Packard et al. which was fitted using least squares without the log transformation, which looks O.K. (ish) on linear axes, but of you back-transform the model to log-log axes, then it is very clearly biased, with the body masses of smaller mammals being consistently over-predicted.

The conventional allometric model is just about able to explain the observed body mass of an elephant (it is an outlier among mammals - it has very strong bones for it's body mass, perhaps because it is a very active animal). The model fitted on linear axes on the other hand has the elephant lying many standard deviations away from the predicted value. The conventional model predicts the elephant as having a body mass of about 10 tons (4 would be a more accurate estimate). The Packard et al model though, predicts the mass to be 36 tons, which is wildly wrong (and the estimates for the body masses of dinosaurs are also much higher). The reason for this is the next most heavy mammal is the hippo, which has a relatively high body mass for its bone circumference.

G. C. Cawley and G. J. Janacek, On allometric equations for predicting body mass of dinosaurs, Journal of Zoology, vol. 280, no. 4, pp. 355-361, 2010. doi:10.1111/j.1469-7998.2009.00665.x (preprint)

This example shows some advantages of log transformation in allometry, (a) and (b) show the long bone circumferences and body masses of 33 mammals (used to predict quadrapedal dinosaur body masses), plotted on log-log axes and on linear axes. This shows that a linear model on log-log transformed data is a very reasonable model. (c) and (d) show the power law model of Packard et al. which was fitted using least squares without the log transformation, which looks O.K. (ish) on linear axes, but of you back-transform the model to log-log axes, then it is very clearly biased, with the body masses of smaller mammals being consistently over-predicted. The predictive error bars of the model also imply that it is possible for some smaller mammals to have a negative body mass.

The conventional allometric model is just about able to explain the observed body mass of an elephant (it is an outlier among mammals - it has very strong bones for it's body mass, perhaps because it is a very active animal). The model fitted on linear axes on the other hand has the elephant lying many standard deviations away from the predicted value. The conventional model predicts the elephant as having a body mass of about 10 tons (4 would be a more accurate estimate). The Packard et al model though, predicts the mass to be 36 tons, which is wildly wrong (and the estimates for the body masses of dinosaurs are also much higher - the estimated mass of Apatosaurus louisiae rises from 18.2 tons to 302 tons, which is completely implausible). The reason for this is the next most heavy mammal is the hippo, which has a relatively high body mass for its bone circumference.

added 2457 characters in body
Source Link
Dikran Marsupial
  • 58.3k
  • 9
  • 154
  • 236

which was a comment on an earlier paper by Packard et al, that got a fair bit of media attention as it (incorrectly) asserted that dinosaurs were only about half as big as was previously thought.

G. C. Packard, T. J. Boardman, G. F. Birchard, Allometric equations for predicting body mass of dinosaurs, Journal of Zoology, vol. 279, no. 1, pp. 102-110, 2009. doi:10.1111/j.1469-7998.2009.00594.x

Such relationships often have a power-law relationship, e.g. the strength of a bone depends on its cross-sectional area, and the body mass is related to the cube of the linear body measurements. So we would expect a power-law relationship between the circumference of major bones, such as the femur, and the body mass of the dinosaur. In practice it won't be exactly cubic, but allometric models aim to determine the appropriate scaling from the observations. If we use logarithmic transformations of both variables, then a linear model gives a power-law relationship, so that implements our prior knowledge about the problem. As a bonus, it also imposes an assumption of relative rather than absolute errors, which turns out to be the other key benefit of allometry over other transformations. Our paper was a comment on a previous study that, err..., did not fully appreciate that.

[I'll addThis example shows some advantages of log transformation in allometry, (a) and (b) show the long bone circumferences and body masses of 33 mammals (used to predict quadrapedal dinosaur body masses), plotted on log-log axes and on linear axes. This shows that a picture or two when I getlinear model on log-log transformed data is a very reasonable model. (c) and (d) show the power law model of Packard et al. which was fitted using least squares without the log transformation, which looks O.K. (ish) on linear axes, but of you back-transform the model to log-log axes, then it is very clearly biased, with the office tomorrow]body masses of smaller mammals being consistently over-predicted.

example from paper

Unfortunately Packard et al. used this model to argue that the body masses of dinosaurs was only about half that previously thought, but this is because their model is very sensitive to the body mass of the largest mammal in the dataset (the elephant) because of the assumption of absolute, rather than relative errors. This means a difference of a kilogram in the weight of an elephant is as bad as an error of a kilogram in the predicted weight of a mouse! The error in the modelling assumptions is easily demonstrated by making predictions with the elephant left out of the calibration data:

enter image description here

The conventional allometric model is just about able to explain the observed body mass of an elephant (it is an outlier among mammals - it has very strong bones for it's body mass, perhaps because it is a very active animal). The model fitted on linear axes on the other hand has the elephant lying many standard deviations away from the predicted value. The conventional model predicts the elephant as having a body mass of about 10 tons (4 would be a more accurate estimate). The Packard et al model though, predicts the mass to be 36 tons, which is wildly wrong (and the estimates for the body masses of dinosaurs are also much higher). The reason for this is the next most heavy mammal is the hippo, which has a relatively high body mass for its bone circumference.

So this shows that there are two reasons for logarithmic transformation: Firstly for including prior knowledge of the form of the regression; Secondly for including prior knowledge regarding the distribution of values around the regression (which is normally the conditional mean of the predictive distribution).

Such relationships often have a power-law relationship, e.g. the strength of a bone depends on its cross-sectional area, and the body mass is related to the cube of the linear body measurements. So we would expect a power-law relationship between the circumference of major bones, such as the femur, and the body mass of the dinosaur. In practice it won't be exactly cubic, but allometric models aim to determine the appropriate scaling from the observations. If we use logarithmic transformations of both variables, then a linear model gives a power-law relationship, so that implements our prior knowledge about the problem. As a bonus, it also imposes an assumption of relative rather than absolute errors, which turns out to be the other key benefit of allometry over other transformations. Our paper was a comment on a previous study that, err..., did not fully appreciate that.

[I'll add a picture or two when I get to the office tomorrow]

which was a comment on an earlier paper by Packard et al, that got a fair bit of media attention as it (incorrectly) asserted that dinosaurs were only about half as big as was previously thought.

G. C. Packard, T. J. Boardman, G. F. Birchard, Allometric equations for predicting body mass of dinosaurs, Journal of Zoology, vol. 279, no. 1, pp. 102-110, 2009. doi:10.1111/j.1469-7998.2009.00594.x

Such relationships often have a power-law relationship, e.g. the strength of a bone depends on its cross-sectional area, and the body mass is related to the cube of the linear body measurements. So we would expect a power-law relationship between the circumference of major bones, such as the femur, and the body mass of the dinosaur. In practice it won't be exactly cubic, but allometric models aim to determine the appropriate scaling from the observations. If we use logarithmic transformations of both variables, then a linear model gives a power-law relationship, so that implements our prior knowledge about the problem. As a bonus, it also imposes an assumption of relative rather than absolute errors, which turns out to be the other key benefit of allometry over other transformations. Our paper was a comment on a previous study that, err..., did not fully appreciate that.

This example shows some advantages of log transformation in allometry, (a) and (b) show the long bone circumferences and body masses of 33 mammals (used to predict quadrapedal dinosaur body masses), plotted on log-log axes and on linear axes. This shows that a linear model on log-log transformed data is a very reasonable model. (c) and (d) show the power law model of Packard et al. which was fitted using least squares without the log transformation, which looks O.K. (ish) on linear axes, but of you back-transform the model to log-log axes, then it is very clearly biased, with the body masses of smaller mammals being consistently over-predicted.

example from paper

Unfortunately Packard et al. used this model to argue that the body masses of dinosaurs was only about half that previously thought, but this is because their model is very sensitive to the body mass of the largest mammal in the dataset (the elephant) because of the assumption of absolute, rather than relative errors. This means a difference of a kilogram in the weight of an elephant is as bad as an error of a kilogram in the predicted weight of a mouse! The error in the modelling assumptions is easily demonstrated by making predictions with the elephant left out of the calibration data:

enter image description here

The conventional allometric model is just about able to explain the observed body mass of an elephant (it is an outlier among mammals - it has very strong bones for it's body mass, perhaps because it is a very active animal). The model fitted on linear axes on the other hand has the elephant lying many standard deviations away from the predicted value. The conventional model predicts the elephant as having a body mass of about 10 tons (4 would be a more accurate estimate). The Packard et al model though, predicts the mass to be 36 tons, which is wildly wrong (and the estimates for the body masses of dinosaurs are also much higher). The reason for this is the next most heavy mammal is the hippo, which has a relatively high body mass for its bone circumference.

So this shows that there are two reasons for logarithmic transformation: Firstly for including prior knowledge of the form of the regression; Secondly for including prior knowledge regarding the distribution of values around the regression (which is normally the conditional mean of the predictive distribution).

deleted 15 characters in body
Source Link
Dikran Marsupial
  • 58.3k
  • 9
  • 154
  • 236

@IgorF. and @SextusEmpiricus (+1) make the key point "You should ideally use a function that makes physical sense.", a nice example is allometry which is often used in biology to model the relationships between physical measurements, for instance the long bone circumference and body mass of dinosaurs,

G. C. Cawley and G. J. Janacek, On allometric equations for predicting body mass of dinosaurs, Journal of Zoology, vol. 280, no. 4, pp. 355-361, 2010. doi:10.1111/j.1469-7998.2009.00665.x

Such relationships often have a power-law relationship, e.g. the strength of a bone depends on its cross-sectional area, and the body mass is related to the cube of the linear body measurements. So we would expect a power-law relationship between the circumference of major bones, such as the femur, and the body mass of the dinosaur. In practice it won't be exactly cubic, but allometric models aim to determine the appropriate scaling from the observations. If we use logarithmic transformations of both variables, then a linear model gives a power-law relationship, so that implements our prior knowledge about the problem. As a bonus, it also imposes an assumption of relative rather than absolute errors, which turns out to be the other key benefit of allometry over other transformations. My Our paper was a comment on a previous study that, err..., did not fully appreciate that.

[I'll add a picture or two when I get to the office tomorrow]

@IgorF. and @SextusEmpiricus (+1) make the key point "You should ideally use a function that makes physical sense.", a nice example is allometry which is often used in biology to model the relationships between physical measurements, for instance the long bone circumference and body mass of dinosaurs,

G. C. Cawley and G. J. Janacek, On allometric equations for predicting body mass of dinosaurs, Journal of Zoology, vol. 280, no. 4, pp. 355-361, 2010. doi:10.1111/j.1469-7998.2009.00665.x

Such relationships often have a power-law relationship, e.g. the strength of a bone depends on its cross-sectional area, and the body mass is related to the cube of the linear body measurements. So we would expect a power-law relationship between the circumference of major bones, such as the femur, and the body mass of the dinosaur. In practice it won't be exactly cubic, but allometric models aim to determine the appropriate scaling from the observations. If we use logarithmic transformations of both variables, then a linear model gives a power-law relationship, so that implements our prior knowledge about the problem. As a bonus, it also imposes an assumption of relative rather than absolute errors, which turns out to be the other key benefit of allometry over other transformations. My paper was a comment on a previous study that, err..., did not fully appreciate that.

[I'll add a picture or two when I get to the office tomorrow]

@IgorF. and @SextusEmpiricus (+1) make the key point "You should ideally use a function that makes physical sense.", a nice example is allometry which is often used in biology to model the relationships between physical measurements, for instance the long bone circumference and body mass of dinosaurs,

G. C. Cawley and G. J. Janacek, On allometric equations for predicting body mass of dinosaurs, Journal of Zoology, vol. 280, no. 4, pp. 355-361, 2010. doi:10.1111/j.1469-7998.2009.00665.x

Such relationships often have a power-law relationship, e.g. the strength of a bone depends on its cross-sectional area, and the body mass is related to the cube of the linear body measurements. So we would expect a power-law relationship between the circumference of major bones, such as the femur, and the body mass of the dinosaur. In practice it won't be exactly cubic, but allometric models aim to determine the appropriate scaling from the observations. If we use logarithmic transformations of both variables, then a linear model gives a power-law relationship, so that implements our prior knowledge about the problem. As a bonus, it also imposes an assumption of relative rather than absolute errors, which turns out to be the other key benefit of allometry over other transformations. Our paper was a comment on a previous study that, err..., did not fully appreciate that.

[I'll add a picture or two when I get to the office tomorrow]

deleted 15 characters in body
Source Link
Dikran Marsupial
  • 58.3k
  • 9
  • 154
  • 236
Loading
edited body
Source Link
Dikran Marsupial
  • 58.3k
  • 9
  • 154
  • 236
Loading
edited body
Source Link
Dikran Marsupial
  • 58.3k
  • 9
  • 154
  • 236
Loading
added 15 characters in body
Source Link
Dikran Marsupial
  • 58.3k
  • 9
  • 154
  • 236
Loading
Source Link
Dikran Marsupial
  • 58.3k
  • 9
  • 154
  • 236
Loading