1
$\begingroup$

This has been bothering me for a while. Both $X$ and $Y$ are material properties. They can be described using a linear regression model built in the log-transformed space, i.e., $\log Y=a \log X+b$. For each material, I measure the property $X$ for $m$ times; while $Y$ for $n$ times. Hence, in order to build the regression model, I need to use the means of $X$ and $Y$. Should I use geometric mean or arithmetic mean here? If geometric mean is to be used, should standard deviation be calculated using the log-transformed variables too? Moreover, I guess the variance of both $X$ and $Y$ should be taken into account when calculating the confidence/prediction bands of the regression model, right? How to do it? It would be highly appreciated if a reference (book/paper) could be given for this question.

$\endgroup$
1
  • $\begingroup$ Welcome to our site! You can use Latex math typesetting here by enclosing in dollar signs, so e.g. $x$ produces $x$. $\endgroup$ Commented Jun 4, 2016 at 12:28

1 Answer 1

1
$\begingroup$

If the data can "be described using a linear regression model built in log-transformed space" then (unless I'm missing something problem specific...) you should run a linear regression in log transformed space!

That would effectively be using the mean of $\log Y$, $Cov(\log X, \log Y)$, $Var(\log X)$ etc... to calculate your estimates $a$ and $b$

Something to keep in mind too:

$$\exp\left( \frac{1}{n} \sum_i \log x_i \right) = \left( \prod x_i \right) ^{\frac{1}{n}}$$ That is, the exponential of the arithmetic mean in logs is equal to the geometric mean in levels.

$\endgroup$
1
  • $\begingroup$ Thanks, Matthew. Could you possibly recommend a book/paper where more details on this topic can be found? $\endgroup$ Commented Jun 7, 2016 at 19:44

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.