3
$\begingroup$

I am following some work to do with a regression based performance attribution.

The regression is a cross sectional one. The $y$ vector is the risk free return for say 1,000 companies. The $X$ matrix is made up of a constant, some factors such as book to price, momentum etc, say we 6 such factors (7 including the constant). Then for a stock's country there are dummy variables (twenty countries). So our matrix is 1000 x 27 (including the constant).

However I thought when you have dummy variables you would not use all of them, you would use $n-1$ because it introduces multicollinearity. Is the regression above mis-specified?

$\endgroup$

1 Answer 1

2
$\begingroup$

You are right that if you use binary dummy variables for $n$ possible values of some feature (the country in your case) you need only $n-1$ variables because the last (or first) country is indicated by all dummy variables equal to zero.

$\endgroup$

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.