Skip to main content

You are not logged in. Your edit will be placed in a queue until it is peer reviewed.

We welcome edits that make the post easier to understand and more valuable for readers. Because community members review edits, please try to make the post substantially better than how you found it, for example, by fixing grammar or adding additional resources and hyperlinks.

Required fields*

3
  • 16
    $\begingroup$ Using a 0/1 coding scheme is essentially useful when applying regression models among others, although several coding schemes are possible, e.g. -1/1 (but it will change the interpretation of the regression coefficients). It should not be confused with data entry (that is, what you really put in your database), though. In this case, it is better to store the full labels. Convert them to numerical values or build a dedicated design matrix when you build your regression model. Otherwise, I wish you good luck to tell what the 0 and 1's stand for in 5 years. $\endgroup$ Commented Oct 7, 2011 at 21:11
  • $\begingroup$ I've seen the gender coded in the database as male, female and unknown. $\endgroup$ Commented May 18, 2015 at 18:20
  • 4
    $\begingroup$ I think this question is best considered as two questions confounded. The larger question is why use 0-1 coding rather than any other for an indicator or dummy variable. The smaller question is why use 1 for male and 0 for female, to which one short answer is that many other codings are in use, including the opposite of 1 for female, etc., and also various complex codings allowing for unknown gender and for other gender categories. $\endgroup$ Commented May 18, 2015 at 18:34