6
$\begingroup$

Actually, I am trying to recover an example in a textbook via Mathematica. The situation is to study the relationship between esophagus cancer and smoking as well as excessive drinking. And the original data is quite simple:

data = {{0, 0, 199, 63, 136}, {0, 1, 170, 63, 107}, {1, 0, 101, 44, 57}, {1, 1, 416, 265, 151}}; 

where in each row (sub-list), the entries are for x1 (smoking: no-0, yes-1), x2 (excessive drinking: no-0, yes-1), total number, number of positive and number of negative (total number = number of positive + number of negative).

The textbook gives, e.g., the estimate and standard error of the linear coefficients b0 (Intercept), b1 and b2 as (picture coming from http://statpages.info/logistic.html)

enter image description here


However, when I use LogitModelFit (which accepts data with structure {{x1, x2, y},...}, and y is the ratio of positive samples: number of positive / total number) with the below new data

dataNew = {{0., 0., 0.316583}, {0., 1., 0.370588}, {1., 0., 0.435644}, {1., 1., 0.637019}}; logitM = LogitModelFit[dataNew, {x1, x2}, {x1, x2}]; 

Then logitM["ParameterTable"] gives

enter image description here

The problem is that the results obtained by Mathematica do not accord with those by the textbook.


So my first questions are:

  1. Have I used LogitModelFit correctly?

  2. How could one get the consistent results as shown in the first picture?


P.S. I am using Windows 10 and Mathematica 11.2.

------------------------------

Conclusion to Question 1: I indeed did not use LogitModelFit correctly.

Conclusion to Question 2: See the selected answer.

And thanks a lot to @JimB and @J. M..

$\endgroup$
5
  • $\begingroup$ "an example in a textbook" - can you mention which textbook, please? $\endgroup$ Commented Nov 13, 2017 at 4:56
  • $\begingroup$ @J.M. The textbook is in Chinese. If necessary, I could translate it. $\endgroup$ Commented Nov 13, 2017 at 4:58
  • $\begingroup$ Hmm... did the textbook at least mention what software they were using that generated those results? $\endgroup$ Commented Nov 13, 2017 at 4:59
  • 1
    $\begingroup$ @J.M. The software used in the textbook is SPSS. $\endgroup$ Commented Nov 13, 2017 at 5:03
  • 1
    $\begingroup$ I don't have access to SPSS at the moment, but indeed I am getting conflicting results from Mathematica and the webpage you linked to. Let me check further... $\endgroup$ Commented Nov 13, 2017 at 5:16

1 Answer 1

9
$\begingroup$

You probably should use GeneralizedLinearModelFit given the structure of your data. One needs to turn the data into an array of the following form: {smoking, drinking,status}:

newdata = {{0, 0, 1}, {0, 1, 1}, {1, 0, 1}, {1, 1, 1}, {0, 0, 0}, {0, 1, 0}, {1, 0, 0}, {1, 1, 0}}; 

Then construct the Weights from the associated frequency counts:

weights = Flatten[{data[[All, 4]], data[[All, 5]]}]; 

The analysis is performed by

glm = GeneralizedLinearModelFit[ newdata, {smoking, drinking}, {smoking, drinking}, ExponentialFamily -> "Binomial", Weights -> weights]; glm["ParameterTable"] 

Table of estimated coefficients

Update

You actually can fit the model with the same structure using LogitModelFit:

lmf = LogitModelFit[newdata, {smoking, drinking}, {smoking, drinking}, Weights -> weights]; lmf["ParameterTable"] 
$\endgroup$
4
  • $\begingroup$ Ah, that's more like it! So, LogitModelFit[] is broken here? $\endgroup$ Commented Nov 13, 2017 at 5:55
  • $\begingroup$ Sorry @J.M. I was too slow. It's not broken. I just have a tendency/bias to use GeneralizedLinearModelFit rather than the more specialized functions (LogitModelFit, ProbitModelFit, etc.). But I added the use of LogitModelFit in the answer. $\endgroup$ Commented Nov 13, 2017 at 6:03
  • $\begingroup$ @JimB Thank you very much! $\endgroup$ Commented Nov 13, 2017 at 6:11
  • $\begingroup$ @J.M. Also thank you for your timely response! $\endgroup$ Commented Nov 13, 2017 at 6:13

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.