Logistic regression for medical statistics -- Conflict between results by Mathematica and other statistical softwares

Question

Actually, I am trying to recover an example in a textbook via Mathematica. The situation is to study the relationship between esophagus cancer and smoking as well as excessive drinking. And the original data is quite simple:

data = {{0, 0, 199, 63, 136}, {0, 1, 170, 63, 107}, {1, 0, 101, 44, 57}, {1, 1, 416, 265, 151}};

where in each row (sub-list), the entries are for x1 (smoking: no-0, yes-1), x2 (excessive drinking: no-0, yes-1), total number, number of positive and number of negative (total number = number of positive + number of negative).

The textbook gives, e.g., the estimate and standard error of the linear coefficients b0 (Intercept), b1 and b2 as (picture coming from http://statpages.info/logistic.html)

However, when I use LogitModelFit (which accepts data with structure {{x1, x2, y},...}, and y is the ratio of positive samples: number of positive / total number) with the below new data

dataNew = {{0., 0., 0.316583}, {0., 1., 0.370588}, {1., 0., 0.435644}, {1., 1., 0.637019}}; logitM = LogitModelFit[dataNew, {x1, x2}, {x1, x2}];

Then logitM["ParameterTable"] gives

The problem is that the results obtained by Mathematica do not accord with those by the textbook.

So my first questions are:

Have I used LogitModelFit correctly?
How could one get the consistent results as shown in the first picture?

P.S. I am using Windows 10 and Mathematica 11.2.

------------------------------

Conclusion to Question 1: I indeed did not use LogitModelFit correctly.

Conclusion to Question 2: See the selected answer.

And thanks a lot to @JimB and @J. M..

"an example in a textbook" - can you mention which textbook, please? — J. M.'s missing motivation
– J. M.'s missing motivation, Commented Nov 13, 2017 at 4:56
@J.M. The textbook is in Chinese. If necessary, I could translate it. — Αλέξανδρος Ζεγγ
– Αλέξανδρος Ζεγγ, Commented Nov 13, 2017 at 4:58
Hmm... did the textbook at least mention what software they were using that generated those results? — J. M.'s missing motivation
– J. M.'s missing motivation, Commented Nov 13, 2017 at 4:59
I don't have access to SPSS at the moment, but indeed I am getting conflicting results from Mathematica and the webpage you linked to. Let me check further... — J. M.'s missing motivation
– J. M.'s missing motivation, Commented Nov 13, 2017 at 5:16

JimB · Accepted Answer · 2017-11-13 06:01:56Z

You probably should use GeneralizedLinearModelFit given the structure of your data. One needs to turn the data into an array of the following form: {smoking, drinking,status}:

newdata = {{0, 0, 1}, {0, 1, 1}, {1, 0, 1}, {1, 1, 1}, {0, 0, 0}, {0, 1, 0}, {1, 0, 0}, {1, 1, 0}};

Then construct the Weights from the associated frequency counts:

weights = Flatten[{data[[All, 4]], data[[All, 5]]}];

The analysis is performed by

glm = GeneralizedLinearModelFit[ newdata, {smoking, drinking}, {smoking, drinking}, ExponentialFamily -> "Binomial", Weights -> weights]; glm["ParameterTable"]

Update

You actually can fit the model with the same structure using LogitModelFit:

lmf = LogitModelFit[newdata, {smoking, drinking}, {smoking, drinking}, Weights -> weights]; lmf["ParameterTable"]

Ah, that's more like it! So, LogitModelFit[] is broken here? — J. M.'s missing motivation
– J. M.'s missing motivation, Commented Nov 13, 2017 at 5:55
Sorry @J.M. I was too slow. It's not broken. I just have a tendency/bias to use GeneralizedLinearModelFit rather than the more specialized functions (LogitModelFit, ProbitModelFit, etc.). But I added the use of LogitModelFit in the answer. — JimB
– JimB, Commented Nov 13, 2017 at 6:03

Stack Exchange Network

Logistic regression for medical statistics -- Conflict between results by Mathematica and other statistical softwares

------------------------------

1 Answer 1

Hot Network Questions

Logistic regression for medical statistics -- Conflict between results by Mathematica and other statistical softwares

------------------------------

1 Answer 1

Related

Hot Network Questions