compare logit models with different holdout tasks


to test the significance of different logit models the LogLikelihood (LL) of those models is subtracted to calculate chi square. E.g.: 2(LLnew - LLold).

After that the degrees of freedom are calculated to consider them in the chi square table and to check if the difference between those models is significant or not.

But what do I  if there is no change in the estimated parameters but solely in the number of holdout tasks  (eg. 1 vs 2)? How do I consider the degrees of freedom for calculating the significance of model change?

asked Feb 18, 2019 by anonymous

1 Answer

0 votes
The -2LL test is a test of fit of the model to the estimation data.  It has nothing to do with holdout data.  

If you want to test model validity using holdout choices, then you probably need to hold out many more than 1-2 choice sets (I usually recommend at least 8).  

But to answer your question directly, when using holdouts, you should probably test the hit rates (assuming you ran HB analysis and have respondent-level utilities).  In that case you look your prediction of how each respondent  would make the holdout choices and compare to the actual holdout choices each respondent makes.  Then for each respondent you correctly predicted 0, 1 or 2 of the holdout question (or 0 or 1 if you have only one holdout.    You can then test whether this number is higher for your New or Old model using a dependent t-test.
answered Feb 18, 2019 by Keith Chrzan Platinum Sawtooth Software, Inc. (95,775 points)