Have an idea?

Visit Sawtooth Software Feedback to share your ideas on how we can improve our products.

Modelling anchored MaxDiff

When using direct anchoring (comparing items with "zero"), in Orme's Applied MaxDiff it says that the comparison should be encoded as a two row "choice" task. If the "main" MaxDiff survey has more (e.g. 3, 4, 5) alternatives, how can the modelling software fit the 2-alternatives task within it?
asked Dec 22, 2019 by deekay

1 Answer

0 votes
See Exhibit 3.8 on page 32 for a specific example.  In this example, a MaxDiff set with 4 items is first coded as best and worst choice sets (the first 8 rows) and then in the next 8 rows each item is separately coded as the choice of being above the threshold or not.
answered Dec 22, 2019 by Keith Chrzan Platinum Sawtooth Software, Inc. (115,750 points)
I was just looking at that page. :) I understand the coding idea, but since this means merging 4-alternatives task (original MaxDiff) with a 2-alternatives task (item vs. threshold/anchor), the modelling software has to be adapted to pick the both models and somehow merge them. This part is unclear to me. Does this mean that we essentially build two models and somehow combine them? Or is there a unified modelling formula for the anchored MaxDiff case?
I have never had to adapt software to get the model to run.  Most logit software packages I'm aware of have on problem handling choice sets with different numbers of alternatives - it's a very common thing in choice modeling.  This is the coding our software uses, and if you use this coding you could run the analysis with any of a number of other conditional logit software packages (the mlogit package, among others, in R, SAS, Systat, Nlogit and other packages, I imagine.
Of course it can handle different number of alternatives, but not within the same model/problem. E.g. if the MNL formula has 4 parameters (p1 = exp(u1)/(exp(u1)+exp(u2)+exp(u3)+exp(u4)), it cannot just like that use 2 parameters, at least when the model/coefficients are estimated (in the prediction phase this can be overcome although it's not strictly correct). And in this coding scheme for anchors two different problems/choice tasks are simply stuck together. This confuses me since I'm not sure how to estimate the model with such an approach. Orme also wrote a comment at this point, quoted: "By stacking the two types of choice tasks (MaxDiff and direct binary judgments, choices from quads and choices from pairs) within the same utility estimation, we are potentially mixing choice scenarios that likely have different contexts, response errors, and corresponding scale factors."
Actually, yes, it can just like that use different denominators for choice sets of different sizes.  This is not unique to our software as it is a common feature in many different logit modeling software packages.   Of course if you're concerned about the logit scale parameter being different (because of differing levels of response error in pairs and quads) you could remedy this by running a model where the scale parameter is also a variable that gets estimated.  This is pretty straightforward to do with an aggregate model, but when you're building respondent-level model (e.g. a mixed logit with either hierarchical Bayesian MNL or with the method of simulated likelihoods, the modeling gets more complex, and it's something we don't do with our software outside of our Adaptive CBC package.
True, I looked again and it really is fine to mix "cases" with different number of alternatives. Regarding the additional estimate of the scale difference, I will look into this too, at this point it's not clear to me how to set it up. Thanks for your help anyway...
It seems that I was right, at least partially. Although the documentation indicates the opposite, mlogit (mlogit.data code) in R doesn't support different numbers of alternatives. ChoiceModelR HB works fine. Some other models, including Stan code, also depend on having the same number of alternatives for all cases.
I would say that the issue is not so uncommon and that mixing different numbers of alternatives is not so widely supported. So it's good that you implemented it in your software. :)
Well, I eventually managed to make the mlogit work with different numbers of alternatives. But it required some non-intuitive tweaking of the parameters.
You are correct, mlogit isn't the easiest to work with.  There's a new R package called Apollo that's made specifically for choice modeling, so it may be another for you to look into.
Looks interesting, will definitely have a look. But before that, I want to set up a STAN script to support varying number of alternatives. Do you have any experience/advice with that?
Sorry, but I do not.