Have an idea?

Visit Sawtooth Software Feedback to share your ideas on how we can improve our products.

How sparse can sparse max diff be?

I am interested in conducting a sparse max diff with 200 items. My client has strong preference for showing at most 5 items per max diff question and having at most 12 max diff questions per respondent.

With sparse max diff and 200 items, showing each item once will require 40  max diff questions per respondent [(200/5) * 1 = 40]. I have used sparse max diff in the past and most sparse I've ever used was showing each item once per respondent.

To meet the client's request, however, each item would be shown 0.3 times [(200/5) * 0.3  = 12].

My question is, in sparse max diff when you need to show each item less than once per respondent, how sparse can you get? That is, is showing each item 0.3 times per respondent OK, 0.25 times per respondent OK, or have practitioners found that you should not go lower than (say) 0.5 times per respondent?

For analysis I will need to run HB estimation in Lighthouse Studio to obtain respondent-level max diff scores (as is done with regular non-sparse max diff designs). So my question is being asked (and needs an answer) in terms of recommendations for this type of analysis (versus aggregate logit). That is, How sparse can you go and still be able to estimate HB scores in Lighthouse Studio?

Thank you.
asked Mar 22, 2019 by anonymous

1 Answer

0 votes
We've tested as Sparse MaxDiff with each item appearing one time per respondent, but I thin 40 questions will be too many for most audiences,a nd we got reasonably good respondent-level utilities.  With such a large number of items, you might want to consider a different kind of sparse method called Express MaxDiff.  With Express MaxDiff you select a random (say 30) of the items per person and you use only those 30 in that respondent's MaxDiff.  Up to 100 items we think Sparse works better than Express, but beyond that Express may be more viable than Sparse.

But I'm aware of no tests of Sparse wherein you show each item fewer than once per respondent, so I don't really know how well that would perform.
answered Mar 22, 2019 by Keith Chrzan Platinum Sawtooth Software, Inc. (115,350 points)
I like Keith's recommendations.  If I could add something to the discussion...if you are only selecting 30 items out of the 200 items, HB estimation will need to impute scores for the remaining 170 items for each person.  So, HB is going to do great to estimate the relative scores for each individual on the 30 items each respondent received (assuming you show each at least 2x or 3x times to each individual); but it's going to struggle to get precise individual-level estimation on the remaining 170 items per person.  The information at the individual level on those 170 items (which are different for each respondent) will just be imputed from the population means and covariances.
Bryan's right, and I think the imputing from population means is why Sparse outdoes Express when the number of attributes is small enough to allow Sparse.
Thank you both for the very helpful response.